Coder Social home page Coder Social logo

algopractice6's Introduction

Hash Table Research

Introduction

This research focuses on performance of various implementations of hash table and hash functions.

The following hash functions are considered:

  • For integers:
    • Identity function
    • Multiplicative hash
  • For floating-point numbers:
    • Round to nearest integer
    • Interpret IEEE-754 floating point number representation as unsigned integer
  • For strings:
    • Length of string
    • Sum of characters
    • Polynomial hash
    • crc64 hash

The following hash table types are considered:

  • Hash table with closed addressing (with linked list chaining)
  • Hash table with open addressing

Results

Hash functions

Tests for various hash functions yielded the following results:

integer hashes floating-point hashes string hashes

On uniformly-distributed inputs all functions behave similarly, except for hash_string_length which is limited by maximum length of generated string.

However, hash_string_sum_char will behave poorly on larger table sizes, as it will produce normal distribution of bucket sizes according to Central Limit Theorem.

Hash tables

When generating random queries for hash tables, the following results were obtained:

linear scale random tests logarithmic scale random tests

On this data, hash table with open addressing performs about twice as fast as with closed one.

When increasing the probability of insertion operation to 50%, the results were following:

linear scale weighted tests logarithmic scale weighted tests

Now the hash table with open addressing is almost three times faster than another implementation.

Conclusions

Hash functions

On uniform data nearly all function will behave efficiently, as the results show. However, on a more structured data the results might change. For example, if the range of input integers or floating-point numbers is smaller than the size of hash table, the identity and round functions will behave inefficiently, similar to the performance of hash_str_length.

hash_str_sum_char behaves efficiently on relatively small hash tables containing long words, but it will inevitably suffer from the effects of Central Limit Theorem in other cases.

Hash tables

The hash table with open addressing performs much faster than the table with closed addressing due to increased code locality and less frequent memory allocations. This effect becomes more noticeable with larger sample sizes, as spikes in execution time, indicating the occurrence of rehash, are higher for the table with closed addressing. This indicates that the longer insertion times are affecting the performance of a rehash, decreasing the overall hash table performance even further.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.