Coder Social home page Coder Social logo

davidleeds / hashmap Goto Github PK

View Code? Open in Web Editor NEW
265.0 13.0 54.0 107 KB

Templated type-safe hashmap implementation in C using open addressing and linear probing for collision resolution.

License: MIT License

C 93.59% CMake 6.41%
hashmap constant-time linear-probing open-addressing c embedded-linux tiny-library generic-programming template type-safe

hashmap's Introduction

hashmap

ci

Templated type-safe hashmap implementation in C using open addressing and linear probing for collision resolution.

Summary

This project came into existence because there are a notable lack of flexible and easy to use data structures available in C. C data structures with efficient, type-safe interfaces are virtually non-existent. Higher level languages have built-in libraries and templated classes, but plenty of embedded projects or higher level libraries are implemented in C. When it is undesireable to depend on a bulky library like Glib or grapple with a restrictive license agreement, this is the library for you.

Goals

  • To scale gracefully to the full capacity of the numeric primitives in use. We should be able to load enough entries to consume all memory on the system without hitting any bugs relating to integer overflows. Lookups on a hashtable with a hundreds of millions of entries should be performed in close to constant time, no different than lookups in a hashtable with 20 entries. Automatic rehashing occurs and maintains a load factor of 0.75 or less.
  • To provide a clean and easy-to-use interface. C data structures often struggle to strike a balance between flexibility and ease of use. To this end, I wrapped a generic C backend implementation with light-weight pre-processor macros to create a templated interface that enables the compiler to type-check all function arguments and return values. All required type information is encoded in the hashmap declaration using theHASHMAP() macro. Unlike with header-only macro libraries, there is no code duplication or performance disadvantage over a traditional library with a non-type-safe void * interface.
  • To enable easy iteration and safe entry removal during iteration. Applications often need these features, and the data structure should not hold them back. Easy to use hashmap_foreach() macros and a more flexible iterator interface are provided. This hashmap also uses an open addressing scheme, which has superior iteration performance to a similar hashmap implemented using separate chaining (buckets with linked lists). This is because fewer instructions are needed per iteration, and array traversal has superior cache performance than linked list traversal.
  • To use a very unrestrictive software license. Using no license was an option, but I wanted to allow the code to be tracked, simply for my own edification. I chose the MIT license because it is the most common open source license in use, and it grants full rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell the code. Basically, take this code and do what you want with it. Just be nice and leave the license comment and my name at top of the file. Feel free to add your name if you are modifying and redistributing.

API Examples

Declaring a type-specific hashmap

Use the HASHMAP(key_type, value_type) macro to declare a hashmap state struct specific to your needs. Keys and values are always passed in by pointer. Keys are const.

/* Map with string key (const char *) and integer value (int *) */
HASHMAP(char, int) map1;

/* Map with uint64 key (const uint64_t *) and struct value (struct my_value *) */
HASHMAP(uint64_t, struct my_value) map2;

The structure defined by the HASHMAP() macro may be used directly, or named using typedef. For example:

typedef HASHMAP(char, struct my_value) value_map_t;

Initialization and cleanup

Maps must be initialized with a key hash function and a key comparator.

/* Initialize the map structure */
hashmap_init(&map, my_key_hash, my_key_compare);

/* Use the map... */

/* Free resources associated with the map */
hashmap_cleanup(&map);

This library provides some hash functions, so you may not have to write your own:

I recommend using these, unless you have very specific needs.

/* Initialize a map with case-sensitive string keys */
hashmap_init(&map, hashmap_hash_string, strcmp);

Note that memory associated with map keys and values is not managed by the map, so you may need to free this before calling hashmap_cleanup(). Keys are often stored in the same structure as the value, but it is possible to have the map manage key memory allocation internally, by calling hashmap_set_key_alloc_funcs().

Value insertion and access

/* Insert a my_value (fails and returns -EEXIST if the key already exists) */
int result = hashmap_put(&map, "KeyABC", val);

/* Access the value with a given key */
struct my_value *val = hashmap_get(&map, "KeyABC");

Value removal

/* Erase the entry with the given key */
struct my_value *val = hashmap_remove(&map, "KeyABC");

/* Erase all entries */
hashmap_clear(&map);

/* Erase all entries and reset the hash table to its initial size */
hashmap_reset(&map);

Iteration

Iteration may be accomplished using the "convenience" foreach macros, or by using the iterator interface directly. Generally, the foreach macros are the most intuitive and convenient.

const char *key;
struct my_value *val;

/* Iterate over all map entries and access both keys and values */
hashmap_foreach(key, val, &map) {
    /* Access each entry */
}

/* Iterate over all map entries and access just keys */
hashmap_foreach_key(key, &map) {
    /* Access each entry */
}

/* Iterate over all map entries and access just values */
hashmap_foreach_data(val, &map) {
    /* Access each entry */
}

The above iteration macros are only safe for read-only access. To safely remove the current element during iteration, use the macros with a _safe suffix. These require an additional pointer parameter. For example:

const char *key;
struct my_value *val;
void *temp;

/* Okay */
hashmap_foreach_key_safe(key, &map, temp) {
    hashmap_remove(&map, key);
}

Iteration using the iterator interface.

HASHMAP_ITER(map) it;

for (it = hashmap_iter(&map); hashmap_iter_valid(&it); hashmap_iter_next(&it) {
	/*
	 * Access entry using:
	 *   hashmap_iter_get_key()
	 *   hashmap_iter_get_data()
	 *   hashmap_iter_set_data()
	 */
}

Additional examples

Are located in the examples directory in the source tree.

How to Build and Install

This project uses CMake to orchestrate the build and installallation process. To build and install on your host system, follow these easy steps:

  1. git clone https://github.com/DavidLeeds/hashmap.git - download the source
  2. mkdir build-hashmap && cd build-hashmap - create a build directory outside the source tree
  3. cmake ../hashmap - run CMake to setup the build
  4. make - compile the code
  5. make test - run the unit tests (if enabled)
  6. sudo make install - OPTIONAL install the library on this system
CMake Options
  • HASHMAP_BUILD_TESTS - Set to ON to generate unit tests.
  • HASHMAP_BUILD_EXAMPLES - Set to ON to build example code.

Contibutions and Questions

I welcome all questions and contributions. Feel free to e-mail me, or put up a pull request. The core algorithm is stable, but I'm happy to consider CMake improvements, compiler compatibility fixes, or API additions.

hashmap's People

Contributors

chardin-cpi avatar davidleeds avatar dleeds-cpi avatar dmitry-kabanov avatar ecklm avatar foxieflakey avatar tarik02 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hashmap's Issues

-Wdiscarded-qualifiers at hahsmap.h:29:16

Reimplementing the library I noticed that the foreach declared int the header file discard a 'const' quilifier

warning: assignment discardsconstqualifier from pointer target type [-Wdiscarded-qualifiers]
     ((key) = hashmap_iter_get_key(&__HASHMAP_UNIQUE(x, it))) && 
note: in expansion of macro__HASHMAP_FOREACH#define hashmap_foreach(key, data, h) __HASHMAP_FOREACH(__HASHMAP_MAKE_UNIQUE(__map), (key), (data), (h))

It happens both with the normal and safe foreach, so I suppose it should happen also with those that iterate over keys and values. Does someone understand the problem better than I do?
Thanks!

Traversal error while adding

When I was using the test program, I added elements while traversing, and found that I always traversed to the last element each time, adding and deleting at the same time. This is my test program.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <fcntl.h>
#include <unistd.h>
#include <libgen.h>
#include <errno.h>
#include <pthread.h>

#include "hashmap.h"

#define ERROR(format, args...)  do{printf("\033[1;31m[TEST][%s-%d]\033[m" format, basename(__FILE__), __LINE__, ##args);}while(0)
#define DEBUG(format, args...)  do{printf("\033[1;32m[TEST][%s-%d]\033[m" format, basename(__FILE__), __LINE__, ##args);}while(0)
#define INFO(format, args...)   do{printf("\033[1;35m[TEST][%s-%d]\033[m" format, basename(__FILE__), __LINE__, ##args);}while(0)
#define SYSERR(format, args...) do{printf("\033[1;31m[TEST][%s-%d](%s) \033[m" format, basename(__FILE__), __LINE__, strerror(errno), ##args);}while(0)

typedef struct NPU_IMAGE_ST
{
    uint64_t  au64PhyAddr[3]; /* RW;The physical address of the image */
    uint64_t  au64VirAddr[3]; /* RW;The virtual address of the image */
    uint32_t  au32Stride[3];  /* RW;The stride of the image */  
    uint32_t  u32Width;       /* RW;The width of the image */
    uint32_t  u32Height;      /* RW;The height of the image */
    uint32_t  u32AlignHeight;
    uint32_t  u32Size;
    void* reserve;
}NPU_IMAGE_S;

typedef struct MCV_ALGORITHM_RESULT_ST {
    uint64_t frame_id;
    uint32_t object_size;
    uint32_t bind_size;
    uint32_t cache_list_size;
    // the size of the original received image
    uint32_t original_width;
    uint32_t original_height;
    // the size of the algorithm processe image
    uint32_t processed_width;
    uint32_t processed_height;
    // other kind result, data transmission
    uint32_t meta_info_size;
    void *user_data;     // user data
    int reserved[15];    // reserved
} MCV_ALGORITHM_RESULT_S;

typedef struct FRAME_ALGORITHM_RESULT_ST
{
    NPU_IMAGE_S* npu_image;
    MCV_ALGORITHM_RESULT_S* palgorithm_result;
}FRAME_ALGORITHM_RESULT_S;

typedef HASHMAP(char, FRAME_ALGORITHM_RESULT_S) algo_std_map;
typedef struct FRAME_DATA_MAP_ST{
    pthread_mutex_t *map_data_result_lock;
    algo_std_map *map_data_result;
}FRAME_DATA_MAP_S;

static algo_std_map g_mapFaceDataResult;
static pthread_mutex_t g_mapFaceDataResultLock = PTHREAD_MUTEX_INITIALIZER;

static void hashmap_dump(algo_std_map *map, const char *label)
{
    DEBUG("Hashmap stats: %s\n", label);
    DEBUG("    # entries:           %zu\n", hashmap_size(map));
    DEBUG("    Table size:          %zu\n", map->map_base.table_size);
    DEBUG("    Load factor:         %.4f\n", hashmap_load_factor(map));
    DEBUG("    Collisions mean:     %.4f\n", hashmap_collisions_mean(map));
    DEBUG("    Collisions variance: %.4f\n", hashmap_collisions_variance(map));

    const char *key = NULL;
    const char *temp = NULL;
    FRAME_ALGORITHM_RESULT_S *p_frame_algo_result = NULL;
    hashmap_foreach_safe(key, p_frame_algo_result, map, temp)
    {
        if (hashmap_get(map, key) != p_frame_algo_result)
        {
            ERROR("invalid iterator, key is %s.\n", key);
            continue;
        }
        DEBUG("dump key is %s\n", key);
    }
}


static void* send_data_thread(void *lparam)
{
    uint64_t frame_id = 0;

    while (1)
    {
        NPU_IMAGE_S *npu_image = NULL;
        npu_image = (NPU_IMAGE_S *)calloc(1, sizeof(NPU_IMAGE_S));
        if (NULL == npu_image)
        {
            SYSERR("Fail to calloc npu_image.\n");
            usleep(100*1000);
            continue;
        }
        npu_image->u32Width  = frame_id * 10;
        npu_image->u32Height = frame_id * 5;

        int hashmap_ret = -1;
        char key_str[32] = {0};
        FRAME_ALGORITHM_RESULT_S *pstAlgorithmResult = NULL;
        pstAlgorithmResult = (FRAME_ALGORITHM_RESULT_S *)calloc(1, sizeof(FRAME_ALGORITHM_RESULT_S));
        if (pstAlgorithmResult)
        {
            pthread_mutex_lock(&g_mapFaceDataResultLock);
            pstAlgorithmResult->npu_image = npu_image;
            pstAlgorithmResult->palgorithm_result = NULL;
            snprintf(key_str, sizeof(key_str), "%llu", frame_id++);
            hashmap_ret = hashmap_put(&g_mapFaceDataResult, key_str, (void*)pstAlgorithmResult);
            if (hashmap_ret < 0)
            {
                ERROR("Fail to called hashmap_put, key is %s, because of %s.\n", key_str, strerror(-hashmap_ret));
                free(npu_image);
                free(pstAlgorithmResult);
            }
            else
            {
                DEBUG("input: key:%s - pstAlgorithmResult:%p, npu_image:%p, map size %d.\n", 
                       key_str, pstAlgorithmResult, pstAlgorithmResult->npu_image, hashmap_size(&g_mapFaceDataResult));
            }
            pthread_mutex_unlock(&g_mapFaceDataResultLock);
        }
        usleep(100*1000);
    }


    return NULL;
}

static void* recv_data_thread(void *lparam)
{
    while (1)
    {
        pthread_mutex_lock(&g_mapFaceDataResultLock);
        const void *temp = NULL;
        const char *key = NULL;
        void *data = NULL;
        FRAME_ALGORITHM_RESULT_S *pframe_algorithm_result = NULL;
        INFO("hashmap_size is %d.\n", hashmap_size(&g_mapFaceDataResult));
        hashmap_foreach_safe(key, pframe_algorithm_result, &g_mapFaceDataResult, temp)
        {
            data = NULL;
            data = hashmap_get(&g_mapFaceDataResult, key);
            if (data != pframe_algorithm_result)
            {
                ERROR("invalid iterator, data = %p, pframe_algorithm_result=%p.\n",
                      data, pframe_algorithm_result);
                continue;
            }
            printf("--------------------------------------------------------");
            hashmap_dump(&g_mapFaceDataResult, __FUNCTION__);
            DEBUG("found it, key is %s.\n", key);
            printf("--------------------------------------------------------");
        }
        
        pthread_mutex_unlock(&g_mapFaceDataResultLock);
        usleep(500*1000);
    }

    return NULL;
}

int main(int argc, const char *argv[])
{
    pthread_t send_pid = -1;
    pthread_t recv_pid = -1;

    hashmap_init(&g_mapFaceDataResult, hashmap_hash_string, strcmp);

    pthread_create(&send_pid, NULL, send_data_thread, NULL);
    pthread_create(&recv_pid, NULL, recv_data_thread, NULL);

    while(1);

    return 0;
}

table pointer is NULL

After creating a hashmap the hb->table pointer is null. Any operation will produce memory errors due to this. calling hashmap_reserve will result in a memory error due to the null pointer.
hashmap_rehash() will produce a memory error because of applying an offset to the null pointer.

modifying key interface requirements

Can you provide an interface "hashmap_set" for modifying the key, the function is to add the key if it does not exist, and modify it if it exists. Thank you so much.

incovenient function type in hashmap_set_key_alloc_funcs

Because of the type of the free function this takes the most common use case of just passing 'free' produces a warning

Incompatible function pointer types initializing 'typeof (((&skined_anim_mngr.map))->map_types->t_key_free_func)' (aka 'void (*)(char *)') with an expression of type 'void (void *)'

Therefore wouldn't it make sense to change the type to void free(char *) ?

Multiple errors with typeof

I'm trying to upgrade my project to the latest version of your hashmap (previously using version 2016-2018) , but after rewriting all instances to include new syntax, I'm running into multiple errors at compile time that basically seem to reduce to:
error: ‘typeof’ was not declared in this scope

Searching the internet seems to indicate that this is a GNU extension to C that doesn't work well with C++. This is a mixed C/C++ project. Is there anything I can do?

How to build in nodebug mode

I need to run this lib in sgx. But sgx do not support __assert_fail. So I need to make in no debug mode.

Please help. Thanks

Incompatibility with the c2x standard

root@vps103211:~/projects# gcc -c src/nnetwork.c -o obj/nnetwork.o -lhashmap
root@vps103211:~/projects# gcc -c src/nnetwork.c -o obj/nnetwork.o -lhashmap --std=c2x
In file included from src/../libs/nnetwork.h:5,
                 from src/nnetwork.c:1:
src/nnetwork.c: In functioninit’:
src/nnetwork.c:19:3: error: expected ‘;’ before__map_hash19 |   hashmap_init(&map1, hashmap_hash_string, strcmp);
      |   ^~~~~~~~~~~~
src/nnetwork.c:19:3: error: expected ‘;’ before__map_compare19 |   hashmap_init(&map1, hashmap_hash_string, strcmp);
      |   ^~~~~~~~~~~~
src/nnetwork.c:19:3: error: ‘__map_hashundeclared (first use in this function)
   19 |   hashmap_init(&map1, hashmap_hash_string, strcmp);
      |   ^~~~~~~~~~~~
src/nnetwork.c:19:3: note: each undeclared identifier is reported only once for each function it appears in
src/nnetwork.c:19:3: error: ‘__map_compareundeclared (first use in this function)
   19 |   hashmap_init(&map1, hashmap_hash_string, strcmp);
      |   ^~~~~~~~~~~~
root@vps103211:~/projects#

global hashmap can't be seen outside of where it's declared

Hi, I'm trying to use this hashmap as a global variable but I can't figure out how the variable can be seen outside of the scope of where it's declared. I've loaded the hashmap correctly:
(gdb) p map $499 = {map_base = {table_size_init = 128, table_size = 2048, size = 1466, table = 0x555558c909b0, hash = 0x55555556a12f <hashmap_hash_string>, compare = 0x7ffff70a9c50, key_dup = 0x0, key_free = 0x0}, map_types = 0x7fffffffb550}

But when I enter into a function, the map has a different address and is empty:
(gdb) p map $501 = (struct hashmap_base *) 0x0

The struct also seems to be different and the two maps have different addresses. I've used the macro HASHMAP in the main function because I couldn't get the program to compile using a typedef in the header file. Any ideas on how I can get this to work?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.