greg7mdp / parallel-hashmap Goto Github PK
View Code? Open in Web Editor NEWA family of header-only, very fast and memory-friendly hashmap and btree containers.
Home Page: https://greg7mdp.github.io/parallel-hashmap/
License: Apache License 2.0
A family of header-only, very fast and memory-friendly hashmap and btree containers.
Home Page: https://greg7mdp.github.io/parallel-hashmap/
License: Apache License 2.0
In some applications it would be desirable to have bi-directional iteration over the hash table (even though the standards specifies that iterators are only forward iterators). Would it be possible to add such functionality to node_ and flat_ hash tables?
When using parallel-hashmap and telling it a mutex type to use internally for thread-safety .... you say it's not safe to use iterators while another thread may be writing ....
Obviously iterating with those iterators I would expect to not be thread-safe.
But presumably it is thread-safe to, without any extra external locking, do a .find() - which returns precisely such an iterator - and then retrieve the value from that iterator ....?
Otherwise it's not at all clear how the class could be at all used thread-safely, e.g. how does simply one retrieve a value from the map (in such a case, generally using the [] operator is less desirable that using find(), as we don't want to potentially be writing a new entry to the map as a side-effect of our get, and creating extra contention in the process etc).
Note - I did look at the examples and tests, but trying to wade through all the macros is a ton harder and less informative than if one could just read some simple example code along with an explanation of exactly what is and isn't safe to do.
When compiling latest clone with g++-8 (Ubuntu/Linaro 8.3.0-26ubuntu1~18.04) 8.3.0
on Linux minit 4.9.140-tegra #1 SMP PREEMPT Wed Mar 13 00:32:22 PDT 2019 aarch64 aarch64 aarch64 GNU/Linux
with the following switch -O3 -std=c++1z
In file included from XXX
../include/parallel_hashmap/phmap.h: In function ‘size_t phmap::container_internal::CapacityToGrowth(size_t)’:
/home/minit/nano-tools/include/parallel_hashmap/phmap_config.h:634:58: error: ‘capacity’ is not a constant expression
#define PHMAP_IF_CONSTEXPR(expr) if constexpr ((expr))
../include/parallel_hashmap/phmap.h:499:5: note: in expansion of macro ‘PHMAP_IF_CONSTEXPR’
PHMAP_IF_CONSTEXPR (Group::kWidth == 8 && capacity == 7) {
^~~~~~~~~~~~~~~~~~
../include/parallel_hashmap/phmap.h: In function ‘size_t phmap::container_internal::GrowthToLowerboundCapacity(size_t)’:
/home/minit/nano-tools/include/parallel_hashmap/phmap_config.h:634:58: error: ‘growth’ is not a constant expression
#define PHMAP_IF_CONSTEXPR(expr) if constexpr ((expr))
^
../include/parallel_hashmap/phmap.h:513:5: note: in expansion of macro ‘PHMAP_IF_CONSTEXPR’
PHMAP_IF_CONSTEXPR (Group::kWidth == 8 && growth == 7) {
phmap_base.h
mentions std::result_of
here:
parallel-hashmap/parallel_hashmap/phmap_base.h
Lines 456 to 457 in 2399d65
In VS 2019 16.5, due to compiler improvements for the [[deprecated]]
attribute, this line will emit "warning C4996: 'std::result_of<_Fty>
': warning STL4014: std::result_of
and std::result_of_t
are deprecated in C++17. They are superseded by std::invoke_result
and std::invoke_result_t
. You can define _SILENCE_CXX17_RESULT_OF_DEPRECATION_WARNING
or _SILENCE_ALL_CXX17_DEPRECATION_WARNINGS
to acknowledge that you have received this warning."
It appears that nothing in parallel-hashmap uses result_of_t
. Therefore, I believe that the best fix would be to delete this line entirely.
Thank you Greg for this awesome contribution, I am finding it quite useful for my application.
I plan to generate a very large hashmap with bitsets as the key and integers as the value. The size of the hashmap will be 10GB when serialized to binary. It might also be important to note that the hashmap will not be modified after generation; purely used for look-up purposes.
I went ahead and serialized a 1GB flat hashmap using cereal and that took about 10 minutes, while it took about 20 minutes to deserialize.
From what I could gather, hashmaps are slow to deserialize due to the fact they need to repopulate the entire map again from scratch (or so I've been told).
To mitigate the slow deserialization process I was thinking to split the hashmap up and by some means parallelize the process of deserialization. Or maybe pre-allocating the memory necessary for the hashmap will speed things up. I am not really sure.
I look forward to receiving your response.
Hi,
I'm using the parallel-hashmap for text tokenisation at https://github.com/bnosac/tokenizers.bpe
It is an R package. When compiling the package with Address Sanitizer (Debian Linux, R-devel, GCC ASAN/UBSAN)
I'm getting the following address sanitizer messages from parallel-hashmap. Is it possible to fix these? And how? Thanks!
./parallel_hashmap/phmap.h: In function ‘__m128i phmap::container_internal::_mm_cmpgt_epi8_fixed(__m128i, __m128i)’:
./parallel_hashmap/phmap.h:294:44: warning: overflow in implicit constant conversion [-Woverflow]
const __m128i mask = _mm_set1_epi8(0x80);
^
./parallel_hashmap/phmap_bits.h: In function ‘uint64_t umul128(uint64_t, uint64_t, uint64_t*)’:
./parallel_hashmap/phmap_bits.h:342:44: warning: ISO C++ does not support ‘__int128’ for ‘type name’ [-Wpedantic]
auto result = static_cast<unsigned __int128>(a) * static_cast<unsigned __int128>(b);
^~~~~~~~
./parallel_hashmap/phmap_bits.h:342:80: warning: ISO C++ does not support ‘__int128’ for ‘type name’ [-Wpedantic]
auto result = static_cast<unsigned __int128>(a) * static_cast<unsigned __int128>(b);
^~~~~~~~
Full trace log below
* installing to library ‘/home/docker/R’
* installing *source* package ‘tokenizers.bpe’ ...
** libs
g++ -fsanitize=undefined,bounds-strict -fno-omit-frame-pointer -std=gnu++11 -I"/usr/local/lib/R/include" -DNDEBUG -pthread -DSTRICT_R_HEADERS -I./youtokentome/cpp -I. -I"/home/docker/R/Rcpp/include" -I/usr/local/include -fpic -g -O2 -Wall -pedantic -mtune=native -c youtokentome/cpp/utils.cpp -o youtokentome/cpp/utils.o
In file included from ./parallel_hashmap/phmap_utils.h:26:0,
from ./parallel_hashmap/phmap.h:51,
from youtokentome/cpp/utils.h:6,
from youtokentome/cpp/utils.cpp:2:
./parallel_hashmap/phmap_bits.h: In function ‘uint64_t umul128(uint64_t, uint64_t, uint64_t*)’:
./parallel_hashmap/phmap_bits.h:342:44: warning: ISO C++ does not support ‘__int128’ for ‘type name’ [-Wpedantic]
auto result = static_cast<unsigned __int128>(a) * static_cast<unsigned __int128>(b);
^~~~~~~~
./parallel_hashmap/phmap_bits.h:342:80: warning: ISO C++ does not support ‘__int128’ for ‘type name’ [-Wpedantic]
auto result = static_cast<unsigned __int128>(a) * static_cast<unsigned __int128>(b);
^~~~~~~~
In file included from youtokentome/cpp/utils.h:6:0,
from youtokentome/cpp/utils.cpp:2:
./parallel_hashmap/phmap.h: In function ‘__m128i phmap::container_internal::_mm_cmpgt_epi8_fixed(__m128i, __m128i)’:
./parallel_hashmap/phmap.h:294:44: warning: overflow in implicit constant conversion [-Woverflow]
const __m128i mask = _mm_set1_epi8(0x80);
^
g++ -fsanitize=undefined,bounds-strict -fno-omit-frame-pointer -std=gnu++11 -I"/usr/local/lib/R/include" -DNDEBUG -pthread -DSTRICT_R_HEADERS -I./youtokentome/cpp -I. -I"/home/docker/R/Rcpp/include" -I/usr/local/include -fpic -g -O2 -Wall -pedantic -mtune=native -c youtokentome/cpp/bpe.cpp -o youtokentome/cpp/bpe.o
In file included from ./parallel_hashmap/phmap_utils.h:26:0,
from ./parallel_hashmap/phmap.h:51,
from youtokentome/cpp/bpe.h:6,
from youtokentome/cpp/bpe.cpp:4:
./parallel_hashmap/phmap_bits.h: In function ‘uint64_t umul128(uint64_t, uint64_t, uint64_t*)’:
./parallel_hashmap/phmap_bits.h:342:44: warning: ISO C++ does not support ‘__int128’ for ‘type name’ [-Wpedantic]
auto result = static_cast<unsigned __int128>(a) * static_cast<unsigned __int128>(b);
^~~~~~~~
./parallel_hashmap/phmap_bits.h:342:80: warning: ISO C++ does not support ‘__int128’ for ‘type name’ [-Wpedantic]
auto result = static_cast<unsigned __int128>(a) * static_cast<unsigned __int128>(b);
^~~~~~~~
In file included from youtokentome/cpp/bpe.h:6:0,
from youtokentome/cpp/bpe.cpp:4:
./parallel_hashmap/phmap.h: In function ‘__m128i phmap::container_internal::_mm_cmpgt_epi8_fixed(__m128i, __m128i)’:
./parallel_hashmap/phmap.h:294:44: warning: overflow in implicit constant conversion [-Woverflow]
const __m128i mask = _mm_set1_epi8(0x80);
^
g++ -fsanitize=undefined,bounds-strict -fno-omit-frame-pointer -std=gnu++11 -I"/usr/local/lib/R/include" -DNDEBUG -pthread -DSTRICT_R_HEADERS -I./youtokentome/cpp -I. -I"/home/docker/R/Rcpp/include" -I/usr/local/include -fpic -g -O2 -Wall -pedantic -mtune=native -c youtokentome/cpp/utf8.cpp -o youtokentome/cpp/utf8.o
In file included from ./parallel_hashmap/phmap_utils.h:26:0,
from ./parallel_hashmap/phmap.h:51,
from youtokentome/cpp/utils.h:6,
from youtokentome/cpp/utf8.h:3,
from youtokentome/cpp/utf8.cpp:2:
./parallel_hashmap/phmap_bits.h: In function ‘uint64_t umul128(uint64_t, uint64_t, uint64_t*)’:
./parallel_hashmap/phmap_bits.h:342:44: warning: ISO C++ does not support ‘__int128’ for ‘type name’ [-Wpedantic]
auto result = static_cast<unsigned __int128>(a) * static_cast<unsigned __int128>(b);
^~~~~~~~
./parallel_hashmap/phmap_bits.h:342:80: warning: ISO C++ does not support ‘__int128’ for ‘type name’ [-Wpedantic]
auto result = static_cast<unsigned __int128>(a) * static_cast<unsigned __int128>(b);
^~~~~~~~
In file included from youtokentome/cpp/utils.h:6:0,
from youtokentome/cpp/utf8.h:3,
from youtokentome/cpp/utf8.cpp:2:
./parallel_hashmap/phmap.h: In function ‘__m128i phmap::container_internal::_mm_cmpgt_epi8_fixed(__m128i, __m128i)’:
./parallel_hashmap/phmap.h:294:44: warning: overflow in implicit constant conversion [-Woverflow]
const __m128i mask = _mm_set1_epi8(0x80);
^
g++ -fsanitize=undefined,bounds-strict -fno-omit-frame-pointer -std=gnu++11 -I"/usr/local/lib/R/include" -DNDEBUG -pthread -DSTRICT_R_HEADERS -I./youtokentome/cpp -I. -I"/home/docker/R/Rcpp/include" -I/usr/local/include -fpic -g -O2 -Wall -pedantic -mtune=native -c rcpp_youtokentome.cpp -o rcpp_youtokentome.o
In file included from ./parallel_hashmap/phmap_utils.h:26:0,
from ./parallel_hashmap/phmap.h:51,
from ./youtokentome/cpp/bpe.h:6,
from rcpp_youtokentome.cpp:3:
./parallel_hashmap/phmap_bits.h: In function ‘uint64_t umul128(uint64_t, uint64_t, uint64_t*)’:
./parallel_hashmap/phmap_bits.h:342:44: warning: ISO C++ does not support ‘__int128’ for ‘type name’ [-Wpedantic]
auto result = static_cast<unsigned __int128>(a) * static_cast<unsigned __int128>(b);
^~~~~~~~
./parallel_hashmap/phmap_bits.h:342:80: warning: ISO C++ does not support ‘__int128’ for ‘type name’ [-Wpedantic]
auto result = static_cast<unsigned __int128>(a) * static_cast<unsigned __int128>(b);
^~~~~~~~
In file included from ./youtokentome/cpp/bpe.h:6:0,
from rcpp_youtokentome.cpp:3:
./parallel_hashmap/phmap.h: In function ‘__m128i phmap::container_internal::_mm_cmpgt_epi8_fixed(__m128i, __m128i)’:
./parallel_hashmap/phmap.h:294:44: warning: overflow in implicit constant conversion [-Woverflow]
const __m128i mask = _mm_set1_epi8(0x80);
^
g++ -fsanitize=undefined,bounds-strict -fno-omit-frame-pointer -std=gnu++11 -I"/usr/local/lib/R/include" -DNDEBUG -pthread -DSTRICT_R_HEADERS -I./youtokentome/cpp -I. -I"/home/docker/R/Rcpp/include" -I/usr/local/include -fpic -g -O2 -Wall -pedantic -mtune=native -c RcppExports.cpp -o RcppExports.o
g++ -fsanitize=undefined,bounds-strict -fno-omit-frame-pointer -std=gnu++11 -shared -L/usr/local/lib/R/lib -L/usr/local/lib -o tokenizers.bpe.so youtokentome/cpp/utils.o youtokentome/cpp/bpe.o youtokentome/cpp/utf8.o rcpp_youtokentome.o RcppExports.o -pthread -L/usr/local/lib/R/lib -lR
rm -f youtokentome/cpp/utils.o youtokentome/cpp/bpe.o youtokentome/cpp/utf8.o rcpp_youtokentome.o RcppExports.o
installing to /home/docker/R/tokenizers.bpe/libs
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (tokenizers.bpe)
Intel C++ compiler issues warning:
warning #2196: routine is both "inline" and "noinline"
for methods marked as PHMAP_ATTRIBUTE_NOINLINE. I think this is known issue with icpc, that could be explicitly addressed by adding intel specific pragma.
benchHashMap n = 4086946, keyType = int64_t, valueType = RankItem(128 Byet)
#0 0x00007f171cfdf428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007f171cfdf428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007f171cfe102a in __GI_abort () at abort.c:89
#2 0x00007f171d92284d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007f171d9206b6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007f171d920701 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007f171d920919 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007f171d920ebc in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x000000000040de84 in __gnu_cxx::new_allocator<void* phmap::container_internal::Allocate<8ul, std::allocator<std::pair<long const, RankItem> > >(std::allocator<std::pair<long const, RankItem> >, unsigned long)::M>::allocate(unsigned long, void const) (this=, __n=) at /usr/include/c++/5/ext/new_allocator.h:104
#8 phmap::allocator_traits<std::allocator<void* phmap::container_internal::Allocate<8ul, std::allocator<std::pair<long const, RankItem> > >(std::allocator<std::pair<long const, RankItem> >, unsigned long)::M> >::allocate(std::allocator<void phmap::container_internal::Allocate<8ul, std::allocator<std::pair<long const, RankItem> > >(std::allocator<std::pair<long const, RankItem> >*, unsigned long)::M>&, unsigned long) (a=, n=) at ./phmap/phmap_base.h:1579
#9 phmap::container_internal::Allocate<8ul, std::allocator<std::pair<long const, RankItem> > > (alloc=0x7ffd8ac798e8, n=) at ./phmap/phmap.h:696
#10 phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<long, RankItem>, std::hash, phmap::EqualTo, std::allocator<std::pair<long const, RankItem> > >::initialize_slots (this=this@entry=0x7ffd8ac798c0)
at ./phmap/phmap.h:1810
#11 0x00000000004274da in phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<long, RankItem>, std::hash, phmap::EqualTo, std::allocator<std::pair<long const, RankItem> > >::resize (this=0x7ffd8ac798c0,
new_capacity=) at ./phmap/phmap.h:1842
#12 0x000000000042f5ec in phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<long, RankItem>, std::hash, phmap::EqualTo, std::allocator<std::pair<long const, RankItem> > >::rehash_and_grow_if_necessary (
this=0x7ffd8ac798c0) at ./phmap/phmap.h:1934
#13 phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<long, RankItem>, std::hash, phmap::EqualTo, std::allocator<std::pair<long const, RankItem> > >::prepare_insert (this=0x7ffd8ac798c0,
hash=11022116168902158974) at ./phmap/phmap.h:2025
#14 0x000000000042f6cd in phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<long, RankItem>, std::hash, phmap::EqualTo, std::allocator<std::pair<long const, RankItem> > >::find_or_prepare_insert (
this=this@entry=0x7ffd8ac798c0, key=@0x7ffd8ac798b8: -187985434075425886, hash=) at ./phmap/phmap.h:2013
#15 0x0000000000430477 in phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<long, RankItem>, std::hash, phmap::EqualTo, std::allocator<std::pair<long const, RankItem> > >::find_or_prepare_insert (
key=@0x7ffd8ac798b8: -187985434075425886, this=0x7ffd8ac798c0) at ./phmap/phmap.h:2018
#16 phmap::container_internal::raw_hash_map<phmap::container_internal::FlatHashMapPolicy<long, RankItem>, std::hash, phmap::EqualTo, std::allocator<std::pair<long const, RankItem> > >::try_emplace_impl<long const&> (
k=@0x7ffd8ac798b8: -187985434075425886, this=0x7ffd8ac798c0) at ./phmap/phmap.h:2293
#17 phmap::container_internal::raw_hash_map<phmap::container_internal::FlatHashMapPolicy<long, RankItem>, std::hash, phmap::EqualTo, std::allocator<std::pair<long const, RankItem> > >::try_emplace<long, , 0>(long const&) (
k=@0x7ffd8ac798b8: -187985434075425886, this=0x7ffd8ac798c0) at ./phmap/phmap.h:2241
#18 phmap::container_internal::raw_hash_map<phmap::container_internal::FlatHashMapPolicy<long, RankItem>, std::hash, phmap::EqualTo, std::allocator<std::pair<long const, RankItem> > >::operator[]<long, phmap::container_internal::FlatHashMapPolicy<long, RankItem> > (key=@0x7ffd8ac798b8: -187985434075425886, this=0x7ffd8ac798c0) at ./phmap/phmap.h:2277
#19 insert_high_load<phmap::flat_hash_map<long, RankItem, std::hash, phmap::EqualTo, std::allocator<std::pair<long const, RankItem> > > > (ahash=..., hash_name="phmap", vList=std::vector of length 4086946, capacity 4086946 = {...})
at ebench.cpp:535
#20 0x0000000000430b3b in benOneHash<phmap::flat_hash_map<long, RankItem, std::hash, phmap::EqualTo, std::allocator<std::pair<long const, RankItem> > > > (tmp=..., hash_name="phmap",
oList=std::vector of length 4086946, capacity 4086946 = {...}) at ebench.cpp:933
#21 0x00000000004064e4 in benchHashMap (n=4086946) at ebench.cpp:1041
#22 0x000000000040230a in main (argc=, argv=) at ebench.cpp:1312
Are iterators still invalidated after each insert?
Can anyone of kindness has implemented a gdb pretty printer for phmap's types? Please offer one here, thanks a lot!!
Hi,
How can I see the contents of a flat_hash_set?
If I knew how to do it, I could write a gdb pretty-printer for it.
At the moment, it's impossible to see the contents of the set, at least for me who doesn't know the structure
Does parallel-hashmap ever make its own clean instance of my custom Allocator and allocates from that instead of a copy of the allocator I passed in the constructor?
I wonder what happens to your SSE code on ARM CPUs that only have NEON.
The flat hash maps may move the keys and values in memory. So if you keep a pointer to something inside a flat hash map, this pointer may become invalid when the map is mutated. The node hash maps don't, and should be used instead if this is a problem.
What's mean "move the keys and values in memory" ?
Why "pointer may become invalid"?I can't understand.
Because used memcpy() and didn't use the copy constructor?
Hey Greg,
I would like to make an unordered unique multimap phmap::flat_hash_map<Key, phmap::flat_hash_set<Value>>
where Key
and Value
are trivially copyable. As one would expect when I try to dump the outer map using phmap::BinaryOutputArchive ar_out( filename.c_str() ); dict.dump( ar_out );
I get an error that the inner set is not trivially copyable.
What would you suggest for serializing?
Also, side question: Is this how you would make a multimap?
Thanks,
Adam
When I run following code:
phmap::parallel_flat_hash_map<std::string, int> my_map;
my_map.max_load_factor(0.5);
my_map.reserve(15);
std::cout << my_map.max_load_factor() << std::endl;
std::cout << my_map.bucket_count() << std::endl;
I got the result:
1
16
But it is supposed to be:
0.5
32
can you add my hash map https://github.com/ktprime/emhash to your bench code?
I'm not exactly sure how this is meant to be used with Abseil. I set -DPHMAP_USE_ABSL_HASH
, -DPHMAP_USE_ABSL_HASHEQ
, include parallel_hashmap/phmap_fwd_decl.h
, and then am I to include absl/hash/hash.h
, finally phmap
?
Playing around with it, but I'm getting cryptic name collisions, and thought I would ask about intended usage oppose to fighting with it more. I'm migrating from vanilla abseil to phmap for stronger concurrency guarantees in parallel_*
. I want to retain my hashing from hashing with abseil. My classes have the CRTP AbslHashValue hook defined, and I was hoping for plug and play.
A provided example would be 🎉
Since it's a header only lib, can it be easily exported as a lib and used by C code?
thanks!
'Tis but a nitpick.
When I see the name "Parallel Hash Map" I immediately think of a hash map for parallel (concurrent) computations.
So the name choice is pretty confusing in this regard.
Not sure if it existed back when this was first added, but currently there is no absl::EqualTo, so compilation breaks when PHMAP_USE_ABSL_HASHEQ is defined. Workaround is to add this before including phmap.h
:
namespace absl {
template <typename T>
struct EqualTo {
bool operator()(const T& a, const T& b) const {
return a == b;
}
};
}
At a glance, couldn't phmap just always use its own EqualTo? At least for me, I'm solely using PHMAP_USE_ABSL_HASHEQ for the hashing.
If I wanted to create a parallel_flat_hash_map with the last parameter being std::mutex, what should i put for the Hash, Eq, Alloc and N, what are the defaults.
template <class K, class V, class Hash, class Eq, class Alloc, size_t N, class Mtx_>
class parallel_flat_hash_map ......;
I tried to look into the codes to figure it out myself, but seems it's beyond my level :P
Hello @greg7mdp !
I'm using a parallel_flat_hash_map<uint64_t, uint32_t>
to store information about some groups, much like a SQL GROUP BY. Map key is a XXH3 cumulative hash of the grouped columns and value is the position of that group.
I submit to 16 worker threads a list of group keys to insert, following example code from your documentation:
using map_type = phmap::parallel_flat_hash_map<uint64_t, uint32_t>;
map_type map;
tv::parallel::for_each(
map.subcnt(),
[&](auto& ctx) {
using hasher_type = typename map_type::hasher;
hasher_type hasher;
// gkeys is a std::vector<uint64_t> containing group keys
for (auto gkey : gkeys) {
auto map_key = hasher(gkey);
auto idx = map.subidx(map_key);
// ctx.local_index is the worker index: from 0 to map.subcnt() - 1
if (idx == ctx.local_index) {
auto gpos = current_group_count;
// This line triggers AddressSanitizer or segfault
auto res = map.insert(map_entry(gkey, gpos));
if (res.second) {
// New element inserted
++current_group_count;
} else {
// Grab existing group position
gpos = res.first->second;
}
// Update size of group
++group_sizes[gpos];
}
}
}
);
When compiled with AddressSanitizer under GCC 8.3.0, I get the attached trace. When compiled without AddressSanitizer, it segfaults at the same location. I've banged my head around (though it starts to hurt a bit now) checking and re-checking to no-avail.
Could you please help me ?
This one in minor. In phmap_base.h, when PHMAP_HAVE_EXCEPTIONS is 0 (false), the implementation of Throw does not reference the parameter 'error' and may generate a warning.
I have the following two types:
class StringView final {
public:
template <size_t sz>
constexpr StringView(const char (&_text)[sz]) noexcept
: _data(_text), _length(sz) {
static_assert(sz > 0, "String must not be empty");
}
constexpr StringView(const char* __restrict _text, size_t len) noexcept
: _data(_text), _length(len) {}
StringView(const StringView v) noexcept
: _data(v._data), _length(v._length) {}
StringView(const String& str) noexcept;
StringView() = delete;
constexpr const char* data() const noexcept { return _data; }
constexpr size_t length() const noexcept { return _length; }
bool operator==(const StringView v) const noexcept;
private:
const char* _data;
const size_t _length;
};
struct String final {
String() noexcept;
String(sds str) noexcept;
String(size_t len) noexcept;
String(const StringView v) noexcept;
String(const char* __restrict str) noexcept;
String(const char* __restrict str, size_t len) noexcept;
~String() noexcept;
String(const String& other) noexcept;
String(String&& other) noexcept;
void operator=(const String& str) noexcept;
void operator=(String&& str) noexcept;
operator sds() const noexcept { return s; }
bool operator==(const String& str) const noexcept;
bool operator==(const char* __restrict str) const noexcept;
bool operator==(const StringView v) const noexcept;
size_t length() const noexcept;
sds s;
};
I would like to implement heterogeneous lookup between them.
I've done the following:
struct StringHash {
using is_transparent = void;
size_t operator()(const StringView v) const;
};
struct StringEq {
using is_transparent = void;
bool operator()(const StringView lhs, const StringView rhs) const;
};
And I've then created the following map:
phmap::node_hash_map<String, String, StringHash, StringEq>
However, it appears emplacing values into it triggers an assert.
where[String(key.data(), key.length())] =
String(value.as_cstring(), value.get_string_length())
parallel-hashmap/parallel_hashmap/phmap.h
Lines 2010 to 2012 in 882162c
How should I go about doing this? Are there any examples available anywhere?
I observe intermittent segmentation fault when I use parallel_flat_hash_map
in a multithreaded application. The fault happens in raw_hash_set::EqualElement::operator()
. I guess references that EqualElement
stores became invalid.
Sanitizer output:
==2547==ERROR: AddressSanitizer: heap-use-after-free on address 0x7fffea323880 at pc 0x000000487e8d bp 0x7fffd4c4e930 sp 0x7fffd4c4e928
READ of size 8 at 0x7fffea323880 thread T7
#0 0x487e8c in std::__1::equal_to<unsigned long>::operator()(unsigned long const&, unsigned long const&) const /somewhere/bin/../include/c++/v1/functional:688:17
#1 0x487e8c in phmap::EqualTo<unsigned long>::operator()(unsigned long const&, unsigned long const&) const /somewhere/include/parallel_hashmap/phmap_base.h:79:16
#2 0x487e8c in bool phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::EqualElement<unsigned long>::operator()<unsigned long, std::__1::piecewise_construct_t const&, std::__1::tuple<unsigned long const&>, std::__1::tuple<short const&> >(unsigned long const&, std::__1::piecewise_construct_t const&, std::__1::tuple<unsigned long const&>&&, std::__1::tuple<short const&>&&) const /somewhere/include/parallel_hashmap/phmap.h:1674:20
#3 0x487e8c in decltype(std::declval<phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::EqualElement<unsigned long> >()(std::declval<unsigned long const& const&>(), std::__1::piecewise_construct, std::declval<std::__1::tuple<unsigned long const&> >(), std::declval<std::__1::tuple<short const&> >())) phmap::container_internal::memory_internal::DecomposePairImpl<phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::EqualElement<unsigned long>, unsigned long const&, std::__1::tuple<short const&> >(phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::EqualElement<unsigned long>&&, std::__1::pair<std::__1::tuple<unsigned long const&>, std::__1::tuple<short const&> >) /somewhere/include/parallel_hashmap/phmap.h:673:12
#4 0x487e8c in decltype(memory_internal::DecomposePairImpl(std::forward<phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::EqualElement<unsigned long> >(fp), PairArgs(std::forward<std::__1::pair<unsigned long const, short>&>(fp0)))) phmap::container_internal::DecomposePair<phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::EqualElement<unsigned long>, std::__1::pair<unsigned long const, short>&>(phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::EqualElement<unsigned long>&&, std::__1::pair<unsigned long const, short>&) /somewhere/include/parallel_hashmap/phmap.h:3499:12
#5 0x487e8c in decltype(phmap::container_internal::DecomposePair(std::declval<phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::EqualElement<unsigned long> >(), std::declval<std::__1::pair<unsigned long const, short>&>())) phmap::container_internal::FlatHashMapPolicy<unsigned long, short>::apply<phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::EqualElement<unsigned long>, std::__1::pair<unsigned long const, short>&>(phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::EqualElement<unsigned long>&&, std::__1::pair<unsigned long const, short>&) /somewhere/include/parallel_hashmap/phmap.h:3602:16
#6 0x487e8c in decltype(phmap::container_internal::FlatHashMapPolicy<unsigned long, short>::apply(std::forward<phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::EqualElement<unsigned long> >(fp), std::forward<std::__1::pair<unsigned long const, short>&>(fp0))) phmap::container_internal::hash_policy_traits<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, void>::apply<phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::EqualElement<unsigned long>, std::__1::pair<unsigned long const, short>&, phmap::container_internal::FlatHashMapPolicy<unsigned long, short> >(phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::EqualElement<unsigned long>&&, std::__1::pair<unsigned long const, short>&) /somewhere/include/parallel_hashmap/phmap_base.h:544:16
#7 0x487e8c in phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::iterator phmap::container_internal::raw_hash_set<phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::find<unsigned long>(unsigned long const&, unsigned long) /somewhere/include/parallel_hashmap/phmap.h:1571:21
#8 0x487e8c in phmap::container_internal::parallel_hash_set<4ul, phmap::container_internal::raw_hash_set, phmap::NullMutex, phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::iterator phmap::container_internal::parallel_hash_set<4ul, phmap::container_internal::raw_hash_set, phmap::NullMutex, phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::find<unsigned long>(unsigned long const&, unsigned long) /somewhere/include/parallel_hashmap/phmap.h:3038:24
#9 0x487e8c in phmap::container_internal::parallel_hash_set<4ul, phmap::container_internal::raw_hash_set, phmap::NullMutex, phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::iterator phmap::container_internal::parallel_hash_set<4ul, phmap::container_internal::raw_hash_set, phmap::NullMutex, phmap::container_internal::FlatHashMapPolicy<unsigned long, short>, phmap::Hash<unsigned long>, phmap::EqualTo<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, short> > >::find<unsigned long>(unsigned long const&) /somewhere/include/parallel_hashmap/phmap.h:3044:16
I've wrote a test case and find that node_hash_map when load from a dump, the first key is 0, and all other key is incorrect, If I has missed some thing? the test code is bellow
TEST(PHMAP, TEST) {
{
phmap::node_hash_map<uint64_t, uint32_t> mp1 = {{100, 99}, {300, 299}};
for (const auto &n : mp1)
std::cout << n.first << "'s value is: " << n.second << "\n";
{
phmap::BinaryOutputArchive ar_out("./dump.data");
mp1.dump(ar_out);
}
}
phmap::node_hash_map<uint64_t, uint32_t> mp2;
{
phmap::BinaryInputArchive ar_in("./dump.data");
mp2.load(ar_in);
}
for (const auto& n : mp2)
std::cout << n.first << "'s value is: " << n.second << "\n";
}
the output is as follows:
300's value is: 299
100's value is: 99
0's value is: 299
93934918124288's value is: 99
There are two places protected with a check of PHMAP_HAVE_SSSE3, both in phmap.h. The protected intrinsics, _mm_movemask_epi8 and _mm_or_si128, are both SSE2 intrinsics.
It seems you could just remove those checks, the corresponding the #else case code, and simplify phmap_config.h to only test for (and require) SSE2.
Hello!
Is std::shared_mutex support for C++14 and higher planned? I know I can provide LockableImpl specialization for it, but what about native support?
Very interesting library! I wonder how the parallel map compares to Folly's ConcurrentHashMap [1]. I think this is a closer comparison than F14 Map.
[1] https://github.com/facebook/folly/blob/master/folly/concurrency/ConcurrentHashMap.h
Hi all,
I have created a python package that binds parallel-hashmap with numpy integration. The package is not finished, but it has complete, vectorized POD object support such as ints, floats, and bools. There is limited support for strings. I plan to precompile some simple tuples and arrays in the future. Extending new datatypes should be fairly straight forward.
I was able to achieve approximately comparable performance to the c++ implementation without any optimization of the bindings yet. (~30 seconds and 2100MB memory for 100M 8B:8B insertions).
The link to the repo is here: https://github.com/atom-moyer/sparsepy.git
Would we be interested in advertising that package on this page?
This is my first open source contribution, so I am quite excited to help as many people as I can with their hash map needs. My engineering research relies heavily on prototyping protocols which require high performance hash maps. Therefore, I have a personal interest in developing the fastest, smallest, and easiest hash maps possible.
Compiler Error:
c:\users\jerem\documents\umich\research\iros\polylidarv2\polylidar\parallel_hashmap\phmap_bits.h(265): error C3861: '_BitScanForward64': identifier not found
c:\users\jerem\documents\umich\research\iros\polylidarv2\polylidar\parallel_hashmap\phmap_bits.h(298): error C3861: '_BitScanForward': identififier not found
These errors occured on Line 265 of phmap_bits.h
. Googling the error informed me that we need to have a #include <intrin.h>
before this line number. However this #include
does not come until Line 327:
To fix this issue I deleted this include statement and put it below Line 162, here:
After this change it worked just fine for me. I think the only issue is that there should be a guard to check for X64 architecture. I tried the #include
other places but additional errors kept appearing.
Hope this helps.
I've encounter a case where I would like to modify the value inside the hash map in a thread-safe way. I would like to propose a function similar to if_contains that I have so far named if_modify. The difference between the two functions is if_modify is not const, and it locks the unique_lock rather than the shared_lock. Otherwise they use the same function body.
I would like to point out that an identifier like “__v
” does eventually not fit to the expected naming convention of the C++ language standard.
Would you like to adjust your selection for unique names?
Hey, just wanted to say that I really like the work you are doing here. We have found that SwissTables performance can suffer dramatically when there is not enough entropy in the hash function. By using a hash of (h ^ (h >> 4)) & 0x0F
to select your subtables, you are creating just such a reduction of entropy. Given that tables rarely grow large enough to use all their high bits, you can probably get better performance by doing something like h >> 56
. At the very least it is worth exploring.
Hello,
I encountered an issue with reserve and capacity.
I am using parallel flat hash map.
When calling reserve(X), The following call to capacity() does not seem to return X or even anything close, I could see why it would be different because of the way the hashmap is implemented, but the difference seems pretty big and as a result a call to capacity() seem useless and dangerous.
My use case was iterating a bunch of images to collect statistics and color info.
For each image I was doing a map.reserve(std::max(map.capacity(), image.size()) to make sure the map's capacity was at least the current image's number of pixels.
The program ended up trying to allocate several GBs of memory and ended up crashing pretty fast. Without the reserve line, memory usage peaks at like 350 MB.
Is this supposed to be this way ?
The Visual Studio 2017/2019 STL exports the template function
template <class _Kty>
inline size_t hash_value(const _Kty& _Keyval) { // hash _Keyval to size_t value one-to-one
return (size_t) _Keyval ^ (size_t) 0xdeadbeef;
}
into the std
namespace. This causes has_hash_value
to always return true
, seeing that the function is well-defined for any type as long as we don't have to actually run it.
This causes an unfortunate interaction with the StringHash
class, which is instantiated unconditionally as soon as you include phmap.h
and C++17 is being used:
template <class T>
struct Hash
{
template <class U, typename std::enable_if<has_hash_value<U>::value, int>::type = 0>
size_t _hash(const T& val) const
{
return hash_value(val);
}
template <class U, typename std::enable_if<!has_hash_value<U>::value, int>::type = 0>
size_t _hash(const T& val) const
{
return std::hash<T>()(val);
}
inline size_t operator()(const T& val) const
{
return _hash<T>(val);
}
};
// ...
struct StringHash
{
using is_transparent = void;
size_t operator()(std::string_view v) const {
return phmap::Hash<std::string_view>{}(v);
}
};
This will try to instantiate the above std::hash_value
on std::string_view
and fail miserably. One workaround is to define hash_value(std::string_view)
somewhere.
Hey. I try implement parallel hashmap to this solution IrrlichtBaw
I change
std::unordered_map
to your
phmap::flat_hash_map
Code where you can look at this
and i have error with allocator
Severity Code Description Project File Line Suppression State
Error C2338 You've instantiated std::aligned_storage<Len, Align> with an extended alignment (in other words, Align > alignof(max_align_t)). Before VS 2017 15.8, the member "type" would non-conformingly have an alignment of only alignof(max_align_t). VS 2017 15.8 was fixed to handle this correctly, but the fix inherently changes layout and breaks binary compatibility (only for uses of aligned_storage with extended alignments). Please define either (1) _ENABLE_EXTENDED_ALIGNED_STORAGE to acknowledge that you understand this message and that you actually want a type with an extended alignment, or (2) _DISABLE_EXTENDED_ALIGNED_STORAGE to silence this message and get the old non-conforming behavior. IrrlichtServer C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.24.28314\include\type_traits 1061
Its a bug ?
My Output
Happy code: phmap::flat_hash_map<void *, bool, std::hash<void *>> Foo; Foo[(void *)0x00] = true;
Sad code: phmap::flat_hash_map<void *, bool> Foo; Foo[(void *)0x00] = true;
Compiler log:
phmap.h(102): error C2440: 'static_cast': cannot convert from 'T *' to 'uintptr_t'
phmap.h(101): note: while compiling class template member function 'size_t phmap::Hash<const T *>::operator ()(T *) noexcept const'
phmap.h(4203): note: see reference to class template instantiation 'phmap::Hash<const T *>' being compiled
phmap.h(2418): note: see reference to function template instantiation 'std::pair<phmap::container_internal::raw_hash_set<Policy,Hash,Eq,Alloc>::iterator,bool> phmap::container_internal::raw_hash_map<Policy,Hash,Eq,Alloc>::try_emplace<_Ty,,0,0x0>(void *&&)' being compiled
with
[
Policy=phmap::container_internal::FlatHashMapPolicy<void *,bool>,
Hash=phmap::container_internal::HashEq<void *,void>::Hash,
Eq=phmap::container_internal::HashEq<void *,void>::Eq,
Alloc=std::allocator<std::pair<void *const ,bool>>,
_Ty=void *
]
C++ isn't my strong suit so I haven't quite figured out what this issue is yet.
But, when I try to build the example in MSVS 2017 and 2019 with Windows SDK 10.0.17763.0, using ICC 2019 update 3 (all are debug builds using default settings), I get these errors:
Severity Code Description Project File Line Suppression State
Error #3466 inheriting constructors must be inherited from a direct base class hash d:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap.h 4144
Error (active) E1389 redeclaration cannot add dllexport/dllimport to "ldexpf" (declared at line 706 of "C:\Program Files (x86)\Windows Kits\10\Include\10.0.17763.0\ucrt\corecrt_math.h") hash C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019\windows\compiler\include\math.h 704
Error (active) E1389 redeclaration cannot add dllexport/dllimport to "powf" (declared at line 743 of "C:\Program Files (x86)\Windows Kits\10\Include\10.0.17763.0\ucrt\corecrt_math.h") hash C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019\windows\compiler\include\math.h 799
Error (active) E2512 the argument to a feature-test macro must be a simple identifier hash D:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap_config.h 557
Error (active) E2512 the argument to a feature-test macro must be a simple identifier hash D:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap_config.h 573
Error (active) E2512 the argument to a feature-test macro must be a simple identifier hash D:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap_config.h 604
Error #3466 inheriting constructors must be inherited from a direct base class hash d:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap.h 2166
Error #3466 inheriting constructors must be inherited from a direct base class hash d:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap.h 3304
Error #3466 inheriting constructors must be inherited from a direct base class hash d:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap.h 4203
Error #3466 inheriting constructors must be inherited from a direct base class hash d:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap.h 4262
Error #3466 inheriting constructors must be inherited from a direct base class hash d:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap.h 4321
Error #3466 inheriting constructors must be inherited from a direct base class hash d:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap.h 4372
Error #3466 inheriting constructors must be inherited from a direct base class hash d:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap.h 4419
Error #3466 inheriting constructors must be inherited from a direct base class hash d:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap.h 4470
Error #3466 inheriting constructors must be inherited from a direct base class hash d:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap.h 4520
Works fine with Clang and MSVC using c++14 standard though.
Under c++17 with Clang I get:
Error (active) E2512 the argument to a feature-test macro must be a simple identifier hash D:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap_config.h 557
Error (active) E2512 the argument to a feature-test macro must be a simple identifier hash D:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap_config.h 573
Error (active) E2512 the argument to a feature-test macro must be a simple identifier hash D:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap_config.h 604
Error cannot convert 'const std::basic_string_view<char, std::char_traits >' to 'size_t' (aka 'unsigned int') without a conversion operator hash C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\include\xhash 31
And with MSVC
Error (active) E2512 the argument to a feature-test macro must be a simple identifier hash D:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap_config.h 557
Error (active) E2512 the argument to a feature-test macro must be a simple identifier hash D:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap_config.h 573
Error (active) E2512 the argument to a feature-test macro must be a simple identifier hash D:\Downloads\parallel-hashmap-master\parallel_hashmap\phmap_config.h 604
Though you might want to know.
Cheers
I run the examples/emplace.cc and i found the phmap::flat_hash_map slower than unordered_map, is that something wrong?
here is the output
bench: iterations: 100000 / container_size: 1
std::unordered_map: 178 ms
std::map: 120 ms
std::vectorstd::pair: 118 ms
phmap::flat_hash_map: 541 ms
bench: iterations: 100000 / container_size: 10
std::unordered_map: 1300 ms
std::map: 1461 ms
std::vectorstd::pair: 802 ms
phmap::flat_hash_map: 3574 ms
bench: iterations: 100000 / container_size: 50
std::unordered_map: 8307 ms
std::map: 9969 ms
std::vectorstd::pair: 5532 ms
phmap::flat_hash_map: 19734 ms
my g++ version
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
Trying this with MSVC with C++20 enabled and I get the following error by trying to use flat_hash_map:
error C2039: 'result_of': is not a member of 'std'
Looking it up it seems std::result_of was deprecated in 17 and removed in 20
What happens if I want to use a shared_mutex
instead of a regular mutex? I am thinking to take a shared lock on reads, and the exclusive lock on insertions. Use-case:
https://github.com/vinniefalco/BeastLounge/blob/8647895a3d691f7e9b7b3f5fbcfee6ba3f987c19/server/channel_list.cpp#L52
I suppose I could build my own container that is specific to my data types (and avoid templates, to get faster builds) and use the non-synchronized hashmap from this library as a building block.
#include "parallel_hashmap/btree.h"
struct Test
{
int a;
int b;
};
class Comparer {
public:
bool operator()(const Test& l, const Test& r) const noexcept
{
return l.a < r.a;
}
};
using HashSet = phmap::btree_set<Test, Comparer>;
int main(int argc, char** argv)
{
Test a { 1, 1 };
Test b { 2, 1 };
Test c { 2, 2 };
HashSet h;
auto it = h.insert(a);
if ( ! it.second )
printf("failed to insert a\n");
auto it2 = h.insert(b);
if ( ! it2.second )
printf("failed to insert b\n");
auto it3 = h.insert(c);
if ( ! it3.second )
printf("failed to insert c\n");
}
I would expect that to insert all three elements, but for some reason c
fails to insert what should be a unique object.
Hello,
There are 2 ways to give an object a hash function.
However, I am working with memory addresses and I wish I could keep them the same. I tried, specializating the template<> Hash for size_t to return the identity (which I saw that it is already being done like in phmap_utils.h), however printing the hash value in the insert or find is totally different. That might be because of the control bits, if I am not mistaken.
Is there a way I could just pass the hash function as the identity function, so that when I try to store a memory address as the key, it will no longer be hashed ?
All the best!
The following code:
template <typename Container, typename T>
inline void update_hash_table__(Container& S, const T& t) {
auto [it, res] = S.insert(t);
if (res == false) {
auto x = S.extract(it);
x.value().merge(t);
S.insert(std::move(x));
}
} // update_hash_table__
with Container=phmap::node_hash_set
fails with the following assertion:
parallel_hashmap/phmap.h:2010: void phmap::container_internal::raw_hash_set<Policy, Hash, Eq, Alloc>::emplace_at(size_t, Args&& ...) [with Args = {const qap_task&}; Policy = phmap::container_internal::NodeHashSetPolicy<qap_task>; Hash = phmap::Hash<qap_task>; Eq = phmap::EqualTo<qap_task>; Alloc = std::allocator<qap_task>; size_t = long unsigned int]: Assertion `PolicyTraits::apply(FindElement{*this}, *iterator_at(i)) == iterator_at(i) && "constructed value does not match the lookup key"' failed.
T
is a custom type which supplies std::hash<T>
and provides custom operator==
. The method merge
does not change how the updated object hashes or tests for equality. Considering this, why the assertion would be triggered?
Edit: it seems that the failing test case triggers assertion on the first insert in the code snippet.
Hi @greg7mdp , thanks for this amazing chain of powerful projects.
I'd like to ask about the behavior of the phmap::btree_map and I have several inquiries.
Sorry if this seemed to be unprofessional documenting of the issue but I could not find a better way to describe, so I recorded a short video.
Or view it from the following URL.
As you will see, there's inconsistency in the output of compiling and running the same code.
The code used:
#include <iostream>
#include <map>
#include <cstdint>
#include <parallel_hashmap/btree.h>
int main(){
std::cout << "phmap btree Inserting uint64_t ..." << std::endl;
phmap::btree_map<uint64_t, uint64_t> btree_1;
for(uint64_t i; i < 10000; i++) btree_1[i] = i;
std::cout << "btree_1 size: " << btree_1.size() << std::endl << std::endl;
std::cout << "phmap btree Inserting int ..." << std::endl;
phmap::btree_map<int, int> btree_2;
for(int j; j < 10000; j++) btree_2[j] = j;
std::cout << "btree_2 size: " << btree_2.size() << std::endl << std::endl;
}
cmake_minimum_required(VERSION 3.4)
project(btree_map_issue)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++14 -fPIC -fopenmp -W -Wall -pedantic -O3 -oFast")
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -Wall")
file(GLOB SOURCES
"main.cpp"
)
include_directories(parallel-hashmap)
add_executable(run_issue ${SOURCES})
Thank you.
Hello,
I couldn't find any info about the theoretical time complexity.
I'd like to check with an author that this package has an O(1) time complexity for accessing an element / inserting an element.
This is what most people assume when you use the word 'hashmap', but maybe it's not implemented this way, but as long as it remains approx. O(1) in small n, it's "O(1) enough" for me.
Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.