C++Now 2018: You Can Do Better than std::unordered_map: New Improvements to Hash Table Performance

Ғылым және технология

cppnow.org
-
Presentation Slides, PDFs, Source Code and other presenter materials are available at: cppnow.org/history/2018/talks/
-
The hash table is probably the most important data structure. Because of that importance, there is a large zoo of possible implementations with design trade-offs. The standard hash table, std::unordered_map, traded off performance in order to get backwards compatibility with std::map. Which was probably a good choice, but it does leave us with a lot of hash tables that are slower than necessary, while also using more memory than necessary.
This talk is about replacements for std::unordered_map. How they work, why they are faster, and when you should choose which. Topics include linear probing with Robin Hood Hashing, Google's new trick of using SIMD instructions to look at 16 elements at a time, and optimizations for node based containers, because they can actually be really fast.
I will also talk about recent improvements to hash table performance. Little tricks that shave nanoseconds from table lookup times. In an environment that's already had decades worth of micro-optimizations, it's fascinating to watch as people come up with inventive new ways to keep pushing performance.
-
Malte Skarupke
Malte is an AI programmer at Avalanche Studios in New York. In his free time he likes to optimize algorithms. He blogs at www.probablydance.com
-
Videos Filmed & Edited by Bash Films: www.BashFilms.com
---
*--*
---

Пікірлер: 49

  • @OmarChida
    @OmarChida3 жыл бұрын

    One of the best talks I have ever saw. I quite like his scientific approach.

  • @TheOnlyFeanor
    @TheOnlyFeanor5 жыл бұрын

    This was an excellent talk, thank you.

  • @willw2596
    @willw25963 ай бұрын

    I'm 15 minutes in and find this to be one of the best technical talks I've ever seen.

  • @BoostCon

    @BoostCon

    3 ай бұрын

    Thank you so much for your appreciative comment!

  • @amitm1157
    @amitm11575 жыл бұрын

    Nice explanation about cache hits and misses while using hash tables.

  • @TheLavaBlock
    @TheLavaBlock5 жыл бұрын

    Great talk! Thanks

  • @conradomartinezp
    @conradomartinezp2 жыл бұрын

    As long as you have a way to solve ties in Robin Hood which only depends on the identity of the keys, e.g., the smaller key stays at its slot, the larger continues probing, the layout of the hash table after N inserts will be exactly the same, independent of the order of insertion. If ties are solved by the time of insertion then a cluster of synonims (equal hash value) will be ordered by time of insertion. Robin Hood does not reduce the average time for successful search but it makes a bit reduction of the variance (even in almost full tables) and it also reduces the expected longest probe sequence from O(log n) to O(log log n). It's nice to see that it also does well in practice 😉

  • @origamibulldoser1618
    @origamibulldoser16186 ай бұрын

    This is mad fascinating.

  • @wiadroman
    @wiadroman5 жыл бұрын

    Great material, thanks a lot.

  • @user-ze6fs5ui6h
    @user-ze6fs5ui6h2 ай бұрын

    Thank you! Best talk I have seen on this topic😊

  • @BoostCon

    @BoostCon

    2 ай бұрын

    Thank you! So pleased to hear that you found the presentation informative.

  • @dennisrkb
    @dennisrkb4 жыл бұрын

    you explained google's map much better than they did themselves

  • @skepticmoderate5790

    @skepticmoderate5790

    4 жыл бұрын

    They explained it pretty well...

  • @KilgoreTroutAsf
    @KilgoreTroutAsf5 жыл бұрын

    300 nanoseconds amortized sounds atrocious for a cache miss on modern hardware. could it actually be 300 cpu cycles instead? This would fit the typical 50-80 ns latency for ddr4 and 3~4 Ghz cpu frequency either that or each miss results in O*(4) accesses to ram, with the cost of cpu instructions being too small ( ~0.25 ns per cycle) to matter

  • @lenkite
    @lenkite4 жыл бұрын

    Wonderful and enlightening talk especially the zooms at various N levels. So std::map is great at < 400 elements!. I only wish the size of the element was also shown.

  • @leleo53000

    @leleo53000

    4 жыл бұрын

    it's using `std::map`, the size of the element is probably 4

  • @excitedbox5705

    @excitedbox5705

    3 жыл бұрын

    @@leleo53000 I think he said 8 later on but that might not apply to that.

  • @ratchet1freak
    @ratchet1freak6 жыл бұрын

    I'm pretty sure that you can significantly improve the robinhood adaptation of the google's map implementation is different, If you treat the 16 element bucket as a bucket that each are targets of the hash, instead of the individual elements of the buckets. Then the robin hood variant can use just 2 bits for the distance meta data (0-2 bucket distance + 3 for empty) and keep 6 bits for the hash pre-filter.

  • @Voy2378
    @Voy23786 жыл бұрын

    Benchmarking only on int keys only is deceptive. If you have std::string or any other type with expensive operator == ridiculously high load factor will kill you.

  • @KilgoreTroutAsf

    @KilgoreTroutAsf

    5 жыл бұрын

    A good implementation for large-sized or variable-length keys should never store keys and hashes together, because it completely negates cache locality and memory alignment trickery. The better solution is to use two-level indirection and use the hash table to store (hash,pointer) or (hash,offset) pairs, where the offsets point to a flat array or other memory pool for keys, and are just linearly incremented after each addition. Also, the whole point of hash tables is to use a good universal hash function which digests the object into a fixed-length string of enough bits to guarantee very low collision probability. The final test for true equality should only happen more than once very rarely.

  • @excitedbox5705

    @excitedbox5705

    3 жыл бұрын

    insert your strings into an array and use the keys as your INT. Then when you retrieve the key you can use stringArray[key] which only adds minor overhead.

  • @andrewgrant788

    @andrewgrant788

    2 жыл бұрын

    Agreed. The comparison between std::map() and std::unordered_map() performance for small values of N are very misleading when benchmarked with int keys. Because a std::map is a tree, std::map::find() does a lot of < comparisons of keys which is very cheap for ints and much more expensive for strings or more complex keys. unorded_map first hashes the key to determine the bucket and then compares the keys in the same hash bucket. Keys can be pre-hashed to further optimized performance.

  • @FindecanorNotGmail
    @FindecanorNotGmail Жыл бұрын

    In the final algorithm, to evict an element inside a linked list you'd have to unlink it from the list by adjusting the link to it to point to the element that the evictee was pointing to, and then replace it last in the list. (Or am I missing something?) But that means that to be able to evict any element, then the sum of the jump distance _to_ it and _from_ it would also need to be in jump_distances[]. If not, you _could_ grow the table, but would you be guaranteed that it would fit _then_ ?

  • @jacksonallan954

    @jacksonallan954

    10 ай бұрын

    I haven't checked Malte's code to see how he does it, but I did implement a similar hash table. You are imagining that the jump-distance value is the jump distance from the *current* bucket to the bucket containing the next element in the chain. However, I think the jump distance values have to be relative to the chain's *home* bucket, i.e. the bucket to which all the elements in the chain belong and that contains the first element in the chain. That way, the problem you identified disappears. We can always re-link elements existing in a chain.

  • @77oxf
    @77oxf3 жыл бұрын

    Why is std::map faster for fewer elements?

  • @TheJGAdams

    @TheJGAdams

    29 күн бұрын

    They're O(log(n))

  • @roman9509
    @roman95095 жыл бұрын

    cool

  • @Sychonut
    @Sychonut Жыл бұрын

    The content was great and much appreciated, but damn it felt at several points throughout that it's a nerds fest with people trying to show off to one another how smart they are. It's just not a good look.

  • @ABaumstumpf
    @ABaumstumpf4 жыл бұрын

    If only the real world was that simple - a hasmap for int-int, gosh how it would like that. instead we have things like.... normally we have maps of maps of vectors and such things.........

  • @blarghblargh

    @blarghblargh

    4 жыл бұрын

    his benchmarks are open source. go prove how his map works worse for those, and show us you actually know what you're talking about

  • @muonline2067

    @muonline2067

    2 жыл бұрын

    @@blarghblargh my question is: where can I download his unordered_map for testing man?

  • @joestevenson5568
    @joestevenson5568Ай бұрын

    Great talk, terrible audience

  • @zyxyuv1650
    @zyxyuv16505 жыл бұрын

    Obvious German

  • @seditt5146

    @seditt5146

    5 жыл бұрын

    Really? Because I thought he sounded as though he had a slight Irish accent not German.

  • @pruibiebehastoet1914

    @pruibiebehastoet1914

    2 жыл бұрын

    Because of the gründlichkeit ?

Келесі