Saturday, March 31, 2007

Talk on lock-free hash tables

Cliff Click gave an interesting Google engEdu talk on an implementation of "A Lock-Free Hash Table" for multiprocessor systems with large numbers (>32) of CPUs.

Cliff also has a couple weblog posts ([1] [2]) that cover some of the material from his talk. A brief excerpt:
A Non-Blocking (concurrent, lock-free) HashTable ... [where the] single-threaded performance appears to equal java.util.HashTable, and at high CPU count (750 cpus on a high-end Azul machine) it's about twice as fast as Doug Lea's marvelous java.util.concurrent.ConcurrentHashMap - with 4096-way striping.

I've chosen a power-of-2 closed table - collisions cause re-probing into the table ... I do not need any memory fencing or any locking [even on a resize]. I do need CAS [(Compare-and-swap)] when changing the table.
Early in the talk, I was wondering how Cliff handles deletes. As it turns out, he does not. Old deleted keys are only cleaned up on a resize, which works and nicely dodges delete nastiness.

I seem to be bumping into a lot of interesting work on large scale multiprocessor systems lately. For example, "Evaluating MapReduce for Multi-core and Multiprocessor Systems" (PDF) shows some promising results for using MapReduce to simplify some parallel programming on a single box with 8+ CPUs (rather than on a cluster of boxes), at least for tasks that easily can be translated to a MapReduce problem.

There is also a Google engEdu talk on that MapReduce paper, but the video is hard to hear in parts. However, there were some interesting tidbits in the Q&A, including people questioning the small data sizes used for the tests (almost all under 1G) and whether the extraordinary speedup (~x50) reported on the ReverseIndex task was an issue with the coding of the serial version of the algorithm rather than a true speedup.

Broadening out to issues with programming multiprocessor systems in general, another Google engEdu talk, "Advanced Topics in Programming Languages: The Java Memory Model", provides a nice overview of some tricky issues that may trip up even the best of us when trying to do multithreaded programming.

Finally, that Java memory model talk serves as good motivation for another talk, "Software Transactional Memory", which argues for considering the software transactional memory model -- allowing a series of reads/writes to be marked as one logical action and then using abort and retries to handle most contention -- over error-prone traditional locking. If you have never heard about software transactional memory before (I had not), skimming the Wikipedia page or watching the video might be worthwhile.

1 comment:

Ben Meyer said...

Adding to the pile on Qt Labs they recently added QtConcurrent which is another nice framework for doing map reduce code on multi cpu boxes.