Wednesday, February 2, 2011

Consistent hashing and performance

Most of today's in-memory data grids rely on Consistent hashing for achieving scalability.
From a performance perspective it is generally preferable to have CH functions that spread the data evenly between the nodes in the grid.
Uneven data distribution causes increased stress on the more heavily loaded nodes. That slows down the cluster and also increases the risk of that node crashing (e.g. OOM).
In order to measure the performance fault caused by uneven distribution I enhanced Radargun with with an "ideal" consistent hash that guarantees an equal number of keys per node. Then I benchmarked Infinispan using Radargun. The benchmark was run twice, once using Infinispan's built in CH and then using the "ideal" CH.
Following graph shows the result of these runs (cluster size on X and throughput on Y):

The actual number of keys per node is:
Configuration : dist-sync.xml
Cluster size: 3 -> ( 2175 494 331)
Cluster size: 5 -> ( 500 660 1855 585 1400)
Cluster size: 7 -> ( 1775 487 37 239 1396 457 2609)
Cluster size: 9 -> ( 1548 3 545 2854 201 2025 7 1609 208)

Configuration : idealdistribution/dist-sync-ideal-distribution.xml
Cluster size: 3 -> ( 1000 1000 1000)
Cluster size: 5 -> ( 1000 1000 1000 1000 1000)
Cluster size: 7 -> ( 1000 1000 1000 1000 1000 1000 1000)
Cluster size: 9 -> ( 1000 1000 1000 1000 1000 1000 1000 1000 1000)


Observations:
  1. For a 3-nodes cluster uneven distribution over performed. This is explained by the fact that one of the node holds most of the data, so it needed to do very few RPC calls to other cluster members, drastically increasing the average performance.
  2. For clusters made out of 5, 7 and 9 nodes the "ideal" distribution is 5-20% better
  3. The discrepancy tends to increase together with the cluster size.
Would be interesting to see the difference on a 30+ nodes machine!

More...
- vendor-specific approaches to data partitioning: Infinispan, Coherence, Gigaspaces.
- project Radargun