UUID Collisions

How does Swarm deal with the very unlikely scenario of UUID collisions?

There are two types of collisions to consider:


A) Two nodes generate exactly the same truly random seed at startup 

B) Two nodes have their PRNG streams "cross" each other so that an isolated UUID collision ensues


Condition A is catastrophic if it occurs, because the PRNGs (pseudorandom number generators) of both nodes will be perfectly synchronized and can result in both nodes generating the exact same sequence of UUIDs.  Swarm detects this condition as nodes join the cluster and both of the colliding nodes immediately shutdown in order to prevent problems. A critical error is logged in this circumstance so that an administrator knows why the nodes took themselves offline.


Condition B, where two nodes just happen to generate the exact same UUID for two different streams, cannot be detected by Swarm.  If it occurs, the two streams are likely coexist for some period of time, during which a GET will randomly choose one or the other of the streams.  At some point, the replication and load balancing processes is likely to cause one of the streams to be lost as it copies and trims these streams. Other streams will be unaffected and Swarm will continue to function correctly in all other respects.


The likelihood of condition B occurring is statistically improbable to the extent that Swarm assumes UUIDs are indeed universally unique. We also assume the ALU (arithmetic logic unit) will always perform XORs correctly, that the FPU (floating point unit) will always multiply two floating point numbers correctly, and that RAM will detect bit errors and won't return bad data. Statistically, it is more likely that a hardware design error (ie. the Pentium FPU flaw) or a manufacturing glitch would cause one of the latter errors than it is that two truly random 128-bit numbers would accidentally collide.

The probability that Swarm will attempt to assign two different streams the same UUID is essentially zero. It is as likely that the pieces of a broken wine glass will spontaneously reassemble themselves when placed in a paper bag and shaken. You would have to store something like 1.8 x 10**19 or about 20 quintillion streams in your cluster before a collision like this becomes a likely event. If you could generate and use 50,000 UUIDs per second on a single node, and you had a 1 million node cluster, it would take about 215 quadrillion millennia before you use up all available UUIDs.

© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.