A UUID is a 128-bit number (ie, 16 bytes) that does not need to be assigned from a "central" location (and thus doesn't need to have an internet connection to be generated), and has a very high probability of being unique. They enable networks of devices to create packets of data and assign IDs for them, so that when they are brought together, very few, if any, of these packets will have the same ID.
There are 7 official versions of UUIDs and two "customizable" versions -- and anyone who needs an ID for their purposes doesn't even have to adhere to these standards. Twitter, for example, created their own "Snowflake ID" standard, a 64-bit ID that has since been adopted (and sometimes modified) by various other communication platforms, to keep track of messages.
Most of the different versions are various mashups of device IDs, organization IDs, incremental IDs (in case you create several items at the same millisecond), hashes, timestamps, and even random numbers, with a few bits reserved for designating which version is being used. The version that most amuses me (I refuse to say I have a "favorite" version, both because I haven't used them much, and because I can see reasons for using any of several of the versions) is Version 4: it is a completely random number, with the exception of the four bits that are reserved to say "I'm just a completely random number!"
Version 4 amuses me in no small part because the chance of two separate devices generating the same 124-bit number is practically nonexistent -- but it has nonetheless happened in the "wild", both because of bad random number generation, and because it has happened by chance.
So, why would anyone want to use Version 4, when you have versions that pretty much guarantee you'll always have a different ID? (So long, at least, as you ensure that device IDs and other parts are unique! -- because apparently even other versions have had their own collisions.) One major reason is that all the other versions are predictable -- and what's worse, they provide useful information, like timestamps, device IDs, and even sequential information about when a particular packet of data was created.
If something is created at random, none of these issues arise. Granted, they aren't "useful" in the way that other IDs are -- but they can't be guessed, or forged, or created out of full cloth, either.
Such an ID is called a "nonce" -- a term for something that's intended to be used only once -- and they can be used when a "throwaway" ID is needed. A nonce can be used to protect passwords, uniquely identify packets, maintain session connections, and seed cryptographic algorithms to provide even more randomness to their results.
No comments:
Post a Comment