The Kember Identity May Not Exist
I used to work with Elliot Kember at Eventfinder. He recently proposed a search for the MD5 "Kember identity", a string which, when hashed, returns itself.
Something I've been thinking about: it's entirely possible such an identity doesn't exist.
Imagine a hashing algorithm that takes a two bit input and returns a two bit output. That is, the domain and range is {0, 1, 2, 3}. If the algorithm perfectly distributes its hashes over the range then, for each input value, there's a 0.25 probability of it being an identity. So, the probability of no identity being produced by the hash function over any of its domain is (0.75 ^ 4) = 0.32; about a one in three chance.
MD5 has a range of 2 ^ 128. For finding the identity, the domain must be the same as a range. So, there's a 1 in 2 ^ 128 chance of a random input being the identity. The probability of an input not being an identity is ((2 ^ 128) - 1 / (2 ^ 128)). And the probability of there not being any Kember Identity at all is ((2 ^ 128) - 1 / (2 ^ 128)) ^ (2 ^ 128).
Which, according to some quick mpmath, is:
0.367879441171442321595523770161
46086744527058077566697138371152
93233486842268612093790091730736
201865…
So, there's a decent chance that there is no Kember identity. Good to know if you're considering starting to churn away on the problem.