Generally speaking hashing functions never work perfectly, so you must be prepared to support situation when there are two identical MD5 values for two different substances. In the case of MD5 it is extremally rare situation, but still for obvious reasons you can use MD5 to enumerate "only" 2128 substances.
If you use InChI strings as indexes you don't have that problem, as they have been designed with uniqueness in mind.
But InChI is not a perfect solution as wel if you don't have full information about molecule available.