1) For the purpose of really low hash collision, can I get away with just using half of the 128 bits of a sha1 rather than dealing with the sha1 itself? I understand this is not suitable for cryptographic hashes, but I just need the hashes for hash table keys.
2) Computation time isn't a priority, and besides which I'm hashing very small pieces of data. In particular, I'm mostly going to be taking 2 or 3 64-bit hashes and hashing them to get another 64-bit hash. Is there a better option than sha1 for this purpose? Again, collisions should be very unlikely.
3) I'm a sql newb. Is it a good idea to use 64-bit hashes as id's in sql? Will 64-bit id's cause performance problems in sqlite or postgres? I'm going to need to coordinate data across multiple databases (including a Lucene index), so I figured I should deal with hashes directly in the tables rather than bothering with auto-incremented ids (which would only be meaningful in one db, not across all data stores). I figure 64-bit is a good compromise: big enough for unlikely collisions but saves on space (and lookup time?).
4) What about CRC-64? Does that produce a random enough distribution?