views:

126

answers:

1

What i would like to do is scan a disc or a drive (usb, main hdd, etc) for files and store its info in a db. Then i would search the db to a particular file to find where it is stored. Alternatively i cans search how old copys are for archiving reasons or if i have dupes of something and dont need to rearchive it or look for a dupe in the case i back it up purposely several times and one of my disc was scratch or drive was corrupted.

Here is what i am thinking

os + fs flag (1 byte?) st_mode (even if not in linix) 2bytes win32_attr (even if not on windows) 4bytes (this covers hiddent, dir vs file, locked, etc) file size (64bits) a/m/c time, 64bits. index/unique key as fileID

Should i have the name as a variable length inside its own table looked up by its matching fileID? or should i have a 260 length filename in the db or should i have a variable length filename in the db?

Then i have blobs of XYZ bits required for my checksum (md5, sha1, sha512, etc, one blob for each) in a checksum/hash table looked up by fileID.

I was thinking my hash table should have fileID (int which is same length as index?), hashType (int), hashValue(varchar).

A: 

put the filename as a varchar in the file table, at least varchar[ 1024 ], windows has a limit on total path length in some OS combos, similar to ISO CD/DVDs.

put the hashes in a association table like:

Hash
{
 fileId int,
 hash_type int,         -- or enum
 hash varchar[ 255 ], -- or largest hashtype
 PK ( fileId, hash_type ),
 index( fileID ), 
}

so you can add new hash types later and allows you to not support all hash types, for all files.

sfossen
what is PK and what is index? why do i need them?i am adding the hash table i had in mind to my question so you could better explain what is wrong with it.
acidzombie24
PK === primary key of fileId and hash_type, index is another index on fileID.
sfossen
you'd link to the file table by the fileId.
sfossen
Are you saying i link to the hash index in my filetable ?
acidzombie24
fileId is a foreign key in Hash and part of the composite primary key in Hash. File's primary key is fileId.
sfossen