views:

215

answers:

2

I'd like to write my own music streaming web application for my personal use but I'm racking my brain on how to manage it. Existing music and their location's rarely change but are still capable of (fixing filename, ID3 tags, /The Chemical Brothers instead of /Chemical Brothers). How would the community manage all of these files? I can gather a lot of information through just an ID3 reader and my file system but it would also be nice to keep track of how often played and such. Would using iTunes's .xml file be a good choice? Just keeping my music current in iTunes and basing my web applications data off of it? I was thinking of keeping track of all my music by md5'ing the file and using that as the unique identifier but if I change the ID3 tags will that change the md5 value?

I suppose my real question is, how can you keep track of large amounts of music? Keep the meta info in a database? Just how I would connect the file and db entry is my real question or just use a read when need filesystem setup.

A: 

For inspiration, and maybe even for a complete solution, check out ampache. I don't know what you call large, but ampache (a php application backed by a mysql db) easily handles music collections of tens of thousands of tracks.

Reecently I discovered SubSonic, and the web site says "Manage 100,000+ files in your music collection without hazzle" bt I haven't been able to test it yet. It's written in Java and the source looks pretty neat at first sight, so maybe there's inspiration to get there too.

fvu
I did check it out but I was left wondering if it's able to dynamically add/subtract tracks into it's library.
Well hmm. http://ampache.org/wiki/install:catalog#maintaining_a_catalog sounds promising. I was hoping I could do something myself. Pet project?
There's a script that performs maintenance of the library - files that have disappeared (erased from disc) are removed from the db, files that are added are picked up and added to the db.
fvu
A: 

I missed part 2 of your question (the md5 thing). I don't think an MD5/SHA/... solution will work well because they don't allow you to find doubles in your collection (like popular tracks that appear on many different samplers). And especially with big collections, that's something you will want to do someday.

There's a technique called acoustic fingerprinting that shows a lot of promise, have a look here for a quick intro. Even if there are minor differences in recording levels (like those popular "normalized" tracks), the acoustic fingerprint should remain the same - I say should, because none of the techniques I tested is really 100% errorfree. Another advantage of these acoustic fingerprints is that they can help you with tagging: a service like FreeDB will only work on complete CD's, acoustic fingerprints can identify single tracks.

fvu
I like that, acoustic fingerprinting. That might be my ticket. On that note, I am trying Ampache and it works surprisingly well but is doubling, tripling and sometimes quadrupling artists and tracks. Ampache issue for sure but it's a good base and will be a nice inspiration indeed.