views:

124

answers:

3

I have a project at home I'm thinking about using as a way of teaching myself something new.

On an external NTFS drive I have a few tens of GBs of music albums, all in mp3 format and all nice and neatly sorted in directories. Some are quite obscure. I would like to create a playlist, in .m3u or (preferably) as an iTunes playlist, of all the tracks that were officially released as singles. How would you go about this? And what free tools, programming languages, APIs, scripts would you recommend exploring as a way of achieving this?

I'm kind of just looking for rough guidelines and ideas rather than fine solutions. I am also committed to beginning learning PHP next week for a work project so I guess any ideas that involve PHP would be very welcome too.

+1  A: 

There's three distinct things you need to do to accomplish this:

  • Build an M3U playlist

Building a M3U file in and of itself is a fairly simple operation - in their simplest form, with no extended information, an M3U file is just a newline-delimited list of files that should be played in a playlist. So to create a playlist, you would simply need to write all the files you want to play into a new file. Depending on the scripting or programming technology you're using to do this, there's usually a fairly simple library that will allow writing data to a file.

  • Iterate over a directory structure

In order to parse through all your mp3 files, you will need to look through a structure of subdirectories in order to analyze every file. This is a textbook case for recursion, which involves code that calls itself, commonly to work through a tree structure such as a directory with it's subdirectories. Here's some pseudo-code of how moving through a directory structure would look:

function CheckDirectory(directory)

    for each file in directory
          // do something with the files

    for each subdirectory in directory
         CheckDirectory(directory) // this will run the same code for each subdirectory

Eventually, this code will move down each "branch" in the directory structure and check all files, regardless of how deep the directory structure is or where in the directory structure files are located.

  • Identify what constitutes a "single"

MP3 tags, to my knowledge, don't have a field identifying them as a single, so there's no straightforward way to determine what is a single and what isn't. A couple approaches you could try:

  1. Consider any directory with only one MP3 in it to represent a single. This is the most straightforward approach, but there's a lot of cases where this will get false positives - maybe you only own one song from an album, or there's a stray file somewhere for some reason.
  2. Compare the "song name" in the MP3 file to the "album name". Somewhat more sophisticated - if you go down this route you will need to parse the MP3 tags, which you can probably find a library for depending on what language you are using for this. Also keep in mind there's a good chance of edge conditions where the "album name" for a single may not match the song name exactly. A simple idea in theory, but the edge cases may make it very complicated if 100% accuracy is important.
  3. Find some sort of online web service that informs you of whether a song title is a single. If this exists, awesome, but I've certainly never heard of it.

All in all, your realistic options are probably 1 and 2. 1 is simpler, and will produce some false positives (songs that aren't singles but your logic thinks they are). 2 is more complicated, and will produce some false negatives (songs that ARE singles that your logic doesn't recognize). It's really up to what suits you best.

Ryan Brunner
I think any album with five or less mp3s can be considered a single, since there are often remixes and/or b-sides. Beyond that you get into the 'remix album' territory, so I think five is a reasonable number.
Pies
You should probably also look at the total album length, should be less than, say, half an hour if it's a single. This is important because of albums with very few but very long songs, long mixtapes, audiobooks etc.
Pies
Sorry, I don't think I explained this clearly enough. Each directory is a full album of roughly around 8 to 20 tracks. So none of the directories are single (or EP) releases. What I'm looking for is a way of determing which tracks were released as promotional singles; which tracks from the album would you have heard on the radio, or in the Top 40 (if my taste wasn't so obscure). I was thinking about trying to use an API from something like All Music Guide to identify singles. Or maybe the Musicbrainz database.
Mark G
+1  A: 

I think something like PHP or Ruby would work well for this.

My algorithm would probably have steps like this:

  1. Iterate through mp3s and collect ID3 tag information
  2. Store the information so that it can be sorted by artist (I would use SQLite or something)
  3. Query the server (Musicbrainz or whatever) for all of the singles by each of the artists in your database or data structure. For instance, if your database indicates that you have any tracks by "The Beatles" then you would query the database for "singles by the Beatles." Then store those results.
  4. Compare your list of singles by each artist to the collections of tracks in your library by that artist
  5. When there is a match, simply add the filename to the M3U file.

This method will have much better performance than querying the API for every track.

Also:

  • I would use regular expressions when comparing the track titles to avoid incorrect negative matches when there are things such as (album version) or (remix) in the title
  • I recommend using the Discogs API for musical release information. Its RESTful and the database is pretty accurate.

Have fun, let us know when you've written Foobar2001.

gregsabo
A: 

To determine if a track was released as a single, you can use MusicBrainz, the open source music metadata base. MusicBrainz identifies all releases by a 'type' - one of which is 'Single'.

plamere