views:

210

answers:

9

I've created a Winamp-like music player in Delphi. Not so complex, of course. Just a simple one.

But now I would like to add a more complex feature: Songs in the library should be automatically rated based on the user's listening habits.

This means: The application should "understand" if the user likes a song or not. And not only whether he/she likes it but also how much.

My approach so far (data which could be used):

  • Simply measure how often a song was played per time. Start counting time when the song was added to the library so that recent songs don't have any disadvantage.
  • Measure how long a song was played on average (minutes).
  • Starting a song but directly change to another one should have a bad influence on the ranking since the user didn't seem to like the song.
  • ...

Could you please help me with this problem? I would just like to have some ideas. I don't need the implementation in Delphi.

+3  A: 

Measure how long a song was played on average (minutes).

I don't think this is a good metric, because a long song would gain an unfair advantage over a short song. You should use a percentage instead:

avg. time played / total song length

dbyrne
I usually moan about the opposite problem: if you measure the number of times a song's played as an indication of its "niceness", long songs get penalised! So your take-the-average idea sounds rather neat.
Frank Shearar
Thank you very much, dbyrne, nice idea. So I will measure the average time played in percent instead of minutes.
That doesn't quite work out nice either though - if you measure average percentage played, a short song played once will have 100% rating, whereas a very long song commonly played for just the first few minutes will have a low rating despite commonly being played.
Eamon Nerbonne
Last.FM simply clips; a song counts as played when it's played for at least 30 seconds or 50% of the total length if that's longer, or 2(?) minutes if that's shorter - I'm unsure of the exact clipping values; not that it really matters much.
Eamon Nerbonne
+8  A: 

I would track all of your users' listening habits in a central database, so you can make recommendations based on what other people like too ("people that liked this song, also liked these other songs")

some other metrics to consider:

  • proportion of times that the song was immediately replayed (ex. this song was immediately replayed 12% of the times it was played)

  • did they turn on the "repeat this song" button during play?

  • times played per hour, day, week, month

  • proportion of times this song was skipped. (ex. this song was played, but immediately skipped 99% of the time)

  • proportion of song listened to (the user listened to 50% of this song on average, versus 100% of some other song)

also:

listen in on the user's microphone. do they sing along? :D

what volume do they play the song? do they crank it up?

Put in a "recommend this song to friends" button (that emails song title to friend or something). Songs they recommend, they probably like.

You might want to do some feature extraction on the audio stream, and find similar songs. This is hard, but you can read more about it here:

"Automatic Feature Extraction for Classifying Audio Data " http://www.springerlink.com/content/g71368g57x013j48/

"Understandable models Of music collections based on exhaustive feature generation with temporal statistics" http://portal.acm.org/citation.cfm?id=1150523

"Collaborative Use of Features in a Distributed System for the Organization of Music Collections" http://www.idea-group.com/Bookstore/Chapter.aspx?TitleId=24432

el chief
Thank you very much, el chief. There are some nice ideas in your answer. Concerning the first paragraph: I know this approach (last.fm) but I build a single-user application. So I can't compare the user's habits with other users' habits.
Concerning your additional metrics: Shouldn't one combine metric #4 and metric #5? If a song is immediately skipped, then the proportion listened to is just 1% or so, right?
Your metric #3 corresponds to my metric #1, doesn't it? Whether I measure the times played per week or per year doesn't draw any distinction, does it?
re skipping. you are right. an immediate skip would correspond to playing say 5% of the song, so remove metric #4.re time unit. you should track the date/time of last play in any case. songs that were played more in the last week might be considered "hotter" or "more liked at the moment" than ones that were played more, but farther in the past. example, songA and songB added to library on same date. i played songB ten times in week 1, and songA ten times in week 10. i played them the same total number of times, and they are the same age, but you might say i like songA better right now
el chief
Also track the songs BPM, And guess the style of music, if you compare this with other songs that are played often, if there in the same BPM range then more than likely they listen to a lot of music from that range thus being there type of music. so they will again probably like the sone
RobertPitt
A: 
(ListenPartCount * (ListenFullCount ^ 2)) + (AverageTotalListenTime * ListenPartTimeAverage)
--------------------------------------------------------------------------------------------
               ((AverageTotalListenTime - ListenPartTimeAverage) + 0.0001f)

This formula will produce an nice result, since user could really like just part of song, this should be seen in the score, also if user likes full song then weight should be doubled.

You can tweak this folmula in various ways, f.ex include user tree of listening, f.ex if user listens one song and after that he listens another song few times, etc.

Lukas Šalkauskas
A: 

Use the date the song was added to the library as a starting point.

Measure how often the song/genre/artist/album is played (fully, or in part or skipped) - this will also allow you to measure how often a song/genre/artist/album is not played.

Come up with a weighting based on these parameters, when a song, it's genre, artist or album has not been played frequently, it should rank poorly. When an artist is played every day songs from this artist should get a boost, but say one of the artist's songs is never played this song should still rank pretty low

narkie1987
+1  A: 

Concerning your additional metrics: Shouldn't one combine metric #4 and metric #5? If a song is immediately skipped, then the proportion listened to is just 1% or so, right? – marco92w May 21 at 15:08

These should be separate. Skipping should result in negative rating for the song that was skipped. However, if the user closes the application when a song begins, you should not consider it as negative rating, even though only a low percentage of the song was played.

notmg
A: 

Simply measure how often a song was played per time.

Often, I go to play a particular song, and then just let my iPod run until the end of an album. So this method would give an unfair advantage to songs late in an album. Something you might want to compensate for if your music player works the same way.

dan04
A: 

What about artificial intelligence appliance on this problem?

Well! Let me say that starting from scratch could be really funny to use a network of clients with their own "intelligence" and finally collect client results on a central "intelligence".

Each client could produce his own "user ratings" based on user habitudes (as already said: average listenig, listenig count, etc...).

Than a central "intelligent" collector could merge individual ratings into "global ratings" showing trands, suggestions and every high level rating you need.

Anyway to train such a "brain" means that you have to solve the problem in an analytical way first, but really could be funny to build such a cloud of interconnected small brains to produce higher level "intelligence".

As usual, as I don´t know your skills, take a look to neural networks, genetic algorithms, fuzzy logic, pattern recognition and similar problems for a deeper understanding.

DrFalk3n
A: 

You can use some simple function like:

listened_time_of_song/(length_of_song + 15s) 

or

 listened_time_of_song/(length_of_song * 1.1) 

that means that if song was stopped in 15 seconds then it would be rated with negative score, or maybe the second case is even better (length of song would have no matter to final note if user listened whole song)

Another way may be using neural networks if you are common with this subject.

+2  A: 

Please let degrade likeliness over time. You seem to like songs better if you heard them often during the last n days, while older songs should only get a casual mentioning, since you like them but heard them way too much, probably.

Least but not last you could add beat detection (and maybe frequence spectrum) to find similar songs, which could provide you with more data than the user inputted by hearing the songs.

I would also go for grouping songs having the same MP3-Id Tag here, since this also gives a hint what the user is currently on. And if you want to provide some autoplay function, it would also help. After hearing a great Goa song, switching to Punk is strange, even if I like songs of both worlds.

Daniel