The new forums will be named Coin Return (based on the most recent vote)! You can check on the status and timeline of the transition to the new forums here.
The Guiding Principles and New Rules document is now in effect.

I want to dissect Pandora

ButterBeanButterBean Registered User regular
edited May 2009 in Help / Advice Forum
I'm interested in accessing the database that controls Pandora's ability to create stations and match artists. The mechanism behind Pandora's ability to create stations seems to be based on something called the Music Genome Project.

From what I understand (according to Wikipedia) what the Music Genome Project (MGP) does is categorize songs/bands using a list of musical attributes, in addition to what they call musical 'genes'. Some songs can have up to 500(!!!) of these markers. The wikipedia article explains it better than I can: http://en.wikipedia.org/wiki/Music_Genome_Project

When you create a new station, it will tell you some of the attributes of the songs you've input, but sometimes it doesn't seem to work well. I get tired of Pandora - they replay some songs over and over, or I'll insert a couple of songs I like and the station will pick up on the 'wrong' attributes I like about a particular group of songs. What I'd really like to be able to do is look up a song or artist, see the WHOLE genome they've created for it, and then do a search of all songs in the database with genome attributes x,y and z.

Aside from using it to find new music myself based on specifically chosen attributes (rather than letting some algorithm assume certain things), I just think it would be pretty interesting to see things like what the most popular instrument in music is at any given time, the most comonly used instruments used by a particular band, how a band's sound changed after a key member left etc.

I've done a few searches into the origin of the MGP and Pandora but can't find anything but pretty spare details and news regarding Pandora as a business and service. The wikipedia article is the most substantial thing I've found about the MGP. This seems kind of weird, because according to Pandora's site, the MGP has been around since 2000. That's ancient by web standards - you'd think there was a lot more documentation of such an interesting database.

Is there a site or service that offers the access to information I describe or is it pretty much a pipe dream at this point? I've tried google but I think part of my problem is that I'm not sure if the search terms and language I'm using are correct.

ButterBean on

Posts

  • Hamster_styleHamster_style Registered User regular
    edited May 2009
    I'd expect a lot of the stuff with Pandora's database is very very proprietary.

    This may be a little tangential, but there is an alternative you may be interested in checking out is a thing called "Boffin" from last.fm:

    http://www.last.fm/group/Audioscrobbler+Beta/forum/30705/_/510180

    It's very beta, but I think it does something to what you're looking for, ie, look up a bunch of stuff for the attributes x, y, z. However, the attributes here are user generated "tags" rather than stuff from the genome project - but I'm not gonna lie, some of that genome stuff is kinda off.

    Hamster_style on
  • lifeincognitolifeincognito Registered User regular
    edited May 2009
    I recently read an article talking about how Pandora worked in they May IEEE Spectrum magazine, so you want to go find a copy and read the blurb ( it isn't a full length article but it may prove helpful to you ).

    From what I understand of the article, Pandora runs light on the algorithms and the filtering and heavy on using actual music experts to rate and categorize music. This tends to have stations get 'stuck' if you thumb up or down too many songs as you found on your own. At the same time you have to remember that they are having people address the musicality of the songs so they might not be tracking the instruments being used so much as the feel and actually score of the music. They claim having actually people do the rating gathers much more technical aspects, which is why they went bankrupt shortly after starting work in 2000, due to the burst of the dot com bubble and having a rather large staff of people doing nothing but rating music.

    Something that might be addressing a more quantitative approach to music is gnoosic, but I don't know much about how it works sadly.

    lifeincognito on
    losers weepers. jawas keepers.
  • CrystalMethodistCrystalMethodist Registered User regular
    edited May 2009
    Like other people have mentioned, Pandora is mostly hand-tagged. There's not a whole lot of algorithms in there, they just have a list of characteristics for each song/artist, and they return songs/artists with similar characteristics.

    Last.fm is much more about algorithms, and that's a lot of collaborative filtering stuff, which is actually what I'm spending a lot of time working with for my masters degree. Poke around the Netflix prize to give yourself an introduction to the subject.

    CrystalMethodist on
Sign In or Register to comment.