Showing posts with label Music. Show all posts
Showing posts with label Music. Show all posts

Monday, April 30, 2012

One Million... Song Dataset Challenge [Think "Data geek music listening analysis and prediction challenge"]

kaggle - Million Song Dataset Challenge


"The Million Song Dataset Challenge aims at being the best possible offline evaluation of a music recommendation system. Any type of algorithm can be used: collaborative filtering, content-based methods, web crawling, even human oracles!* By relying on the Million Song Dataset, the data for the competition is completely open: almost everything is known and possibly available.

What is the task in a few words? You have: 1) the full listening history for 1M users, 2) half of the listening history for 110K users (10K validation set, 100K test set), and you must predict the missing half. How much easier can it get?

The most straightforward approach to this task is pure collaborative filtering, but remember that there is a wealth of information available to you through the Million Song Dataset. Go ahead, explore! If you have questions, we recommend that you consult the MSD Mailing List.

Ready to start recommending? Read through our Getting Started tutorial..."

Million Song Dataset

"The Million Song Dataset is a freely-available collection of audio features and metadata for a million contemporary popular music tracks.

Its purposes are:

  • To encourage research on algorithms that scale to commercial sizes
  • To provide a reference dataset for evaluating research
  • As a shortcut alternative to creating a large dataset with APIs (e.g. The Echo Nest's)
  • To help new researchers get started in the MIR field

The core of the dataset is the feature analysis and metadata for one million songs, provided by The Echo Nest. The dataset does not include any audio, only the derived features. Note, however, that sample audio can be fetched from services like 7digital, using code we provide.

The Million Song Dataset is also a cluster of complementary datasets contributed by the community:

The Million Song Dataset started as a collaborative project between The Echo Nest and LabROSA. It was supported in part by the NSF.


I love this part from the MSD Getting Started section;

To start your own experiments, you can download the entire dataset (280 GB). We also provide a subset of 10,000 songs (1%, 1.8 GB compressed) for a quick taste.

280GB. Now that's a dataset.


I thought the MSD Challenge interesting and something that called out to my inner data geek.

(via KDnuggets Home - Million Song Dataset Challenge)

Wednesday, July 20, 2011

Soundtorch - Turing sound/music browsing on its ear...



Audio browsing redefined

Soundtorch lets you browse through your audio collection extremely fast. Just drag your audio files onto the Soundtorch view ― all sounds are immediately arranged in a meaningful way and ready to be auditioned. Soundtorch is driven by the C.A.S.E. (Computer Aided Sound Exploration) engine, a sophisticated suite of algorithms that analyze and intelligently classify your audio collection.

With Soundtorch you can listen to sounds using a virtual torchlight ― all sounds illuminated by the light are played back simultaneously. This allows you to get an overview of literally thousands of sound files in a matter of mere minutes. As you zoom in or focus the light beam, you can still listen to one sound at a time.

Even though Soundtorch plays so many sounds at once, there's no cacophony. Your mind is powerful enough to easily spot the sound you like. Soundtorch further positions each playing sound on a surround system, or uses advanced surround virtualization when listening with headphones.


Check out the video then come back...

It's that pretty awesome? What a cool way to explore and discover music. And the fact that it's reportedly written with XNA is also pretty cool too.  :)

(via reddit/r/xna - Audio Search Engine, written in XNA)

Friday, March 28, 2008

Let the Machine do the Listening to your music - Machine Listening API from The Echo Nest

TechCrunch - First “Machine Listening” API Flies From The Echo Nest

"“Machine Listening” is the idea that computers can be programmed to interpret audio signals the same way humans do. This means that they can tell when a song belongs to the blues genre rather than techno. And they can detect musical characteristics like tempos, transition types, and harmonies.


The Echo Nest is a company that’s bringing machine listening to Web 2.0. It was founded by two MIT PhD students and is supported by a government grant. Today, the company releases the first of several “Musical Brain” APIs intended to improve three main aspects of music-related web services: search, recommendations, and interactivity.

The first API, which focuses on signature analysis and ...


The Echo Nest will lend all of its APIs to non-commercial projects for free, but it will charge commercial sites with a usage fee. ..."

I am a huge sucker for API's. And I dig the through that I could use this, or future API's, to help me build some cool play lists for my Zune. Or maybe a social music recommendation app.

The API is a simple HTTP/GET/POX one (i.e. REST?) and while there are no .Net samples yet, it should be VERY easy to use from it.

Just one more thing for me to play with... :)