For Music Hack Day San Francisco, I teamed up with Steven Lloyd to integrate a music visualizer into Songbird. As far as I know, this is the first music visualization for Songbird. Since we didn’t have access to the raw audio through Songbird, we used The Echo Nest’s analysis data, which gives high-level, musically meaningful information to display. SongbirdVis represents pitch, timbre, loudness, rhythmic, and structural information from the analysis data, and synchronizes it with the audio. In addition to showing information about the currently playing track, SongbirdVis is interactive. You can click on part of the visualization to hear the music at that point. You can also click and drag to select a part of the visualization to zoom in on.
To get analysis data for a track, all you need is a free Echo Nest API key. Instead of uploading the track for analysis, we use the new v4 beta search API to see if The Echo Nest has already analyzed the currently selected track. When the API is updated sometime in the next month, we’ll update the code to upload the track, or search by the track’s MD5 checksum.
The analysis data contains a lot of deep and musically interesting information about a piece of music. Bars, beats, structural sections … there’s pitch and timbre data for every perceptual event in the track. The trick for a developer is to decide how to visually represent that.
I’ve been thinking a lot about those issues lately as a part of visualizer.fm, a project to synchronize HTML 5 audio playback of music with visualizations of Echo Nest Analysis data using processing.js (more about visualizer.fm in a separate post). I decided to port the diagnostic visualizer to work in Songbird. I thought it was a good choice since it shows the whole song at once, so it is good for both viewing the analysis for a track, and for use during playback.
Songbird displaying SongbirdVis for the track “Dancing Queen” by ABBA.
Zooming in on the first section.
The timbre, pitch, and loudness features are all in terms of segments. A segment corresponds to a perceptual event (e.g. guitar note, drum hit) in a song.
Working from top to bottom, here’s what SongbirdVis displays:
The timbre display shows the the 12-dimensional timbre vector for each segment. Longer segments take up more horizontal space. The timbre vector is colored by interpreting the first 3 dimensions as RGB values.
The pitch display shows the 12-dimensional pitch vector for each segment. Each value corresponds to the strength of a pitch at that point. So, if a guitar plays the note G, and there is no other sound, only the bin for G would be colored in. Because of percussive and other non-pitched sounds, we see a lot of color on the pitch display. Some filtering or weighting might be in order to make this display a bit cleaner.
Pitch colors are chosen by taking a note frequency in hertz and finding the color of the corresponding wavelength of light.
The loudness curve shows the perceptual loudness (in dB) over the course of the track. The thickness of the curve shows the difference between the loudness at the beginning of a segment (loudness start), and the maximum loudness for that segment. The white line shows the overall loudness for the track. The vertical white lines show where the analysis data has marked “end of fade in” and “start of fade out”.
The meter display shows Bars, Beats, and Tatums as blue, red, and white squares, respectively. Because the beats and tatums are so close to each other when fully zoomed out, it looks like they are thick lines. The blue, red, and white curves show the confidence associated with each of the squares they correspond to. In Dancing Queen, you can see that the confidence associated with beats is a lot higher, on average, than the confidence associated with bars or tatums. The gray vertical bars correspond to sections, higher-level structures in the song, such as chorus and verse. This makes it easy to jump around between chorus and verse.
- Sync - SongbirdVis receives an updated timestamp from Songbird as the track plays, allowing the visualizer to match what is playing.
- Seek - Clicking on SongbirdVis allows you to hear the track at the place you clicked by setting the currently playing position in the track.
- Track changed - When a new track starts playing, SongbirdVis queries the Echo Nest API to display the visualization for the new track.
- Resizing - When the Songbird window is resized, SongbirdVis resizes itself.
SongbirdVis needs the Echo Nest to update the v4 search API and track upload API before it can be released for general use. There are also a couple of things that remain to be fixed in our hacked-up copy of processing.js. In the meanwhile, the code is available on github for your forking pleasure.
SF Music Hack day was a great experience, with loads of smart developers doing cool things. You should definitely check out the full list of other hacks, here. My favorite was Leonard’s Set Summary. The best part about it, for me, was that it uses Capsule. It’s a real joy to see people make cool stuff with code I’ve developed.