Presentation at DMRN+7

The Digital Music Research Network 1-day workshop was held on Tuesday 18th of December at the Queen Mary University of London. The workshop included invited and contributed talks, and poster sessions. In my talk I provided an overview of the project, the main goals and background, a brief summary of Semantic Web technologies, and then discussed a shared vocabulary for audio features in more detail. The final part of the talk was reserved for a practical demonstration of the Sonic Annotator Web Application, in which Mel Frequency Cepstral Coefficients were extracted from an uploaded audio file (in this case Fela Kuti's song Gentleman).

SAWA, implemented by Gy├Ârgy Fazekas, is a web application that uses the Vamp Plugins to extract user chosen features and can return the results in a few formats, including RDF. The uploaded file can be identified using MusicDNS fingerprint and MusicBrainz. This enables the file to be linked with related resources, including the MusicBrainz and DBTune databases, and the BBC page of the artist. The user can then download the RDF file after the extraction is completed and open it in the Sonic Visualiser application. In the demonstration I compared MFCCs extracted with 3 different tools: SAWA, LibXtract and the bextract command line tool from the Marsyas toolkit, in order to illustrate one of the complications we are faced with while developing a shared vocabulary for audio features. It is difficult to know whether the same features computed by different tools are interchangeable and compatible. From the simple comparison experiment, it was obvious that all the 3 sets of MFCCs of the same track were distinctly different. This raises the issue of how to annotate features that are equivalent in theory, but in practice produce significantly divergent results.

The slides of the presentation are available in the publications section of this website.