Last Wednesday (14/11/12) I made an announcement on the music-ir and public-lod mailing lists to engage the MIR and Linked Data research communities in the project activities. The music-ir list is the main mailing list for MIR researchers, while public-lod provides a discussion forum for members of the Linking Open Data project and the broader Linked Data community.

Last week we submitted a late-breaking session paper to the ISMIR post-conference abstracts that will be published on the ISMIR website. The paper outlines the main goals of the SOVARR project in the wider context of research efforts at the Centre for Digital Music investigating the benefits of common data representations when dealing with large collections of media. The late-breaking session at the ISMIR 2012 was held in collaboration with Sebastian Ewert of the Semantic Media project.

Quite a few papers related to the SOVARR project were presented at ISMIR 2012. Here are the ones that I found particularly interesting:

Towards a (Better) Definition of the Description of Annotated MIR Corpora. In this paper, Geoffroy Peeters and Karën Fort define a methodology for describing annotated corpora for MIR (e.g. evaluation data sets).

Last week we held the first meeting of a new project related to SOVARR called Semantic Media, exploring how industry and universities can work together and addresses the challenge of time-based navigation in large collections of media. During the meeting I had a chance to talk to Edoardo Pignotti, who previously looked at how to create shared vocabularies for social sciences in an e-Science context, and support users to describe research artefacts and activities in a structured way.

There has been tremendous work in the MIR community to create easy to use feature extractor tools (e.g. Marsyas, jMIR, MIR toolbox, Vamp plugins to name a few), it remains difficult however to know whether a feature computed by one tool is the same as (or compatible/replaceable with) a feature computed by another tool. Moreover, if different tools were used in the same experiment, their outputs typically need conversion to some sort of common format, and for reproducibility, this glue code needs to evolve with the changes of the tools themselves.

We are pleased to announce the launch of the Shared Open Vocabulary for Audio Research and Retrieval (SOVARR) project as of October 1, 2012. We are hoping to engage the audio research community in our investigation and will publish regular updates and news related to the project on this website. A detailed description of the project can be found in the about section.