The Audio Feature Vocabulary

The Audio Feature Vocabulary defines terms for the tool and task specific ontologies and implements the model layer of the ontology framework. It is a clean version of the catalogue which only lists the features without any of their properties with many duplications of terms consolidated. This enables the definition of tool and task specific feature implementations and leaves any categorisation or taxonomic organisation to be specified in the implementation layer.

The vocabulary also specifies computational workflow models for some of the features which can be linked to from lower level ontologies. The computational workflow models are based on feature signatures as described in this article by Dalibor Mitrovic, Matthias Zeppelzauer and Christian Breiteneder. The signatures represent mathematical operations employed in the feature extraction process with each operation assigned a lexical symbol. It offers a compact description of each feature and enables an easier way of comparing features according to their extraction workflows. Converting the signatures into a linked data format to include them in the vocabulary involves defining a set of OWL classes that handle the representation and sequential nature of the calculations. The operations are implemented as sub-classes of three general classes: transformations, filters and aggregations. For each abstract feature, we define a model property. The OWL range of the model property is a ComputationalModel class in the Audio Feature Ontology namespace. The operation sequence can be defined through this object's operation sequence property. For example, the signature of the Chromagram feature is defined as ``f F l Σ'', which designates a sequence of (1) windowing (f), (2) Discrete Fourier Transform (F), (3) logarithm (l) and (4) sum (Σ) is expressed as a sequence of RDF statements:

afv:Chromagram a owl:Class ;
    afo:model afv:ChromagramModel ;
    rdfs:subClassOf afo:AudioFeature .

afv:ChromagramModel a afo:ComputationalModel ; 
    afo:operation_sequence afv:Chromagram_operation_sequence_1 .

afv:Chromagram_operation_sequence_1 a afv:Windowing ;
    afo:next_operation afv:Chromagram_operation_sequence_2 .

afv:Chromagram_operation_sequence_2 a afv:DiscreteFourierTransform ;
    afo:next_operation afv:Chromagram_operation_sequence_3 .

afv:Chromagram_operation_sequence_3 a afv:Logarithm ;
    afo:next_operation afv:Chromagram_operation_sequence_4 .

afv:Chromagram_operation_sequence_4 a afo:LastOperation, afv:Sum .

This structure enables building SPARQL queries of any level of complexity to retrieve comparative information on features from the vocabulary. For a rather straightforward example, we can inquire which features in the vocabulary employ the Discrete Cosine Transform calculation by executing the following query:

 SELECT DISTINCT ?feature
 WHERE { 
    ?sequence rdf:type afv:DiscreteCosineTransform .
    ?x afo:next_operation+ ?sequence .
    OPTIONAL { ?model afo:operation_sequence ?x .  ?feature afo:model ?model }
    FILTER (!isBlank(?feature))
 }
 ORDER BY ?feature

This query, when executed in SPARQL 1.1 specification, will produce the following result:

AutocorrelationMFCCs
BarkscaleFrequencyCepstralCoefficients
ModifiedGroupDelay
ModulationHarmonicCoefficients
NoiseRobustAuditoryFeature
PerceptualLinearPrediction
RelativeSpectralPLP

The query would not produce the same result in SPARQL 1.0 because the older specification does not support the recursive '+' operator.

The vocabulary is accessible in HTML, RDF, and N3: http://sovarr.c4dm.eecs.qmul.ac.uk/af/vocabulary/1.0#