nif:isString
|
-
For each audio recording in our dataset we extract music descriptors by a) filtering out speech segments as detected via a speech/music discriminator algorithm, b) extracting audio descriptors capturing aspects of music style, c) applying feature learning to reduce dimensionality and project the recordings into a similarity space.
|