A list of interesting datasets

less than 1 minute read

Audio classification

A bunch of datasets for audio classification:

Google Audioset (Google Machine Listening Group): “632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube videos”.
FSD (FreeSound Dataset): “The AudioSet Ontology is a hierarchical collection of over 600 sound classes and we have filled them with 268,261 audio samples from Freesound.” Maintained by FreeSound, i.e. UPF/MTG.
- See this Kaggle competition: https://www.kaggle.com/c/freesound-audio-tagging
- And this Kaggle Kernel, specific to this dataset (but applying to audio data in general): https://www.kaggle.com/fizzbuzz/beginner-s-guide-to-audio-data
VocalSet (Wilkins et al. 2018): “A singing voice dataset consisting of 10.1 hours of monophonic recorded audio of professional singers demonstrating both standard and extended vocal techniques on all 5 vowels.”