Additional Information Page for ICASSP 2017 Paper

This webpage contains additional information for our ICASSP 2017 paper, titled, KNOWLEDGE TRANSFER FROM WEAKLY LABELED AUDIO USING CONVOLUTIONAL NEURAL NETWORK FOR SOUND EVENTS AND SCENES .

If you did not come on this page from the paper, might be a good idea to read the paper first, pdf

We provide addtional results and analysis here.

This paper sets state of art results on Audioset and ESC-50 datasets. On ESC-50 dataset it achieves better than human accuracy. On Audioset it sets state of art using balanced training set.

Authors: Anurag Kumar, Maksim Khadkevich, Christian Fügen

Email: alnu AT andrew DOT cmu DOT edu, fugen AT fb DOT com


Code to Extract Features

Here is the code to extract features.

Audioset Dataset Results

Audioset is a large scale weakly labeled [2] dataset for sound events, Audioset [1]. It contains a total of 527 sound events for which labeled videos from Youtube are provided.

Click Here For More Details and Results on Audioset


Sound Event Classification (ESC-50) Results

Here, we show the results on sound event classification using the proposed approaches to learn represnetation using \(\mathcal{N}_S\). ESC-50 dataset consists of a total of 50 different sound events.

Click Here For More Details and Results on ESC-50

Acoustic Scene Classification (DCASE-2016) Results

Click Here For More Details and Results on DCASE-2016

Semantic Understanding

Click Here For More Details on Semantic Understanding using our methods


References

[1]Jort F Gemmeke, Daniel PW Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R Channing Moore, Manoj Plakal, and Marvin Ritter, “Audio set: An ontology and human-labeled dataset for audio events,” in IEEE ICASSP, 2017.
[2] Anurag Kumar, Bhiksha Raj , “Audio Event Detection using Weakly Labeled Data,” ACM Multimedia (MM), 2016
[3] Anurag Kumar, Bhiksha Raj , “Weakly Supervised Scalable Audio Content Analysis,” IEEE ICME, 2016
[4] Karol J Piczak, “Esc: Dataset for environmental sound classification,” in Proceedings of the 23rd ACM Multimedia. ACM, 2015.
[5] Annamaria Mesaros, Toni Heittola, and Tuomas Virtanen, “Tut database for acoustic scene classification and sound event detection,” ESIPCO, 2016.



Counts