KNOWLEDGE TRANSFER FROM WEAKLY LABELED AUDIO USING CONVOLUTIONAL NEURAL NETWORK FOR SOUND EVENTS AND SCENES. pdf
Authors: Anurag Kumar, Maksim Khadkevich, Christian Fügen
Average Precision (AP)
Figure compared AP of \(\mathcal{N}_S\) and \(\mathcal{N}_S^{slat}\). The Index of sound events 1 to 50 is shown below. The first column is Index in the above figure, second is the event id (magenta) as used in Audioset dataset and third (blue) is sound event name
1 /m/07s0s5r Strum
2 /m/042v_gx Acoustic guitar
3 /m/02hnl Drum kit
4 /m/02qldy Narration- monologue
5 /m/02mk9 Engine
6 /m/0l14md Percussion
7 /m/07gxw Techno
8 /m/05zppz Male speech- man speaking
9 /m/07s72n Dubstep
10 /m/068hy Domestic animals- pets
11 /m/026t6 Drum
12 /m/015p6 Bird
13 /t/dd00126 Inside- large room or hall
14 /m/07y_7 Violin- fiddle
15 /t/dd00128 Outside- urban or manmade
16 /t/dd00129 Outside- rural or natural
17 /m/02lkt Electronic music
18 /m/0jbk Animal
19 /m/0k4j Car
20 /m/015lz1 Singing
21 /m/0fx80y Plucked string instrument
22 /m/0342h Guitar
23 /t/dd00125 Inside- small room
24 /m/04szw Musical instrument
25 /m/07yv9 Vehicle
26 /m/09x0r Speech
27 /m/04rlf Music