KNOWLEDGE TRANSFER FROM WEAKLY LABELED AUDIO USING CONVOLUTIONAL NEURAL NETWORK FOR SOUND EVENTS AND SCENES. pdf
Authors: Anurag Kumar, Maksim Khadkevich, Christian Fügen
Localization of Sound Events
In the figures below we show some examples of localization of sound events. The figure shows how the output activation for the sound event of interest changes with time. Note, how the activation (red line ) becomes rises up when the event of interest is occurring. Backgrond is logmel spectrogram.
Fig 1. Sound event label, Whoosh- swoosh- swish. The red line shows the localization done by the network. Fig 2. Sound event label, Sidetone. Sidetone is a very small duration event whereas segment size is 128 logmel frames (~1.5 s) Fig 3. Sound event label, Machine gun and Gunfire. Fig 4. Sound event label, Sneeze. Fig 5. Sound event label, Horse, Neigh-whinny sound. Fig 6. Sound event label, Bird.