JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 8

Multimodal sound monitoring

- Devdoot Chatterjee

2 of 8

About me

Student at Delhi Technological University, Delhi, India.
Majoring in Mechanical Engineering, also pursuing a minor degree in Data Science.
Passionate about Machine Learning, Deep Learning.
Worked on perception algorithms for Autonomous vehicles.

Acoustic Separator

Built a bioacoustic source separator to separate orca vocalizations from other background noises in hydrophone recordings.

Spleeter

Open-source audio source separation library.
Uses a U-Net architecture for source separation.
Outputs a mask which when multiplied with the original spectrogram should give us the spectrogram of the isolated source.

Dataset preparation

The dataset used to train the model was extracted from the PodCast rounds.
The sound separation dataset was generated by randomly overlapping these orca vocalization sounds with other background noises consisting of noise from sea waves, ships, boats, etc.
Additionally, used CosmoDB’s API to get recordings tagged squeaky, (that sound like orca vocals) to make the model more robust.

Results

Evaluation Metrics	Value
Mean Absolute Difference	0.4626
Mean Absolute Difference (vocals)	0.234
Mean Absolute Difference (accompaniment)	0.2286

GUI

Ambra, another GSOC volunteer, designed a GUI for pre-processing audio data.
One can also use this GUI to extract orca vocalization from a hydrophone recording using the fine-tuned Spleeter model.

Future Work

Building a web app for real-time implementation of the source separator (using Gradio).
Work on BioCPPNet- a lightweight Deep Learning architecture optimized for bioacoustic source separation.