1 of 8

Multimodal sound monitoring

- Devdoot Chatterjee

2 of 8

About me

  • Student at Delhi Technological University, Delhi, India.
  • Majoring in Mechanical Engineering, also pursuing a minor degree in Data Science.
  • Passionate about Machine Learning, Deep Learning.
  • Worked on perception algorithms for Autonomous vehicles.

3 of 8

Acoustic Separator

Built a bioacoustic source separator to separate orca vocalizations from other background noises in hydrophone recordings.

4 of 8

Spleeter

  • Open-source audio source separation library.
  • Uses a U-Net architecture for source separation.
  • Outputs a mask which when multiplied with the original spectrogram should give us the spectrogram of the isolated source.

5 of 8

Dataset preparation

  • The dataset used to train the model was extracted from the PodCast rounds.
  • The sound separation dataset was generated by randomly overlapping these orca vocalization sounds with other background noises consisting of noise from sea waves, ships, boats, etc.
  • Additionally, used CosmoDB’s API to get recordings tagged squeaky, (that sound like orca vocals) to make the model more robust.

6 of 8

Results

Evaluation Metrics

Value

Mean Absolute Difference

0.4626

Mean Absolute Difference (vocals)

0.234

Mean Absolute Difference (accompaniment)

0.2286

7 of 8

GUI

  • Ambra, another GSOC volunteer, designed a GUI for pre-processing audio data.
  • One can also use this GUI to extract orca vocalization from a hydrophone recording using the fine-tuned Spleeter model.

8 of 8

Future Work

  • Building a web app for real-time implementation of the source separator (using Gradio).
  • Work on BioCPPNet- a lightweight Deep Learning architecture optimized for bioacoustic source separation.