Hidden Echoes Survive generative Audio instrument training
Chris Tralie, Ursinus College
Matt Amery
Create & Innovate UK
Ian Utz Ben Douglas
Ursinus College
Motivation
Motivation
Motivation: Where Is Training Data Used?
Example courtesy of Josh Brown in CS 372, spring 2023
https://ursinus-cs372-s2023.github.io/CoursePage/Assignments/HW6_StringAlong/
Engel, Jesse, Chenjie Gu, and Adam Roberts. "DDSP: Differentiable Digital Signal Processing." International Conference on Learning Representations. 2019.
Echo Hiding: A Simple classical idea
Gruhl, Daniel, Anthony Lu, and Walter Bender. "Echo hiding." Information Hiding: First International Workshop Cambridge, UK, May 30–June 1, 1996 Proceedings 1. Springer Berlin Heidelberg, 1996.
Uncovering Hidden Echoes via Cepstrum
Examples Watermarking Rave
[1]Caillon, Antoine, and Philippe Esling. "RAVE: A variational autoencoder for fast and high-quality neural audio synthesis." arXiv preprint arXiv:2111.05011 (2021).
[2] Xi, Qingyang, et al. "GuitarSet: A Dataset for Guitar Transcription." ISMIR. 2018.
Clean
50
75
100
Examples Watermarking Dance Diffusion
[3]Evans, Z. 2022. Dance Diffusion. https://github.com/harmonai-org/sample-generator.
[4]Gillick, J.; Roberts, A.; Engel, J.; Eck, D.; and Bamman, D. 2019. Learning to Groove with Inverse Sequence Transfor-
mations. In International Conference on Machine Learning (ICML).
Clean
50
75
100
Single Echo Results
Pseudorandom Time-Spread Echo Patterns
Ko, B.-S.; Nishimura, R.; and Suzuki, Y. 2005. Time-spread echo method for digital audio watermarking. IEEE Transactions on Multimedia, 7(2): 212–221.
Pseudorandom Time-Spread Echo Patterns
Longer durations are more robust
Pseudorandom Time-Spread Echo Patterns on Rave And DDSP
Area under area under ROC curves
Single Echoes Survive pitch SHift Data Augmentation
Z-scores generally decrease for an increasing probability of pitch augmentation, though they remain detectable even for high rates of augmentation.
Mixed Echoes Can Be Demixed using Demucs
Défossez, A.; Usunier, N.; Bottou, L.; and Bach, F. 2019. Music Source Separation in the Waveform Domain. arXiv preprint arXiv:1911.13254.
Rafii, Z.; Liutkus, A.; Stöter, F.-R.; Mimilakis, S. I.; and Bittner, R. 2019. MUSDB18-HQ - an uncompressed version of MUSDB18.
Original mix from demucs
Mixed rave style transfer on individual tracks
Next Steps
Rave VocalSet male/female
Dane Diffusion Fine Tuning
Special Thanks To Bill Mongan And Leslie New
For letting me run computers in their offices every day for 5 months straight…
Code, Supplementary Material, ETc