1 of 53

Exploring the hidden potential of sound data

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

2 of 53

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

3 of 53

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

4 of 53

Restaurant

Home

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

5 of 53

Restaurant

Home

Eating

Eating

Cooking

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

6 of 53

Restaurant

Home

Eating

Eating

Cooking

Breakfast / Lunch / Dinner

Breakfast / Lunch / Dinner

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

7 of 53

Restaurant

Home

Eating

Eating

Cooking

Breakfast / Lunch / Dinner

Breakfast / Lunch / Dinner

LOCATION

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

8 of 53

Restaurant

Home

Eating

Eating

Cooking

Breakfast / Lunch / Dinner

Breakfast / Lunch / Dinner

ACTIVITY

LOCATION

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

9 of 53

Restaurant

Home

Eating

Eating

Cooking

Breakfast / Lunch / Dinner

Breakfast / Lunch / Dinner

SUB-ACTIVITY

ACTIVITY

LOCATION

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

10 of 53

Front-end developer

Google Developer Expert & Mozilla Tech speaker

Charlie Gerard

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

11 of 53

ACOUSTIC ACTIVITY RECOGNITION

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

12 of 53

Using the rich properties of sound to gain insights about an activity or environment

ACOUSTIC ACTIVITY RECOGNITION

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

13 of 53

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

14 of 53

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

15 of 53

Web audio API

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

16 of 53

Visualizations

Waveform / Oscilloscope, Frequency bar graph

https://webaudioapi.com/samples/visualizer/

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

17 of 53

Spectrogram

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

18 of 53

Time

Spectrogram

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

19 of 53

Frequencies

Spectrogram

Time

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

20 of 53

Frequencies

Amplitude

Spectrogram

Time

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

21 of 53

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

22 of 53

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

23 of 53

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

24 of 53

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

...

]

Collecting data

25 of 53

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

...

]

[{

label: 0,

features:

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

…

]

},

{

label: 1,

features:

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

…

]

},

...]

Collecting data

Data transformation

26 of 53

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

...

]

[{

label: 0,

features:

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

…

]

},

{

label: 1,

features:

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

…

]

},

...]

Collecting data

// labels

[

[0,0,0,0,0],

[1,1,1,1,1],

…

]

// features

[

[

[204, 10, …],

[25, 45, …],

…

],

[

[45, 37, …],

[23, 67, …],

…

],

...

]

Data transformation

27 of 53

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

...

]

[{

label: 0,

features:

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

…

]

},

{

label: 1,

features:

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

…

]

},

...]

Collecting data

// labels

[

[0,0,0,0,0],

[1,1,1,1,1],

…

]

// features

[

[

[204, 10, …],

[25, 45, …],

…

],

[

[45, 37, …],

[23, 67, …],

…

],

...

]

Tensors

Data transformation

28 of 53

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

...

]

[{

label: 0,

features:

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

…

]

},

{

label: 1,

features:

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

…

]

},

...]

Collecting data

// labels

[

[0,0,0,0,0],

[1,1,1,1,1],

…

]

// features

[

[

[204, 10, …],

[25, 45, …],

…

],

[

[45, 37, …],

[23, 67, …],

…

],

...

]

Tensors

Algorithm

Data transformation

Training

29 of 53

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

...

]

[{

label: 0,

features:

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

…

]

},

{

label: 1,

features:

[

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

[ 204, 10, 34,

11, 0, 3, 56,

78, 23, 89,

56, 67, …

],

…

]

},

...]

Collecting data

// labels

[

[0,0,0,0,0],

[1,1,1,1,1],

…

]

// features

[

[

[204, 10, …],

[25, 45, …],

…

],

[

[45, 37, …],

[23, 67, …],

…

],

...

]

Tensors

Algorithm

Output / prediction

Data transformation

Training

30 of 53

<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@1.3.1/dist/tf.min.js">

</script>

<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/speech-commands@0.4.0/dist/speech-commands.min.js">

</script>

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

31 of 53

async function setupModel(URL, predictionCB) {

//store the prediction and audio callback functions

predictionCallback = predictionCB;

const modelURL = `${URL}/model.json`;

const metadataURL = `${URL}/metadata.json`;

model = window.speechCommands.create('BROWSER_FFT', undefined, modelURL, metadataURL);

await model.ensureModelLoaded();

const modelParameters = {

invokeCallbackOnNoiseAndUnknown: true, // run even when only background noise is detected

includeSpectrogram: true, // give us access to numerical audio data

overlapFactor: 0.5 // how often per second to sample audio, 0.5 means twice per second

};

model.listen(

//This callback function is invoked each time the model has a prediction.

prediction => {

predictionCallback(prediction.scores);

},

modelParameters

);

}

32 of 53

async function setupModel(URL, predictionCB) {

//store the prediction and audio callback functions

predictionCallback = predictionCB;

const modelURL = `${URL}/model.json`;

const metadataURL = `${URL}/metadata.json`;

model = window.speechCommands.create('BROWSER_FFT', undefined, modelURL, metadataURL);

await model.ensureModelLoaded();

const modelParameters = {

invokeCallbackOnNoiseAndUnknown: true, // run even when only background noise is detected

includeSpectrogram: true, // give us access to numerical audio data

overlapFactor: 0.5 // how often per second to sample audio, 0.5 means twice per second

};

model.listen(

//This callback function is invoked each time the model has a prediction.

prediction => {

predictionCallback(prediction.scores);

},

modelParameters

);

}

33 of 53

async function setupModel(URL, predictionCB) {

//store the prediction and audio callback functions

predictionCallback = predictionCB;

const modelURL = `${URL}/model.json`;

const metadataURL = `${URL}/metadata.json`;

model = window.speechCommands.create('BROWSER_FFT', undefined, modelURL, metadataURL);

await model.ensureModelLoaded();

const modelParameters = {

invokeCallbackOnNoiseAndUnknown: true, // run even when only background noise is detected

includeSpectrogram: true, // give us access to numerical audio data

overlapFactor: 0.5 // how often per second to sample audio, 0.5 means twice per second

};

model.listen(

//This callback function is invoked each time the model has a prediction.

prediction => {

predictionCallback(prediction.scores);

},

modelParameters

);

}

34 of 53

async function setupModel(URL, predictionCB) {

//store the prediction and audio callback functions

predictionCallback = predictionCB;

const modelURL = `${URL}/model.json`;

const metadataURL = `${URL}/metadata.json`;

model = window.speechCommands.create('BROWSER_FFT', undefined, modelURL, metadataURL);

await model.ensureModelLoaded();

const modelParameters = {

invokeCallbackOnNoiseAndUnknown: true, // run even when only background noise is detected

includeSpectrogram: true, // give us access to numerical audio data

overlapFactor: 0.5 // how often per second to sample audio, 0.5 means twice per second

};

model.listen(

//This callback function is invoked each time the model has a prediction.

prediction => {

predictionCallback(prediction.scores);

},

modelParameters

);

}

35 of 53

let labels = ["Clapping","Speaking","_background_noise_"];

setupModel(URL, data => {

// data will look like this [0.87689, 0.21456, 0.56789]

switch(Math.max(...data)){

case data[0]:

currentPrediction = labels[0];

break;

case data[1]:

currentPrediction = labels[1];

break;

default:

currentPrediction = "";

break;

}

}

return currentPrediction;

});

36 of 53

DEMO

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

37 of 53

Acoustic-ml.netlify.com

(⚠️ Early prototype optimised for Chrome desktop)

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

38 of 53

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

39 of 53

BENEFITS

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

40 of 53

1 sensor to rule them all

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

41 of 53

APPLICATIONS

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

42 of 53

Applications

Smart home / office

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

43 of 53

Smart home / office

Interactive storytelling

Applications

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

44 of 53

Smart home / office

Interactive storytelling

Health tracking

Applications

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

45 of 53

Automatic Youtube video captions of sound effects

Applications

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

46 of 53

Applications

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

47 of 53

LIMITATIONS

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

48 of 53

Limitations

  • Need lots of (good) samples

  • Model file size can be very large

  • Systems can be fooled

  • Privacy concerns

  • Not alway able to understand multiple activities at once

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

49 of 53

General purpose synthetic sensors

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

50 of 53

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

51 of 53

RESOURCES

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

52 of 53

Resources

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie

53 of 53

β™₯️

THANK YOU!

πŸ‘©πŸ» Charlie Gerard

πŸ‘©πŸ»β€πŸ’» @devdevcharlie