Progress Update - 20th April
Group 1 - Emotional Speech Synthesis using HMMs
Summary of updates as per previous timeline
DL Approach I - Vanilla Tacotron Fine-tuning
DL Approach I - Vanilla Tacotron Fine-tuning
What we have done/are doing:�
DL Approach I - Vanilla Tacotron Fine-tuning
What we have done/are doing:
Why are we doing this?
DL Approach I - Vanilla Tacotron Fine-tuning
Vanilla Tacotron FT Results
After 1k iters.
After 12k iters.
After 25k iters.
After 37k iters.
After 50k iters.
No Finetuning
Vanilla Tacotron FT Results
Observations
DL Approach II - DCTTS
DL Approach II - DCTTS : DCTTS Model
DL Approach II - DCTTS : EMOV-DB Dataset
DL Approach II - DCTTS : DCTTS Model
Implementation details -
DL Approach II - DCTTS : DCTTS Model
Pre-training results -
�Fine-tuning results - No audio is being generated (no sound for audio file of 5 second), however the attention alignment looks near perfect.
Attention matrix for the sentence: "I wrote this sentence myself to test whether it works."
“Hello and welcome to the winter offering of speech recognition and understanding.”
“Hi my name is Brihi and I am working on this repository as a part of my course project.”
More samples available here
DL Approaches for TTS
Challenges
DL Approaches for TTS
Next steps:
HMM based Speech Synthesis
HMM based Speech Synthesis
Demo!
HMM based Speech Synthesis
Next Steps
HMM based Speech Synthesis
Challenges
Thank you!
All samples today are available here: https://drive.google.com/open?id=1TVfzep6fqF7d9FkQR23dcNINp5TCy6Zh