MCD DTW tutorial #96

styagi130 · 2022-11-02T16:05:31Z

Mcd tutorial

tts-evaluation-MCD-DTW.ipynb

blisc · 2022-11-04T15:15:00Z

tts-evaluation-MCD-DTW.ipynb

+    "## Conclusion\n",
+    "<img src=\"imgs/riva-tts-MCD_DTW_final_comparision.jpeg\">\n",
+    "\n",
+    "From the graph above the value of MCD is greater for radtts audios than radtts audios, this is also reflected in the average MCD value for both models. Therefore we can conclude that fastpitch has better convergence than radtts. However we cannot evaluate the quality of audios generated by these models using MCD. MCD is a great tool for testing model convergence, but generated audios may have pronunciation and quality artefacts. Therefore MCD evaluation should be followed by a MOS(Mean opinion score) and CMOS(Comparative mean opinion scores) evaluation."


Can you clarify this sentence?

tts-evaluation-MCD-DTW.ipynb

blisc · 2022-11-08T15:41:14Z

tts-evaluation-MCD-DTW.ipynb

+    "sr = 22050\n",
+    "\n",
+    "## Mfcc params\n",
+    "n_mfcc=n_mels"


Can you set n_mfcc to 34?

blisc · 2022-11-08T15:41:34Z

tts-evaluation-MCD-DTW.ipynb

+   "source": [
+    "def mel2mfcc(mels):\n",
+    "    mfcc = librosa.feature.mfcc(S=mels, n_mfcc=n_mfcc)\n",
+    "    mfcc = librosa.power_to_db(mfcc, ref=np.max)\n",


Drop the power_to_db

tts-evaluation-MCD-DTW.ipynb

blisc · 2022-11-08T15:46:40Z

tts-evaluation-MCD-DTW.ipynb

@@ -0,0 +1,495 @@
+{


How did you get the mel spectrograms from the models?

For radTTS I used: https://github.com/NVIDIA/radtts
For fastpitch, I used the method described here: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/tts_en_fastpitch

Do have a code snippet that you can add to the notebook?

I can add for fastpitch, for radTTS I had to do some changes in their inference script. That wont be possible in the notebook.

siddhartht130 added 2 commits November 2, 2022 20:00

MCD DTW tutorial

61ccefa

Ljspeech audio samples and images

311ee56

styagi130 marked this pull request as ready for review November 2, 2022 17:31

blisc reviewed Nov 4, 2022

View reviewed changes

siddhartht130 added 4 commits November 7, 2022 12:34

Fixed rendering

11c1843

added mels specs from models

b013b85

Moved to processing on mels from audios

c145503

Fixed typos

ea2e62b

blisc reviewed Nov 8, 2022

View reviewed changes

siddhartht130 added 2 commits November 9, 2022 13:17

Removed extra audios

e13ec60

Generated mels for gt

c9a7d49

redoctopus mentioned this pull request Jan 11, 2023

Port Riva's mel cepstral distortion w/ dynamic time warping notebook NVIDIA/NeMo#5778

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MCD DTW tutorial #96

MCD DTW tutorial #96

styagi130 commented Nov 2, 2022

blisc Nov 4, 2022

blisc Nov 8, 2022

blisc Nov 8, 2022

blisc Nov 8, 2022

styagi130 Nov 9, 2022

blisc Nov 9, 2022

styagi130 Nov 10, 2022

MCD DTW tutorial #96

Are you sure you want to change the base?

MCD DTW tutorial #96

Conversation

styagi130 commented Nov 2, 2022

blisc Nov 4, 2022

Choose a reason for hiding this comment

blisc Nov 8, 2022

Choose a reason for hiding this comment

blisc Nov 8, 2022

Choose a reason for hiding this comment

blisc Nov 8, 2022

Choose a reason for hiding this comment

styagi130 Nov 9, 2022

Choose a reason for hiding this comment

blisc Nov 9, 2022

Choose a reason for hiding this comment

styagi130 Nov 10, 2022

Choose a reason for hiding this comment