A way to execute hydra_train.py for Ksponspeech #191

taejin0128 · 2023-01-11T04:28:16Z

❓ Questions & Help

Is it possible to make a STT model for converting bilingual speech to two different languages? For example, converting human speech(voice) to text '현재 비행기가 Turbulence로 인해 흔들리고 있습니다.)

1-1) If it is possible, how can I train the model?
If I want to train a Korean language data for a specific field, which one is a common method to train additional data on a pre-trained model?
Method 1.
step 1) Train the Ksponspeech data & Make a model
step 2) Train a Korean language data of a specific field with the model generated in step 1

Method 2. (I think that it needs too much efforts and time to transform the data for training together)
step 1) Make a combined dictionary for Ksponspeech data and the other Korean language data
step 2) Train all the data(Ksponspeech & other Korean langauge data) together
For improving terminology recognition for a specific field, is it adequate to train addition data on a pre-trained model (which is trained with Ksponspeech data and "character" mode") with the "subword" option?

taejin0128 changed the title ~~The way to execute hydra_train.py for Ksponspeech~~ A way to execute hydra_train.py for Ksponspeech Jan 15, 2023