You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is it possible to make a STT model for converting bilingual speech to two different languages? For example, converting human speech(voice) to text '현재 비행기가 Turbulence로 인해 흔들리고 있습니다.)
1-1) If it is possible, how can I train the model?
If I want to train a Korean language data for a specific field, which one is a common method to train additional data on a pre-trained model?
Method 1.
step 1) Train the Ksponspeech data & Make a model
step 2) Train a Korean language data of a specific field with the model generated in step 1
Method 2. (I think that it needs too much efforts and time to transform the data for training together)
step 1) Make a combined dictionary for Ksponspeech data and the other Korean language data
step 2) Train all the data(Ksponspeech & other Korean langauge data) together
For improving terminology recognition for a specific field, is it adequate to train addition data on a pre-trained model (which is trained with Ksponspeech data and "character" mode") with the "subword" option?
Details
The text was updated successfully, but these errors were encountered:
taejin0128
changed the title
The way to execute hydra_train.py for Ksponspeech
A way to execute hydra_train.py for Ksponspeech
Jan 15, 2023
❓ Questions & Help
Is it possible to make a STT model for converting bilingual speech to two different languages? For example, converting human speech(voice) to text '현재 비행기가 Turbulence로 인해 흔들리고 있습니다.)
1-1) If it is possible, how can I train the model?
If I want to train a Korean language data for a specific field, which one is a common method to train additional data on a pre-trained model?
Method 1.
step 1) Train the Ksponspeech data & Make a model
step 2) Train a Korean language data of a specific field with the model generated in step 1
Method 2. (I think that it needs too much efforts and time to transform the data for training together)
step 1) Make a combined dictionary for Ksponspeech data and the other Korean language data
step 2) Train all the data(Ksponspeech & other Korean langauge data) together
For improving terminology recognition for a specific field, is it adequate to train addition data on a pre-trained model (which is trained with Ksponspeech data and "character" mode") with the "subword" option?
Details
The text was updated successfully, but these errors were encountered: