-
-
Notifications
You must be signed in to change notification settings - Fork 634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with Persian voice #404
Comments
I don't maintain colab so maybe it's an old version with a bug that has been fixed already. I recall that the default sample rate was wrong for fairseq tts engine and was fixed long time ago. |
I'm the one maintaining the colab It should be pulling from the latest in the main tho... and that issue should have been fixed... I'll check but hit me up if I forget about this |
Could you give us a sample text that is running into that issue on your end for us to test out? |
Cause, I'm not getting the issue your describing when running it in colab or locally. My resultsMy inputMy Local outputMy Colab output |
Yes definitely, this is an example. |
Have you tried giving it input files that are not pdf? Perhaps a epub or txt instead? PDF's are notoriously difficult to convert into txt That may be the cause of your problem |
yeah its work better with .txt about 85% but still some time read the strange text and meaningless ... |
hm , it could potentially be that the Persian model is not very good... Also when you say "some time read the strange text and meaningless"
|
Yes, you're probably right. |
fairseq is not the same quality of xttsv2 or bark. it can be some glitches caused by special punctuations or just a space at the wrong place. |
ok, whatever it's so weak for persian. |
you can help us to improve it rather than give up. without community help nothing can evolve. |
It looks like there's also another Persian glow-tts model when I look at the list of coqui-tts models |
I would like a comment from @DonMonro who can help us to find a better model for fairseq or else if not another tts engine. |
Hi. Unfortunately, the Persian community is very weak in this field, and if something is found, it has to be searched for in English. I searched a lot, and the only thing I found that has been pre-trained is this: https://github.com/SadeghKrmi/pertts-streamlit But I can't tell if this can be used in your project or not. |
this model is used with piper-tts engine, which we wanted to integrate into eb2ab, but last time we checked it was not possible to do it since piper-tts is locked to python 3.10 max. maybe we should push their dev to upgrade to python 3.12 |
If anyone ever gets theses added to the coqui-tts in a PR then it would also help out with this https://github.com/karim23657/awesome-Persian-Speech?tab=readme-ov-file |
@DonMonro could you provide the text you got the issue? |
It mispronounces many words, especially those that have come from foreign languages into Persian, such as 'system' or in Persian 'سیستم'. This word has a lot of usage in Persian, but it is inherently English. This problem stems from the language model. Often, it creates irregular and unnecessary pauses between words, and sometimes the speaker's voice changes within the text. For example, I tested this on this text. |
I need only the original text to see if it's the punctuation doing some glitches. the rest cannot be solved like foreign languages pronounced in persian etc.... |
btw, to get a perfect A.I. TTS pronounced it does not come only from the model, but also how it's written, and even sometimes the word itself must be phoneticly changed to pronounced it well. |
maybe fixed in the next update |
how does the vits female (best) voice sound in this free demo hugginface space to you? Cause I made a PR to coqui-tts to add that model here idiap/coqui-ai-TTS#332 and We need feedback from a persian speaker :) |
In fact, it works really well. There is still a lot of room for improvement, but it's the best Persian model I've ever seen. I'm very excited that you were able to find this. |
Thank you I will report this to the coqui team about my PR |
Give it a thumbs up or something to give it more attention as it might help it pass faster |
@mahdi155000 is this attached audio (you must unzip) result better from the text you provided? |
It's just a little better. The issue with the pauses has been resolved, but it still pronounces the words poorly. Overall, it has improved, but it doesn't reach the quality of coqui |
pronounciation is due to the model quality, I worked on the code for the pauses issue. thanks for your report |
Hello
I realized when I run your Colab for Persian that it wasn't working properly, meaning the voice playback speed was too high and it sounded very strange, like a robot or an alien. What's the problem? or some time read the strange text not persian and not even have mean ....
The text was updated successfully, but these errors were encountered: