Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

possibility to run this tool locally? #2

Open
danielw97 opened this issue Jan 9, 2024 · 4 comments
Open

possibility to run this tool locally? #2

danielw97 opened this issue Jan 9, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@danielw97
Copy link

Hi there,
Thanks very much for your work on this project, it seems quite interesting.
I'm just wondering if there is any way/possibility to run this locally for those of us who have good GPUs?
It could possibly leverage the styletts2 rest api or importable script at https://github.com/NeuralVox/StyleTTS2
As someone who isn't a developer this is probably a lot more complicated than I'm making it out to be, although this is great work so far none the less.

@duplaja
Copy link
Owner

duplaja commented Jan 9, 2024

Hey there, you are welcome! It should be very doable to modify to run locally. I actually don't have the hardware to test or build it out myself, unfortunately (what lead to this).

One would essentially need to replace the code in the convert_chapter function (

def convert_chapter(client,chapter_path, chapter_paragraphs, style_voice):
), to either call a local API in that version of StyleTTS 2, or pass the chapter text in whatever method was desired, and get the result wav files. (would also want to strip out the code to spin up / down the HF Space.

You can look at some of what is being done on the StyleTTS 2 side, by looking at app.py in the HF Space's code: https://huggingface.co/spaces/Dupaja/styletts2-public/blob/main/app.py

I'd certainly give my blessing if anyone wanted to try and make a local fork!

@duplaja duplaja assigned duplaja and unassigned duplaja Jan 9, 2024
@duplaja duplaja added the enhancement New feature or request label Jan 9, 2024
@nixolas1
Copy link

I did a quick and dirty local fork, but it works!
Check it out: https://github.com/nixolas1/epub-to-audiobook-local

@danielw97
Copy link
Author

Thanks much for your work on this.
Unfortunately when I just tried to test this on wsl, it appeared to try to process the book although didn't actually process any speech.
I'm running the docker container at 5000, and ffmpeg as well as the requirements are installed.

@nixolas1
Copy link

Strange. No errors in the script, or in the docker logs? (I usually check the logs on the container in Docker Desktop).
I'll try doing a fresh install later, maybe I forgot some local changes.

I remember having to fix some crashes in the docker image (turning on normalization on audio or something). But don't have the time to set this up properly yet.
A more ideal setup would be having just one image, with the audiobook logic alongside the styleTTS, and not use such a obscure tts image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants