possibility to run this tool locally? #2

danielw97 · 2024-01-09T16:17:03Z

Hi there,
Thanks very much for your work on this project, it seems quite interesting.
I'm just wondering if there is any way/possibility to run this locally for those of us who have good GPUs?
It could possibly leverage the styletts2 rest api or importable script at https://github.com/NeuralVox/StyleTTS2
As someone who isn't a developer this is probably a lot more complicated than I'm making it out to be, although this is great work so far none the less.

duplaja · 2024-01-09T16:28:38Z

Hey there, you are welcome! It should be very doable to modify to run locally. I actually don't have the hardware to test or build it out myself, unfortunately (what lead to this).

One would essentially need to replace the code in the convert_chapter function (

epub-to-audiobook-hf/epub-to-audiobook-hf.py

Line 87 in a67d799

def convert_chapter(client,chapter_path, chapter_paragraphs, style_voice):

), to either call a local API in that version of StyleTTS 2, or pass the chapter text in whatever method was desired, and get the result wav files. (would also want to strip out the code to spin up / down the HF Space.

You can look at some of what is being done on the StyleTTS 2 side, by looking at app.py in the HF Space's code: https://huggingface.co/spaces/Dupaja/styletts2-public/blob/main/app.py

I'd certainly give my blessing if anyone wanted to try and make a local fork!

nixolas1 · 2024-01-14T23:47:36Z

I did a quick and dirty local fork, but it works!
Check it out: https://github.com/nixolas1/epub-to-audiobook-local

danielw97 · 2024-01-15T00:15:34Z

Thanks much for your work on this.
Unfortunately when I just tried to test this on wsl, it appeared to try to process the book although didn't actually process any speech.
I'm running the docker container at 5000, and ffmpeg as well as the requirements are installed.

nixolas1 · 2024-01-15T11:08:20Z

Strange. No errors in the script, or in the docker logs? (I usually check the logs on the container in Docker Desktop).
I'll try doing a fresh install later, maybe I forgot some local changes.

I remember having to fix some crashes in the docker image (turning on normalization on audio or something). But don't have the time to set this up properly yet.
A more ideal setup would be having just one image, with the audiobook logic alongside the styleTTS, and not use such a obscure tts image.

duplaja assigned duplaja and unassigned duplaja Jan 9, 2024

duplaja added the enhancement New feature or request label Jan 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

possibility to run this tool locally? #2

possibility to run this tool locally? #2

danielw97 commented Jan 9, 2024

duplaja commented Jan 9, 2024

nixolas1 commented Jan 14, 2024

danielw97 commented Jan 15, 2024

nixolas1 commented Jan 15, 2024

possibility to run this tool locally? #2

possibility to run this tool locally? #2

Comments

danielw97 commented Jan 9, 2024

duplaja commented Jan 9, 2024

nixolas1 commented Jan 14, 2024

danielw97 commented Jan 15, 2024

nixolas1 commented Jan 15, 2024