-
-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Allow CONCURRENT requests and Multiple Instances management; Add API authentication; and configuration improvements #225
base: master
Are you sure you want to change the base?
Conversation
CodePothunter
commented
Mar 7, 2025
- Implement OpenAI-compatible API key authentication
- Add configuration options for GPU instances, concurrency, and request handling
- Update README with authentication instructions
- Modify configuration and routing to support optional API key verification
- Enhance system information and debug endpoints to expose authentication status.
- Implement OpenAI-compatible API key authentication - Add configuration options for GPU instances, concurrency, and request handling - Update README with authentication instructions - Modify configuration and routing to support optional API key verification - Enhance system information and debug endpoints to expose authentication status
This looks great, will take a look through today |
- Modify audio chunk concatenation to handle float32 audio data - Add explicit conversion from float32 to int16 using amplitude scaling - Remove unnecessary dtype specification in np.concatenate
- Create GPU-specific startup script - Set environment variables for GPU and project configuration - Use uv to install GPU extras and run FastAPI server
Is there a reason you deleted start-gpu and not start-cpu aswell |
It's a misoperation I've added it back in my new commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell this pr breaks streaming. (I tested by running Test.py) It produced an empty wav file (outputstream.wav).
The reason is that when stream=True, the audio conversion functions are not really called.
It has been solved in the most recent commit. The reason is that when stream=True, the audio conversion functions were not really called compared to the non-stream mode. |
When I run the docker container using the config in: It doesn't seem to respect env vars when I put them in a .env file or when I add them to the docker compose file |
Sorry, I did not consider the docker-related issues. |
Also when running it in a docker container (I havn't tested this outsidee of one) running Test.py twice to generate four querys causes the container to exit with code 139 (Note that I am using the gpu container):
|
…easing audio container Refactor StreamingAudioWriter to improve audio encoding reliability - Restructure audio encoding logic for better error handling - Create a new method `_create_container()` to manage container creation - Improve handling of different audio formats and encoding scenarios - Add error logging for audio chunk encoding failures - Simplify container and stream management in write_chunk method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pr currently breaks or removes the following features:
- The webui does not actually stream or receive any audio
- All text normalization that was in there is no longer being called
- There is no option to change the speed as it is not being passed into the generation system
- _process_chunk is in tts_service but it never gets called
- Captioned speech is broken because no timestamps are ever requested
- Streaming is broken as only the first chunk of text is returned
- Triming audio is always disabled even though it makes sense to do for chunks that contain speech
- smart_split is never called so I'm not really sure how it is suppost to split text in a sensible way
- process_text_chunk is never called
Honestly this pr feels unfinished and untested
- Update InstancePool to accept and process speed parameter - Modify TTSService to pass speed to instance pool - Update Test.py with new port and authentication - Adjust start-gpu.sh to use port 50888
Why did u change the gpu port to 50888 |