[BUG🐛] Unable to generate voices using OpenAI api #64

maxi1134 · 2025-02-04T14:59:40Z

Bug Description

Unable to generate voice using "OpenAI API"

Minimal Reproducible Example

Create the PIP environment, and run auralis.openai --host 0.0.0.0 --port 8000 --model AstraMindAI/xttsv2 --gpt_model AstraMindAI/xtts2-gpt --max_concurrency 8 --vllm_logging_level warn

Then try to generate audio through the http://192.168.0.14:8000/v1/audio/speech endpoint with

{
  "input": "this is a test",
  "model": "xttsv2",
  "voice": [
    "/examples/ncage.wav"
  ],
  "response_format": "wav",
  "speed": 0,
  "enhance_speech": false,
  "language": "auto",
  "max_ref_length": 60,
  "gpt_cond_len": 30,
  "gpt_cond_chunk_len": 4,
  "temperature": 0.75,
  "top_p": 0.85,
  "top_k": 50,
  "repetition_penalty": 5,
  "length_penalty": 1,
  "do_sample": true
}

curl -X 'POST' \
  'http://192.168.0.14:8000/v1/audio/speech' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "input": "this is a test",
  "model": "xttsv2",
  "voice": [
    "/examples/ncage.wav"
  ],
  "response_format": "wav",
  "speed": 0,
  "enhance_speech": false,
  "language": "auto",
  "max_ref_length": 60,
  "gpt_cond_len": 30,
  "gpt_cond_chunk_len": 4,
  "temperature": 0.75,
  "top_p": 0.85,
  "top_k": 50,
  "repetition_penalty": 5,
  "length_penalty": 1,
  "do_sample": true
}'

The voice is present in the /examples folder.

Expected Behavior

The voice is generated

Actual Behavior

I receive:

{
  "detail": [
    {
      "type": "value_error",
      "loc": [
        "body",
        "voice"
      ],
      "msg": "Value error, Invalid base64 encoding in voice file",
      "input": [
        "/examples/ncage.wav"
      ],
      "ctx": {
        "error": {}
      }
    }
  ]
}

Error Logs

INFO:     192.168.0.69:63960 - "POST /v1/audio/speech HTTP/1.1" 422 Unprocessable Entity

Environment

Please run the following commands and include the output:

# OS Information
`Linux machinelearning 6.8.0-52-generic #53-Ubuntu SMP PREEMPT_DYNAMIC Sat Jan 11 00:06:25 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux`


# Python version
`Python 3.12.3`

# Installed Python packages


Package                           Version
--------------------------------- -------------
aiofiles                          24.1.0
aiohappyeyeballs                  2.4.4
aiohttp                           3.11.11
aiosignal                         1.3.2
annotated-types                   0.7.0
anyio                             4.8.0
asttokens                         3.0.0
attrs                             25.1.0
audioread                         3.0.1
auralis                           0.2.8.post2
beautifulsoup4                    4.13.1
blis                              0.7.11
cachetools                        5.5.1
catalogue                         2.0.10
certifi                           2025.1.31
cffi                              1.17.1
charset-normalizer                3.4.1
click                             8.1.8
cloudpathlib                      0.20.0
cloudpickle                       3.1.1
colorama                          0.4.6
compressed-tensors                0.8.0
confection                        0.1.5
cutlet                            0.5.0
cymem                             2.0.11
datasets                          3.2.0
decorator                         5.1.1
dill                              0.3.8
diskcache                         5.6.3
distro                            1.9.0
docopt                            0.6.2
EbookLib                          0.18
einops                            0.8.0
executing                         2.2.0
fastapi                           0.115.8
ffmpeg                            1.4
filelock                          3.17.0
frozenlist                        1.5.0
fsspec                            2024.9.0
fugashi                           1.4.0
future                            1.0.0
gguf                              0.10.0
h11                               0.14.0
hangul-romanize                   0.1.0
httpcore                          1.0.7
httptools                         0.6.4
httpx                             0.28.1
huggingface-hub                   0.28.1
idna                              3.10
importlib_metadata                8.6.1
iniconfig                         2.0.0
interegular                       0.3.3
ipython                           8.32.0
jaconv                            0.4.0
jedi                              0.19.2
Jinja2                            3.1.5
jiter                             0.8.2
joblib                            1.4.2
jsonschema                        4.23.0
jsonschema-specifications         2024.10.1
langcodes                         3.5.0
langid                            1.1.6
language_data                     1.3.0
lark                              1.2.2
lazy_loader                       0.4
librosa                           0.10.2.post1
llvmlite                          0.44.0
lm-format-enforcer                0.10.9
lxml                              5.3.0
marisa-trie                       1.2.1
markdown-it-py                    3.0.0
MarkupSafe                        3.0.2
matplotlib-inline                 0.1.7
mdurl                             0.1.2
mistral_common                    1.5.2
mojimoji                          0.0.13
mpmath                            1.3.0
msgpack                           1.1.0
msgspec                           0.19.0
multidict                         6.1.0
multiprocess                      0.70.16
murmurhash                        1.0.12
nest-asyncio                      1.6.0
networkx                          3.4.2
num2words                         0.5.14
numba                             0.61.0
numpy                             1.26.4
nvidia-cublas-cu12                12.4.5.8
nvidia-cuda-cupti-cu12            12.4.127
nvidia-cuda-nvrtc-cu12            12.4.127
nvidia-cuda-runtime-cu12          12.4.127
nvidia-cudnn-cu12                 9.1.0.70
nvidia-cufft-cu12                 11.2.1.3
nvidia-curand-cu12                10.3.5.147
nvidia-cusolver-cu12              11.6.1.9
nvidia-cusparse-cu12              12.3.1.170
nvidia-ml-py                      12.570.86
nvidia-nccl-cu12                  2.21.5
nvidia-nvjitlink-cu12             12.4.127
nvidia-nvtx-cu12                  12.4.127
openai                            1.61.0
OpenCC                            1.1.9
opencv-python-headless            4.11.0.86
outlines                          0.0.46
packaging                         24.2
pandas                            2.2.3
parso                             0.8.4
partial-json-parser               0.2.1.1.post5
pexpect                           4.9.0
pillow                            10.4.0
pip                               24.0
platformdirs                      4.3.6
pluggy                            1.5.0
pooch                             1.8.2
preshed                           3.0.9
prometheus_client                 0.21.1
prometheus-fastapi-instrumentator 7.0.2
prompt_toolkit                    3.0.50
propcache                         0.2.1
protobuf                          5.29.3
psutil                            6.1.1
ptyprocess                        0.7.0
pure_eval                         0.2.3
py-cpuinfo                        9.0.0
pyairports                        2.1.1
pyarrow                           19.0.0
pycountry                         24.6.1
pycparser                         2.22
pydantic                          2.10.6
pydantic_core                     2.27.2
Pygments                          2.19.1
pyloudnorm                        0.1.1
pypinyin                          0.53.0
pytest                            8.3.4
python-dateutil                   2.9.0.post0
python-dotenv                     1.0.1
pytz                              2025.1
PyYAML                            6.0.2
pyzmq                             26.2.1
ray                               2.42.0
referencing                       0.36.2
regex                             2024.11.6
requests                          2.32.3
rich                              13.9.4
rpds-py                           0.22.3
safetensors                       0.5.2
scikit-learn                      1.6.1
scipy                             1.15.1
sentencepiece                     0.2.0
setuptools                        75.8.0
shellingham                       1.5.4
six                               1.17.0
smart-open                        7.1.0
sniffio                           1.3.1
sounddevice                       0.5.1
soundfile                         0.13.1
soupsieve                         2.6
soxr                              0.5.0.post1
spacy                             3.7.5
spacy-legacy                      3.0.12
spacy-loggers                     1.0.5
srsly                             2.5.1
stack-data                        0.6.3
starlette                         0.45.3
sympy                             1.13.1
thinc                             8.2.5
threadpoolctl                     3.5.0
tiktoken                          0.7.0
tokenizers                        0.21.0
torch                             2.5.1
torchaudio                        2.5.1
torchvision                       0.20.1
tqdm                              4.67.1
traitlets                         5.14.3
transformers                      4.48.2
triton                            3.1.0
typer                             0.15.1
typing_extensions                 4.12.2
tzdata                            2025.1
urllib3                           2.3.0
uvicorn                           0.34.0
uvloop                            0.21.0
vllm                              0.6.4.post1
wasabi                            1.1.3
watchfiles                        1.0.4
wcwidth                           0.2.13
weasel                            0.4.1
websockets                        14.2
wrapt                             1.17.2
xformers                          0.0.28.post3
xxhash                            3.5.0
yarl                              1.18.3
zipp                              3.21.0

GPU Information (if applicable)


+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.77                 Driver Version: 565.77         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3090        Off |   00000000:01:00.0 Off |                  N/A |
|  0%   38C    P8             17W /  370W |   19834MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A     57527      C   python3                                      3410MiB |
|    0   N/A  N/A   1602657      C   ...unners/cuda_v12/ollama_llama_server      16408MiB |
+-----------------------------------------------------------------------------------------+

CUDA version (if applicable)

-bash: nvcc: command not found


## Possible Solutions
I have no clue, but I am willing to help!

Additional Information

The end goal is to use this GitHub project alongside https://github.com/sfortis/openai_tts

The text was updated successfully, but these errors were encountered:

maxi1134 added the bug Something isn't working label Feb 4, 2025

maxi1134 changed the title ~~[BUG🐛]~~ [BUG🐛] Unable to generate voices using OpenAI api Feb 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG🐛] Unable to generate voices using OpenAI api #64

[BUG🐛] Unable to generate voices using OpenAI api #64

maxi1134 commented Feb 4, 2025 •

edited

Loading

[BUG🐛] Unable to generate voices using OpenAI api #64

[BUG🐛] Unable to generate voices using OpenAI api #64

Comments

maxi1134 commented Feb 4, 2025 • edited Loading

Bug Description

Minimal Reproducible Example

Expected Behavior

Actual Behavior

Error Logs

Environment

GPU Information (if applicable)

CUDA version (if applicable)

Additional Information

maxi1134 commented Feb 4, 2025 •

edited

Loading