Error in uploading the voice cloning sample silence_threshold #388

MrE3gman · 2025-03-02T14:02:39Z

Script Mode

Docker image

Process Mode

Gradio GUI

Operating System:

Windows

Describe the bug
When uploading the audio for voice cloning that previously worked (6 sec wav at 24khz) it throws this error on the GUI

Error
extract_voice() error: _trim_and_clean() error: cannot access local variable 'silence_threshold' where it is not associated with a value

Terminal log

2025-03-02 15:00:18 IPs available for connection:
2025-03-02 15:00:18 ['127.0.0.1', '::1', '172.19.0.2']
2025-03-02 15:00:18 Note: 0.0.0.0 is not the IP to connect. Instead use an IP above to connect.
2025-03-02 15:00:18 * Running on local URL: http://0.0.0.0:7860
2025-03-02 15:00:18
2025-03-02 15:00:18 To create a public link, set share=True in launch().
2025-03-02 15:00:54 Input file valid
2025-03-02 15:00:54 ffmpeg version 5.1.6-0+deb12u1 Copyright (c) 2000-2024 the FFmpeg developers
2025-03-02 15:00:54 built with gcc 12 (Debian 12.2.0-14)
2025-03-02 15:00:54 configuration: --prefix=/usr --extra-version=0+deb12u1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libglslang --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librist --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --disable-sndio --enable-libjxl --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-libplacebo --enable-librav1e --enable-shared
2025-03-02 15:00:54 libavutil 57. 28.100 / 57. 28.100
2025-03-02 15:00:54 libavcodec 59. 37.100 / 59. 37.100
2025-03-02 15:00:54 libavformat 59. 27.100 / 59. 27.100
2025-03-02 15:00:54 libavdevice 59. 7.100 / 59. 7.100
2025-03-02 15:00:54 libavfilter 8. 44.100 / 8. 44.100
2025-03-02 15:00:54 libswscale 6. 7.100 / 6. 7.100
2025-03-02 15:00:54 libswresample 4. 7.100 / 4. 7.100
2025-03-02 15:00:54 libpostproc 56. 6.100 / 56. 6.100
2025-03-02 15:00:54 Guessed Channel Layout for Input Stream #0.0 : stereo
2025-03-02 15:00:54 Input #0, wav, from '/tmp/gradio/40e6d78261744dbf168bba408d40d0fcc48bd669a2122af4551b505c55e52bae/claudio.wav':
2025-03-02 15:00:54 Duration: 00:00:05.83, bitrate: 768 kb/s
2025-03-02 15:00:54 Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 24000 Hz, stereo, s16, 768 kb/s
2025-03-02 15:00:54 Stream mapping:
2025-03-02 15:00:54 Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
2025-03-02 15:00:54 Press [q] to stop, [?] for help
2025-03-02 15:00:54 Output #0, wav, to '/home/user/app/voices/__sessions/voice-dc82c937-1c86-4b47-a6d1-107bb14fba0b/claudio.wav':
2025-03-02 15:00:54 Metadata:
2025-03-02 15:00:54 ISFT : Lavf59.27.100
2025-03-02 15:00:54 Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, mono, s16, 705 kb/s
2025-03-02 15:00:54 Metadata:
2025-03-02 15:00:54 encoder : Lavc59.37.100 pcm_s16le
2025-03-02 15:00:54 size= 4kB time=00:00:00.04 bitrate= 720.5kbits/s speed=N/A
2025-03-02 15:00:54 size= 503kB time=00:00:05.83 bitrate= 705.7kbits/s speed= 557x
2025-03-02 15:00:54 video:0kB audio:503kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.015157%
2025-03-02 15:00:54 Conversion to .wav format for processing successful
2025-03-02 15:00:55 Noise Score: 4225.04
2025-03-02 15:00:55 No background noise or music detected. Skipping separation.
2025-03-02 15:00:55 Traceback (most recent call last):
2025-03-02 15:00:55 File "/home/user/app/lib/classes/voice_extractor.py", line 148, in _trim_and_clean
2025-03-02 15:00:55 self._remove_silences(audio, silence_threshold)
2025-03-02 15:00:55 ^^^^^^^^^^^^^^^^^
2025-03-02 15:00:55 UnboundLocalError: cannot access local variable 'silence_threshold' where it is not associated with a value
2025-03-02 15:00:55
2025-03-02 15:00:55 During handling of the above exception, another exception occurred:
2025-03-02 15:00:55
2025-03-02 15:00:55 Traceback (most recent call last):
2025-03-02 15:00:55 File "/home/user/app/lib/classes/voice_extractor.py", line 292, in extract_voice
2025-03-02 15:00:55 success, msg = self._trim_and_clean()
2025-03-02 15:00:55 ^^^^^^^^^^^^^^^^^^^^^^
2025-03-02 15:00:55 File "/home/user/app/lib/classes/voice_extractor.py", line 210, in _trim_and_clean
2025-03-02 15:00:55 raise ValueError(error)
2025-03-02 15:00:55 ValueError: _trim_and_clean() error: cannot access local variable 'silence_threshold' where it is not associated with a value
2025-03-02 15:00:55
2025-03-02 15:00:55 During handling of the above exception, another exception occurred:
2025-03-02 15:00:55
2025-03-02 15:00:55 Traceback (most recent call last):
2025-03-02 15:00:55 File "/home/user/.local/lib/python3.12/site-packages/gradio/queueing.py", line 625, in process_events
2025-03-02 15:00:55 response = await route_utils.call_process_api(
2025-03-02 15:00:55 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-02 15:00:55 File "/home/user/.local/lib/python3.12/site-packages/gradio/route_utils.py", line 322, in call_process_api
2025-03-02 15:00:55 output = await app.get_blocks().process_api(
2025-03-02 15:00:55 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-02 15:00:55 File "/home/user/.local/lib/python3.12/site-packages/gradio/blocks.py", line 2096, in process_api
2025-03-02 15:00:55 result = await self.call_function(
2025-03-02 15:00:55 ^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-02 15:00:55 File "/home/user/.local/lib/python3.12/site-packages/gradio/blocks.py", line 1643, in call_function
2025-03-02 15:00:55 prediction = await anyio.to_thread.run_sync( # type: ignore
2025-03-02 15:00:55 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-02 15:00:55 File "/home/user/.local/lib/python3.12/site-packages/anyio/to_thread.py", line 56, in run_sync
2025-03-02 15:00:55 return await get_async_backend().run_sync_in_worker_thread(
2025-03-02 15:00:55 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-02 15:00:55 File "/home/user/.local/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2461, in run_sync_in_worker_thread
2025-03-02 15:00:55 return await future
2025-03-02 15:00:55 ^^^^^^^^^^^^
2025-03-02 15:00:55 File "/home/user/.local/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 962, in run
2025-03-02 15:00:55 result = context.run(func, *args)
2025-03-02 15:00:55 ^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-02 15:00:55 File "/home/user/.local/lib/python3.12/site-packages/gradio/utils.py", line 890, in wrapper
2025-03-02 15:00:55 response = f(*args, **kwargs)
2025-03-02 15:00:55 ^^^^^^^^^^^^^^^^^^
2025-03-02 15:00:55 File "/home/user/app/lib/functions.py", line 1887, in change_gr_voice_file
2025-03-02 15:00:55 status, msg = extractor.extract_voice()
2025-03-02 15:00:55 ^^^^^^^^^^^^^^^^^^^^^^^^^
2025-03-02 15:00:55 File "/home/user/app/lib/classes/voice_extractor.py", line 299, in extract_voice
2025-03-02 15:00:55 raise ValueError(msg)
2025-03-02 15:00:55 ValueError: extract_voice() error: _trim_and_clean() error: cannot access local variable 'silence_threshold' where it is not associated with a value

The text was updated successfully, but these errors were encountered:

ROBERT-MCDOWELL · 2025-03-02T14:12:03Z

please try the native last git repo, not the docker which is behind the last version for now.

agentxan · 2025-03-02T14:57:52Z

I also get this error using the lastest from git using headless

v25.2.27 native mode
Input file valid
ffmpeg version 7.1-full_build-www.gyan.dev Copyright (c) 2000-2024 the FFmpeg developers
built with gcc 14.2.0 (Rev1, Built by MSYS2 project)
configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libopenjpeg --enable-libquirc --enable-libuavs3d --enable-libxevd --enable-libzvbi --enable-libqrencode --enable-librav1e --enable-libsvtav1 --enable-libvvenc --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxeve --enable-libxvid --enable-libaom --enable-libjxl --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-dxva2 --enable-d3d11va --enable-d3d12va --enable-ffnvcodec --enable-libvpl --enable-nvdec --enable-nvenc --enable-vaapi --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-liblc3 --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
libavutil 59. 39.100 / 59. 39.100
libavcodec 61. 19.100 / 61. 19.100
libavformat 61. 7.100 / 61. 7.100
libavdevice 61. 3.100 / 61. 3.100
libavfilter 10. 4.100 / 10. 4.100
libswscale 8. 3.100 / 8. 3.100
libswresample 5. 3.100 / 5. 3.100
libpostproc 58. 3.100 / 58. 3.100
[aist#0:0/pcm_s16le @ 000002843d82f780] Guessed Channel Layout: stereo
Input #0, wav, from 'D:\ebooktest\voices\jenna.wav':
Duration: 00:00:09.60, bitrate: 1411 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'd:\ebook2audiobook\voices__sessions\voice-798c4557-a1f6-4fc5-88a2-777a7196d97c\jenna.wav':
Metadata:
ISFT : Lavf61.7.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, mono, s16, 705 kb/s
Metadata:
encoder : Lavc61.19.100 pcm_s16le
[out#0/wav @ 000002843d82f940] video:0KiB audio:827KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: 0.009212%
size= 827KiB time=00:00:09.60 bitrate= 705.7kbits/s speed= 425x
Conversion to .wav format for processing successful
Noise Score: 5270.12
No background noise or music detected. Skipping separation.
convert_ebook() Exception: extract_voice() error: _trim_and_clean() error: cannot access local variable 'silence_threshold' where it is not associated with a value
Conversion failed: extract_voice() error: _trim_and_clean() error: cannot access local variable 'silence_threshold' where it is not associated with a value

fix #388

ROBERT-MCDOWELL · 2025-03-02T17:24:24Z

fixed and merged.

ROBERT-MCDOWELL added a commit to ROBERT-MCDOWELL/ebook2audiobook that referenced this issue Mar 2, 2025

fix DrewThomasson#388

b867e59

ROBERT-MCDOWELL added a commit that referenced this issue Mar 2, 2025

Merge pull request #390 from ROBERT-MCDOWELL/v25

0c65456

fix #388

ROBERT-MCDOWELL closed this as completed Mar 2, 2025

ROBERT-MCDOWELL added the Fixed and merged label Mar 3, 2025

MrE3gman mentioned this issue Mar 4, 2025

Audio truncated and wierd silences on Spanish XTTS #410

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in uploading the voice cloning sample silence_threshold #388

Error in uploading the voice cloning sample silence_threshold #388

MrE3gman commented Mar 2, 2025

ROBERT-MCDOWELL commented Mar 2, 2025

agentxan commented Mar 2, 2025 •

edited

Loading

ROBERT-MCDOWELL commented Mar 2, 2025

Error in uploading the voice cloning sample silence_threshold #388

Error in uploading the voice cloning sample silence_threshold #388

Comments

MrE3gman commented Mar 2, 2025

ROBERT-MCDOWELL commented Mar 2, 2025

agentxan commented Mar 2, 2025 • edited Loading

ROBERT-MCDOWELL commented Mar 2, 2025

agentxan commented Mar 2, 2025 •

edited

Loading