Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CoreML not usable for large files anymore? #1619

Open
mirozahorak opened this issue Dec 10, 2023 · 3 comments
Open

CoreML not usable for large files anymore? #1619

mirozahorak opened this issue Dec 10, 2023 · 3 comments
Labels
question Further information is requested

Comments

@mirozahorak
Copy link

mirozahorak commented Dec 10, 2023

I have upgraded whisper to latest version and downloaded large-v3 model.
whisper.cpp was working wonderfully for me before.
I am processing the same kind of files as before 30-80 minutes.
Now it is very hard to achieve the same quality and even finished transcription of files.
Many times, after 30-40 minutes, it just repeats the sentences.

But most of the time in just crashes (sometimes after a few minutes, sometimes after 20 minutes) with error:

whisper_full_with_state: failed to decode
/Volumes/DEVEL/TRANSCRIPTION/w2/whisper.cpp/main: failed to process audio

I have tested all kinds of settings, and they have an influence, but the previous quality and simplicity is impossible to achieve. With default settings it just does not work anymore.

I have recompiled and double tested, but either something is broken or i am doing something wrong.
Let me know what info i can provide to help solve this problem.

i have tested changing parameters:

-bo 8 -mc 64 -bs 8 -et 2.9 
-bo 8  -mc 56  -lpt -0.9  -wt 0.005 -sow

and other combinations of above, but while they change behaviour, i was not able to achieve quality of whisper.cpp version i installed in august with large-v2 model

@bobqianic
Copy link
Collaborator

I have upgraded whisper to latest version and downloaded large-v3 model.

Now it is very hard to achieve the same quality

It's better to use large-v2 instead, as the problem lies with large-v3 itself, which has experienced a significant decline in quality compared to large-v2.

openai/whisper#1762

and even finished transcription of files.

whisper_full_with_state: failed to decode
/Volumes/DEVEL/TRANSCRIPTION/w2/whisper.cpp/main: failed to process audio

Give large-v2 a try and check if the issue persists.

@bobqianic bobqianic added the question Further information is requested label Dec 10, 2023
@ggerganov
Copy link
Owner

Can you test: #1633

@simicvm
Copy link

simicvm commented Feb 25, 2024

Can you test: #1633

seems like this is still a problem. on big files, large-v3 at some point just starts repeating sentences. large-v2 transcribes them without issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants