Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support setting custom alignment heads for dtw #301

Conversation

jettoblack
Copy link
Contributor

This PR adds the ability to configure a custom attention heads array to enable DTW time alignment when using non-standard Whisper models which don't have a named preset configured in WhisperAlignmentHeadsPreset enumeration. The standard models have these presets included in whisper.cpp but for any other model you must specify the model-specific values.

For example, for the model "Distil Large V3", you need to use use:

        RuntimeOptions.Instance.SetUseDtwTimeStamps(true);
        RuntimeOptions.Instance.SetHeadsPreset(WhisperAlignmentHeadsPreset.Custom);
        RuntimeOptions.Instance.SetAlignmentHeads(
            Enumerable.Range(0, 20).Select(i => new WhisperAlignmentHead(1, i)).ToArray()
        );

This one seems simple (array where TextLayer is always 1, and Head is 0...19), but that isn't the case for every model. E.g. look at the presets in whisper.cpp for g_aheads_large_v2:

static const whisper_ahead g_aheads_large_v2[]  = { {10, 12}, {13, 17}, {16, 11}, {16, 12}, {16, 13}, {17, 15}, {17, 16}, {18, 4}, {18, 11}, {18, 19}, {19, 11}, {21, 2}, {21, 3}, {22, 3}, {22, 9}, {22, 12}, {23, 5}, {23, 7}, {23, 13}, {25, 5}, {26, 1}, {26, 12}, {27, 15} };

@sandrohanea
Copy link
Owner

Hey @jettoblack ,
Thanks for the contribution!

I think it makes sense for now (it follows the same patter as the rest). However, we might change it in the future to not grow these RuntimeOptions with stuff that can be moved and initialized per factory.

RuntimeOptions will keep only the options that cannot be set twice (e.g. defaultRuntimeOrder, bypassLoading etc) related to Runtime loading and we'll create new Options that can be defined for each factory.

@sandrohanea sandrohanea merged commit 78e1651 into sandrohanea:main Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants