-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Diarization #1556
Diarization #1556
Conversation
@nnegrey , @dizcology , @theacodes Can you please review this PR? |
Looks like these use the |
python transcribe_diarization.py \ | ||
resources/Google_Gnome.wav | ||
python transcribe_diarization.py \ | ||
gs://cloud-ml-api-e2e-testing/speech/stereo_audio.wav |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cloud-ml-api-e2e-testing
public bucket?
Good catch @nnegrey. I moved all the 4 beta samples to beta_snippets.py and simplified a few things. |
print('First alternative of result {}: {}' | ||
.format(i, alternative.transcript)) | ||
print('Speaker Tag for the first word: {}' | ||
.format(alternative.words[0].speaker_tag)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does each word
in alternative.words
have other relevant information concerning diarization besides speaker_tag
? if so print them as well, if not please ignore this comment.
os.path.join(RESOURCES, 'Google_Gnome.wav')) | ||
out, err = capsys.readouterr() | ||
|
||
assert 'OK Google stream stranger things from Netflix to my TV' in out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there two different speakers in this audio file? if so assert something about the speaker_tags being returned correctly.
The samples in this code demo diarization: Basically who said what (when there is more than just one person talking).