-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds audio querying to MultimodalQ&A gateway #974
Conversation
Signed-off-by: okhleif-IL <[email protected]> * added in audio dict creation Signed-off-by: okhleif-IL <[email protected]> * separated audio from prompt Signed-off-by: okhleif-IL <[email protected]> * added ASR endpoint Signed-off-by: okhleif-IL <[email protected]> * removed ASR endpoints from mm embedding Signed-off-by: okhleif-IL <[email protected]> * edited return logic, fixed function call Signed-off-by: okhleif-IL <[email protected]> * added megaservice to elif Signed-off-by: okhleif-IL <[email protected]> * reworked helper func Signed-off-by: okhleif-IL <[email protected]> * Append audio to prompt Signed-off-by: okhleif-IL <[email protected]> * Reworked handle messages, added metadata Signed-off-by: okhleif-IL <[email protected]> * Moved dictionary logic to right place Signed-off-by: okhleif-IL <[email protected]> * changed logic to rely on message len Signed-off-by: okhleif-IL <[email protected]> * list --> empty str Signed-off-by: okhleif-IL <[email protected]> --------- Signed-off-by: Melanie Buehler <[email protected]> Signed-off-by: okhleif-IL <[email protected]> Signed-off-by: dmsuehir <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: okhleif-IL <[email protected]>
Signed-off-by: okhleif-IL <[email protected]>
Fixed role bug where enumeration was wrong
for more information, see https://pre-commit.ci
Codecov ReportAttention: Patch coverage is
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Signed-off-by: Melanie Buehler <[email protected]>
Signed-off-by: Melanie Buehler <[email protected]>
Signed-off-by: Melanie Buehler <[email protected]>
Adds unit test coverage for audio query
for more information, see https://pre-commit.ci
Signed-off-by: Melanie Buehler <[email protected]>
Fix port number placement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
else: | ||
return prompt | ||
|
||
def convert_audio_to_text(self, audio): | ||
# translate audio to text by passing in dictionary to ASR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment quirky! dictionary is a data type here but can get mixed with the English word dictionary (word meanings)
else: | ||
input_dict = {"byte_str": audio[0]} | ||
|
||
response = requests.post(self.asr_endpoint, data=json.dumps(input_dict), proxies={"http": None}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should proxies be read from some environment variable for a more general solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this is setting proxies in the first place, shouldn't those be set well before this point?
import unittest | ||
from typing import Union | ||
|
||
import requests | ||
from fastapi import Request | ||
|
||
os.environ["ASR_SERVICE_PORT"] = "8086" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this overrides environment, instead of taking the value from environment?
Labeling this as |
Looking at new changes in |
Description
Adds ASR endpoint, speech audio processing, prompt construction, and return of decoded audio in response metadata. This goes with GenAIExamples PR: opea-project/GenAIExamples#1225.
Issues
Part of the MultimodalQnA Audio & Image Enhancements RFC
Type of change
Dependencies
N/A
Tests
Automated tests were added to GenAIExamples.