Adds audio querying to MultimodalQ&A gateway #974

mhbuehler · 2024-12-04T19:00:25Z

Description

Adds ASR endpoint, speech audio processing, prompt construction, and return of decoded audio in response metadata. This goes with GenAIExamples PR: opea-project/GenAIExamples#1225.

Issues

Part of the MultimodalQnA Audio & Image Enhancements RFC

Type of change

New feature (non-breaking change which adds new functionality)

Dependencies

N/A

Tests

Automated tests were added to GenAIExamples.

Signed-off-by: okhleif-IL <[email protected]> * added in audio dict creation Signed-off-by: okhleif-IL <[email protected]> * separated audio from prompt Signed-off-by: okhleif-IL <[email protected]> * added ASR endpoint Signed-off-by: okhleif-IL <[email protected]> * removed ASR endpoints from mm embedding Signed-off-by: okhleif-IL <[email protected]> * edited return logic, fixed function call Signed-off-by: okhleif-IL <[email protected]> * added megaservice to elif Signed-off-by: okhleif-IL <[email protected]> * reworked helper func Signed-off-by: okhleif-IL <[email protected]> * Append audio to prompt Signed-off-by: okhleif-IL <[email protected]> * Reworked handle messages, added metadata Signed-off-by: okhleif-IL <[email protected]> * Moved dictionary logic to right place Signed-off-by: okhleif-IL <[email protected]> * changed logic to rely on message len Signed-off-by: okhleif-IL <[email protected]> * list --> empty str Signed-off-by: okhleif-IL <[email protected]> --------- Signed-off-by: Melanie Buehler <[email protected]> Signed-off-by: okhleif-IL <[email protected]> Signed-off-by: dmsuehir <[email protected]>

for more information, see https://pre-commit.ci

Signed-off-by: okhleif-IL <[email protected]>

Fixed role bug where enumeration was wrong

for more information, see https://pre-commit.ci

codecov · 2024-12-04T23:23:46Z

Codecov Report

Attention: Patch coverage is 66.66667% with 25 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
comps/cores/mega/gateway.py	66.66%	25 Missing ⚠️

Files with missing lines	Coverage Δ
comps/cores/mega/gateway.py	`31.43% <66.66%> (+3.29%)`	⬆️

comps/cores/mega/gateway.py

ashahba

LGTM!

Signed-off-by: Melanie Buehler <[email protected]>

Adds unit test coverage for audio query

for more information, see https://pre-commit.ci

Signed-off-by: Melanie Buehler <[email protected]>

Fix port number placement

mkbhanda

Looks good to me!

mkbhanda · 2024-12-06T19:13:28Z

comps/cores/mega/gateway.py

        else:
            return prompt

+    def convert_audio_to_text(self, audio):
+        # translate audio to text by passing in dictionary to ASR


comment quirky! dictionary is a data type here but can get mixed with the English word dictionary (word meanings)

mkbhanda · 2024-12-06T19:15:23Z

comps/cores/mega/gateway.py

+        else:
+            input_dict = {"byte_str": audio[0]}
+
+        response = requests.post(self.asr_endpoint, data=json.dumps(input_dict), proxies={"http": None})


should proxies be read from some environment variable for a more general solution?

Why this is setting proxies in the first place, shouldn't those be set well before this point?

eero-t · 2024-12-09T12:23:50Z

tests/cores/mega/test_multimodalqna_gateway.py

 import unittest
 from typing import Union

 import requests
 from fastapi import Request

+os.environ["ASR_SERVICE_PORT"] = "8086"


Why this overrides environment, instead of taking the value from environment?

ashahba · 2024-12-10T01:04:54Z

Labeling this as WIP since we may not need it after all.

ashahba · 2024-12-10T23:41:31Z

Looking at new changes in GenAIExamples, it seems like we don't this PR after all since this PR opea-project/GenAIExamples#1225 is self contained.

mhbuehler requested a review from lvliang-intel as a code owner December 4, 2024 19:00

mhbuehler and others added 3 commits December 4, 2024 11:01

Merge branch 'main' into mmqna-audio-query

e1e5fde

[pre-commit.ci] auto fixes from pre-commit.com hooks

70c54e1

for more information, see https://pre-commit.ci

fixed role bug where i never was > 0

6a71843

Signed-off-by: okhleif-IL <[email protected]>

mhbuehler mentioned this pull request Dec 4, 2024

Adds audio querying to MultimodalQ&A Example opea-project/GenAIExamples#1225

Merged

1 task

okhleif-IL and others added 3 commits December 4, 2024 15:14

removed whitespace

615459b

Signed-off-by: okhleif-IL <[email protected]>

Merge pull request #13 from mhbuehler/omar/role-debug

1753473

Fixed role bug where enumeration was wrong

[pre-commit.ci] auto fixes from pre-commit.com hooks

dcafe8d

for more information, see https://pre-commit.ci

ashahba added WIP r1.2 labels Dec 4, 2024

ashahba added this to the v1.2 milestone Dec 4, 2024

ashahba reviewed Dec 5, 2024

View reviewed changes

comps/cores/mega/gateway.py Show resolved Hide resolved

ashahba approved these changes Dec 5, 2024

View reviewed changes

mhbuehler and others added 8 commits December 5, 2024 14:45

Adds unit test coverage for audio query

fa47959

Signed-off-by: Melanie Buehler <[email protected]>

Port number fix

37826be

Signed-off-by: Melanie Buehler <[email protected]>

Formatting

40d34db

Signed-off-by: Melanie Buehler <[email protected]>

Merge pull request #14 from mhbuehler/melanie/add_test_coverage

6f2a753

Adds unit test coverage for audio query

[pre-commit.ci] auto fixes from pre-commit.com hooks

a665c3c

for more information, see https://pre-commit.ci

Merge branch 'main' into mmqna-audio-query

4a5c8ea

Fixed place where port number is set

d9ab567

Signed-off-by: Melanie Buehler <[email protected]>

Merge pull request #15 from mhbuehler/melanie/port_placement

75b135f

Fix port number placement

ashahba removed the WIP label Dec 6, 2024

mkbhanda approved these changes Dec 6, 2024

View reviewed changes

eero-t reviewed Dec 9, 2024

View reviewed changes

okhleif-IL mentioned this pull request Dec 9, 2024

Moved Audio Query Gateway changes to multimodalqna.py mhbuehler/GenAIExamples#29

Merged

4 tasks

ashahba added the WIP label Dec 10, 2024

ashahba closed this Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds audio querying to MultimodalQ&A gateway #974

Adds audio querying to MultimodalQ&A gateway #974

mhbuehler commented Dec 4, 2024 •

edited

Loading

codecov bot commented Dec 4, 2024 •

edited

Loading

ashahba left a comment

mkbhanda left a comment

mkbhanda Dec 6, 2024

mkbhanda Dec 6, 2024

eero-t Dec 9, 2024

eero-t Dec 9, 2024

ashahba commented Dec 10, 2024

ashahba commented Dec 10, 2024

Adds audio querying to MultimodalQ&A gateway #974

Adds audio querying to MultimodalQ&A gateway #974

Conversation

mhbuehler commented Dec 4, 2024 • edited Loading

Description

Issues

Type of change

Dependencies

Tests

codecov bot commented Dec 4, 2024 • edited Loading

Codecov Report

ashahba left a comment

Choose a reason for hiding this comment

mkbhanda left a comment

Choose a reason for hiding this comment

mkbhanda Dec 6, 2024

Choose a reason for hiding this comment

mkbhanda Dec 6, 2024

Choose a reason for hiding this comment

eero-t Dec 9, 2024

Choose a reason for hiding this comment

eero-t Dec 9, 2024

Choose a reason for hiding this comment

ashahba commented Dec 10, 2024

ashahba commented Dec 10, 2024

mhbuehler commented Dec 4, 2024 •

edited

Loading

codecov bot commented Dec 4, 2024 •

edited

Loading