Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InferenceClient] Support proxy calls for 3rd party providers #2781

Merged
merged 7 commits into from
Jan 24, 2025

Conversation

hanouticelina
Copy link
Contributor

Resolve #2780

This is a first draft of the implementation of proxied calls. For now, this is done directly in prepare_request method for each provider, as there might be some some particularities for some providers (e.g. we need to update headers for fal.ai when the call is proxied).

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@@ -1186,3 +1186,203 @@ def test_chat_completion_error_in_stream():
def test_resolve_chat_completion_url(model_url: str, expected_url: str):
url = _build_chat_completion_url(model_url)
assert url == expected_url


class TestHFInferenceProvider:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be some refactoring here as the tests are quite similar, but i opted to have separate test classes per provider to make adding tests for a new provider more straightforward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(likely to be revisited at some point yes but not shocked by having some redundancy in these tests)

@@ -2806,7 +2806,7 @@ def visual_question_answering(
"""
provider_helper = get_provider_helper(self.provider, task="visual-question-answering")
request_parameters = provider_helper.prepare_request(
inputs=None,
inputs=image,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure how the test did not fail before 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙈

@hanouticelina hanouticelina requested a review from Wauplin January 24, 2025 14:18
Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Mostly some high-level questions but overall logic looks good to me

tests/test_inference_client.py Outdated Show resolved Hide resolved
tests/test_inference_client.py Outdated Show resolved Hide resolved
tests/test_inference_client.py Outdated Show resolved Hide resolved
@@ -1186,3 +1186,203 @@ def test_chat_completion_error_in_stream():
def test_resolve_chat_completion_url(model_url: str, expected_url: str):
url = _build_chat_completion_url(model_url)
assert url == expected_url


class TestHFInferenceProvider:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(likely to be revisited at some point yes but not shocked by having some redundancy in these tests)

@@ -2806,7 +2806,7 @@ def visual_question_answering(
"""
provider_helper = get_provider_helper(self.provider, task="visual-question-answering")
request_parameters = provider_helper.prepare_request(
inputs=None,
inputs=image,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙈

Comment on lines 56 to 57
"Routing the call through Hugging Face's infrastructure using your HF token, "
"and the usage will be billed directly to your Hugging Face account"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it actually the case? My understanding was that if a user sets their api_key in their HF account, the proxy will use it. So usage is billed on HF account only if using proxy + haven't set an api_key in their user settings. cc @SBrandeis for confirmation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Routing the call through Hugging Face's infrastructure using your HF token, "
"and the usage will be billed directly to your Hugging Face account"
"Calling fal.ai provider through Hugging Face proxy."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes maybe you're right

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding was that if a user sets their api_key in their HF account, the proxy will use it.

correct (but this changed today, there was some back'n'forth)

@hanouticelina hanouticelina marked this pull request as ready for review January 24, 2025 16:30
@hanouticelina hanouticelina requested a review from Wauplin January 24, 2025 16:30
Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go!

@hanouticelina hanouticelina merged commit 9410c22 into main Jan 24, 2025
17 checks passed
@hanouticelina hanouticelina deleted the add-proxy-calls branch January 24, 2025 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

implement proxy-ed calls (+ make sure we never leak HF token to another provider)
4 participants