Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add get_model_status to get a model status on InferenceAPI #1558

Closed
Wauplin opened this issue Jul 12, 2023 · 12 comments
Closed

Add get_model_status to get a model status on InferenceAPI #1558

Wauplin opened this issue Jul 12, 2023 · 12 comments
Labels
good first issue Good for newcomers

Comments

@Wauplin
Copy link
Contributor

Wauplin commented Jul 12, 2023

Related to #1557 but can be implemented separately.

InferenceClient could have an extra method get_model_status to get the status of a deployed model. This only makes sense for InferenceAPI at least for now. We could think about a status for InferenceEndpoint as well but that will probably be implemented separately (see #1541). API endpoint for a given model is https://api-inference.huggingface.co/status/{model_id}.

This /status endpoint gives 2 main information:

  • is the model loadable?
  • is the model loaded?
# Not loaded model
>>> curl https://api-inference.huggingface.co/status/google/flan-t5-xxl
{"loaded":false,"state":"Loadable","compute_type":"gpu","framework":"text-generation-inference"}

# Not loadable model
>>> curl https://api-inference.huggingface.co/status/bigscience/bloomz
{"loaded":false,"state":"TooBig","compute_type":"cpu","framework":"transformers"}

# Loaded model
>>> curl https://api-inference.huggingface.co/status/bigcode/starcoder
{"loaded":true,"state":"Loaded","compute_type":"gpu","framework":"text-generation-inference"}

# Missing model
>>> curl https://api-inference.huggingface.co/status/unknown/model
{"error":"Model unknown/model does not exist"}

Here is how I see the method signature (can be discussed if needed). Would be good to define a dataclass for the returned value.

class InferenceClient:
    ...
   
    def get_model_status(self, model: Optional[str] = None, *, token: Optional[str] = None) -> ModelStatus:
        model = model or self.model
        if model is None:
             raise ValueError("Model id not provided")
        if model.startswith("https://"):
            raise ValueError("...")  # only works for InferenceAPI, not any URL 
        
        return ...
@sifisKoen
Copy link
Contributor

Hello it's a nice issue and I think I solve it following your initial function blueprint. Also I added a @dataclass.
Here is the dataclass:

@dataclass
class ModelStatus:
    loaded: bool
    state: str
    compute_type: str
    framework: str

And then after your if statement I think you could include something like this:

huggingface_interface_response = request.get(f"https://api-inference.huggingface.co/status/{model}")
huggingface_interface_response.raise_for_status()

Finally you can insert couple lines of code one just to check if you have an error in your responsed data. Something like this:

response_data = huggingface_interface_response.json()
if "error" in data:
    raise ValueError(response_data["error"])
elif response_data["loaded"] == False:
    raise ValueError(response_data["state"])

And then you can just return the data through your dataclass you have already created. Something like this:

return ModelStatus(
            loaded=data['loaded'],
            state=data['state'],
            compute_type=data['compute_type'],
            framework=data['framework'],
        )

FYI you need to import some modules.

from dataclasses import dataclass
import requests

I think this is what you need for this issue 😄
Let me know if it helped you.

@Wauplin
Copy link
Contributor Author

Wauplin commented Jul 12, 2023

Thanks for your comment @sifisKoen! Yes I definitely think this is the way to go. The only think I would do differently is to not raise an issue if model is not loaded. If response_data["loaded"] == False it is still a valid information to get for the user.

Would you like to open a PR to put everything together? 🤗

@sifisKoen
Copy link
Contributor

Thank you for your reply.

Ok I got it. So you think something like:

1st Case

if response_data["loaded"] == False then we raise a ValueError

if response_data["loaded"] == False:
    raise ValueError(response_data["state"])

2nd Case

if response_data["loaded"] == False then we just return the state of the model

if response_data["loaded"] == False:
    return response_data["state"]

3rd Case

if response_data["loaded"] == False then we do nothing and we just make return. Witch I think is the less optimized way.

if response_data["loaded"] == False:
    # Do nothing and just do return?
    return

I am just asking so to provide you with the most suitable code for the project.

Yep I will do it. 😃
You have a very nice project here btw. 👏

@Wauplin
Copy link
Contributor Author

Wauplin commented Jul 12, 2023

Ok so my idea would be to return the dataclass, no matter the value of loaded. Something like that:

    response_data = huggingface_interface_response.json()
    if "error" in data:
        raise ValueError(response_data["error"])
    return ModelStatus(
            loaded=data['loaded'],
            state=data['state'],
            compute_type=data['compute_type'],
            framework=data['framework'],
    )

By doing so, we let the user decide what's best to do with the information.
Thanks for your help and enthousiasm! 🤗 ❤️

@sifisKoen
Copy link
Contributor

Ohhhh, cool got you. I will open a new PR with the new function, and a doc string with function description.

No problem I will check other issues if I can help to them too. 🤝

sifisKoen added a commit to sifisKoen/huggingface_hub that referenced this issue Jul 12, 2023
@sjjpo2002
Copy link

Any update on the progress for this issue? Knowing the status of deployed models is specially important as I couldn't find a way to list all the models that are deployed and available through the API.

@Wauplin
Copy link
Contributor Author

Wauplin commented Aug 3, 2023

@sjjpo2002 Implementation has started in this PR: #1559
cc @sifisKoen about the development itself (implementation is mostly done, just some tests remaining)

@sifisKoen
Copy link
Contributor

@sjjpo2002 Implementation has started in this PR: #1559 cc @sifisKoen about the development itself (implementation is mostly done, just some tests remaining)

Hello, these days I am out of office. I am at my vacations. I will be back in 5 days, so I will finish the tests then. @Wauplin

@Wauplin
Copy link
Contributor Author

Wauplin commented Aug 3, 2023

Great, thanks @sifisKoen. Enjoy your vacation time! 🎉 😄

@sifisKoen
Copy link
Contributor

@Wauplin hey mate. I just finished the tests I will upload them now. Sorry for my delay tho. 😞

sifisKoen added a commit to sifisKoen/huggingface_hub that referenced this issue Aug 28, 2023
Wauplin added a commit that referenced this issue Aug 29, 2023
* Add get_model_status function (#1558)

* Update src/huggingface_hub/inference/_client.py

Accepting the suggestion.

Co-authored-by: Lucain <[email protected]>

* Update src/huggingface_hub/inference/_client.py

Accept the changes get_model_status function doc string.

Co-authored-by: Lucain <[email protected]>

* Update src/huggingface_hub/inference/_client.py

Accept the string inclusion in Error raise.

Co-authored-by: Lucain <[email protected]>

* Use interface endpoint constant and add get_session

Accept the two changes about the INFERENCE_ENDPOINT and get_session().

Co-authored-by: Lucain <[email protected]>

* Add dataclass in _common.py

* Docstring refactor

* Add new comments and modifications.

Co-authored-by: Lucain <[email protected]>

* Update src/huggingface_hub/inference/_client.py comments. From Wauplin

Co-authored-by: Lucain <[email protected]>

* Update src/huggingface_hub/inference/_client.py comments so to follow the correct syntax. From Wauplin

Co-authored-by: Lucain <[email protected]>

* Update src/huggingface_hub/inference/_common.py

Co-authored-by: Lucain <[email protected]>

* Add the tests for #1558

* fix async get_model_status

---------

Co-authored-by: Lucain <[email protected]>
@Wauplin
Copy link
Contributor Author

Wauplin commented Aug 29, 2023

This feature is now available! Thanks @sifisKoen for your work ❤️

@Wauplin Wauplin closed this as completed Aug 29, 2023
@sifisKoen
Copy link
Contributor

Thank you @Wauplin loved work with you 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants