Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix calls to defunct AsyncEngineClient #138

Merged
merged 4 commits into from
Sep 24, 2024

Conversation

NickLucche
Copy link
Contributor

@NickLucche NickLucche commented Sep 23, 2024

Address breaking changes in upstream main (v0.6.2?) vllm-project/vllm#8673 and vllm-project/vllm#8157. The latter is yet another bulky change to the rpc client-server interaction: I couldn't spot differences in functionality and tests are fine on our side, but I could really use some help to double check that what's been implemented upstream is still fully compatible with the use being made of here.

I branched off #136, so we should be merging that PR first.

@codecov-commenter
Copy link

codecov-commenter commented Sep 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 57.26%. Comparing base (10dd1d9) to head (f929661).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #138      +/-   ##
==========================================
+ Coverage   56.80%   57.26%   +0.45%     
==========================================
  Files          25       25              
  Lines        1542     1542              
  Branches      256      256              
==========================================
+ Hits          876      883       +7     
+ Misses        588      581       -7     
  Partials       78       78              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dtrifiro dtrifiro force-pushed the asyncengineclient-replacement-fix branch from 1525dd3 to 08652e5 Compare September 23, 2024 15:20
@@ -262,7 +262,7 @@ async def Generate(
log_tracing_disabled_warning()
generators.append(
self.engine.generate(
inputs=inputs,
inputs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not remove this, I prefer explicitly specifying the names of the input, as this has caused issues in the past

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can I move inputs to prompt and assume >v0.6.1.post2 or do I have to be backward compatible?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sad this wasn't done in a non-breaking way in vllm: https://github.com/vllm-project/vllm/pull/8673/files#diff-e8f334a55697b72398acbbed4d280bcb3e5eb6c2e66fc81ccde477818b4b0351R38. Maybe we can simply comment here that the kwarg name is removed due to this change, and then in later releases add in prompt=?

Might also be worth changing the name of the inputs variable here to prompt

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are reverting upstream for now vllm-project/vllm#8750.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it could've been done in a less abrupt way upstream by warning about deprecation of inputs while adding both inputs and prompts for some time..

@dtrifiro dtrifiro self-requested a review September 23, 2024 15:22
Copy link
Contributor

@dtrifiro dtrifiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, one minor nit

@@ -177,12 +177,12 @@ class TextGenerationService(generation_pb2_grpc.GenerationServiceServicer):

def __init__(
self,
engine: AsyncEngineClient | AsyncLLMEngine,
engine: EngineClient | AsyncLLMEngine,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that the union should be needed here.. EngineClient is a Protocol and AsyncLLMEngine implements that protocol.. so only the former should be needed I think?

@@ -65,7 +65,7 @@
from transformers import PreTrainedTokenizer
from vllm import CompletionOutput, RequestOutput
from vllm.config import ModelConfig
from vllm.engine.protocol import AsyncEngineClient
from vllm.engine.protocol import EngineClient
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this was renamed in the latest changes, won't this break usage of the adapter with existing releases (which I thought @dtrifiro tries to maintain)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from vllm.engine.protocol import EngineClient
try:
from vllm.engine.protocol import EngineClient
except ImportError:
from vllm.engine.protocol import AsyncEngineClient as EngineClient

Maybe this is simple to alias like so?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep my bad, good one!

@joerunde joerunde merged commit 5852586 into main Sep 24, 2024
3 checks passed
@joerunde joerunde deleted the asyncengineclient-replacement-fix branch September 24, 2024 14:55
@joerunde
Copy link
Contributor

Thanks @NickLucche!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants