-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix calls to defunct AsyncEngineClient #138
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #138 +/- ##
==========================================
+ Coverage 56.80% 57.26% +0.45%
==========================================
Files 25 25
Lines 1542 1542
Branches 256 256
==========================================
+ Hits 876 883 +7
+ Misses 588 581 -7
Partials 78 78 ☔ View full report in Codecov by Sentry. |
…eClient type with interface
1525dd3
to
08652e5
Compare
@@ -262,7 +262,7 @@ async def Generate( | |||
log_tracing_disabled_warning() | |||
generators.append( | |||
self.engine.generate( | |||
inputs=inputs, | |||
inputs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not remove this, I prefer explicitly specifying the names of the input, as this has caused issues in the past
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can I move inputs
to prompt
and assume >v0.6.1.post2
or do I have to be backward compatible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sad this wasn't done in a non-breaking way in vllm: https://github.com/vllm-project/vllm/pull/8673/files#diff-e8f334a55697b72398acbbed4d280bcb3e5eb6c2e66fc81ccde477818b4b0351R38. Maybe we can simply comment here that the kwarg name is removed due to this change, and then in later releases add in prompt=
?
Might also be worth changing the name of the inputs
variable here to prompt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are reverting upstream for now vllm-project/vllm#8750.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it could've been done in a less abrupt way upstream by warning about deprecation of inputs
while adding both inputs
and prompts
for some time..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, one minor nit
@@ -177,12 +177,12 @@ class TextGenerationService(generation_pb2_grpc.GenerationServiceServicer): | |||
|
|||
def __init__( | |||
self, | |||
engine: AsyncEngineClient | AsyncLLMEngine, | |||
engine: EngineClient | AsyncLLMEngine, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that the union should be needed here.. EngineClient
is a Protocol
and AsyncLLMEngine
implements that protocol.. so only the former should be needed I think?
@@ -65,7 +65,7 @@ | |||
from transformers import PreTrainedTokenizer | |||
from vllm import CompletionOutput, RequestOutput | |||
from vllm.config import ModelConfig | |||
from vllm.engine.protocol import AsyncEngineClient | |||
from vllm.engine.protocol import EngineClient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this was renamed in the latest changes, won't this break usage of the adapter with existing releases (which I thought @dtrifiro tries to maintain)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from vllm.engine.protocol import EngineClient | |
try: | |
from vllm.engine.protocol import EngineClient | |
except ImportError: | |
from vllm.engine.protocol import AsyncEngineClient as EngineClient |
Maybe this is simple to alias like so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep my bad, good one!
Thanks @NickLucche! |
Address breaking changes in upstream main (v0.6.2?) vllm-project/vllm#8673 and vllm-project/vllm#8157. The latter is yet another bulky change to the rpc client-server interaction: I couldn't spot differences in functionality and tests are fine on our side, but I could really use some help to double check that what's been implemented upstream is still fully compatible with the use being made of here.
I branched off #136, so we should be merging that PR first.