Skip to content

Commit

Permalink
Try fetching stop_reason from EngineOutput before checking the request
Browse files Browse the repository at this point in the history
  • Loading branch information
bnellnm committed Feb 11, 2025
1 parent 565c1ef commit 481ced7
Showing 1 changed file with 5 additions and 2 deletions.
7 changes: 5 additions & 2 deletions vllm/v1/engine/output_processor.py
Original file line number Diff line number Diff line change
Expand Up @@ -179,11 +179,14 @@ def process_outputs(
# in the EngineCore.
req_state.is_prefilling = not new_token_ids

stop_reason = engine_core_output.stop_reason

# 2) Detokenize the token ids into text and check for stop
# strings.
stop_reason = req_state.detokenizer.update(new_token_ids)
if stop_reason:
stop_string = req_state.detokenizer.update(new_token_ids)
if stop_string and finish_reason != FinishReason.STOP:
finish_reason = FinishReason.STOP
stop_reason = stop_string

# 3) Compute sample and prompt logprobs for request,
# if required.
Expand Down

0 comments on commit 481ced7

Please sign in to comment.