Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: AttributeError in OpenAIServingChat when accessing chat_template when using ray serve #4296

Closed
BoussouarSari opened this issue Apr 23, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@BoussouarSari
Copy link

Your current environment

vllm version : [v0.4.1]

🐛 Describe the bug

Description

I'm encountering an AttributeError in the OpenAIServingChat module when integrating Ray Serve with the OpenAI API. The error arises because the tokenizer object is accessed before it is fully initialized.

Error Message

the error occured in this line

AttributeError: 'NoneType' object has no attribute 'chat_template'

Issue Details

The tokenizer is instantiated asynchronously in the _post_init() function of the OpenAIServing class. However, this instantiation occurs conditionally within the constructor based on the status of an existing event loop

The _load_chat_template function, which relies on tokenizer, is invoked synchronously here, potentially before tokenizer is fully initialized.

Suggested Solution

Convert _load_chat_template to an asynchronous function and invoke it similarly to _post_init, ensuring it is executed after the tokenizer has been initialized. This modification should maintain the sequence of initialization and avoid premature access.

@schoennenbeck
Copy link
Contributor

See this PR: #2727

@DarkLight1337
Copy link
Member

Fixed by #2727

@Iven2132
Copy link

Hi @DarkLight1337 Its not been fixed yet i still that that

    import fastapi
    import vllm.entrypoints.openai.api_server as api_server
    from vllm.engine.arg_utils import AsyncEngineArgs
    from vllm.engine.async_llm_engine import AsyncLLMEngine
    from vllm.entrypoints.logger import RequestLogger
    from vllm.entrypoints.openai.serving_chat import OpenAIServingChat
    from vllm.entrypoints.openai.serving_completion import (
        OpenAIServingCompletion,
    )
    from vllm.entrypoints.openai.serving_engine import BaseModelPath
    from vllm.usage.usage_lib import UsageContext

    # create a fastAPI app that uses vLLM's OpenAI-compatible router
    web_app = fastapi.FastAPI(
        title=f"OpenAI-compatible {MODEL_NAME} server",
        description="Run an OpenAI-compatible LLM server with vLLM on modal.com 🚀",
        version="0.0.1",
        docs_url="/docs",
    )

    router = fastapi.APIRouter()

    # wrap vllm's router in auth router
    router.include_router(api_server.router)
    # add authed vllm to our fastAPI app
    web_app.include_router(router)

    engine_args = AsyncEngineArgs(
        model=MODEL_NAME,
        tensor_parallel_size=N_GPU,
        gpu_memory_utilization=0.90,
        max_model_len=8096,
        enforce_eager=False,  # capture the graph for faster inference, but slower cold starts (30s > 20s)
    )

    engine = AsyncLLMEngine.from_engine_args(
        engine_args, usage_context=UsageContext.OPENAI_API_SERVER
    )

    model_config = engine.get_model_config()

    request_logger = RequestLogger(max_log_len=2048)

    base_model_paths = [
        BaseModelPath(name=MODEL_NAME.split("/")[1], model_path=MODEL_NAME)
    ]

    api_server.chat = lambda s: OpenAIServingChat(
        engine,
        model_config=model_config,
        base_model_paths=base_model_paths,
        chat_template=None,
        response_role="assistant",
        lora_modules=[],
        prompt_adapters=[],
        request_logger=request_logger,
    )
    api_server.completion = lambda s: OpenAIServingCompletion(
        engine,
        model_config=model_config,
        base_model_paths=base_model_paths,
        lora_modules=[],
        prompt_adapters=[],
        request_logger=request_logger,
    )

    return web_app

@DarkLight1337
Copy link
Member

Can you open a new issue and provide more detailed information?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants