Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[server] phi-3 uses <|endoftext|> instead of <|end|> when applying chat template in /chat/completions #7432

Closed
andysalerno opened this issue May 21, 2024 · 1 comment · Fixed by #7449

Comments

@andysalerno
Copy link
Contributor

andysalerno commented May 21, 2024

When using phi-3 without the option --chat-template phi3, the tokenization is incorrect.

For example, if I do use --chat-template phi3, here is the log output when I send the message "hi":

{
    "level": "VERB",
    "function": "update_slots",
    "line": 1954,
    "msg": "prompt tokenized",
    "id_slot": 0,
    "id_task": 1,
    "n_ctx": 8192,
    "n_keep": 0,
    "n_prompt_tokens": 7,
    "prompt_tokens": "<s><|system|><|end|><|user|> hi<|end|><|assistant|>"
}

actually the extra space after <|user|> is concerning, it should be a newline, but maybe that's just an artifact of how the log message is formatted.

But here's what happens when the --chat-template phi3 is omitted:

{
    "level": "VERB",
    "function": "update_slots",
    "line": 1954,
    "msg": "prompt tokenized",
    "id_slot": 0,
    "id_task": 0,
    "n_ctx": 8192,
    "n_keep": 0,
    "n_prompt_tokens": 11,
    "prompt_tokens": "<s><|system|><|endoftext|> \n<|user|> hi<|endoftext|> \n<|assistant|>"
}

See how it uses <|endoftext|> (wrong) instead of <|end|> (correct) which causes really bad generation.

I am using the gguf straight from Microsoft, so I guess it is as official as it gets:

https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf

Possibly the problem is in the gguf itself? Even so, it's weird that using the "official" gguf results in incorrect tokenization output from the template applied.

Now, you could just always use --chat-template phi3. But my expectation is the phi3 chat template should automatically be picked up by the detection heuristic, when using the canonical/official Phi-3 models, since they purport to support phi3.

@tristandruyen
Copy link
Contributor

tristandruyen commented May 22, 2024

So regarding your note about the space, the code actually uses a newline after <|user|> as you can see here, so it seems like a display artifact.

The template auto-detection seems broken, it mistakenly selects the zephyr chat template due to the matching <|user|>:

    # ....
    } else if (tmpl == "zephyr" || tmpl.find("<|user|>") != std::string::npos) {
        // zephyr template
        for (auto message : chat) {
            ss << "<|" << message->role << "|>" << "\n" << message->content << "<|endoftext|>\n";
        }
        if (add_ass) {
            ss << "<|assistant|>\n";
        }
    } 

and this check happens before the phi template check

    # ....
    } else if (tmpl == "phi3" || (tmpl.find("<|assistant|>") != std::string::npos && tmpl.find("<|end|>") != std::string::npos )) {
        // Phi 3
        for (auto message : chat) {
            std::string role(message->role);
            ss << "<|" << role << "|>\n" << trim(message->content) << "<|end|>\n";
        }
        if (add_ass) {
            ss << "<|assistant|>\n";
        }
    } 

This should be a pretty easy fix though, I'll make a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants