Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update LLMModelFactory.swift #183

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

hasnat
Copy link

@hasnat hasnat commented Jan 23, 2025

Adds in DeepSeek (mlx-community/DeepSeek-R1-Distill-Qwen-7B-4bit) ModelConfiguration

Am unsure what needs to be done for ModelTypeRegistry.creators but it worked for me.
IMG_0E75D5E1FB1E-1

Adds in DeepSeek (mlx-community/DeepSeek-R1-Distill-Qwen-7B-4bit) ModelConfiguration
@DePasqualeOrg
Copy link
Contributor

It's not apparent from the model's output here because of a silent error, but it's actually not using the chat template, which results in worse output. For that we need @pcuenca to create a new version tag for swift-transformers, ideally after merging my PR for tool use as well as his preferred formatting solution. This would allow mlx-swift-examples to use the latest version of Jinja and swift-transformers, which will enable function calling, chat templates for vision models, as well as support for some recent models (Phi-4 and DeepSeek R1).

@awni
Copy link
Member

awni commented Jan 23, 2025

Thanks for the PR. Let's wait until Swift Transformers updates Jinja and tags a new release so we can update that here. I'm guessing @pcuenca will get to it soon :).

In the meantime, anyone who wants to try the model can clone this PR and be sure to manually update the Jinja package otherwise you will be using the model without a chat template and it will give pretty bad results.

@hasnat
Copy link
Author

hasnat commented Jan 23, 2025 via email

@pcuenca
Copy link
Contributor

pcuenca commented Jan 24, 2025

Sorry for the delay. There's still a problem when applying the chat template in swift-transformers, looking into it.

@pcuenca
Copy link
Contributor

pcuenca commented Jan 24, 2025

I just pushed swift-transformers 0.1.15, with the new Jinja engine, tokenization fixes that impacted the Deepseek tokenizer, fixes for Phi 4, and more.

@awni
Copy link
Member

awni commented Jan 24, 2025

Thanks @pcuenca that's awesome!!

@davidkoski
Copy link
Collaborator

OK, so I think we just need to get the branch/tag pointers updated here, both in the xcodeproj and the Project.swift

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants