Skip to content

<1 tok/sec with A100 with DeepSeek-R1-Distill-Llama-8B #11555

Answered by cmp-nct
sb98052 asked this question in Q&A
Discussion options

You must be logged in to vote

If nothing changed in the past months then you'll still have to choose how many layers you want to offload to gpu through -ngl

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by sb98052
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants