You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Prospective Crusoe customers would like to understand how to perform common Llama workflows (inference, fine-tuning, training) on our cloud. We'd like to contribute solutions centered around Llama to the 3p_integrations repo, beginning with a tutorial on benchmarking FP8 quants served via vLLM. The tutorial covers how to deploy resources on Crusoe, start a vLLM server, run and interpret benchmarks, and finally how to create fp8 quants of existing Llama3 finetunes.
We hope for this to be the first of a series of solutions for common Llama workflows!
Alternatives
No response
Additional context
No response
The text was updated successfully, but these errors were encountered:
🚀 The feature, motivation and pitch
Prospective Crusoe customers would like to understand how to perform common Llama workflows (inference, fine-tuning, training) on our cloud. We'd like to contribute solutions centered around Llama to the 3p_integrations repo, beginning with a tutorial on benchmarking FP8 quants served via vLLM. The tutorial covers how to deploy resources on Crusoe, start a vLLM server, run and interpret benchmarks, and finally how to create fp8 quants of existing Llama3 finetunes.
We hope for this to be the first of a series of solutions for common Llama workflows!
Alternatives
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: