garak audit for large offline models that wont fit in one gpu #1090
Replies: 1 comment
-
I'd recommend either using a version hosted elsewhere and one of the endpoint-based generators, e.g. REST, |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am able to run garak on a smaller model. how would i do it for a bigger model say llama370B that requires multiple gpus to start inferencing?
Beta Was this translation helpful? Give feedback.
All reactions