garak audit for large offline models that wont fit in one gpu #1090

eivorwolfkissed · 2025-01-27T15:40:47Z

eivorwolfkissed
Jan 27, 2025

I am able to run garak on a smaller model. how would i do it for a bigger model say llama370B that requires multiple gpus to start inferencing?

leondz · 2025-02-07T11:06:53Z

leondz
Feb 7, 2025
Maintainer

I'd recommend either using a version hosted elsewhere and one of the endpoint-based generators, e.g. REST, OpenAICompatible, or LiteLLM -- or using the hf_args and device_map parameters available in huggingface.Pipeline and huggingface.Model with a Hugging Face implementation of the model.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

garak audit for large offline models that wont fit in one gpu #1090

{{title}}

Replies: 1 comment

{{title}}

Select a reply

garak audit for large offline models that wont fit in one gpu #1090

eivorwolfkissed Jan 27, 2025

Replies: 1 comment

leondz Feb 7, 2025 Maintainer

eivorwolfkissed
Jan 27, 2025

leondz
Feb 7, 2025
Maintainer