-
I was trying to use nemo with deepseek-distilled llama , and these models always start with after some debugging i noticed that nemo-guardrails makes calls to the llm providing max token =3 which captures this opening think tag and the rail blocks further flow . Question is is there a way to customize this , maybe having a shim to parse the LLM response... or something to extend the token size etc.. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hi @YaphetKG, yes it is possible to set For example have a look at content safety prompts.yml If your issue persists please let me know, it could be that the task you are using does not support |
Beta Was this translation helpful? Give feedback.
-
Hi @YaphetKG , While changing the This is still in review, but will be merged soon in Thanks, |
Beta Was this translation helpful? Give feedback.
Hi @YaphetKG ,
While changing the
max_tokens
was needed, using reasoning models with reasoning traces (deepseek-r1
or distilled models) in the output required a bit of extra work from the user. We now have a MR to make the process painless, see #996This is still in review, but will be merged soon in
develop
. If you can test the changes and provide feedback, it would be helpful.Thanks,
Traian