-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: enhance model selection and input reduction for token limit exceedance #52
Conversation
if total_tokens < code_max_length: | ||
break | ||
|
||
if total_tokens >= code_max_length: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
若请求文本过多,直接引发 ContextLengthExceededError
,进而无需进行此次交互以减少费用。
if total_tokens >= code_max_length: | ||
error_message = ( | ||
f"Context length of {total_tokens} exceeds the maximum limit of {code_max_length} tokens " | ||
f"after {max_attempts} attempts. Please reduce the input size manually." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不用输出这一行,这里没有manually的逻辑,如果两次缩减后还是超长就直接跳过了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 这里确实不应有 manually 的逻辑。
- 此处的用意是当缩减措施无效时抛出一个错误,而不是单纯的
pass
,抛出错误的用意是中断当前的某文档生成任务,同时不进行后续的交互逻辑。你觉得呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不必,这种情况基本不会发生,直接跳过即可,而且抛出报错人类也无法单独解决一个对象代码超长的问题
larger_models = {k: v for k, v in max_input_tokens_map.items() if v > code_max_length} | ||
if larger_models: | ||
# 选择一个拥有更大输入限制的模型 | ||
model = max(larger_models, key=larger_models.get) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不需要加一个验证model是否可用吗?如果配置文件里有model但是用户没有配置api_key,或者apikey没有权限?是不是做一个异常处理接一下可能的错误然后继续换model直到可用?(或者在config的字典里判断api_key字段不为空的才能加入最后的字典)。如果换model后还是出错的错误处理,比如继续换其他更长上下文的model直到可用。最后的最后,如果所有model都不可用,就跳二级缩减的逻辑去
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
配置文件里有model但是用户没有配置api_key (或者在config的字典里判断api_key字段不为空的才能加入最后的字典)
此不应为与 openai 交互类的目的,加入这个内容个人觉得会引入臃肿。我觉得也许应该有某个库在读取了配置文件后做验证,保证配置项是完整的。
如果换model后还是出错的错误处理,比如继续换其他更长上下文的model直到可用。最后的最后,如果所有model都不可用,就跳二级缩减的逻辑去。
当前的做法是在与模型交互前先根据模型 context window
来对 request
做判断,目的是确保模型 completion
效果符合预期。因此在这个情况下,理论上不会出现 context window exceed
。
退一步讲,若发生了 error , 那么内容应为其他错误,这个时候切换了 fallback model
,那么需要重新进行整个 (1)判断 context window
(2) 重新发送请求的过程。
This commit introduces an improved logic for handling scenarios where the input token count exceeds the maximum limit allowed by the selected model. Now, the code dynamically selects a model with a higher input token capacity if the total token count of system and user prompts surpasses the current model's limit. Additionally, it implements a fallback mechanism to iteratively reduce the input size through specific content removal strategies if no larger model is available. This ensures more flexible and efficient processing of large inputs, enhancing the system's adaptability to varying input sizes.