You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After merging #6578, the AsyncMetricsCollector now respects the time interval between collecting speculative decoding metrics. It now collects the metrics every 5 seconds, instead of continuously collecting them.
While this is the desired behaviour, I now see that in practice, we are never logging the speculative decoding stats. This is because, in order to get logged, the speculative decoding stats need to happen to be collected in exactly the same step that we decide to log the output (which is controlled by a different time interval). In practice, this does not seem to happen using the default settings.
The text was updated successfully, but these errors were encountered:
Your current environment
n/a
🐛 Describe the bug
After merging #6578, the
AsyncMetricsCollector
now respects the time interval between collecting speculative decoding metrics. It now collects the metrics every 5 seconds, instead of continuously collecting them.While this is the desired behaviour, I now see that in practice, we are never logging the speculative decoding stats. This is because, in order to get logged, the speculative decoding stats need to happen to be collected in exactly the same step that we decide to log the output (which is controlled by a different time interval). In practice, this does not seem to happen using the default settings.
The text was updated successfully, but these errors were encountered: