-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TimerTest started to fail consistently #1230
Comments
ZPA Triage:
|
When we have a fix we should merge it together with #1231 to have a succeeding CI |
@koevskinikola are you on that already? we want to build RC1 today for the release. |
I'm looking into this right now. |
@koevskinikola asked me to look into this urgently because it's a release blocker for today and he suspected that the recent controllable clock changes may be the root cause. Indeed, if we add a debug log to
These are from before the test travels backwards in time to 2023. Afterwards, we see that the schedule service is not called again. This is a bug in Zeebe where traveling backwards in time prevents further calls to the schedule service, thus never triggering tasks scheduled by the engine. As a workaround, you can change the tests to instead travel forwards in time but this still needs to be fixed properly in Zeebe. |
PR #1232 introduces only a temporary fix to this issue, so I decided not to close the issue with it. I'll create a separate |
@koevskinikola did you get a chance to create the issue in |
@lenaschoenburg I didn't. I'm writing in right now. |
Note: The workaround from PR #1232 seems to only fix the embedded engine test. However, the containerized engine test (i.e. Testcontainers-based) continues to fail. It's still not clear why, but the engine implementations for these two tests are different, PR #1233 disables the |
@lenaschoenburg here is the Zeebe issue related to this one: I used the explanation you provided here to flesh out the description. |
ZPA Planning:
|
Description
AbstractTimerTest was introduced in this PR as a result of solving this issue.
The root cause of the issue was in Bytes class compareTo(final Bytes other) method.
Before we used to compare byte arrays tin the following way:
This was leading to incorrect date comparison as we were comparing signed bytes and sometimes the date that was supposed to come after some other date had a negative byte (ex. -97).
For example:
2023.10.10 15:50:00 date will look as follows when converted to bytes array: [0, 0, 1, -117, 25...]
2023.11.5 15:50:00 date will look as follows when converted to bytes array: [0, 0, 1, -117, -97...]
When comparing these 2 dates we would previously get 2023.11.5 15:50:00 as a smaller date that would result in 2023.10.10 15:50:00 timer event to not be activated.
The fix to use Guava UnsignedBytes.compare(ourByte, otherByte).
TimerTest failures were reported this week (19 August) and the test is failing consistently now.
I can reproduce the failure every time when running the test:
stack trace
The text was updated successfully, but these errors were encountered: