-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better XDP error handling #391
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is probably a good step but this just raises a different exception which means that the Controller will still crash, just with a different exception.
this can be seen in your crash test logs:
LNST Controller crashed with an exception:
Traceback (most recent call last):
File "/mnt/tests/data.lnst.anl.eng.rdu2.dc.redhat.com/data-server-content/gitlab-tasks/beaker-lnst-tasks/master.tar.gz/lnst/test-runner/./do-my-test", line 35, in main
ctl.run(recipe, multimatch=bool(params.get("MULTIMATCH", False)))
File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.12/lib/python3.12/site-packages/lnst/Controller/Controller.py", line 172, in run
recipe.test()
File "/root/rhextensions-lnst/lnst/RHExtensions/RHRecipeMixin.py", line 109, in test
super(RHRecipeMixin, self).test()
File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.12/lib/python3.12/site-packages/lnst/Recipes/ENRT/BaseEnrtRecipe.py", line 210, in test
self.do_tests(recipe_config)
File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.12/lib/python3.12/site-packages/lnst/Recipes/ENRT/BaseEnrtRecipe.py", line 332, in do_tests
self.do_perf_tests(recipe_config)
File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.12/lib/python3.12/site-packages/lnst/Recipes/ENRT/BaseEnrtRecipe.py", line 357, in do_perf_tests
result = self.perf_test(perf_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/rhextensions-lnst/lnst/RHExtensions/RHRecipeMixin.py", line 425, in perf_test
return super().perf_test(recipe_conf)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.12/lib/python3.12/site-packages/lnst/RecipeCommon/Perf/Recipe.py", line 162, in perf_test
self.perf_test_iteration(recipe_conf, results)
File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.12/lib/python3.12/site-packages/lnst/RecipeCommon/Perf/Recipe.py", line 192, in perf_test_iteration
measurement_results = measurement.collect_results()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.12/lib/python3.12/site-packages/lnst/RecipeCommon/Perf/Measurements/XDPBenchMeasurement.py", line 154, in collect_results
flow_results.generator_results = self._parse_generator_results(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/virtualenvs/rhextensions-lnst-Xo1BSm3a-py3.12/lib/python3.12/site-packages/lnst/RecipeCommon/Perf/Measurements/XDPBenchMeasurement.py", line 169, in _parse_generator_results
raise job.result["exception"] # propagate exception from agent
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
lnst.Tests.BaseTestModule.TestModuleError: pktgen module is not loaded
This is better than before, it better explains what went wrong, but our other Measurement modules don't reraise exceptions, instead they return "fail results" with some error messages to indicate issues.
This may be important when running multiple tests where possibly some can fail due to errors but some can run just fine.
Or potentially to let rerun mechanics to rerun the test automatically in case of random errors, similar to what i did here: #382
In some cases it's likely good to raise some exception, to indicate serious really bad issues, but i'm not sure if this should be always the case. In what situations will this occur?
PktGen test module may raise an exception which needs to be handled on controller side as well to prevent it to treat exception as a result.
Handling exceptions from agent's test modules to prevent controller crash as it excepts resuts to be present.
2a88663
to
70fc317
Compare
Hmm actually yeah, in context of our tooling it makes sense to just report "invalid" results |
Description
XDP's test modules might raise an exception when running on agent
machine. Currently these are not propagated to conroller, so it treats
incomplete/wrong results as a valid ones and so, controller crashes
on weird issues because of that.
Tests
J:10515937
J:10515938