-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optserv 1269 part 2 servox async logic hardening #571
Optserv 1269 part 2 servox async logic hardening #571
Conversation
linkous8
commented
Apr 26, 2024
- refactor connector initialization to be member method load_connectors of servo instead of procedural assembly logic
- update servo to store connector type routes loaded from config yam/default values
- fix servo attach() and detach() referencing servo module
- refactor servo shutdown to send gootbye event, add reason param to shutdown to include in request
- use TaskGroup for checks rememdy processing to avoid task leaks
- Move AssemblyRunner run() args poll, interactive to class properties set on init
- fix incorrected type hint on _default_context_manager
- Update progress handler to support creating its queue processor task with an external task group tied to lifetime of the AssemblyRunner
- Update progress handler sink to require servo from extra or context and include in progress dict
- alphebetize progress dict by key
- move servo to api connection details from ServoRunner to servo
- Update ServoRunner to utilize a task group to organize tasks that used to be created and stored as properties with additional management boilerplate
- Coallesce main_loop into run_main_loop
- get rid of pointlress _reraise_if_neccessary method cluttering stack trace
- remove hello event logic from run and set it to run in servo.startup()
- fix main_loop_task not set to main loop of ServoRunner
- move goodbye event logic to servo shutdown, call assembly.remove_servo() to send it, use _main_loop_task and _task_group for cleanup and auditing
- Update AssemblyRunner to kick off and store single root task from synchronous invocation
- Move most of run() logic into async version _run()
- Update AssemblyRunner to use two task group for tracking tasks that used to be spawned from create_task without storing any refernces to them
- _task_group contextualizes all tasks run within the AssemblyRunner._run() function to ensure the lifetime of the Progress handler is tied to the lifetime of the running servos and gets cancelled by crash errors thrown by them
- _runners_task_group holds only the servo tasks. Due to the progressHandlers queue_processor running an infinite loop, _task_group would run forever if allowed to. _runners_task_group allows us to determine if all servo runner tasks have completed and gracefully exit the _task_group when that is the case
- move _report_progress from dynamic method to class method
- move _handle_progress_exception from dynamic method to class method
- add AssemblyRunner.shutdown guard for re-entrant shutdown
- Update AssemblyRunner.shutdown to attempt to shutdown gracefully and kill the root_task instead of killing all tasks. This should propagate a cancellation down the chain of task groups this agent is now structured under
- Update _handle_progress_exception to shutdown and restart only the individual servo that threw the error instead of all_tasks. Makes use of previously dead code Assembly.add/remove_servo as well as added task groups
- Update Servo connectors lists init to be self contained
- Fix type hint error warnings caused by DNSSubdomainName, DNSLabelName, ContainerTagName
- Disable unused pubsub startup in prometheus connector
- fix unresolved import errors in IDE for devtools.debug
- Fix captured_logs type hint error
- Fix type hints for minikube fixture
- Use task group for FakeAPI server
- Use task group for kubectl_ports_forwarded
- Remove any remaining logic that clobbers asyncio.all_tasks (only tests left at this point)
- fix servo_test failures
- replace prometheus test tasks with task group
- refactor connector initialization to be member method load_connectors of servo instead of procedural assembly logic - update servo to store connector type routes loaded from config yam/default values - fix servo attach() and detach() referencing servo module - refactor servo shutdown to send gootbye event, add reason param to shutdown to include in request - use TaskGroup for checks rememdy processing to avoid task leaks - Move AssemblyRunner run() args poll, interactive to class properties set on init - fix incorrected type hint on _default_context_manager - Update progress handler to support creating its queue processor task with an external task group tied to lifetime of the AssemblyRunner - Update progress handler sink to require servo from extra or context and include in progress dict - alphebetize progress dict by key - move servo to api connection details from ServoRunner to servo - Update ServoRunner to utilize a task group to organize tasks that used to be created and stored as properties with additional management boilerplate - Coallesce main_loop into run_main_loop - get rid of pointlress _reraise_if_neccessary method cluttering stack trace - remove hello event logic from run and set it to run in servo.startup() - fix main_loop_task not set to main loop of ServoRunner - move goodbye event logic to servo shutdown, call assembly.remove_servo() to send it, use _main_loop_task and _task_group for cleanup and auditing - Update AssemblyRunner to kick off and store single root task from synchronous invocation - Move most of run() logic into async version _run() - Update AssemblyRunner to use two task group for tracking tasks that used to be spawned from create_task without storing any refernces to them - _task_group contextualizes all tasks run within the AssemblyRunner._run() function to ensure the lifetime of the Progress handler is tied to the lifetime of the running servos and gets cancelled by crash errors thrown by them - _runners_task_group holds only the servo tasks. Due to the progressHandlers queue_processor running an infinite loop, _task_group would run forever if allowed to. _runners_task_group allows us to determine if all servo runner tasks have completed and gracefully exit the _task_group when that is the case - move _report_progress from dynamic method to class method - move _handle_progress_exception from dynamic method to class method - add AssemblyRunner.shutdown guard for re-entrant shutdown - Update AssemblyRunner.shutdown to attempt to shutdown gracefully and kill the root_task instead of killing all tasks. This should propagate a cancellation down the chain of task groups this agent is now structured under - Update _handle_progress_exception to shutdown and restart only the individual servo that threw the error instead of all_tasks. Makes use of previously dead code Assembly.add/remove_servo as well as added task groups - Update Servo connectors lists init to be self contained - Fix type hint error warnings caused by DNSSubdomainName, DNSLabelName, ContainerTagName - Disable unused pubsub startup in prometheus connector - fix unresolved import errors in IDE for devtools.debug - Fix captured_logs type hint error - Fix type hints for minikube fixture - Use task group for FakeAPI server - Use task group for kubectl_ports_forwarded - Remove any remaining logic that clobbers asyncio.all_tasks (only tests left at this point) - fix servo_test failures - replace prometheus test tasks with task group
- update Servo to accept __connectors__ arg from tests - update progress_request unused params to kwargs - Fix new k8s Annotated types - fix logging test - fix k8s test
- support ExceptionGroup in api.Status.from_error() - fix assembly shutdown not passing reason to goodbye moved to servo shutdown - Fix settlement rejection reason override to work with exception group - move unit test api_mock to tests.helpers - fix runner_test - fix k8s integration tests to expect exception groups
- use issubclass to check for EventError in k8s connector - add test helper to alleviate lack of pytest support for exceptiongroups - fix k8s tests need to unwrap exception groups
Each taskgroup raises a new excpetion group even if its just bubbling up a single task excpetion. To avoid any more significant k8s connector refactors, I am impelementing rudimentary logic to roll the groups into a single error and will apply better logic in a later update
- Add EventAborted error to priorities for legacy measurement abort tests - fix shutdown missing lines of code that actually cancels main loop - apply exception group rollup events invoked by failing tests (TODO apply to runner exec_command) - fix runner_test block until timeout due to task restructure - fix other runner_test to handle exception group and use mock api
- add temporary method of communicating additional errors to backend - fix misunderstanding of ExceptionGroup structure with temporary bandaid to coallesce groups into the most important error (aligns with previous run model of only raising single error) - fix prom test
- add our intercept handler to asyncio logs (support asyncio loglevel) - fix unintended property removal in servorunner - (hopefully) fix exceptions not bubbling out of root_task
- ensure test fixtures instantiating progresshandlers tear them down - add run_until_complete to mocked loop functions in test_assembly_shutdown_with_non_running_servo
- fix logging bug where task_done() could be called too many times if cancelled during queue.get() - update unwrap_exception_group to treat as tree instead of list - fix runner_test needlessly marking all tests integration - refactor event_loop mock patching into helper method - add test for exceptions not raising, reproducable with previous impelementation
) -> None: # noqa: D107 | ||
super().__init__(*args, connectors=[], **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of 'connectors' vs 'connectors' is not easy to understand by looking at the code. If I got it right, the connectors is exclusively for the 'events' mixin and is a copy of 'connectors', except it also includes 'self'.
A short comment explaining the use might be helpful here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand it 100% either but I think one is used for the event bus (includes servo) for dispatching events and the other is used by the servo (excludes servo self reference) for non-event collector handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The markdown let me down here - I meant __connectors__
not connectors. Anyway, this is optional - but if you have an idea what to put in a docstring, it will be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should go in the events mixin but really there should be some logic to unify the variable so that its less confusing and the mixin is self-contained instead of having keyword arg details spilling out into other components.
Line 646 in f3ade34
__connectors__: List[Mixin] = None, |
note: (follow up on the comment about changing 'black' to a larger line_length): |
- add ServoStatuses.from_error method for use in Status.from_error - fix flaws in exception group processing of Status.from_error - fix check remedy dynamic coroutine not called before passing into create_task - Update servo_error_from_group to return default_error for consistency - remove unneeded end param from HELLO log - kubernetes model comment/docstring cleanup