Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List tasks information for hung workers #110

Merged
merged 2 commits into from
Sep 4, 2018
Merged

Conversation

sergw
Copy link
Contributor

@sergw sergw commented Aug 10, 2018

If HungListener found some hung tests it will print information about:

  • worker id
  • test name
  • test params
  • output from .result file

Closes #107

@sergw sergw requested a review from Totktonada August 10, 2018 05:51
@sergw
Copy link
Contributor Author

sergw commented Aug 10, 2018

Checked on some hung tests / result empty. Looks like needs to flush.

Copy link
Member

@Totktonada Totktonada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not close #107? Why 'Part of'?

Comments are below.

listeners.py Outdated
task_name, task_param = self.get_task_by_worker_id(worker_id)
color_stdout("[{0:03d}] {1} {2}\n".format(worker_id,
task_name,
task_param or ''),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Underindented?

dispatcher.py Outdated
@@ -78,10 +78,14 @@ def __init__(self, task_groups, max_workers_cnt, randomize):
self.worker_next_id = 1

tasks_cnt = 0
self.current_task_queue = SimpleQueue()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the new queue is needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sergei (@sergw) helps me to remember the context: workers are separate processes, so we need a queue to pass information about a test that a worker receive.

I propose the following approach. A worker sends an object of specific type to the results queue (see 'Worker results' at worker.py:113) when receive a task. The HangWatcher listener (listeners.py) checks for this type of an object in process_result and saves necessary info. Then it can be used in process_timeout method.

So, we’ll avoid usage of extra queue and we’ll reuse our flexible approach to handle events depending of its type.

Copy link
Member

@Totktonada Totktonada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposed another (hopefully, simpler) approach to implement the feature.

dispatcher.py Outdated
@@ -78,10 +78,14 @@ def __init__(self, task_groups, max_workers_cnt, randomize):
self.worker_next_id = 1

tasks_cnt = 0
self.current_task_queue = SimpleQueue()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sergei (@sergw) helps me to remember the context: workers are separate processes, so we need a queue to pass information about a test that a worker receive.

I propose the following approach. A worker sends an object of specific type to the results queue (see 'Worker results' at worker.py:113) when receive a task. The HangWatcher listener (listeners.py) checks for this type of an object in process_result and saves necessary info. Then it can be used in process_timeout method.

So, we’ll avoid usage of extra queue and we’ll reuse our flexible approach to handle events depending of its type.

listeners.py Outdated
task_name,
task_param or ''),
schema='test_var')
task_path = "{0:03d}_{1}".format(worker_id,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to get a path to result file from a worker and don’t try to reconstruct it here.

lib/worker.py Outdated
@@ -285,6 +286,8 @@ def run_loop(self, task_queue, result_queue):
schema='test_var')
self.stop_worker(task_queue, result_queue)
break

current_task_queue.put((self.id, task_id))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, create a subclass of BaseWorkerMessage to hold the data.

@sergw sergw force-pushed the list-tasks-for-hung-workers branch from 0483493 to b559025 Compare August 17, 2018 09:56
@sergw sergw self-assigned this Aug 17, 2018
@sergw sergw force-pushed the list-tasks-for-hung-workers branch from b559025 to 974dbcc Compare August 17, 2018 12:41
If HungListener found some hung tests it will print information about:
- worker id
- test name
- test params
- last 15 lines from .result file

Closes #107
@sergw sergw force-pushed the list-tasks-for-hung-workers branch 2 times, most recently from 6c3791a to 54dcd33 Compare August 22, 2018 18:10
Copy link
Member

@Totktonada Totktonada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Sergei!

The patch looks good for me. I have one tiny comment, though.

I filed the follow up issue #119 to show output of hang 'core = app' tests. @Korablev77 asks me about that before.

listeners.py Outdated
in self.worker_current_task.iteritems()
if worker_id in worker_ids]
for current_task in hung_tasks:
color_stdout("[{0:03d}] {1} {2}\n".format(current_task.worker_id,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be look more habitual if we'll use the same encoding (one-line yaml) as in reproduce files, like [app-tap/tap.test.lua, null].

@sergw sergw force-pushed the list-tasks-for-hung-workers branch from 54dcd33 to 68e41d1 Compare August 31, 2018 08:52
@sergw sergw merged commit cc0de90 into master Sep 4, 2018
@sergw sergw deleted the list-tasks-for-hung-workers branch September 4, 2018 08:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants