Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

async... await in tests #3706

Merged
merged 7 commits into from
Apr 18, 2020
Merged

async... await in tests #3706

merged 7 commits into from
Apr 18, 2020

Conversation

crusaderky
Copy link
Collaborator

@crusaderky crusaderky commented Apr 15, 2020

  • Update all tests to replace @gen.coroutine... yield with async... await
  • Replace all gen.sleep with asyncio.sleep
  • Clean up @skip markers for Python < 3.6
  • Fix bug in Client(asynchronous=True).close() that, if invoked twice, on the second round would return None; it now returns Awaitable[None].
  • Decorators @gen_cluster and @gen_test now accept only async functions
  • @gen_cluster now properly waits for all Comm and Client objects to be closed, and crashes in case some keep lingering
  • Fix flaky tests

@mrocklin
Copy link
Member

Oh wow, this is amazing. Thank you @crusaderky !

port = a.http_server.port

future = c.submit(sleep, 1)
yield gen.sleep(0.1)
await gen.sleep(0.1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can gen.sleep also be replaced with asyncio.sleep ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, why not

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@crusaderky crusaderky changed the title WIP: async... await in tests async... await in tests Apr 16, 2020
@crusaderky
Copy link
Collaborator Author

crusaderky commented Apr 16, 2020

I'm done, however travis is randomly failing. The failures are concentrated in (but not exclusive to) Python 3.6, and are completely random - run again, and they just move somewhere else. I can't reproduce them locally.
Help is welcome.

For reviewers: since reading 2800 lines of changes can make your eyes cross, I presented the change in 3 commits:

  1. everything else. The code after this change is not valid but it's where all manual interventions are.
  2. replace all yield with await when used in coroutines; change 'def' to 'async def'
  3. replace gen.sleep with asyncio.sleep

@mrocklin
Copy link
Member

Thanks again for doing this @crusaderky . This is great.

For reviewers: since reading 2800 lines of changes can make your eyes cross, I presented the change in 3 commits:

That's super thoughtful. Thanks!

The failures are concentrated in (but not exclusive to) Python 3.6, and are completely random - run again, and they just move somewhere else. I can't reproduce them locally.
Help is welcome.

Hrm, that looks frustrating. At first glance I don't know what's going on. Those do appear to be new intermittent failures though, so I suspect that they are related to some change here. There are some subtle differences between asyncio and tornado that can come up very rarely, like asyncio.sleep(0) behaving differently than gen.sleep(0) or similar topics.

One rather unpleasant way to figure out what is going on is to do a binary search with CI. For example you could commit the changes in half of the files and see if CI keeps complaining. That is a really painful iteration cycle though.

Fix flaky tests


Readd test
@crusaderky
Copy link
Collaborator Author

crusaderky commented Apr 17, 2020

Ok I brought it down to two flaky tests.

  • test_oversubscribing_leases randomly fails on all environments. I did not touch this test at all.
  • test_dont_steal_unknown_functions is deterministically broken on Python 3.6 Linux only, and randomly (rarely) fails in other environments. Note how I changed the test to be less susceptible to randomness.

Not sure what to do with either but I'm fairly convinced it's not my fault...

@mrocklin
Copy link
Member

See #3717 for one

test_dont_steal_unknown_functions seems new to me, but I'm not entirely sure how it could be related. My recommendation here would be to raise an empty PR to test CI and see if it's failing on master.

@jrbourbeau
Copy link
Member

test_dont_steal_unknown_functions is a known flaky test (xref #3574)

@mrocklin
Copy link
Member

Well great. In that case I'm +1 here. Any objection to merging @jrbourbeau ?

@crusaderky
Copy link
Collaborator Author

should I mark them as xfail?

@crusaderky crusaderky closed this Apr 18, 2020
@crusaderky crusaderky reopened this Apr 18, 2020
@mrocklin mrocklin merged commit 9622b8f into dask:master Apr 18, 2020
@mrocklin
Copy link
Member

Merging this in. Thanks again @crusaderky !

@chinmaychandak
Copy link

chinmaychandak commented Apr 22, 2020

A bunch of @gen_cluster tests in streamz are also failing.

An example test is the one below (link here):

@gen_cluster(client=True)
def test_map(c, s, a, b):
    source = Stream(asynchronous=True)
    futures = scatter(source).map(inc)
    futures_L = futures.sink_to_list()
    L = futures.gather().sink_to_list()

    for i in range(5):
        yield source.emit(i)

    assert L == [1, 2, 3, 4, 5]
    assert all(isinstance(f, Future) for f in futures_L)

Just from the looks of it, and by reading the comments here, upgrading from Python 3.6 and adding an async def for tests with @gen_cluster should work.

But, even after making these changes, all @gen_cluster tests are still failing on travisCI with the error: ValueError: @gen_cluster should wrap async def functions. These tests seem to pass locally, so apologies since I do not know of an easy way to post reproducible code here.

Also note that I had to add pytest-tornasync to travis since pytest alone does not handle async tests.

Any help is appreciated since this is blocking another PR as well.

@mrocklin
Copy link
Member

@crusaderky do you have time to revert the check that causes these failures downstream?

@crusaderky
Copy link
Collaborator Author

@mrocklin yes I will

@crusaderky
Copy link
Collaborator Author

But, even after making these changes, all @gen_cluster tests are still failing on travisCI with the error: ValueError: @gen_cluster should wrap async def functions.

@chinmaychandak could you post the output of sys.version and platform.platform() when running on travis?

@chinmaychandak
Copy link

chinmaychandak commented Apr 23, 2020

@chinmaychandak could you post the output of sys.version and platform.platform() when running on travis?

Does this help — https://travis-ci.org/github/python-streamz/streamz/builds/677897003? It shows a Linux platform with Python 3.8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants