Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DHT Benchmark with asynchronous w/r #406

Merged
merged 20 commits into from
Nov 30, 2021
Merged

DHT Benchmark with asynchronous w/r #406

merged 20 commits into from
Nov 30, 2021

Conversation

MuXauJl11110
Copy link
Contributor

Based on #350

@justheuristic justheuristic self-requested a review November 7, 2021 19:14
@justheuristic
Copy link
Member

justheuristic commented Nov 7, 2021

Hi!
Please refer to contributing.md#code-style for code style and how to convert the code automatically.

return value, expiration


async def corouting_task(
Copy link
Member

@justheuristic justheuristic Nov 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider renaming this into something more informative, e.g. store_and_get_task

@codecov
Copy link

codecov bot commented Nov 7, 2021

Codecov Report

Merging #406 (6bdb734) into master (5d31c3b) will decrease coverage by 0.05%.
The diff coverage is 66.66%.

@@            Coverage Diff             @@
##           master     #406      +/-   ##
==========================================
- Coverage   83.52%   83.46%   -0.06%     
==========================================
  Files          77       77              
  Lines        7783     7785       +2     
==========================================
- Hits         6501     6498       -3     
- Misses       1282     1287       +5     
Impacted Files Coverage Δ
hivemind/utils/mpfuture.py 94.09% <66.66%> (-1.33%) ⬇️
hivemind/optim/experimental/progress_tracker.py 98.28% <0.00%> (-1.15%) ⬇️

return store_ok


async def get_task(peer, key):
Copy link
Member

@justheuristic justheuristic Nov 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is function is essentially a single expression that is used once,
perhaps it would be best to inline it into the main coro so that one can read it top-to-bottom without looking up auxiliary functions

@justheuristic
Copy link
Member

justheuristic commented Nov 7, 2021

Let's create a task that runs this benchmark (and the two remaining ones) on pull-request using the same python version as codecov_in_develop_mode

More on other benchmarks:
https://learning-at-home.readthedocs.io/en/latest/user/benchmarks.html

python benchmark_tensor_compression.py
python benchmark_throughput.py --preset minimalistic
[enter whichever command runs your benchmark here]

For instance, you can create a separate job in run_tests.yml

Rationale:

  • why CI: we often break benchmarks, introducing CI will ensure that all benchmarks work
    • for example, benchmark_averaging is currently broken, probably my fault
  • why separate CI job: we could add this to tests, but it would extend the (already long) test runtime

@justheuristic
Copy link
Member

justheuristic commented Nov 7, 2021

Before merge

If you still have time after that, lets implement the failure rate as described in #350

logger.info(f"Sampled {len(expert_uids)} unique ids (after deduplication)")
random.shuffle(expert_uids)
task_list = [
loop.create_task(
Copy link
Member

@justheuristic justheuristic Nov 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider using asyncio.run with asyncio.create_task

optionally make the whole bennchmark async and asyncio-run it from main

logger.warning(
"keys expired midway during get requests. If that isn't desired, increase expiration_time param"
)
loop.run_until_complete(asyncio.wait(task_list))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
loop.run_until_complete(asyncio.wait(task_list))
loop.run_until_complete(asyncio.gather(*task_list))

on: [ push ]
on: [ push, pull_request ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if these changes are necessary for this PR; if possible, I'd keep the diff as short as possible

Copy link
Member

@justheuristic justheuristic Nov 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are, the required tests do not run in a forked PR w/o this change

@borzunov borzunov self-requested a review November 8, 2021 13:47
args = vars(parser.parse_args())
parser.add_argument("--expiration", type=float, default=300, required=False)
parser.add_argument("--latest", type=bool, default=True, required=False)
parser.add_argument("--failure_rate", type=float, default=0.1, required=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep the option to increase the file limit, for the sake of benchmarking with a very large number of peers.


store_start = time.perf_counter()
store_peers = random.sample(peers, min(num_store_peers, len(peers)))
store_tasks = [store_task(peer, key, value, expiration) for peer in store_peers]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
store_tasks = [store_task(peer, key, value, expiration) for peer in store_peers]
subkeys = [uuid.uuid4().hex for peer in store_peers]
store_tasks = [peer.store(
peer, key, subkey=subkey, value=value, get_dht_time() + expiration, return_future=True)
for peer, subkey in zip(store_peers, subkeys)]

To the best of my knowledge, this coro is only used once. I would recommend either of:

  • inlining it: see suggestion above
  • or formatting it: add docstring and type hints

expiration: float,
latest: bool,
failure_rate: float,
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
):
) -> Tuple[int, int, int, int]:
"""Iteratively choose random peers to store data onto the dht, then retreive with another random subset of peers"""

get_start = time.perf_counter()
get_peers = random.sample(peers, min(num_get_peers, len(peers)))
get_tasks = [peer.get(key, latest, return_future=True) for peer in get_peers]
get_result, _ = await asyncio.wait(get_tasks, return_when=asyncio.ALL_COMPLETED)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
get_result, _ = await asyncio.wait(get_tasks, return_when=asyncio.ALL_COMPLETED)
get_result = await asyncio.gather(*get_tasks)

cd benchmarks
python benchmark_throughput.py
python benchmark_tensor_compression.py
python benchmark_dht.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
python benchmark_dht.py
python benchmark_dht.py

[add \n]

- name: Benchmark
run: |
cd benchmarks
python benchmark_throughput.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please choose presets that fit into the time limit, e.g. --preset minimalistic

@justheuristic justheuristic marked this pull request as ready for review November 22, 2021 12:30
python benchmark_throughput.py --preset minimalistic
python benchmark_tensor_compression.py
python benchmark_dht.py

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<\n>

@justheuristic justheuristic merged commit a960438 into learning-at-home:master Nov 30, 2021
@justheuristic justheuristic mentioned this pull request Dec 15, 2021
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants