-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DHT Benchmark with asynchronous w/r #406
Conversation
Hi! |
benchmarks/benchmark_dht.py
Outdated
return value, expiration | ||
|
||
|
||
async def corouting_task( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider renaming this into something more informative, e.g. store_and_get_task
Codecov Report
@@ Coverage Diff @@
## master #406 +/- ##
==========================================
- Coverage 83.52% 83.46% -0.06%
==========================================
Files 77 77
Lines 7783 7785 +2
==========================================
- Hits 6501 6498 -3
- Misses 1282 1287 +5
|
benchmarks/benchmark_dht.py
Outdated
return store_ok | ||
|
||
|
||
async def get_task(peer, key): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is function is essentially a single expression that is used once,
perhaps it would be best to inline it into the main coro so that one can read it top-to-bottom without looking up auxiliary functions
Let's create a task that runs this benchmark (and the two remaining ones) on pull-request using the same python version as codecov_in_develop_mode More on other benchmarks:
For instance, you can create a separate job in run_tests.yml Rationale:
|
Before merge
If you still have time after that, lets implement the failure rate as described in #350 |
benchmarks/benchmark_dht.py
Outdated
logger.info(f"Sampled {len(expert_uids)} unique ids (after deduplication)") | ||
random.shuffle(expert_uids) | ||
task_list = [ | ||
loop.create_task( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider using asyncio.run
with asyncio.create_task
optionally make the whole bennchmark async and asyncio-run it from main
benchmarks/benchmark_dht.py
Outdated
logger.warning( | ||
"keys expired midway during get requests. If that isn't desired, increase expiration_time param" | ||
) | ||
loop.run_until_complete(asyncio.wait(task_list)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
loop.run_until_complete(asyncio.wait(task_list)) | |
loop.run_until_complete(asyncio.gather(*task_list)) |
.github/workflows/check-style.yml
Outdated
on: [ push ] | ||
on: [ push, pull_request ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if these changes are necessary for this PR; if possible, I'd keep the diff as short as possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are, the required tests do not run in a forked PR w/o this change
args = vars(parser.parse_args()) | ||
parser.add_argument("--expiration", type=float, default=300, required=False) | ||
parser.add_argument("--latest", type=bool, default=True, required=False) | ||
parser.add_argument("--failure_rate", type=float, default=0.1, required=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep the option to increase the file limit, for the sake of benchmarking with a very large number of peers.
benchmarks/benchmark_dht.py
Outdated
|
||
store_start = time.perf_counter() | ||
store_peers = random.sample(peers, min(num_store_peers, len(peers))) | ||
store_tasks = [store_task(peer, key, value, expiration) for peer in store_peers] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
store_tasks = [store_task(peer, key, value, expiration) for peer in store_peers] | |
subkeys = [uuid.uuid4().hex for peer in store_peers] | |
store_tasks = [peer.store( | |
peer, key, subkey=subkey, value=value, get_dht_time() + expiration, return_future=True) | |
for peer, subkey in zip(store_peers, subkeys)] |
To the best of my knowledge, this coro is only used once. I would recommend either of:
- inlining it: see suggestion above
- or formatting it: add docstring and type hints
benchmarks/benchmark_dht.py
Outdated
expiration: float, | ||
latest: bool, | ||
failure_rate: float, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
): | |
) -> Tuple[int, int, int, int]: | |
"""Iteratively choose random peers to store data onto the dht, then retreive with another random subset of peers""" |
benchmarks/benchmark_dht.py
Outdated
get_start = time.perf_counter() | ||
get_peers = random.sample(peers, min(num_get_peers, len(peers))) | ||
get_tasks = [peer.get(key, latest, return_future=True) for peer in get_peers] | ||
get_result, _ = await asyncio.wait(get_tasks, return_when=asyncio.ALL_COMPLETED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_result, _ = await asyncio.wait(get_tasks, return_when=asyncio.ALL_COMPLETED) | |
get_result = await asyncio.gather(*get_tasks) |
.github/workflows/run-benchmarks.yml
Outdated
cd benchmarks | ||
python benchmark_throughput.py | ||
python benchmark_tensor_compression.py | ||
python benchmark_dht.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python benchmark_dht.py | |
python benchmark_dht.py | |
[add \n]
.github/workflows/run-benchmarks.yml
Outdated
- name: Benchmark | ||
run: | | ||
cd benchmarks | ||
python benchmark_throughput.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please choose presets that fit into the time limit, e.g. --preset minimalistic
.github/workflows/run-benchmarks.yml
Outdated
python benchmark_throughput.py --preset minimalistic | ||
python benchmark_tensor_compression.py | ||
python benchmark_dht.py | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<\n>
Based on #350