Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize unary handlers with persistent connections to P2P daemon #328

Merged
merged 99 commits into from
Aug 24, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
99 commits
Select commit Hold shift + click to select a range
3f7c2e3
add unary handler support to p2p control class
dvmazur Jul 18, 2021
779bcee
fix naming
dvmazur Jul 18, 2021
914fa51
implement request handling
dvmazur Jul 18, 2021
013938f
add unary handler support to P2P
dvmazur Jul 18, 2021
03ad71a
fixup: restore rpcerror proto message
dvmazur Jul 18, 2021
a476e3c
update call id type
dvmazur Jul 18, 2021
72ad4cc
fix: unary calls and adding handlers now working
dvmazur Jul 19, 2021
9ba6a51
run black on hivemind/p2p
dvmazur Jul 19, 2021
044b232
reimplement protobuf handlers
dvmazur Jul 19, 2021
632c20c
run black on updated files
dvmazur Jul 19, 2021
864cb3e
pass remote id to handler context
dvmazur Jul 19, 2021
e87d862
fix: await unary handler response
dvmazur Jul 19, 2021
14ed9d5
fix: race condition in control._self_ensure_conn
dvmazur Jul 19, 2021
81d115a
style: remote line
dvmazur Jul 19, 2021
a0c534f
fix: future typing
dvmazur Jul 19, 2021
6a61895
refactor: use P2PHandlerError instread of RemoteError
dvmazur Jul 19, 2021
cbcbae7
Merge branch 'unary-handlers' of github.com:learning-at-home/hivemind…
dvmazur Jul 19, 2021
010b466
style: remove unnecessary type annotation
dvmazur Jul 19, 2021
0d1b02b
fix: check if handler is registered before sending messages to daemon
dvmazur Jul 19, 2021
fca7d69
Merge branch 'unary-handlers' of github.com:learning-at-home/hivemind…
dvmazur Jul 19, 2021
86c10dc
fix: asyncio.lock
dvmazur Jul 19, 2021
56d0efd
fix: await unary call
dvmazur Jul 19, 2021
c92e633
fix: represent exception as bytes
dvmazur Jul 19, 2021
b49b9bd
decode error message
dvmazur Jul 19, 2021
71ca067
add steam_output support
dvmazur Jul 22, 2021
56ade26
fix: stream_output argument in call_protobuf_handler
dvmazur Jul 22, 2021
463fba9
fix: cast call_protobuf_handler to async iterator if needed
dvmazur Jul 22, 2021
0358a75
fix: unused variable
dvmazur Jul 22, 2021
e08177b
fix:
dvmazur Jul 22, 2021
9c59a4d
reimplement daemon connection protocol
dvmazur Jul 22, 2021
70c61c4
fix nits
dvmazur Jul 22, 2021
58887d2
Refactor daemon control protocol (#333)
dvmazur Jul 22, 2021
86d01c8
add cancellation support for unary handlers
dvmazur Jul 27, 2021
b1a2ca2
run black on files
dvmazur Jul 27, 2021
38a182c
hotfix
dvmazur Jul 27, 2021
0e3a4b1
Merge branch 'better-messages' into unary-handlers
dvmazur Jul 28, 2021
cb07a45
fix style
dvmazur Jul 28, 2021
d315709
restore p2p_servicer test
dvmazur Jul 28, 2021
8322485
update protobufs
dvmazur Jul 30, 2021
f271c76
fix nits
dvmazur Jul 30, 2021
1925723
update protobufs
dvmazur Aug 3, 2021
1feb884
remove stream_output from call_protobuf_handler
dvmazur Aug 3, 2021
a39a157
remove debug print statements from code
dvmazur Aug 3, 2021
96c59b9
fix suggestions
dvmazur Aug 4, 2021
b058e6e
Merge branch 'master' into unary-handlers
dvmazur Aug 4, 2021
90592e0
sort imports
dvmazur Aug 4, 2021
e6e6dde
Update tests/test_p2p_daemon.py
dvmazur Aug 5, 2021
8d4f4a4
Apply suggestions from code review
dvmazur Aug 5, 2021
bca5b9f
resolve comments
dvmazur Aug 5, 2021
56cfb4e
Add daemon termination support (#348)
dvmazur Aug 10, 2021
2292d62
fix suggesstions
dvmazur Aug 12, 2021
97c6c17
isort control.py
dvmazur Aug 12, 2021
619931d
Merge branch 'master' into unary-handlers
dvmazur Aug 12, 2021
7d8eb40
fix type annotations
dvmazur Aug 13, 2021
72d3705
update p2pd version in setup.py
dvmazur Aug 13, 2021
f7d43f7
fix control
dvmazur Aug 13, 2021
ab5f3ec
fix dht tests
dvmazur Aug 13, 2021
1e96285
fix test warning
dvmazur Aug 15, 2021
105c104
Update hivemind/p2p/p2p_daemon_bindings/control.py
dvmazur Aug 15, 2021
635d1fe
Apply suggestions from code review
dvmazur Aug 15, 2021
5a84ba7
Fix finalizer warnings in ControlClient
borzunov Aug 15, 2021
ff55d4a
Remove unused import
borzunov Aug 15, 2021
8ceb45c
Update hivemind/p2p/p2p_daemon.py
dvmazur Aug 16, 2021
595362c
lotta tests
dvmazur Aug 17, 2021
e02b762
Update run-tests.yml
dvmazur Aug 17, 2021
cfde3f5
remove termination
dvmazur Aug 18, 2021
55fcefb
Merge branch 'unary-handlers' of github.com:learning-at-home/hivemind…
dvmazur Aug 18, 2021
3b5619a
update checksum
dvmazur Aug 18, 2021
eca939d
Merge branch 'master' into unary-handlers
dvmazur Aug 18, 2021
a2fe880
add le to PeerID
dvmazur Aug 18, 2021
e6db7f0
debug hypothesis: sleep
justheuristic Aug 18, 2021
ae40827
handle daemon errors
dvmazur Aug 20, 2021
3c6ec13
refactor logger.warn to logger.warning
dvmazur Aug 20, 2021
527f0ad
skip faulty test
dvmazur Aug 20, 2021
b01bac7
restore timeout in tests
dvmazur Aug 20, 2021
26ee55e
Merge branch 'master' into unary-handlers
dvmazur Aug 20, 2021
62294cc
black hivemind
dvmazur Aug 20, 2021
36cc66c
remove delay from control.py
dvmazur Aug 20, 2021
5f1a65f
replicate P2P in DHT
dvmazur Aug 20, 2021
209d3e1
remove redundant replicate from dht tests
dvmazur Aug 20, 2021
2758410
add add unary handler test
dvmazur Aug 20, 2021
d8e0ae1
followup
dvmazur Aug 20, 2021
995ce49
refactor DHT
dvmazur Aug 20, 2021
73d52ba
refactor hivemind
dvmazur Aug 20, 2021
9676cf4
fix suggestion
dvmazur Aug 20, 2021
602ffdc
refactor daemon
dvmazur Aug 20, 2021
bbc7ee8
hotfix
dvmazur Aug 20, 2021
938aeaa
update daemon protocol and add p2p tests
dvmazur Aug 23, 2021
2e53b47
black tests
dvmazur Aug 23, 2021
eacf5d7
restore idle timeout
dvmazur Aug 23, 2021
6b6c38a
update setup.py
dvmazur Aug 23, 2021
18e8c38
add repr for peerinfo
dvmazur Aug 23, 2021
d044882
refactor peerinfo
dvmazur Aug 23, 2021
85b9e0a
gather adding p2p handlers in servicer
dvmazur Aug 23, 2021
1ded110
fix test_transports
dvmazur Aug 23, 2021
4336578
Update hivemind/p2p/p2p_daemon_bindings/datastructures.py
dvmazur Aug 23, 2021
f4ce4c3
bump p2p version
dvmazur Aug 23, 2021
3855df6
Merge branch 'unary-handlers' of github.com:learning-at-home/hivemind…
dvmazur Aug 23, 2021
f730559
Merge branch 'master' into unary-handlers
dvmazur Aug 23, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions .github/workflows/run-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ jobs:
run: |
cd tests
pytest --durations=0 --durations-min=1.0 -v

build_and_test_p2pd:
runs-on: ubuntu-latest
timeout-minutes: 10
Expand All @@ -61,7 +60,6 @@ jobs:
run: |
cd tests
pytest -k "p2p" -v

codecov_in_develop_mode:

runs-on: ubuntu-latest
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ coverage.xml
.project
.pydevproject
.idea
.vscode
.ipynb_checkpoints

# Rope
Expand Down
9 changes: 9 additions & 0 deletions hivemind/dht/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ def __init__(
initial_peers: Optional[Sequence[Union[Multiaddr, str]]] = None,
*,
start: bool,
p2p: Optional[P2P] = None,
daemon: bool = True,
num_workers: int = DEFAULT_NUM_WORKERS,
record_validators: Iterable[RecordValidatorBase] = (),
Expand Down Expand Up @@ -94,6 +95,8 @@ def __init__(
self._client_mode = None
self._p2p_replica = None

self._daemon_listen_maddr = p2p.daemon_listen_maddr if p2p is not None else None

if start:
self.run_in_background(await_ready=await_ready)

Expand All @@ -105,10 +108,16 @@ def run(self) -> None:

async def _run():
try:
if self._daemon_listen_maddr is not None:
replicated_p2p = await P2P.replicate(self._daemon_listen_maddr)
else:
replicated_p2p = None

self._node = await DHTNode.create(
initial_peers=self.initial_peers,
num_workers=self.num_workers,
record_validator=self._record_validator,
p2p=replicated_p2p,
**self.kwargs,
)
except Exception as e:
Expand Down
2 changes: 1 addition & 1 deletion hivemind/dht/protocol.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ async def create(

def __init__(self, *, _initialized_with_create=False):
"""Internal init method. Please use DHTProtocol.create coroutine to spawn new protocol instances"""
assert _initialized_with_create, " Please use DHTProtocol.create coroutine to spawn new protocol instances "
assert _initialized_with_create, "Please use DHTProtocol.create coroutine to spawn new protocol instances"
super().__init__()

def get_stub(self, peer: PeerID) -> AuthRPCWrapper:
Expand Down
90 changes: 69 additions & 21 deletions hivemind/p2p/p2p_daemon.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,14 @@
from contextlib import closing, suppress
from dataclasses import dataclass
from importlib.resources import path
from typing import Any, AsyncIterator, Awaitable, Callable, List, Optional, Sequence, Tuple, TypeVar, Union
from typing import Any, AsyncIterator, Awaitable, Callable, Dict, List, Optional, Sequence, Tuple, Type, TypeVar, Union

from google.protobuf.message import Message
from multiaddr import Multiaddr

import hivemind.hivemind_cli as cli
import hivemind.p2p.p2p_daemon_bindings.p2pclient as p2pclient
from hivemind.p2p.p2p_daemon_bindings.control import P2PDaemonError, P2PHandlerError
from hivemind.p2p.p2p_daemon_bindings.datastructures import PeerID, PeerInfo, StreamInfo
from hivemind.proto.p2pd_pb2 import RPCError
from hivemind.utils.asyncio import aiter, asingle
Expand All @@ -27,7 +29,6 @@ class P2PContext(object):
handle_name: str
local_id: PeerID
remote_id: PeerID = None
remote_maddr: Multiaddr = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To future reviewers: I and @deniskamazur have agreed to remove this because

  1. Handlers are supposed to work with peer IDs only (not with multiaddrs, which is a lower-level concept)
  2. Absence of remote_maddr simplifies the unary handler implementation



class P2P:
Expand Down Expand Up @@ -65,6 +66,7 @@ class P2P:

def __init__(self):
self.peer_id = None
self._client = None
self._child = None
self._alive = False
self._reader_task = None
Expand All @@ -90,6 +92,7 @@ async def create(
use_auto_relay: bool = False,
relay_hop_limit: int = 0,
startup_timeout: float = 15,
idle_timeout: float = 30,
) -> "P2P":
"""
Start a new p2pd process and connect to it.
Expand All @@ -111,6 +114,8 @@ async def create(
:param use_auto_relay: enables autorelay
:param relay_hop_limit: sets the hop limit for hop relays
:param startup_timeout: raise a P2PDaemonError if the daemon does not start in ``startup_timeout`` seconds
:param idle_timeout: kill daemon if client has been idle for a given number of
seconds before opening persistent streams
:return: a wrapper for the p2p daemon
"""

Expand Down Expand Up @@ -150,6 +155,7 @@ async def create(
relayDiscovery=use_relay_discovery,
autoRelay=use_auto_relay,
relayHopLimit=relay_hop_limit,
idleTimeout=f"{idle_timeout}s",
b=need_bootstrap,
**process_kwargs,
)
Expand All @@ -167,7 +173,7 @@ async def create(
await self.shutdown()
raise P2PDaemonError(f"Daemon failed to start in {startup_timeout:.1f} seconds")

self._client = p2pclient.Client(self._daemon_listen_maddr, self._client_listen_maddr)
self._client = await p2pclient.Client.create(self._daemon_listen_maddr, self._client_listen_maddr)
await self._ping_daemon()
return self

Expand All @@ -189,7 +195,7 @@ async def replicate(cls, daemon_listen_maddr: Multiaddr) -> "P2P":
self._daemon_listen_maddr = daemon_listen_maddr
self._client_listen_maddr = Multiaddr(cls._UNIX_SOCKET_PREFIX + f"p2pclient-{socket_uid}.sock")

self._client = p2pclient.Client(self._daemon_listen_maddr, self._client_listen_maddr)
self._client = await p2pclient.Client.create(self._daemon_listen_maddr, self._client_listen_maddr)

await self._ping_daemon()
return self
Expand Down Expand Up @@ -258,7 +264,7 @@ async def send_protobuf(protobuf: Union[TOutputProtobuf, RPCError], writer: asyn

@staticmethod
async def receive_protobuf(
input_protobuf_type: type, reader: asyncio.StreamReader
input_protobuf_type: Type[Message], reader: asyncio.StreamReader
) -> Tuple[Optional[TInputProtobuf], Optional[RPCError]]:
msg_type = await reader.readexactly(1)
if msg_type == P2P.MESSAGE_MARKER:
Expand All @@ -279,7 +285,7 @@ async def _add_protobuf_stream_handler(
self,
name: str,
handler: Callable[[TInputStream, P2PContext], TOutputStream],
input_protobuf_type: type,
input_protobuf_type: Type[Message],
max_prefetch: int = 5,
) -> None:
"""
Expand All @@ -297,7 +303,6 @@ async def _handle_stream(
handle_name=name,
local_id=self.peer_id,
remote_id=stream_info.peer_id,
remote_maddr=stream_info.addr,
)
requests = asyncio.Queue(max_prefetch)

Expand Down Expand Up @@ -349,7 +354,7 @@ async def _process_stream() -> None:
await self.add_binary_stream_handler(name, _handle_stream)

async def _iterate_protobuf_stream_handler(
self, peer_id: PeerID, name: str, requests: TInputStream, output_protobuf_type: type
self, peer_id: PeerID, name: str, requests: TInputStream, output_protobuf_type: Type[Message]
) -> TOutputStream:
_, reader, writer = await self.call_binary_stream_handler(peer_id, name)

Expand Down Expand Up @@ -381,15 +386,22 @@ async def add_protobuf_handler(
handler: Callable[
[Union[TInputProtobuf, TInputStream], P2PContext], Union[Awaitable[TOutputProtobuf], TOutputStream]
],
input_protobuf_type: type,
input_protobuf_type: Type[Message],
*,
stream_input: bool = False,
stream_output: bool = False,
) -> None:
"""
:param stream_input: If True, assume ``handler`` to take ``TInputStream``
(not just ``TInputProtobuf``) as input.
:param stream_output: If True, assume ``handler`` to return ``TOutputStream``
(not ``Awaitable[TOutputProtobuf]``).
"""

if not stream_input and not stream_output:
await self._add_protobuf_unary_handler(name, handler, input_protobuf_type)
return

async def _stream_handler(requests: P2P.TInputStream, context: P2PContext) -> P2P.TOutputStream:
input = requests if stream_input else await asingle(requests)
output = handler(input, context)
Expand All @@ -402,23 +414,65 @@ async def _stream_handler(requests: P2P.TInputStream, context: P2PContext) -> P2

await self._add_protobuf_stream_handler(name, _stream_handler, input_protobuf_type)

async def _add_protobuf_unary_handler(
self,
handle_name: str,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
handle_name: str,
handler_name: str,

Copy link
Collaborator Author

@dvmazur dvmazur Aug 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we both use handle_name and handler_name in the codebase. Maybe, we should rather clean it up in a follow-up PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, if possible, let's make a note of it somewhere (as an issue or as a TODO in the top message of this PR)

handler: Callable[[TInputProtobuf, P2PContext], Awaitable[TOutputProtobuf]],
input_protobuf_type: Type[Message],
) -> None:
"""
Register a request-response (unary) handler. Unary requests and responses
are sent through persistent multiplexed connections to the daemon for the
sake of reducing the number of open files.
:param handle_name: name of the handler (protocol id)
:param handler: function handling the unary requests
:param input_protobuf_type: protobuf type of the request
"""

async def _unary_handler(request: bytes, remote_id: PeerID) -> bytes:
input_serialized = input_protobuf_type.FromString(request)
context = P2PContext(
handle_name=handle_name,
local_id=self.peer_id,
remote_id=remote_id,
)

response = await handler(input_serialized, context)
return response.SerializeToString()

await self._client.add_unary_handler(handle_name, _unary_handler)

async def call_protobuf_handler(
self,
peer_id: PeerID,
name: str,
input: Union[TInputProtobuf, TInputStream],
output_protobuf_type: type,
output_protobuf_type: Type[Message],
) -> Awaitable[TOutputProtobuf]:
requests = input if isinstance(input, AsyncIterableABC) else aiter(input)
responses = self._iterate_protobuf_stream_handler(peer_id, name, requests, output_protobuf_type)

if not isinstance(input, AsyncIterableABC):
return await self._call_unary_protobuf_handler(peer_id, name, input, output_protobuf_type)

responses = self._iterate_protobuf_stream_handler(peer_id, name, input, output_protobuf_type)
return await asingle(responses)

async def _call_unary_protobuf_handler(
self,
peer_id: PeerID,
handle_name: str,
input: TInputProtobuf,
output_protobuf_type: Type[Message],
) -> Awaitable[TOutputProtobuf]:
serialized_input = input.SerializeToString()
response = await self._client.call_unary_handler(peer_id, handle_name, serialized_input)
return output_protobuf_type.FromString(response)

def iterate_protobuf_handler(
self,
peer_id: PeerID,
name: str,
input: Union[TInputProtobuf, TInputStream],
output_protobuf_type: type,
output_protobuf_type: Type[Message],
) -> TOutputStream:
requests = input if isinstance(input, AsyncIterableABC) else aiter(input)
return self._iterate_protobuf_stream_handler(peer_id, name, requests, output_protobuf_type)
Expand Down Expand Up @@ -453,6 +507,8 @@ async def shutdown(self) -> None:
await self._child.wait()

def _terminate(self) -> None:
if self._client is not None:
self._client.close()
if self._listen_task is not None:
self._listen_task.cancel()
if self._reader_task is not None:
Expand Down Expand Up @@ -501,11 +557,3 @@ async def _read_outputs(self, ready: asyncio.Future) -> None:

if not ready.done():
ready.set_exception(P2PDaemonError(f"Daemon failed to start: {last_line}"))


class P2PDaemonError(RuntimeError):
pass


class P2PHandlerError(Exception):
pass
Loading