Technical debt: RemoteMixtureOfExperts (v0.8) #64

justheuristic · 2020-07-03T12:37:09Z

beam search uses tuple endpoints (i.e. address, port), while dht switched to string endpoints
beam search needs one extra step in beam search because prefix.123.321 != expert.123.321
we may no longer need parallel autograd if it is implemented in pytorch (not the case)
- remove hivemind.utils.autograd in favor of _RemoteExpertCallMany
add a more feature-rich test for moe.py (with several DHT nodes and experts)
cancel unused queries in first_k_active?
when declaring experts, introduce some kind of "grace period" - only "declare" prefixes that have not been updated for that period. (rationale: first prefixes are likely to be already updated by other peers)

justheuristic · 2020-08-26T07:27:44Z

merged as #80

justheuristic added enhancement New feature or request help wanted Extra attention is needed labels Jul 3, 2020

justheuristic linked a pull request Jul 5, 2020 that will close this issue

[preliminary] simplify and explain moe.py #66

Closed

8 tasks

justheuristic closed this as completed Aug 26, 2020

justheuristic mentioned this issue Sep 8, 2020

v0.9 refactoring concerns #98

Closed

39 tasks

Provide feedback