-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Megular Expressions #263
base: master
Are you sure you want to change the base?
Megular Expressions #263
Conversation
3fbfcdd
to
126f9ef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very useful to me modulo comments. See how nice the IsWebRTCDirectMultiaddr
method is now vs compared to master.
I think we should make this package Experimental / Alpha and start using this in go-libp2p. After some experience we can remove the Experimental tag, till then we can keep iterating on a satisfactory api.
meg/sugar.go
Outdated
import ( | ||
"errors" | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the API. Some other things I'd like are:
Any
which matches any protocol
Match the pattern somewhere in the middle, not necessarily at the start.
Any()*
which will match everything from that point onwards, so we can match IP, TCP, Port, Rest,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have any now. Here's an example that captures a ipport and any*
found, err := m.Match(
CaptureAddrPort(&network, &addrPort),
meg.ZeroOrMore(meg.Any),
)
meg/meg.go
Outdated
if s.capture != nil { | ||
cm[s.capture] = append(cm[s.capture], c.Value()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's more powerful to capture the entire Component and not just Value. That way for some uses you don't have to move back from the string from to the component form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done in #269
util.go
Outdated
func (m Multiaddr) Match(p ...meg.Pattern) (bool, error) { | ||
s := meg.PatternToMatchState(p...) | ||
return meg.Match(s, m) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to make this a method as opposed to:
MatchPattern(m Multiaddr, p ...meg.Pattern)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason. I think it's a bit more ergonomic for it to be a method, but I don't have a strong preference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect most users to not want to run this everytime. s := meg.PatternToMatchState(p...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow. They would just do:
var m Multiaddr
// ...
m.Match(<some pattern>)
No use of meg.PatternToMatchState
directly
493f175
to
47c55fc
Compare
126f9ef
to
6bbe24b
Compare
x/meg/sugar.go
Outdated
func Optional(s Pattern) Pattern { | ||
return func(next *MatchState) *MatchState { | ||
return &MatchState{ | ||
kind: split, | ||
next: s(next), | ||
nextSplit: next, | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want an Optional
or do we want a MatchIfExists
? By MatchIfExists
I mean if the component is there we always want to match it. While an Optional if I understand correctly, may be ignored if it is a prefix of the following match components.This might simplify the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the regex equivalent of MatchIfExists?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if there is. I think the expectation is to rewrite such regexs.
Implementation wise, if the next character doesn't match we still move the state ahead without moving the character index ahead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation wise, if the next character doesn't match we still move the state ahead without moving the character index ahead.
This sounds a bit like Optional.
Overall, I think we do want Optional. For example, SNI is an optional field in many multiaddrs
2a8b8af
to
be1c5ad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one unaddresses comment:
#263 (comment)
Support captures export some things wip thinking about public API Think about exposing meg as a public API doc comments Finish rename Add helper for meg and add test add comment for devs
twice as fast without the copy
* much cheaper copies of captures * Add a benchmark * allocate to a slice. Use indexes as handles * cleanup * Add nocapture loop benchmark It's really fast. No surprise * cleanup * nits
* Use Matchable interface * Add Bytes to Matchable interface * feat(x/meg): Support capturing bytes * Export CaptureWithF Can be used by more specific capturers (e.g capture net.AddrIP) * Support Any match, RawValue, and multiple Concatenations * Add CaptureAddrPort
c14016b
to
ae47e22
Compare
ad1932c
to
0c5383d
Compare
A very simple regular expression matcher for Multiaddr components. Supports capturing values. Matches in linear time (no back tracking).
The core logic is about 100 LOC. The sugar to make it nicer to use is about another 100 LOC.
Motivation
If we are going to treat Multiaddrs as encoding, then we need to make it more ergonomic to parse multiaddrs. Right now we have a lot of somewhat wrong manual parsers using
ForEach
. This should be able to replace those, make it cleaner, and most importantly make them obviously correct.In draft while I try this out in go-libp2p.
Example
Parsing a WebTransport Multiaddr
m
.