Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switched chord label validation to regexp method #242

Merged
merged 1 commit into from
Mar 17, 2017

Conversation

bmcfee
Copy link
Collaborator

@bmcfee bmcfee commented Mar 17, 2017

This fixes #217 by using a regexp to parse chords, rather than explicit case splitting.

The regexp is ported over from JAMS, where it has already been thoroughly tested. I added an extra test case 'asdf' that was mentioned in the #217 thread as previously passing.

Because the regexp is such a monster, I disabled the pep8 length check for the corresponding line.

Added test case for chord regexp
@bmcfee bmcfee added this to the 0.5 milestone Mar 17, 2017
@bmcfee bmcfee requested review from craffel and ejhumphrey March 17, 2017 18:41
@bmcfee bmcfee mentioned this pull request Mar 17, 2017
Copy link
Collaborator

@craffel craffel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I assume chord experts (@ejhumphrey @mattmcvicar or others) have given this a thorough run-through in JAMS?

# which is in turn derived from the context-free grammar of
# Harte et al., 2005.

pattern = re.compile(r'''^((N|X)|(([A-G](b*|#*))((:(maj|min|dim|aug|1|5|sus2|sus4|maj6|min6|7|maj7|min7|dim7|hdim7|minmaj7|aug7|9|maj9|min9|11|maj11|min11|13|maj13|min13)(\((\*?((b*|#*)([1-9]|1[0-3]?))(,\*?((b*|#*)([1-9]|1[0-3]?)))*)\))?)|(:\((\*?((b*|#*)([1-9]|1[0-3]?))(,\*?((b*|#*)([1-9]|1[0-3]?)))*)\)))?((/((b*|#*)([1-9]|1[0-3]?)))?)?))$''') # nopep8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use a triple quote if it's only one line?
Splitting into a few lines would not be so bad either.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting the line also prevents GitHub from doing nasty scrolling things.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting a regexp into multiple lines is inviting trouble, IMO. The triple-quotes aren't strictly necessary, but they don't hurt either.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting a regexp into multiple lines is inviting trouble, IMO.

I dunno, wcgw? And you can even add comments for each part of the otherwise inscrutable regex:
https://github.com/craffel/pretty-midi/blob/master/pretty_midi/utilities.py#L82

The triple-quotes aren't strictly necessary, but they don't hurt either.

Isn't this true of literally every string in Python?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno, wcgw?

I generally dislike relying on invisible string concatenation because it's brittle. If, somehow, the string no longer exists inside parentheses, say because you pull it out to its own variable name exp = '....' then it fails silently. I've been burned by this kind of thing before, so I try to avoid it.

Isn't this true of literally every string in Python?

Not strings that could have apostrophes in them (which we don't in this case) or could be interpolated via %/{} (which we also don't). Maybe it's just my own quirk to use r''' ''' for all regexps just in case those things do pop up naturally in the expression.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if, somehow, the string no longer exists inside parentheses,

You could make this argument for any expression in Python which appears in parentheses.

I've been burned by this kind of thing before, so I try to avoid it.

I sense some PTSD. We can just leave it as-is.

@bmcfee
Copy link
Collaborator Author

bmcfee commented Mar 17, 2017

Thanks. I assume chord experts (@ejhumphrey @mattmcvicar or others) have given this a thorough run-through in JAMS?

Yup. It's exactly the grammar from Harte's paper (which is right-recursive and hence regular, hurray for the chomsky language hierarchy!), with additional terminals for X, 1, 5, 13, and sus2 chords. The code to generate the regexp is here and there are extensive unit tests within jams.

@craffel craffel merged commit 98ae38f into mir-evaluation:master Mar 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make validate_chord_label more strict
2 participants