-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent annotations for LS numbers #464
Comments
Looking across the different treebanks, the EWT treebank is separating the They are also keeping multi-section list items grouped, such as in |
Thanks. A Grew-match query for these: See also #440
See email-enronsent38_01-0002 and successive sentences. They are kept as one token.
Perhaps, but I'm guessing they were separated in the original text with newlines or something. Messing with the sentence boundaries is something I'm a little reluctant to do...let's move that discussion to #415.
Will open a separate issue for this. |
Validation issues:
There are several issues here:
NUM
instead ofX
to be consistent with the other LS annotations.NumType=Ord|NumForm=Digit
features -- there may be other cases like this.Note: I'm using
NumType=Ord
here instead ofCard
as these are ordered values -- first, second, third, etc. -- not counted values.The text was updated successfully, but these errors were encountered: