-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EN DateTime V2] Added support for cases like "April ninth through 15th" (#2905) #2994
Conversation
@@ -346,7 +346,7 @@ private List<ExtractResult> ExtractImpl(string text, DateObject reference) | |||
var simpleCasesResults = Token.MergeAllTokens(tokens, text, ExtractorName); | |||
var ordinalExtractions = config.OrdinalExtractor.Extract(text); | |||
|
|||
tokens.AddRange(MergeTwoTimePoints(text, reference)); | |||
tokens.AddRange(MergeTwoTimePoints(text, new List<ExtractResult>(ordinalExtractions), reference)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to create a new list here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just following the same syntax used for the other methods in the list, will modify to pass directly the original list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed the creation of new lists from the other extracting methods
if (er.Count < 2) | ||
{ | ||
er = this.config.DateExtractor.Extract(this.config.TokenBeforeDate + text, referenceDate); | ||
if (er.Count >= 2) | ||
er.ForEach(o => o.Start -= this.config.TokenBeforeDate.Length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit brittle. Can you add a code comment explaining the rationale?
Also, could you run some perf tests? All this new extraction/parsing seems a bit heavy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't some equivalent logic be run in Merge instead of directly in Period?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I will check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implemented new approach by adding written ordinal numbers to the existing regexes, there is no perf impact in this way.
Test results for the mean test case execution time:
Base: (59.6 +/- 3.1)ms
Fix with OrdinalExtractor: (63.9 +/- 3.7)ms
Fix with regexes: (58.5 +/- 3.1)ms
{ | ||
var pr1Value = (DateTimeResolutionResult)pr1.Value; | ||
var pr2Value = (DateTimeResolutionResult)pr2.Value; | ||
var timexList1 = pr1.TimexStr.Split(Constants.DateTimexConnector[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why timex parsing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the timex of the ordinal number needs to be updated to refer to the same month of the other date.
It is similar to what happens to the year in the method SyncYear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed method
Fix to issue #2905.
Modified methods MergeTwoTimePoints in Base DatePeriod Extractor/Parser to support cases where one of the date points is an ordinal number.
Fixed issue in Hindi where "पहले" (first) was wrongly extracted from "से पहले" (before).
Test cases added to EN DateTimeModel.