Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Date Match with NeighborhoodRange greater than 0.91 fails to give valid results #35

Closed
manishobhatia opened this issue Sep 3, 2020 · 1 comment

Comments

@manishobhatia
Copy link
Contributor

DATE element type allows user to override the NeighborhoodRange , but a value of greater than 0.91 causes poor matches to show up.

This is a test case to trigger it in MatchServiceTest.java that can show the failure

@Test
    public void itShouldApplyMatchWithDate() {
        List<Object> dates = Arrays.asList(getDate("01/01/2020"), getDate("01/02/2020"), getDate("07/15/2019"));
        List<Document> documentList = getTestDocuments(dates, DATE, 0.91);
        Map<Document, List<Match<Document>>> result = matchService.applyMatch(documentList);
        result.entrySet().forEach(entry -> {
            entry.getValue().forEach(match -> {
                System.out.println("Data: " + match.getData() + " Matched With: " + match.getMatchedWith() + " Score: " + match.getScore().getResult());
            });
        });

        Assert.assertEquals(2, result.size());
    }

As we increate the value greater than 0.91, dates that are not in the neighborhood shows up in results.

The issue is primarily in the incorrect usage of this

private static final double DATE_SCALE_FACTOR = 1.1;

It is increasing the TokenRanges lower and higher bounds to broader values, causing incorrect matches to show up.

@aavaas
Copy link

aavaas commented Sep 23, 2020

Hi! I would like to take on this bug! Please assign.

Thank you!

aavaas pushed a commit to aavaas/fuzzy-matcher that referenced this issue Oct 2, 2020
aavaas pushed a commit to aavaas/fuzzy-matcher that referenced this issue Nov 10, 2020
manishobhatia added a commit that referenced this issue Nov 12, 2020
#35 fix DateMatch with NeighborhoodRange greater than 0.91 failing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants