Bug(?) in string parsing #45

flipz357 · 2024-02-08T13:18:27Z

I don't need this fixed, but maybe it's of interest:

I noticed that this smatch version returns a score of 1.00 for the different graphs:

(s / see-01
   :ARG0 (p / person
         :name (n / name
              :op1 "Hans")))

and

(s / see-01
   :ARG0 (p / person
         :name (n / name
              :op1 "Hans_")))

Wondering if there's a bug, or if there is some reason for this? It might be some sort of preprocessing that happens here which is not obvious. Since there are also https link etc. in AMR that may contain stuff like "_" I think it may not be sensible to remove characters.

edit

It can be even more severe. The score is also 1.00 for the very different graphs:

(s / see-01
   :ARG0 (p / person
         :name (n / name
              :op1 "Hans Meier")))

and

(s / see-01
   :ARG0 (p / person
         :name (n / name
              :op1 "Hans")))

While the first two graphs could be some pre-processing quirk, the second two ones clearly seem like bug.

The text was updated successfully, but these errors were encountered:

snowblink14 · 2024-02-09T06:41:21Z

Thanks for reporting this, @flipz357. I think this old piece of code has bug handling blank space and underscores when processing the values.

Should be an easy fix though -- I'll fix this soon.

jzw2 · 2024-07-03T08:24:32Z

Is this intended behaviour? I noticed the given test_input1 and test_input2 have a similar example, with "Wiilliam" differeing in capitalization and having underscores. It currently counts it as a match.

flipz357 · 2024-07-03T09:32:58Z

Hi @jzw2 I guess from the view of AMR evaluation/similarity it can seem okay to not differ between upper- and lower-case and be considered a user choice (for very strict evaluation, we might want to differ though). The observed treatment of underscores (the bug you experience, or the example shown above), however, seems indeed like wrong/unintended behavior and should not count as a match.

Maybe you can check out my SMATCH++ that has a best-practice evaluation protocol with AMR normalization (and ILP solver).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug(?) in string parsing #45

Bug(?) in string parsing #45

flipz357 commented Feb 8, 2024 •

edited

Loading

snowblink14 commented Feb 9, 2024

jzw2 commented Jul 3, 2024

flipz357 commented Jul 3, 2024

Bug(?) in string parsing #45

Bug(?) in string parsing #45

Comments

flipz357 commented Feb 8, 2024 • edited Loading

snowblink14 commented Feb 9, 2024

jzw2 commented Jul 3, 2024

flipz357 commented Jul 3, 2024

flipz357 commented Feb 8, 2024 •

edited

Loading