Fixes unique and equal checks #820

nezhar · 2021-06-29T09:06:55Z

Extends the equal check for dictionaries and lists.

This solves #686, but it's also usefull for newer spec implementations such as draft 2020-12

codecov · 2021-06-29T09:09:44Z

Codecov Report

Merging #820 (27661b2) into main (0287da9) will increase coverage by 0.20%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #820      +/-   ##
==========================================
+ Coverage   95.90%   96.11%   +0.20%     
==========================================
  Files          17       18       +1     
  Lines        2587     2726     +139     
  Branches      299      310      +11     
==========================================
+ Hits         2481     2620     +139     
  Misses         86       86              
  Partials       20       20

Impacted Files	Coverage Δ
jsonschema/tests/test_jsonschema_test_suite.py	`87.50% <ø> (ø)`
jsonschema/_utils.py	`90.51% <100.00%> (+2.21%)`	⬆️
jsonschema/tests/test_utils.py	`100.00% <100.00%> (ø)`
jsonschema/tests/test_validators.py	`98.60% <100.00%> (+0.02%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0287da9...27661b2. Read the comment docs.

Julian · 2021-06-29T09:56:24Z

There's some interactions with TypeChecker we need to account for -- specifically, we need a new test here internally (which I'm pretty sure will fail) where someone does type_checker.redefine("array", some_collections.Sequence_that_does_not_inherit_from_list), which is supported use of the library, and then which still contains 1 and True in that non-list-inheriting sequence.

And then the same for dict.

nezhar · 2021-06-29T20:16:43Z

There's some interactions with TypeChecker we need to account for -- specifically, we need a new test here internally (which I'm pretty sure will fail) where someone does type_checker.redefine("array", some_collections.Sequence_that_does_not_inherit_from_list), which is supported use of the library, and then which still contains 1 and True in that non-list-inheriting sequence.

And then the same for dict.

Does this go in the right direction? Is there anything similar in the code you can point me to?

Julian · 2021-07-01T06:08:32Z

Sorry for being terse -- yes that's in the right direction, but now the point is not to test the behavior in that test case itself (which is already covered elsewhere in the suite), it's whether #686 is fixed there. I.e. whether deque([[False], [0]]), {'type': 'array', 'uniqueItems': True}) also fails properly now (because it contains nonunique items). And same for [MyMapping({"a": 0}), MyMapping({"a": False})]

Let me know if that makes sense, if not will write something longer up. And thanks again.

nezhar · 2021-07-01T09:37:52Z

Sorry for being terse -- yes that's in the right direction, but now the point is not to test the behavior in that test case itself (which is already covered elsewhere in the suite), it's whether #686 is fixed there. I.e. whether deque([[False], [0]]), {'type': 'array', 'uniqueItems': True}) also fails properly now (because it contains nonunique items). And same for [MyMapping({"a": 0}), MyMapping({"a": False})]

Let me know if that makes sense, if not will write something longer up. And thanks again.

Thanks for the explanation. I added the examples you mentioned in the test 🙂

Maybe I miss something, but according to these tests this should be valid: https://github.com/json-schema-org/JSON-Schema-Test-Suite/blob/master/tests/draft2020-12/uniqueItems.json#L21

This is what actually fails

validator.validate([deque([False]), deque([0])])

validator.validate([MyMapping('a', 0),  MyMapping('a', False)])

According to the spec this should be valid, right?

Julian · 2021-07-01T09:42:17Z

Yes, sorry, I was typing too quickly -- valid, not invalid!

I didn't try them with the change here but yeah we want to make sure the fix covers those as well.

nezhar · 2021-07-01T10:07:51Z

Yes, sorry, I was typing too quickly -- valid, not invalid!

I didn't try them with the change here but yeah we want to make sure the fix covers those as well.

I removed the first check in uniq as set would interfere in the equal check. Tests seem to be ok now, what do you think?

Julian · 2021-07-02T05:06:21Z

jsonschema/_utils.py

+        sliced = itertools.islice(sort, 1, None)
+
+        for i, j in zip(sort, sliced):
+            return not list_equal(list(i), list(j))


I think this will now likely fail for triply-nested non-list containers.

It also will now do a lot of copying (every object will always be copied).

It's a bit better if we instead just use isinstance(collections.Sequence or collections.Mapping I suspect.

But yeah please check the additional nesting, I suspect we need even better tests that uncover that. Really the nice thing would be a hypothesis test that generates arbitrarily nested containers which it puts 0s and Falses inside and ensures they're always considered unique. If you know how to do that that'd be great, otherwise yeah I suspect at least another manual test is needed.

You are absolutely right, a tripple nesting caused new failures, this was related to the equal function where the behavior was not extended. I refactored the tests and added some more cases as I'm not really sure how to create a data generator here and I'm afraid this makes debugging harder.

Julian · 2021-07-02T05:07:17Z

jsonschema/_utils.py

+            e = unbool(e)
+
+            for i in seen:
+                if isinstance(i, dict) and isinstance(e, dict):


This line too I'm suspicious of if it still contains isinstance(, dict) -- it'll fail for non-dict mappings then, so we need to exercise that case as well.

Let me know if that makes sense.

This has been removed and it makes use of the equal function instead.

Julian · 2021-07-09T16:09:20Z

Apologies for the delay here. I made some small tweaks, but have merged this.

Much appreciated!

I'll get you some feedback on the other PR next!

nezhar added 2 commits June 29, 2021 11:44

#686: Fixes unique and equal checks

5ce1de7

Add test for list_equal and dict_equal

3e5781d

nezhar mentioned this pull request Jun 29, 2021

Add support for draft 2020-12 #817

Merged

Add test case with custom sequance type

4ef44a7

Julian reviewed Jul 2, 2021

View reviewed changes

Extend sequance and mapping check

27661b2

Julian closed this Jul 9, 2021

Julian mentioned this pull request Dec 15, 2021

perf: Undesired fallback to brute force container uniqueness check on certain input types #893

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes unique and equal checks #820

Fixes unique and equal checks #820

nezhar commented Jun 29, 2021

codecov bot commented Jun 29, 2021 •

edited

Loading

Julian commented Jun 29, 2021 •

edited

Loading

nezhar commented Jun 29, 2021

Julian commented Jul 1, 2021

nezhar commented Jul 1, 2021

Julian commented Jul 1, 2021

nezhar commented Jul 1, 2021

Julian Jul 2, 2021

nezhar Jul 2, 2021

Julian Jul 2, 2021

nezhar Jul 2, 2021

Julian commented Jul 9, 2021

Fixes unique and equal checks #820

Fixes unique and equal checks #820

Conversation

nezhar commented Jun 29, 2021

codecov bot commented Jun 29, 2021 • edited Loading

Codecov Report

Julian commented Jun 29, 2021 • edited Loading

nezhar commented Jun 29, 2021

Julian commented Jul 1, 2021

nezhar commented Jul 1, 2021

Julian commented Jul 1, 2021

nezhar commented Jul 1, 2021

Julian Jul 2, 2021

Choose a reason for hiding this comment

nezhar Jul 2, 2021

Choose a reason for hiding this comment

Julian Jul 2, 2021

Choose a reason for hiding this comment

nezhar Jul 2, 2021

Choose a reason for hiding this comment

Julian commented Jul 9, 2021

codecov bot commented Jun 29, 2021 •

edited

Loading

Julian commented Jun 29, 2021 •

edited

Loading