Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes unique and equal checks #820

Closed
wants to merge 4 commits into from
Closed

Fixes unique and equal checks #820

wants to merge 4 commits into from

Conversation

nezhar
Copy link
Contributor

@nezhar nezhar commented Jun 29, 2021

Extends the equal check for dictionaries and lists.

This solves #686, but it's also usefull for newer spec implementations such as draft 2020-12

@codecov
Copy link

codecov bot commented Jun 29, 2021

Codecov Report

Merging #820 (27661b2) into main (0287da9) will increase coverage by 0.20%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #820      +/-   ##
==========================================
+ Coverage   95.90%   96.11%   +0.20%     
==========================================
  Files          17       18       +1     
  Lines        2587     2726     +139     
  Branches      299      310      +11     
==========================================
+ Hits         2481     2620     +139     
  Misses         86       86              
  Partials       20       20              
Impacted Files Coverage Δ
jsonschema/tests/test_jsonschema_test_suite.py 87.50% <ø> (ø)
jsonschema/_utils.py 90.51% <100.00%> (+2.21%) ⬆️
jsonschema/tests/test_utils.py 100.00% <100.00%> (ø)
jsonschema/tests/test_validators.py 98.60% <100.00%> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0287da9...27661b2. Read the comment docs.

@Julian
Copy link
Member

Julian commented Jun 29, 2021

There's some interactions with TypeChecker we need to account for -- specifically, we need a new test here internally (which I'm pretty sure will fail) where someone does type_checker.redefine("array", some_collections.Sequence_that_does_not_inherit_from_list), which is supported use of the library, and then which still contains 1 and True in that non-list-inheriting sequence.

And then the same for dict.

@nezhar
Copy link
Contributor Author

nezhar commented Jun 29, 2021

There's some interactions with TypeChecker we need to account for -- specifically, we need a new test here internally (which I'm pretty sure will fail) where someone does type_checker.redefine("array", some_collections.Sequence_that_does_not_inherit_from_list), which is supported use of the library, and then which still contains 1 and True in that non-list-inheriting sequence.

And then the same for dict.

Does this go in the right direction? Is there anything similar in the code you can point me to?

@Julian
Copy link
Member

Julian commented Jul 1, 2021

Sorry for being terse -- yes that's in the right direction, but now the point is not to test the behavior in that test case itself (which is already covered elsewhere in the suite), it's whether #686 is fixed there. I.e. whether deque([[False], [0]]), {'type': 'array', 'uniqueItems': True}) also fails properly now (because it contains nonunique items). And same for [MyMapping({"a": 0}), MyMapping({"a": False})]

Let me know if that makes sense, if not will write something longer up. And thanks again.

@nezhar
Copy link
Contributor Author

nezhar commented Jul 1, 2021

Sorry for being terse -- yes that's in the right direction, but now the point is not to test the behavior in that test case itself (which is already covered elsewhere in the suite), it's whether #686 is fixed there. I.e. whether deque([[False], [0]]), {'type': 'array', 'uniqueItems': True}) also fails properly now (because it contains nonunique items). And same for [MyMapping({"a": 0}), MyMapping({"a": False})]

Let me know if that makes sense, if not will write something longer up. And thanks again.

Thanks for the explanation. I added the examples you mentioned in the test 🙂

Maybe I miss something, but according to these tests this should be valid: https://github.com/json-schema-org/JSON-Schema-Test-Suite/blob/master/tests/draft2020-12/uniqueItems.json#L21

This is what actually fails

validator.validate([deque([False]), deque([0])])

validator.validate([MyMapping('a', 0),  MyMapping('a', False)])

According to the spec this should be valid, right?

@Julian
Copy link
Member

Julian commented Jul 1, 2021

Yes, sorry, I was typing too quickly -- valid, not invalid!

I didn't try them with the change here but yeah we want to make sure the fix covers those as well.

@nezhar
Copy link
Contributor Author

nezhar commented Jul 1, 2021

Yes, sorry, I was typing too quickly -- valid, not invalid!

I didn't try them with the change here but yeah we want to make sure the fix covers those as well.

I removed the first check in uniq as set would interfere in the equal check. Tests seem to be ok now, what do you think?

sliced = itertools.islice(sort, 1, None)

for i, j in zip(sort, sliced):
return not list_equal(list(i), list(j))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will now likely fail for triply-nested non-list containers.

It also will now do a lot of copying (every object will always be copied).

It's a bit better if we instead just use isinstance(collections.Sequence or collections.Mapping I suspect.

But yeah please check the additional nesting, I suspect we need even better tests that uncover that. Really the nice thing would be a hypothesis test that generates arbitrarily nested containers which it puts 0s and Falses inside and ensures they're always considered unique. If you know how to do that that'd be great, otherwise yeah I suspect at least another manual test is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are absolutely right, a tripple nesting caused new failures, this was related to the equal function where the behavior was not extended. I refactored the tests and added some more cases as I'm not really sure how to create a data generator here and I'm afraid this makes debugging harder.

e = unbool(e)

for i in seen:
if isinstance(i, dict) and isinstance(e, dict):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line too I'm suspicious of if it still contains isinstance(, dict) -- it'll fail for non-dict mappings then, so we need to exercise that case as well.

Let me know if that makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been removed and it makes use of the equal function instead.

@Julian
Copy link
Member

Julian commented Jul 9, 2021

Apologies for the delay here. I made some small tweaks, but have merged this.

Much appreciated!

I'll get you some feedback on the other PR next!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants