Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow "$ref" in place of any schema. #194

Closed
wants to merge 3 commits into from

Conversation

handrews
Copy link
Contributor

This is an alternative to #188 that includes both root schemas and subschemas.

handrews and others added 3 commits December 12, 2016 11:29
@handrews
Copy link
Contributor Author

While the diff looks large, all it does is move the existing top level schemas under "schema" and "hyperSchema" and make the top level a "oneOf" between either a reference or a regular schema/hyper-schema. Because of how that works, the default of {} needs to be re-specified in hyper-schema now, but that is the only other change.

This was referenced Dec 13, 2016
@awwright
Copy link
Member

Yeah, git diff -w helps reduce the diff size.

I'd like to hear some feedback from some more people about the direction to go. This is a bit of a change from how we've handled $ref up until now, and it might be the case the wording in the document itself needs to change instead.

@handrews
Copy link
Contributor Author

@awwright you should have gotten that feedback when you changed the spec.

@Relequestual
Copy link
Member

Pro tip, add &w=1 to a github diff page to ignore whitespace! =]

@Relequestual Relequestual added this to the Meta-schema draft-05 milestone Dec 20, 2016
@Relequestual
Copy link
Member

@awwright @handrews I much prefer this PR to #188.
This PR clearly fixes, as you quoted here #188 (comment)

7.  Schema references with $ref

	The "$ref" keyword is used to reference a schema, and provides the
	ability to validate recursive structures through self-reference.

	An object schema with a "$ref" property MUST be interpreted as a
	"$ref" reference.  The value of the "$ref" property MUST be a URI
	Reference.  Resolved against the current URI base, it identifies the
	URI of a schema to use.  All other properties in a "$ref" object MUST
	be ignored.

Do you not believe this language makes this PR required?

@Relequestual
Copy link
Member

@epoberezkin would really appreciate you weighing in on this as it's a blocker to a critical issue on the website, and an important issue for the draft-5 meta-schema.

@epoberezkin
Copy link
Member

epoberezkin commented Dec 20, 2016

@Relequestual I am not sure what is the contention here. I thought everybody agrees with #194, at least generally :)

As long as $ref means what I believe it means (I'll get to it in a second) I'm pretty neutral about all of the following options:

  • only allow $ref in subschemas (better looking meta-schema from aesthetics point of view, also easier for some users to process it)
  • allow $ref in root schema as well (more consistent, but uglier looking meta-schema)
  • not mention $ref in meta-schema at all (as it is in draft 4)

Whether we allow or dis-allow $ref in sub/root-schemas, the same should be done for booleans. Both approaches have their advantages and disadvantages. This one (#194) seems to have more support, so I am fine with it.

Some people may struggle with not having type: object and properties at the top level, but they will survive.

Regarding my note about what I think $ref means. What I think is missing in the current $ref definition (#188 (comment)) in spec is the statement like this:

base URI inside referenced (sub-)schema is determined by the base URI in the source schema and not by the URI used in $ref.

There are cases when these URIs are not the same thing, as I was trying to point out on many occasions. Also this is how the spec says (or a least implies) currently and how it is in the the test suite. Changing it would be quite damaging.

@awwright do you agree? Can we add something to this regard in the spec to be explicit?

@Relequestual
Copy link
Member

Relequestual commented Dec 20, 2016

@epoberezkin Thanks for the feedback. This issue is just about fixing the meta-schema for draft-5 language, and not about any new draft-6 functionality.

I'd be HUGELY greatful if you could extract parts of your comment to a different issue accoridngly please, as I really want to make 100% sure this PR comments remain on topic and do not diverge.

If I'm missunderstanding, and all of your comments are link to the draft-5 language, and not new draft-6 functionalit, then please dissregard. I'm not 100% clear on everything =/

@Relequestual Relequestual self-assigned this Dec 20, 2016
@handrews
Copy link
Contributor Author

Pro tip, add &w=1 to a github diff page to ignore whitespace! =]

OMG and they couldn't just put a button on the UI for that? Here's the URL for anyone who wants to see the much simpler diff that ignores whitespace:

https://github.com/json-schema-org/json-schema-spec/pull/194/files?w=1

@epoberezkin
Copy link
Member

Yeah, whitespace option is long due...

@Relequestual I thought that the intention to correct meta-schema is related to draft-6, why the same meta-schema as for draft-4 cannot be used, given that draft-5 is already published and the intention was to clarify rather than to change anything?

@handrews
Copy link
Contributor Author

why the same meta-schema as for draft-4 cannot be used, given that draft-5 is already published and the intention was to clarify rather than to change anything?

Because it failed at that intention.

In Draft 4, $ref was not part of JSON Schema at all, it was a separate spec that was considered separately. So it could not appear in the meta-schema, and didn't need to as it was always valid to put anywhere. And it wasn't possible because you could not define a property named $ref because $ref always indicated a reference no matter where it was.

Draft 5 moved $ref into the spec and placed restrictions on where it can appear. So $ref is now a keyword that can only appear in certain places, which allows for us to name properties $ref. This was the main point of making that change. So now $ref is only valid in some places, and needs to be part of the meta-schema to express that.

@epoberezkin
Copy link
Member

Draft 5 moved $ref into the spec and placed restrictions on where it can appear. So $ref is now a keyword that can only appear in certain places, which allows for us to name properties $ref. This was the main point of making that change. So now $ref is only valid in some places, and needs to be part of the meta-schema to express that.

I understand that but I am not convinced that it requires changes in meta-schema. I-D can have restrictions on the schema that are not expressed in meta-schema and as far as I understand $ref was supported as a property name in most validators anyway. But, if you say so, I don't mind.

I just thought it would be more productive to focus on getting draft-6 out rather than creating meta-schema for draft-5 which is very unlikely to be used by anybody...

@handrews
Copy link
Contributor Author

handrews commented Dec 20, 2016

I understand that but I am not convinced that it requires changes in meta-schema. I-D can have restrictions on the schema that are not expressed in meta-schema

Why? Why are you and awwright obsessed with making it harder to write a conforming schema by denying schema authors an accurate meta-schema? What is the downside of having a correct meta-schema? Seriously, how does it hurt us to actually require what we say we require?

Also THIS IS FOR DRAFT 6. Why do people keep thinking it is for draft 5? The fact that this particularl change would also be useful for a hypothetical but non-existing draft 5 meta-schema is irrelevant. It's for draft 6.

There is a draft 5 proposal, it's in the web site repo (b/c it was requested there). No one is paying attention to it because we all want to get draft 6 out. That is why I'm doing this.

@awwright
Copy link
Member

awwright commented Dec 20, 2016

We should also consider what we do in the event that we describe a feature that can't itself be described in the meta-schema. How we handle the "$schema" keyword itself is very close to this scenario.


We should also look back on what it is, exactly, that we consider a schema.

Well, a schema must be a JSON document -- itself a string of octets. We could, theoretically, describe a JSON Schema document in an ABNF. So at the very least, we should consider true, false, and $ref as directives that can appear. This seems to imply they should probably be described in some fashion in the meta-schema.

An ABNF doesn't have to exhaustively describe all the restrictions of a valid document, but there shouldn't be a case where the ABNF/grammar/schema disallows something that's actually valid. Which is what the meta-schema is doing now if you have a schema like {"$ref":"", "minimum":""}


I don't tend to agree that if true/false should be treated the same as $ref for consistency's sake. There might be a good reason we want to consider true/false a schema unto themselves, but consider $ref merely as a directive, without being a schema.

For example, third parties might be asked to write a schema, and true or false seem like perfectly reasonable values in they would want to provide.

But I could make the same argument for $ref too, what if a user wants to simply reference another schema, when asked to provide a schema.

Ultimately, I think anything that's legal in "allOf" should be legal by itself.

The other argument I've made against $ref in root is to help prevent infinite loops. But we have to implement infinite loop detection regardless, because of "allOf" and similar keywords. So this is pretty weak.


There's one other alternative here, which was used in earlier drafts: Allow a mere string as a URI reference to a schema. This would be in addition to allowing objects and booleans as a schema.

@handrews
Copy link
Contributor Author

We should also consider what we do in the event that we describe a feature that can't itself be described in the meta-schema.

Sure, but that's not the point of this PR and does not need to be decided in order to handle this PR.

There's one other alternative here, which was used in earlier drafts: Allow a mere string as a URI reference to a schema. This would be in addition to allowing objects and booleans as a schema.

Again, this PR is about bringing the meta-schema in alignment with the current draft, within the scope of what we have agreed should ideally be in Draft 6.

The idea of going back to a string URI reference, which is a radical departure from the last two drafts, is NOT suitable for discussion in this PR. If you want to propose that, please file it separately and we can consider it for Draft 7 or later.

@awwright
Copy link
Member

awwright commented Dec 20, 2016

Discussion about other features is appropriate if it's sizing up a possible alternative, in comparison to the PR. But yes, extended discussion should take place elsewhere.

There's one more option, if we're worried about putting keywords at root-level or not: Just disallow all additionalProperties in a "$ref" schema: { "dependencies": {"$ref": {"properties":{"$ref":true}, "additionalProperties":false}}} I would definitely like to see a SHOULD NOT be added to this effect.

@handrews
Copy link
Contributor Author

handrews commented Dec 20, 2016

Just disallow all additionalProperties in a "$ref" schema

I disagree with this but won't discuss it here. I'd be happy to elaborate in a separate issue.

This PR is just for aligning the meta-schema with what is already in the specification. We are not discussing alternatives here. We are fixing a bug.

"Alternatives" here means changing the spec, so file an issue if you want to change the spec. This is a bug fix.

@handrews handrews mentioned this pull request Dec 27, 2016
@epoberezkin
Copy link
Member

epoberezkin commented Dec 28, 2016

If the intention to prohibit all keywords but $ref, I very much prefer @awwright 's suggestion above (#194 (comment)) to this PR. I was actually going to post something similar.

@Relequestual even though it's intended for draft5, the same approach would go into draft6, and I don't like this approach.

I very much dislike the idea of not having properties on the top level and having such convoluted meta-schema because:

  1. it's very confusing - nobody will understand the intention
  2. it is likely to break some tools that expect properties on the top level

Meta-schema should not only validate schemas correctly, it should also document what the schema should be. With this PR it doesn't look like a documentation at all - it is obscure.

If we only want to prohibit validation keywords and allow everything else, then we can do this in draft5:

{
  "dependencies": {
    "$ref": {
      "not": {
        "anyOf": [
          { "required": ["properties"] },
          { "required": ["items"] },
          ...
        ]
      }
    }
  }
}

And in draft 6 it will be simpler:

{
  "dependencies": {
    "$ref": {
      "propertyNames": {
        "not": { "enum": ["properties", "items", ...] }
      }
    }
  }
}

In both cases the language in spec will have to change from "ignore" to "MUST NOT be present". If we want to keep "ignore", all that's needed is to add $ref into meta-schema inside properties keyword.

@awwright @handrews @Relequestual I can make a PR if that seems ok, we just need to settle on the option we want.

@handrews
Copy link
Contributor Author

@awwright @epoberezkin let me get this clear: You are both advocating that we fundamentally change how $ref works in draft 6 instead of simply bringing the meta-schema in line with how it works today.

Do you both understand that that is a major change and one that needs a lot of debate somewhere more public than a PR that is just a simple bug fix? This is going to majorly delay draft 6 if we want to make this change. Do you both feel that it is more important to change how referencing works than it is to get a functional draft with meta-schema out for public commentary and feedback?

@epoberezkin
Copy link
Member

epoberezkin commented Dec 28, 2016

@handrews I am not advocating that we change it. I made several suggestions, if you've read my comment, some of them keep it almost as it is. I just wrote that I prefer what @awwright wrote to this PR that even we can't reason about.

The compromise can be to simply define $ref as a property and not impose any restrictions on the presence of other properties in meta-schema, and simply keep saying in the spec that other keywords should be ignored.

All you achieve by this PR is that you are making a bigger number of schemas valid than without this PR. With this PR this schema is valid: { "$ref": "foo", "maximum": "bar" }. Without this PR the same schema is invalid. Was it worth going to all this trouble to achieve this (questionable) result?

@epoberezkin
Copy link
Member

epoberezkin commented Dec 28, 2016

What @awwright is trying to say that unless we want to restrict what can be together with $ref, this PR is absolutely pointless - it makes some weird schemas valid. @handrews if you don't want to make the decision about what can and what cannot be together with $ref, then this PR should simply be dropped and $ref simply defined as other keywords without any context restrictions (in meta-schema, they can still be in the spec). Otherwise we should decide now how we want to restrict it.

Again, you may think that this PR restricts something but it achieves the opposite effect - it extends what is a valid schema.

@handrews
Copy link
Contributor Author

What @awwright is trying to say that unless we want to restrict what can be together with $ref, this PR is absolutely pointless - it makes some weird schemas valid.

It is not pointless, it implements what we have in the spec. Why is that so hard for you and @awwright to understand? This is what the spec says. Lots of weird schemas are valid.

Again, you may think that this PR restricts something but it achieves the opposite effect - it extends what is a valid schema.

@epoberezkin Show me a schema that was invalid before this PR but is now valid after this PR.

@epoberezkin
Copy link
Member

epoberezkin commented Dec 28, 2016

Show me a schema that was invalid before this PR but is now valid after this PR.

It is in my comment above: { "$ref": "foo", "maximum": "bar" }

The presence of $ref allows any additional property so you can have anything you want in any keyword.

@epoberezkin
Copy link
Member

epoberezkin commented Dec 28, 2016

There is not a single schema that became invalid because of this PR. That's why I say it's pointless - it only extends what is valid at the price of a very high complexity.

@handrews
Copy link
Contributor Author

Show me a schema that was invalid before this PR but is now valid after this PR.

It is in my comment above: { "$ref": "foo", "maximum": "bar" }

The presence of $ref allows any additional property so you can have anything you want in any keyword.

That was valid before and it's still valid. This change doesn't make it more valid. The spec has always allowed additional properties but required that they be ignored.

There is not a single schema that became invalid because of this PR. That's why I say it's pointless.

This PR is just implementing what is in the spec. It can't make more things invalid because that is not in the spec, and we do not have anything close to agreement on changing the spec. If we do change the spec, we can (and should) change the schema to match the changed spec. It will not be harder to do that then. Whenever that is.

it only extends what is valid.

It doesn't extend anything! Show me a schema that was invalid before and is valid now. There are none, that's the whole point. This is what the spec says, nothing more, nothing less. Nothing that was valid became invalid. Nothing that was invalid became valid.

at the price of a very high complexity

What complexity? Using a definition for most of the schema? Nearly all non-trivial schemas I've ever worked with are done almost entirely in "definitions" plus just a bit at the top level to bring in the various definitions by name. Most people like naming things and referencing names. The names are more clear.

@epoberezkin
Copy link
Member

epoberezkin commented Dec 28, 2016

That was valid before and it's still valid.

Really? "maximum" must be a number. With this PR it can be anything, as long as there is $ref alongside it.

This PR is just implementing what is in the spec.

Not sure I understand what exactly it achieves that is not achieved already without it.

What complexity? Using a definition for most of the schema?

Yes + a convoluted oneOf / allOf combination where you still don't see that it extends the validity.

@epoberezkin
Copy link
Member

epoberezkin commented Dec 28, 2016

EDIT: When I say extends what is valid, I mean the result of validation of schema against meta-schema, not spec-compliance/non-compliance.

Just try validating schema { "$ref": "foo", "maximum": "bar" } by meta-schema before this PR and after, you will see that it was invalid and became valid.

So when you say "it implements the spec" what exactly it achieves apart from the extension above? Or you're really trying to say that the schema above MUST be valid and to achieve it we should make meta-schema as complex as you suggest? I don't think you'll get much support with this idea. I think everybody assumed that this PR restricts something... @Relequestual @awwright

@handrews
Copy link
Contributor Author

Just try validating schema { "$ref": "foo", "maximum": "bar" } by meta-schema before this PR and after, you will see that it was invalid and became valid.

It was only invalid because "$ref" wasn't in the schema at all. That was a bug not a feature. In draft 4 "$ref" was specified entirely separately and put no restrictions on what else was beside it as long as those properties were ignored. In draft 4, the meta schema was not used with "$ref" at all, as the rule was that first dereferencing occurred, then the meta-schema was applied to the result.

In draft 4's meta-schema "format" was left out by mistake. So technically "format": 42 was valid under draft 4's meta-schema, but we shouldn't use it to argue for anything because it was an error in the first place.

Draft 5 put "$ref" under the control of the JSON Schema spec without changing its syntax. That is what this implements. You are taking out your frustrations with "$ref" on this meta-schema bug fix instead of filing an actual issue to change "$ref" (and so is @awwright )

@epoberezkin
Copy link
Member

epoberezkin commented Dec 28, 2016

Even if you add $ref in meta-schema it would still validate other keywords, even though the spec instructs to ignore them during validation.

So, am I right that all you want to achieve with this PR is to specifically allow any value of any property alongside $ref, even if this property happens to be a JSON Schema keyword?

I would be very surprised if anybody understood it was the intention... If that's the case, it's not just pointless, it's almost malicious :)

@epoberezkin
Copy link
Member

epoberezkin commented Dec 28, 2016

That was a bug not a feature

So technically "format": 42 was valid under draft 4's meta-schema

That's not a bug. That's a simple recognition of the fact that the meta-schema can be a bit less/more restrictive than the spec. Again, you only do structural validation with the schema and keep other things outside. It equally applies to the meta-schema validation as it applies to any validation.

Your attempt to cram all the spec into meta-schema leads to unnecessarily overcomplicated meta-schema without adding any value to it.

@epoberezkin
Copy link
Member

You are taking out your frustrations with "$ref" on this meta-schema bug fix instead of filing an actual issue to change "$ref"

You are not right, and let's not make it personal. I have no frustration with $ref. I can only remind you that Ajv is the ONLY JavaScript validator at the moment that implements $ref correctly. So I know a thing or two about it.

I am fighting your super-meticulous approach of mirroring all the spec requirements in meta-schema at the expense of its clarity and usefulness not only as a schema to validate against, but also a documentation.

@handrews
Copy link
Contributor Author

I didn't view the "frustration" comment as a personal insult, I honestly thought it was a fairly obvious observation across all of the numerous discussions about $ref. It was certainly not my intention to be insulting so I apologize for that.

I still don't understand how using a definition makes anything less clear.

@handrews
Copy link
Contributor Author

@epoberezkin so are you going to open an issue for the changes you want?

@epoberezkin
Copy link
Member

I can make a new PR if we are in agreement to drop this one and about the course of action.

I am ok with a very simple change - add a property for $ref into a schema. Yes, it will not allow mixing invalid values of validation keywords with $ref. Then, spec doesn't say that invalid values should be allowed, it simply says that they should be ignored. So they won't be ignored during validation by meta-schema and will be ignored during validation - what's the damage?

@handrews
Copy link
Contributor Author

@epoberezkin I think I follow what you are saying there. Please submit the PR and if I still think it makes sense I will close this one in favor of the new one.

@epoberezkin
Copy link
Member

ok, will do

@epoberezkin
Copy link
Member

epoberezkin commented Dec 28, 2016

@handrews Should it include the changes for additionalProperties/additionalItems/boolean type? Or should there be as @awwright wanted several PRs?

@epoberezkin
Copy link
Member

@handrews ok, done. Others things have already been merged.

@handrews
Copy link
Contributor Author

Closing in favor of #211 as explained in that PR.

If @awwright or @Relequestual object to #211, I may re-submit this, but will likely do so as a new PR based on whatever reasons there are to object to #211. But I'm hoping #211 is it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants