Comments (16)
I'm not in favor of this, really, it seems to solve a problem that is adequately handled by existing keywords. However, I am not deeply attached to that position, and maybe I would come to love the convenience of this approach (I've considered implementing similar for my own purposes before). My thoughts/suggestions, supposing something like this were to go in:
Given this is a pointer, and per "This principle can also be extended to pointers of rank greater than 1":
-
This is pretty fundamentally different from
properties
/prefixItems
in applying at an arbitrary depth. I would consider it in a different category. That's fine/good as changingproperties
is quite unlikely to happen. -
Changing
required
is (just my opinion) definitely a nonstarter, it is incompatible with countless schemas.The use of JSON pointer and the keyword 'properties' are compatible if we accept for the keyword 'required' indifferently name or JSON pointer.
This is incorrect; a property name may be any string including any valid JSON pointer, so treating
required
entries orproperties
names as either property names or pointers is unresolvably ambiguous. But a new keyword for this is possible. -
Terminology: "child" should refer to the immediate child of a node, not a node at any depth below. It doesn't appear in the spec currently but the term "descendent" is what I use in my tooling to refer to any node below another node (or the node itself, i.e. a node is a descendent of itself at pointer ""). I'll use that term below.
And, I know this is half the point of your suggestion, but the idea of having pointers as properties of JSON Schemas I think is unlikely for actually getting merged (again, just my opinion, which has no bearing on any actual outcome). It's possible, since the spec doesn't define any existing keywords that start with /
- I don't know if any other existing vocabularies do (I doubt so but it is quite possible) but this would restrict that possibility. Adding a new keyword is less unlikely.
Given the above, I envision something like this (adapted/cut down from your example):
{
"$schema": "https://json-schema.org/spec-with-descendents",
"type": "object",
"pointerDescendents": {
"/name": {
"type": "string"
},
"/address": {
"type": "object"
},
"/address/street": {
"type": "string"
}
},
"requiredPointerDescendents": [
"/name",
"/name/address"
]
}
It's a major addition, more than just the keywords: existing keywords all apply in-place to the instance itself or to a child of the instance, so introducing ones that apply anywhere below the instance is a whole new domain. For that reason I avoided the keyword descendents
without "pointer" - it opens the possibility of application to descendent nodes using, e.g. JSON Path on a jsonPathDescendents
keyword, etc.
There is more to work out but I'll stop there because, as I started with, I'm not really of the opinion that this is a good thing to add (even with what I believe to be improvements suggested above). The complexity of applying to any descendent would be a significant challenge to implement in my own tooling, and I don't see enough benefit from being a bit less verbose to introduce such complexity.
from json-schema-spec.
This last point is indicative of what I think is a larger problem (and at the root of the challenge for code generation - and some classes of optimization).
Another quite fundamental but as yet unspoken feature of JSON Schema is its locality. It is not possible to "reach out" into other parts of the document. This might just be "implicit philosophy" rather than deliberate intent but it makes it a lot simpler to reason about and avoids most of the ordering problems you get with other constraints languages.
As to code gen - I would need to create "pseudo-schema" types at the target locations to inject those properties, and I fear it would quickly become a mess. Especially when dynamic references are in play - I would need to walk the dynamic scope to find out if anyone was injecting properties anywhere.
from json-schema-spec.
it seems simpler to me to initially only deal with single level json-pointers
If it's not targeting arbitrary depth, why use pointers? Just indicating the array index seems much simpler, and would look and function very similar to properties
.
{
"type": "array",
"indexItems": {
"0": {
"title": "the first element"
},
"4": {
"title": "the fifth element"
}
}
}
Reading through the considerations people explore above, my opinion that targeting arbitrary-depth descendents with pointers is more problematic than beneficial is stronger than when I first came to this issue.
from json-schema-spec.
This is an interesting idea, and it's not the first time we've seen something like it. However, it breaks a fundamental operating model of JSON Schema: constraints are expressed using keywords.
In your example, /name
and /age
become separate constraints, but with the current approach, they're grouped under a single constraint properties
which then have sub-constraints. Not necessarily a deal-breaker, but it's definitely something to consider.
This also breaks (or at least makes more difficult) other use cases for JSON Schema, like form and code generation.
Currently additionalProperties
looks at properties
to find out which properties it should validate. With an explicit properties
keyword, this is simple, as you just take the keys in the keyword. Making the properties pointers would mean that additionalProperties
would have to parse each pointer to determine which properties to exclude.
Lastly, keywords can technically be any string. To support this, we'd need to restrict keywords to non-pointers since they'd have special meaning. Again, not a deal-breaker, but worth noting.
from json-schema-spec.
This is an interesting idea, and it's not the first time we've seen something like it. However, it breaks a fundamental operating model of JSON Schema: constraints are expressed using keywords.
I agree with you : constraints are expressed using keywords.
But the properties
keyword is a Keywords for Applying Subschemas to Child Instances and not to express a constraint.
In your example,
/name
and/age
become separate constraints, but with the current approach, they're grouped under a single constraintproperties
which then have sub-constraints. Not necessarily a deal-breaker, but it's definitely something to consider.
The difference is subtle:
"name": { "type": "string"}
is a constraint introduced by the keywordproperties
followed by the name of child instance."/name": { "type": "string"}
is a constraint introduced by the child instance pointer.
The only difference is that in the first case the "pointer names" are grouped in theproperties
keyword and in the second they are not grouped (not very difficult in terms of code -> see example to switch a complete schema with and without pointers).
This also breaks (or at least makes more difficult) other use cases for JSON Schema, like form and code generation.
I don't know about this topic (can you explain it ?)
Currently
additionalProperties
looks atproperties
to find out which properties it should validate. With an explicitproperties
keyword, this is simple, as you just take the keys in the keyword. Making the properties pointers would mean thatadditionalProperties
would have to parse each pointer to determine which properties to exclude.
Yes, it is the same point as below.
Lastly, keywords can technically be any string. To support this, we'd need to restrict keywords to non-pointers since they'd have special meaning. Again, not a deal-breaker, but worth noting.
Yes we'd need to restrict keywords to prohibit that the first character being '/' but i'm not convinced that having such a keyword will be considered.
To conclude, I understand that the main obstacle is justifying the change.
In this case, it would be necessary to take into account:
-
that this proposal does not call into question what already exists (it is an addition),
-
that this responds to difficulties:
- solution to access an element of an array (does not exist today),
- solution to directly access an instance located deep in a tree (today we are forced to chain
properties
) - gain in the size and readability of schemas
I don't know JSON Schema's strategy regarding pointers but it would also be interesting to position this proposal in relation to this strategy.
from json-schema-spec.
But the properties keyword is a Keywords for Applying Subschemas to Child Instances and not to express a constraint.
properties
and most of the other applicators do in fact provide assertions.
Validation succeeds if, for each name that appears in both the instance and as a name within this keyword's value, the child instance for that name successfully validates against the corresponding schema. - Core 10.3.2.1
This also breaks (or at least makes more difficult) other use cases for JSON Schema, like form and code generation.
I don't know about this topic (can you explain it ?)
While the JSON Schema specification is written targeting validation and annotation use cases, people also use it as a sort of data definition (which it isn't, really) in order to generate data entry forms or even generate code (e.g. creating models from schemas found in OpenAPI documents).
Moving to support pointers as keys may break these use cases. I'm sure that many of these kinds of users will adjust and find a way to still support their uses, but in the short term, it will break.
but i'm not convinced that having such a keyword will be considered.
User-built vocabularies can define whatever keywords they want. However, we are looking at reserving $
keywords for the Core vocab.
- solution to access an element of an array (does not exist today)
I'm not sure what this is solving. Can you elaborate?
- solution to directly access an instance located deep in a tree (today we are forced to chain properties)
- gain in the size and readability of schemas
These were the previous arguments used.
Another thing to consider is pointer ambiguity. The JSON Pointer /foo/1/bar
could apply to both
{
"foo": [
{
"bar": 42
}
]
}
and
{
"foo": {
"1": {
{
"bar"
}
}
}
}
This is one place where the form and code generation can break down. There's no information as to whether /foo
is supposed to be an array or an object.
I'm not shutting this down. I'm stating the difficulties we've had with this approach before.
TBH, I think this could probably be implemented as a new vocab. There's no requirement of vocabs to define discrete keywords, so technically a vocab could define this as a family of keywords.
I'd like to see how this would affect some "in the wild" schemas. For example, how would one of the meta-schemas be changed?
Also, what guidance would you give for when to use pointers (implicit structure) vs properties
(explicit structure)?
from json-schema-spec.
While the JSON Schema specification is written targeting validation and annotation use cases, people also use it as a sort of data definition (which it isn't, really) in order to generate data entry forms or even generate code (e.g. creating models from schemas found in OpenAPI documents).
Moving to support pointers as keys may break these use cases. I'm sure that many of these kinds of users will adjust and find a way to still support their uses, but in the short term, it will break.
I think it is not realistic to abandon the properties
keyword, this proposal is just an additional tool. Everyone should have the choice of whether or not to use the properties
keyword.
- solution to access an element of an array (does not exist today)
I'm not sure what this is solving. Can you elaborate?
For example, if you have a code composed of ten numbers where the last is equal to 999 (.e.g [10, 25, 574, 65, 89, 5, 8, 56, 8, 999]), the schema could be :
{ "type": "array",
"items": {"type": "integer"},
"/9": {"const": 999}}
"/0": { "type": "number" },
This is one place where the form and code generation can break down. There's no information as to whether
/foo
is supposed to be an array or an object.
I agree with this pointer ambiguity but i think it is not a problem if in the schema you specify the type of instance:
{ "type": "array",
"/0": {"const": 42}}
{ "type": "object",
"/1": {"const": "bar"}}
Also, what guidance would you give for when to use pointers (implicit structure) vs
properties
(explicit structure)?
I don't know ! I think it is necessary to have feedback from other users to better identify the benefit of this approach.
from json-schema-spec.
Thank you @notEthan for this comment.
Here are my remarks:
- about
required
, I agree : It's not realistic to change its scope, an additional keyword is a better option, - about
pointerDescendents
: Having a new keyword is in fact perhaps a better option which does not call into question the current principles but which still raises the question of cohabitation with the associated keywords (e.g.additionalProperties
) -> See comments from @gregsdennis - about the arbitrary depth : I agree that this is a major addition that requires detailed analysis (which remains to be undertaken)
Maybe we can have a multi-step strategy :
- step 1: solution to associate a subschema with an element of a json-array (there is currently no solution defined to identify an element of an array)
{
"type": "array",
"pointerChild": {
"/5": {
"const": 42
}
}
}
( or with another keyword, or without keyword)
- step 2: extending the solution to json-object (without arbitrary depth)
- step 3: extending to arbitrary depth
from json-schema-spec.
I personally prefer the new keyword option. I'd like to propose locationSchemas
.
Considering what this looks like in a subschema, we get
{
"type": "array",
"items": {
"locationSchemas": {
"/foo/bar": { "const": 42 }
}
}
}
This case would need to be implemented like this since JSON Pointer doesn't have a wildcard segment, which you'd need to indicate "all items". So it becomes apparent that having this work in a subschema is a necessity. However, it should also be apparent (although we'll probably have to explicitly state it in the spec) that these pointers are relative to the instance location at this point in the evaluation, even though they're not Relative JSON Pointers.
Further, this should probably work like properties
where the property values are defined but not required. That would means that the locations in locationSchemas
are not required, but if a value exists at that location, its value must match the subschema. So in the above, an item could be a number or array, but if it's an object and the /foo/bar
location exists within it, the value at that location must be 42
.
To that end, we'd need a new keyword to indicate specific locations are required, requiredLocations
.
{
"type": "array",
"items": {
"locationSchemas": {
"/foo/bar": { "const": 42 }
},
"requiredLocations": [
"/foo/bar"
]
}
}
A fallout of requiredLocations
is that the type structure must support that location. In this case, it would carry an implied schema declaring type
s, etc. This would be an equivalent schema.
{
"type": "array",
"items": {
"type": "object",
"properties": {
"foo": {
"type": "object",
"properties": {
"bar": { "const": 42 }
},
"required": [ "bar" ]
}
},
"required": [
"foo"
]
}
}
I don't think the former is any less readable, and I think it should be reasonably easy to implement this.
from json-schema-spec.
How would locationSchemas
interact with additionalProperties
/unevaluatedProperties
?
from json-schema-spec.
If it were implemented as an external vocab, I expect that it wouldn't interact with them.
If we implemented it as part of the Core spec (which currently defines all applicators), we'd have the option to try and figure that out. At best, I expect it could define properties and items for each segment in the pointer.
This is all stream of consciousness...
The part that makes this keyword moot, though, is that you couldn't define additionalProperties
/unevaluatedProperties
without defining schemas at all of the appropriate levels, which is what locationSchmeas
is intended to avoid.
Using the previous example, if the /*/foo
(to use a wildcard in a pointer...) allowed no additional properties, then you'd have to have an additionalProperties
at that level. You might be tempted to do that within the locationSchemas
:
{
"type": "array",
"items": {
"locationSchemas": {
"/foo": { "additionalProperties": false },
"/foo/bar": { "const": 42 }
},
"requiredLocations": [
"/foo/bar"
]
}
}
The problem here is that the additionalProperties
can't see that bar
has been defined anywhere.
So you try it outside of locationSchemas
:
{
"type": "array",
"items": {
"locationSchemas": {
"/foo/bar": { "const": 42 }
},
"requiredLocations": [
"/foo/bar"
],
"properties": {
"foo": { "additionalProperties": false }
}
}
}
but you have the same problem. You still need to define bar
within the same schema as the additionalProperties
:
{
"type": "array",
"items": {
"locationSchemas": {
"/foo/bar": { "const": 42 }
},
"requiredLocations": [
"/foo/bar"
],
"properties": {
"foo": {
"properties": { "bar": true },
"additionalProperties": false
}
}
}
}
which... why are we using locationSchemas
at this point?
The other option is that additionalProperties
just doesn't interact at all and we define yet another keyword for this kind of functionality, e.g. additionalLocations
, that's used alongside the other two.
{
"type": "array",
"items": {
"locationSchemas": {
"/foo": { "additionalProperties": false },
"/foo/bar": { "const": 42 }
},
"requiredLocations": [
"/foo/bar"
],
"additionalLocations": false
}
}
This would disallow (or provide a schema for) any locations not specified by the pointers in an adjacent locationSchemas
. Similarly, unevaluatedLocations
could provide a schema for any locations not specified by pointers in adjacent or child-of-adjacent locationSchemas
(would have to do some pointer math, probably).
These keywords, together, could be implemented in a separate vocab, too.
from json-schema-spec.
We need to consider JSON pointers in two cases:
- single-level: this is how Json schema currently works (a schema is a nested set of single-level schemas)
- multi-level: This is a different approach from the single-level approach
single-level:
-
JSON Pointer is a string (e.g.
/foo
): This case is the same asproperties
andadditionalProperties
/unevaluatedProperties
are available.
Three options are possible:- option 1 - same keyword: With this option we can use
foo
or/foo
interchangeably - option 2 - new keyword: This option is complex because it is difficult to define compatibility with
additionalProperties
/unevaluatedProperties
- option 3 - no keyword: With this option, the JSON pointer is not allowed because this use case is identical to the
property
use case
- option 1 - same keyword: With this option we can use
-
JSON pointer is a number (for example
/2
): This case is similar toproperties
/additionalProperties
/unevaluatedProperties
.
Three options are possible:- option 1 - use the keyword
properties
/additionalProperties
/unevaluatedProperties
: This option is simple and consists only of an extension of the semantics of 'properties'. - option 2 - new keywords (for example
elements
/additionalElements
/unevaluatedElements
orlocations
/additionalLocations
/unvaluatedLocations
: this option has the advantage of not interfering with the keywords 'property' - option 3 - no keyword: This option should not be considered because with the current keywords, it is not possible to address an element of an array.
- option 1 - use the keyword
Multi-level:
The multi-level approach is more comprehensive if we consider an instance as a tree of JSON pointers:
Example (from “Getting Started”):
{
"order": {
"orderId": "ORD123",
"items": [
{
"name": "Product A",
"price": 50
},
{
"name": "Product B",
"price": 30
}
]
}
}
This example is equivalent to the JSON pointer tree below:
{" ": ["/order"],
"/order": ["/order/orderId", "/order/items"],
"/order/orderId": "ORD123",
"/order/items": ["/order/items/0", "/order/items/1"],
"/order/items/0": ["/order/items/0/name", "/order/items/0/price"],
"/order/items/0/name": "Product A",
"/order/items/0/price": 50,
"/order/items/1": ["/order/items/1/name", "/order/items/1/price"],
"/order/items/1/name": "Product B",
"/order/items/1/price": 30
}
In this representation, an instance is an object where the keys are JSON pointers to subschemas and the values are the contents of the subschemas (an array of child JSON pointers (nodes) or a string/number/boolean/null ( leaves)).
The type of subschemas is deduced from the values (array if the child JSON pointers end with a number, object otherwise)
With this dual representation, the use of the JSON pointer is explicit.
Returning to @gregsdennis example, I have a few comments:
- The structure with
items
/locationSchemas
is more complex than a direct JSON pointer because "items" is used to validate all items and not just one. Pointers used withlocationSchemas
should also be translated to/0/foo/bar
,/1/foo/bar
and so on.
{
"type": "array",
"items": {
"locationSchemas": {
"/foo/bar": { "const": 42 }
}
}
}
-
it seems simpler to me to initially keep the use of
locationSchema
with an explicit JSON pointer.Example:
{
"type": "array",
"locationSchemas": {
"1/foo": { "additionalProperties": false },
"1/foo/bar": { "const": 42 }
},
"requiredLocations": [ "1/foo/bar" ]
}
- because multi-level is not the main approach of JSON schema, I think that the “additional” / “unevaluated” question is complex and should perhaps be addressed in a second step.
Summary
To conclude, my opinion is the following:
- keep option 1 for single-level JSON pointer (extending
properties
keyword to JSON pointers and Array) is the simplest way to introduce JSON pointers and add a new keywordlocations (or locationSchema)
/requiredLocations
(withoutadditionalLocations
/unevaluatedLocations
) for multi-level usage. - in case it will be semantically difficult to extend the
properties
keyword to JSON or Array pointers, I will propose to use the new keywordlocations
/requiredLocations
/additionalLocations
/unvaluatedLocations
for Array and for multi-level JSON pointers (but withoutadditionalLocations
/unvaluatedLocations
initially). In this case,locations
will also be applicable to the objects.
from json-schema-spec.
Output is going to be gross.
Going back to the simple example:
{
"$id": "http://example.com/schema",
"type": "array",
"items": {
"locationSchemas": {
"/foo/bar": { "const": 42 }
}
}
}
Output contains a couple properties which include JSON Pointers indicating properties in the schema: schemaLocation
and evaluationPath
.
However, now, since some of the segments are themselves JSON Pointers, those pointers need to be encoded before appending to the evaluation path.
So the output unit resulting from evaluating /foo/bar
in the instance will be:
{
"valid": true,
"schemaLocation": "https://example.com/schema#/items/locationSchemas/~1foo~1bar",
"evaluationPath": "/items/locationSchemas/~1foo~1bar",
"instanceLocation": "/1/foo/bar"
}
The ~1foo~1bar
as a pointer segment isn't great, but it's what you have to do.
from json-schema-spec.
I agree that a multi-level approach is not a continuation of the current single-level approach and that a more global reflection is necessary if we want to generalize json-pointers.
This is why it seems simpler to me to initially only deal with single level json-pointers (meets the need for access to an element of an array).
from json-schema-spec.
This is why it seems simpler to me to initially only deal with single level json-pointers
However, single-level pointers don't really get you much. You have a bunch of /foo
keywords instead of a single properties: { 'foo': {}, ... }
keyword. I'm not sure that small of a change is worth the work that implementations would have to put into it.
It also still doesn't solve the output problem of having to include pointers inside of another pointer.
from json-schema-spec.
To @notEthan's point, there's also #1323 by @awwright which could also solve the problem of index-specific constraints.
from json-schema-spec.
Related Issues (20)
- Question about references ($ref) HOT 2
- Implement a common language style throughout the specifications HOT 8
- pattern matching key/value strings with propertyDependencies / how to require properties with a match? HOT 6
- Clarify how schemas embedded inside other document types are identified HOT 3
- `file://path` is loaded as an absolute path HOT 3
- Publish new drafts of expired Specification Documents HOT 4
- Apply transformations to Dynamic References HOT 5
- `min*`, `max*`, `pattern` constrained from validated JSON data HOT 2
- Proposal: Make `format` validate by default HOT 59
- $schema introduces specification-level circular dependency HOT 11
- Wild idea for vocabs HOT 2
- Future
- Simplify Issue Management HOT 7
- Specification Flow could be improved (but what do we work on next?) HOT 3
- Schema validation crashes with stack overflow HOT 5
- How to interpret schema with both properties and oneOf? HOT 5
- Clarification Needed on the Status and Future Updates of JSON Schema Specification HOT 2
- Add version identifier HOT 2
- 🧹 Clarification: Behavior for duplicate `$id`s or `$anchor`s HOT 6
- 🧹 Clarification: missing definition of `required` for object schema properties HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from json-schema-spec.