'properties' keyword is defined in chapter 10.3.2.1 of the <a href="https://json-schem

But the properties keyword is a <a href="https://json-schema.org/draft/20

Thank you <a class="user-mention notranslate" data-hovercard-type="user" data-hovercar

I personally prefer the new keyword option. I'd like to propose <code class="notransl

How would locationSchemas interact with <code class="

Use JSON Pointers instead of 'properties' or 'prefixItems' keywords,about json-schema-org/json-schema-spec

Comments (16)

notEthan commented on September 23, 2024 2

I'm not in favor of this, really, it seems to solve a problem that is adequately handled by existing keywords. However, I am not deeply attached to that position, and maybe I would come to love the convenience of this approach (I've considered implementing similar for my own purposes before). My thoughts/suggestions, supposing something like this were to go in:

Given this is a pointer, and per "This principle can also be extended to pointers of rank greater than 1":

This is pretty fundamentally different from properties / prefixItems in applying at an arbitrary depth. I would consider it in a different category. That's fine/good as changing properties is quite unlikely to happen.
Changing required is (just my opinion) definitely a nonstarter, it is incompatible with countless schemas.

The use of JSON pointer and the keyword 'properties' are compatible if we accept for the keyword 'required' indifferently name or JSON pointer.

This is incorrect; a property name may be any string including any valid JSON pointer, so treating required entries or properties names as either property names or pointers is unresolvably ambiguous. But a new keyword for this is possible.
Terminology: "child" should refer to the immediate child of a node, not a node at any depth below. It doesn't appear in the spec currently but the term "descendent" is what I use in my tooling to refer to any node below another node (or the node itself, i.e. a node is a descendent of itself at pointer ""). I'll use that term below.

And, I know this is half the point of your suggestion, but the idea of having pointers as properties of JSON Schemas I think is unlikely for actually getting merged (again, just my opinion, which has no bearing on any actual outcome). It's possible, since the spec doesn't define any existing keywords that start with / - I don't know if any other existing vocabularies do (I doubt so but it is quite possible) but this would restrict that possibility. Adding a new keyword is less unlikely.

Given the above, I envision something like this (adapted/cut down from your example):

{
  "$schema": "https://json-schema.org/spec-with-descendents",
  "type": "object",
  "pointerDescendents": {
    "/name": {
      "type": "string"
    },
    "/address": {
      "type": "object"
    },
    "/address/street": {
      "type": "string"
    }
  },
  "requiredPointerDescendents": [
    "/name",
    "/name/address"
  ]
}

It's a major addition, more than just the keywords: existing keywords all apply in-place to the instance itself or to a child of the instance, so introducing ones that apply anywhere below the instance is a whole new domain. For that reason I avoided the keyword descendents without "pointer" - it opens the possibility of application to descendent nodes using, e.g. JSON Path on a jsonPathDescendents keyword, etc.

There is more to work out but I'll stop there because, as I started with, I'm not really of the opinion that this is a good thing to add (even with what I believe to be improvements suggested above). The complexity of applying to any descendent would be a significant challenge to implement in my own tooling, and I don't see enough benefit from being a bit less verbose to introduce such complexity.

from json-schema-spec.

mwadams commented on September 23, 2024 1

This last point is indicative of what I think is a larger problem (and at the root of the challenge for code generation - and some classes of optimization).

Another quite fundamental but as yet unspoken feature of JSON Schema is its locality. It is not possible to "reach out" into other parts of the document. This might just be "implicit philosophy" rather than deliberate intent but it makes it a lot simpler to reason about and avoids most of the ordering problems you get with other constraints languages.

As to code gen - I would need to create "pseudo-schema" types at the target locations to inject those properties, and I fear it would quickly become a mess. Especially when dynamic references are in play - I would need to walk the dynamic scope to find out if anyone was injecting properties anywhere.

from json-schema-spec.

notEthan commented on September 23, 2024 1

it seems simpler to me to initially only deal with single level json-pointers

If it's not targeting arbitrary depth, why use pointers? Just indicating the array index seems much simpler, and would look and function very similar to properties.

{
  "type": "array",
  "indexItems": {
    "0": {
      "title": "the first element"
    },
    "4": {
      "title": "the fifth element"
    }
  }
}

Reading through the considerations people explore above, my opinion that targeting arbitrary-depth descendents with pointers is more problematic than beneficial is stronger than when I first came to this issue.

from json-schema-spec.

gregsdennis commented on September 23, 2024

This is an interesting idea, and it's not the first time we've seen something like it. However, it breaks a fundamental operating model of JSON Schema: constraints are expressed using keywords.

In your example, /name and /age become separate constraints, but with the current approach, they're grouped under a single constraint properties which then have sub-constraints. Not necessarily a deal-breaker, but it's definitely something to consider.

This also breaks (or at least makes more difficult) other use cases for JSON Schema, like form and code generation.

Currently additionalProperties looks at properties to find out which properties it should validate. With an explicit properties keyword, this is simple, as you just take the keys in the keyword. Making the properties pointers would mean that additionalProperties would have to parse each pointer to determine which properties to exclude.

Lastly, keywords can technically be any string. To support this, we'd need to restrict keywords to non-pointers since they'd have special meaning. Again, not a deal-breaker, but worth noting.

from json-schema-spec.

loco-philippe commented on September 23, 2024

This is an interesting idea, and it's not the first time we've seen something like it. However, it breaks a fundamental operating model of JSON Schema: constraints are expressed using keywords.

I agree with you : constraints are expressed using keywords.
But the properties keyword is a Keywords for Applying Subschemas to Child Instances and not to express a constraint.

In your example, /name and /age become separate constraints, but with the current approach, they're grouped under a single constraint properties which then have sub-constraints. Not necessarily a deal-breaker, but it's definitely something to consider.

The difference is subtle:

"name": { "type": "string"} is a constraint introduced by the keyword properties followed by the name of child instance.
"/name": { "type": "string"} is a constraint introduced by the child instance pointer.
The only difference is that in the first case the "pointer names" are grouped in the properties keyword and in the second they are not grouped (not very difficult in terms of code -> see example to switch a complete schema with and without pointers).

This also breaks (or at least makes more difficult) other use cases for JSON Schema, like form and code generation.

I don't know about this topic (can you explain it ?)

Currently additionalProperties looks at properties to find out which properties it should validate. With an explicit properties keyword, this is simple, as you just take the keys in the keyword. Making the properties pointers would mean that additionalProperties would have to parse each pointer to determine which properties to exclude.

Yes, it is the same point as below.

Lastly, keywords can technically be any string. To support this, we'd need to restrict keywords to non-pointers since they'd have special meaning. Again, not a deal-breaker, but worth noting.

Yes we'd need to restrict keywords to prohibit that the first character being '/' but i'm not convinced that having such a keyword will be considered.

To conclude, I understand that the main obstacle is justifying the change.
In this case, it would be necessary to take into account:

that this proposal does not call into question what already exists (it is an addition),
that this responds to difficulties:
- solution to access an element of an array (does not exist today),
- solution to directly access an instance located deep in a tree (today we are forced to chain properties)
- gain in the size and readability of schemas

I don't know JSON Schema's strategy regarding pointers but it would also be interesting to position this proposal in relation to this strategy.

from json-schema-spec.

gregsdennis commented on September 23, 2024

But the properties keyword is a Keywords for Applying Subschemas to Child Instances and not to express a constraint.

properties and most of the other applicators do in fact provide assertions.

Validation succeeds if, for each name that appears in both the instance and as a name within this keyword's value, the child instance for that name successfully validates against the corresponding schema. - Core 10.3.2.1

This also breaks (or at least makes more difficult) other use cases for JSON Schema, like form and code generation.

I don't know about this topic (can you explain it ?)

While the JSON Schema specification is written targeting validation and annotation use cases, people also use it as a sort of data definition (which it isn't, really) in order to generate data entry forms or even generate code (e.g. creating models from schemas found in OpenAPI documents).

Moving to support pointers as keys may break these use cases. I'm sure that many of these kinds of users will adjust and find a way to still support their uses, but in the short term, it will break.

but i'm not convinced that having such a keyword will be considered.

User-built vocabularies can define whatever keywords they want. However, we are looking at reserving $ keywords for the Core vocab.

solution to access an element of an array (does not exist today)

I'm not sure what this is solving. Can you elaborate?

solution to directly access an instance located deep in a tree (today we are forced to chain properties)

gain in the size and readability of schemas

These were the previous arguments used.

Another thing to consider is pointer ambiguity. The JSON Pointer /foo/1/bar could apply to both

{
  "foo": [
    {
      "bar": 42
    }
  ]
}

and

{
  "foo": {
    "1": {
      {
        "bar"
      }
    }
  }
}

This is one place where the form and code generation can break down. There's no information as to whether /foo is supposed to be an array or an object.

I'm not shutting this down. I'm stating the difficulties we've had with this approach before.

TBH, I think this could probably be implemented as a new vocab. There's no requirement of vocabs to define discrete keywords, so technically a vocab could define this as a family of keywords.

I'd like to see how this would affect some "in the wild" schemas. For example, how would one of the meta-schemas be changed?

Also, what guidance would you give for when to use pointers (implicit structure) vs properties (explicit structure)?

from json-schema-spec.

loco-philippe commented on September 23, 2024

While the JSON Schema specification is written targeting validation and annotation use cases, people also use it as a sort of data definition (which it isn't, really) in order to generate data entry forms or even generate code (e.g. creating models from schemas found in OpenAPI documents).

Moving to support pointers as keys may break these use cases. I'm sure that many of these kinds of users will adjust and find a way to still support their uses, but in the short term, it will break.

I think it is not realistic to abandon the properties keyword, this proposal is just an additional tool. Everyone should have the choice of whether or not to use the properties keyword.

solution to access an element of an array (does not exist today)

I'm not sure what this is solving. Can you elaborate?

For example, if you have a code composed of ten numbers where the last is equal to 999 (.e.g [10, 25, 574, 65, 89, 5, 8, 56, 8, 999]), the schema could be :

{ "type": "array",
  "items": {"type": "integer"},
  "/9": {"const": 999}}

"/0": { "type": "number" },

This is one place where the form and code generation can break down. There's no information as to whether /foo is supposed to be an array or an object.

I agree with this pointer ambiguity but i think it is not a problem if in the schema you specify the type of instance:

{ "type": "array",
  "/0": {"const": 42}}

{ "type": "object",
  "/1": {"const": "bar"}}

Also, what guidance would you give for when to use pointers (implicit structure) vs properties (explicit structure)?

I don't know ! I think it is necessary to have feedback from other users to better identify the benefit of this approach.

from json-schema-spec.

loco-philippe commented on September 23, 2024

Thank you @notEthan for this comment.

Here are my remarks:

about required, I agree : It's not realistic to change its scope, an additional keyword is a better option,
about pointerDescendents : Having a new keyword is in fact perhaps a better option which does not call into question the current principles but which still raises the question of cohabitation with the associated keywords (e.g. additionalProperties) -> See comments from @gregsdennis
about the arbitrary depth : I agree that this is a major addition that requires detailed analysis (which remains to be undertaken)

Maybe we can have a multi-step strategy :

step 1: solution to associate a subschema with an element of a json-array (there is currently no solution defined to identify an element of an array)

{
  "type": "array",
    "pointerChild": { 
      "/5": {
       "const": 42
      }
   }
}

( or with another keyword, or without keyword)

step 2: extending the solution to json-object (without arbitrary depth)
step 3: extending to arbitrary depth

from json-schema-spec.

gregsdennis commented on September 23, 2024

I personally prefer the new keyword option. I'd like to propose locationSchemas.

Considering what this looks like in a subschema, we get

{
  "type": "array",
  "items": {
    "locationSchemas": { 
      "/foo/bar": { "const": 42 }
    }
  }
}

This case would need to be implemented like this since JSON Pointer doesn't have a wildcard segment, which you'd need to indicate "all items". So it becomes apparent that having this work in a subschema is a necessity. However, it should also be apparent (although we'll probably have to explicitly state it in the spec) that these pointers are relative to the instance location at this point in the evaluation, even though they're not Relative JSON Pointers.

Further, this should probably work like properties where the property values are defined but not required. That would means that the locations in locationSchemas are not required, but if a value exists at that location, its value must match the subschema. So in the above, an item could be a number or array, but if it's an object and the /foo/bar location exists within it, the value at that location must be 42.

To that end, we'd need a new keyword to indicate specific locations are required, requiredLocations.

{
  "type": "array",
  "items": {
    "locationSchemas": { 
      "/foo/bar": { "const": 42 }
    },
    "requiredLocations": [
      "/foo/bar"
    ]
  }
}

A fallout of requiredLocations is that the type structure must support that location. In this case, it would carry an implied schema declaring types, etc. This would be an equivalent schema.

{
  "type": "array",
  "items": {
    "type": "object",
    "properties": {
      "foo": {
        "type": "object",
        "properties": {
          "bar": { "const": 42 }
        },
        "required": [ "bar" ]
      }
    },
    "required": [
      "foo"
    ]
  }
}

I don't think the former is any less readable, and I think it should be reasonably easy to implement this.

from json-schema-spec.

jdesrosiers commented on September 23, 2024

How would locationSchemas interact with additionalProperties/unevaluatedProperties?

from json-schema-spec.

gregsdennis commented on September 23, 2024

If it were implemented as an external vocab, I expect that it wouldn't interact with them.

If we implemented it as part of the Core spec (which currently defines all applicators), we'd have the option to try and figure that out. At best, I expect it could define properties and items for each segment in the pointer.

This is all stream of consciousness...

The part that makes this keyword moot, though, is that you couldn't define additionalProperties/unevaluatedProperties without defining schemas at all of the appropriate levels, which is what locationSchmeas is intended to avoid.

Using the previous example, if the /*/foo (to use a wildcard in a pointer...) allowed no additional properties, then you'd have to have an additionalProperties at that level. You might be tempted to do that within the locationSchemas:

{
  "type": "array",
  "items": {
    "locationSchemas": {
      "/foo": { "additionalProperties": false },
      "/foo/bar": { "const": 42 }
    },
    "requiredLocations": [
      "/foo/bar"
    ]
  }
}

The problem here is that the additionalProperties can't see that bar has been defined anywhere.

So you try it outside of locationSchemas:

{
  "type": "array",
  "items": {
    "locationSchemas": {
      "/foo/bar": { "const": 42 }
    },
    "requiredLocations": [
      "/foo/bar"
    ],
    "properties": {
      "foo": { "additionalProperties": false }
    }
  }
}

but you have the same problem. You still need to define bar within the same schema as the additionalProperties:

{
  "type": "array",
  "items": {
    "locationSchemas": {
      "/foo/bar": { "const": 42 }
    },
    "requiredLocations": [
      "/foo/bar"
    ],
    "properties": {
      "foo": {
        "properties": { "bar": true },
        "additionalProperties": false
      }
    }
  }
}

which... why are we using locationSchemas at this point?

The other option is that additionalProperties just doesn't interact at all and we define yet another keyword for this kind of functionality, e.g. additionalLocations, that's used alongside the other two.

{
  "type": "array",
  "items": {
    "locationSchemas": {
      "/foo": { "additionalProperties": false },
      "/foo/bar": { "const": 42 }
    },
    "requiredLocations": [
      "/foo/bar"
    ],
    "additionalLocations": false
  }
}

This would disallow (or provide a schema for) any locations not specified by the pointers in an adjacent locationSchemas. Similarly, unevaluatedLocations could provide a schema for any locations not specified by pointers in adjacent or child-of-adjacent locationSchemas (would have to do some pointer math, probably).

These keywords, together, could be implemented in a separate vocab, too.

from json-schema-spec.

loco-philippe commented on September 23, 2024

We need to consider JSON pointers in two cases:

single-level: this is how Json schema currently works (a schema is a nested set of single-level schemas)
multi-level: This is a different approach from the single-level approach

single-level:

JSON Pointer is a string (e.g. /foo): This case is the same as properties and additionalProperties / unevaluatedProperties are available.
Three options are possible:
- option 1 - same keyword: With this option we can use foo or /foo interchangeably
- option 2 - new keyword: This option is complex because it is difficult to define compatibility with additionalProperties / unevaluatedProperties
- option 3 - no keyword: With this option, the JSON pointer is not allowed because this use case is identical to the property use case
JSON pointer is a number (for example /2): This case is similar to properties / additionalProperties / unevaluatedProperties.
Three options are possible:
- option 1 - use the keyword properties / additionalProperties / unevaluatedProperties: This option is simple and consists only of an extension of the semantics of 'properties'.
- option 2 - new keywords (for example elements / additionalElements / unevaluatedElements or locations / additionalLocations / unvaluatedLocations: this option has the advantage of not interfering with the keywords 'property'
- option 3 - no keyword: This option should not be considered because with the current keywords, it is not possible to address an element of an array.

Multi-level:

The multi-level approach is more comprehensive if we consider an instance as a tree of JSON pointers:

Example (from “Getting Started”):

{
  "order": {
    "orderId": "ORD123",
    "items": [
      {
        "name": "Product A",
        "price": 50
      },
      {
        "name": "Product B",
        "price": 30
      }
    ]
  }
}

This example is equivalent to the JSON pointer tree below:

{" ":                    ["/order"],
 "/order":               ["/order/orderId", "/order/items"],
 "/order/orderId":       "ORD123",
 "/order/items":         ["/order/items/0", "/order/items/1"],
 "/order/items/0":       ["/order/items/0/name", "/order/items/0/price"],
 "/order/items/0/name":  "Product A",
 "/order/items/0/price": 50,
 "/order/items/1":       ["/order/items/1/name", "/order/items/1/price"],
 "/order/items/1/name":  "Product B",
 "/order/items/1/price": 30
}

In this representation, an instance is an object where the keys are JSON pointers to subschemas and the values are the contents of the subschemas (an array of child JSON pointers (nodes) or a string/number/boolean/null ( leaves)).

The type of subschemas is deduced from the values (array if the child JSON pointers end with a number, object otherwise)

With this dual representation, the use of the JSON pointer is explicit.

Returning to @gregsdennis example, I have a few comments:

The structure with items / locationSchemas is more complex than a direct JSON pointer because "items" is used to validate all items and not just one. Pointers used with locationSchemas should also be translated to /0/foo/bar, /1/foo/bar and so on.

{
  "type": "array",
  "items": {
    "locationSchemas": { 
      "/foo/bar": { "const": 42 }
    }
  }
}

it seems simpler to me to initially keep the use of locationSchema with an explicit JSON pointer.

Example:

{
  "type": "array",
  "locationSchemas": {
    "1/foo": { "additionalProperties": false },
    "1/foo/bar": { "const": 42 }
  },
  "requiredLocations": [ "1/foo/bar" ]
}

because multi-level is not the main approach of JSON schema, I think that the “additional” / “unevaluated” question is complex and should perhaps be addressed in a second step.

Summary

To conclude, my opinion is the following:

keep option 1 for single-level JSON pointer (extending properties keyword to JSON pointers and Array) is the simplest way to introduce JSON pointers and add a new keyword locations (or locationSchema) / requiredLocations (without additionalLocations / unevaluatedLocations) for multi-level usage.
in case it will be semantically difficult to extend the properties keyword to JSON or Array pointers, I will propose to use the new keyword locations / requiredLocations / additionalLocations / unvaluatedLocations for Array and for multi-level JSON pointers (but without additionalLocations / unvaluatedLocations initially). In this case, locations will also be applicable to the objects.

from json-schema-spec.

gregsdennis commented on September 23, 2024

Output is going to be gross.

Going back to the simple example:

{
  "$id": "http://example.com/schema",
  "type": "array",
  "items": {
    "locationSchemas": { 
      "/foo/bar": { "const": 42 }
    }
  }
}

Output contains a couple properties which include JSON Pointers indicating properties in the schema: schemaLocation and evaluationPath.

However, now, since some of the segments are themselves JSON Pointers, those pointers need to be encoded before appending to the evaluation path.

So the output unit resulting from evaluating /foo/bar in the instance will be:

{
  "valid": true,
  "schemaLocation": "https://example.com/schema#/items/locationSchemas/~1foo~1bar",
  "evaluationPath": "/items/locationSchemas/~1foo~1bar",
  "instanceLocation": "/1/foo/bar"
}

The ~1foo~1bar as a pointer segment isn't great, but it's what you have to do.

from json-schema-spec.

loco-philippe commented on September 23, 2024

I agree that a multi-level approach is not a continuation of the current single-level approach and that a more global reflection is necessary if we want to generalize json-pointers.

This is why it seems simpler to me to initially only deal with single level json-pointers (meets the need for access to an element of an array).

from json-schema-spec.

gregsdennis commented on September 23, 2024

This is why it seems simpler to me to initially only deal with single level json-pointers

However, single-level pointers don't really get you much. You have a bunch of /foo keywords instead of a single properties: { 'foo': {}, ... } keyword. I'm not sure that small of a change is worth the work that implementations would have to put into it.

It also still doesn't solve the output problem of having to include pointers inside of another pointer.

from json-schema-spec.

gregsdennis commented on September 23, 2024

To @notEthan's point, there's also #1323 by @awwright which could also solve the problem of index-specific constraints.

from json-schema-spec.

Use JSON Pointers instead of 'properties' or 'prefixItems' keywords about json-schema-spec HOT 16 OPEN

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent