Comments (5)
And in that case you want to essentially defer checking the schema is valid until the first time it's looked up, right? If so I think that's definitely doable, basically defer validation until "crawl-time" but not force you to crawl all the schemas sounds like "ideal" behavior.
from referencing.
A further consideration I suppose is that whilst we don't yet have any OpenAPI / AsyncAPI-specific support, when we do, that support may also benefit from such an object, which is possibly further argument to not couple at all to jsonschema
and think about how to do this quite generically while still making it easy for users to use.
from referencing.
Regarding how this relates back to #119 (I think it does relate; I think it's a good fit!), I don't think it's necessary or desriable that all of the validation happens when a resource is added to the registry. You probably already have this idea in mind, but I'd allow for
invalid_schema = ...
registry = invalid_schema @ registry # fine
registry.lookup(...) # big boom!
as valid/correct behavior.
For jsonschema
, specifically, my worry would be that calling check_schema
on every subschema could be very slow, and would effectively re-check many parts of a document many times. I think that might only be fully needed at the top-level. But not checking every subschema seems like it's a less clean design. Perhaps this is solvable with a "simple" caching version of check_schema
?
from referencing.
(I have not fully specced this out so definitely definitely your thoughts are even more welcome than they always are, and also as usual I'll respond just with first thoughts so it's possible I'll change my mind quite easily :)
For jsonschema, specifically, my worry would be that calling check_schema on every subschema could be very slow, and would effectively re-check many parts of a document many times.
The idea I think was to tie this to registry .crawl
.ing -- i.e. that a crawled SchemaValidatingRegistry
will have the invariant (that all resources are valid) guaranteed, and an uncrawled one not. That also I think solves the "slow" case, as that should™ guarantee we only check every (sub)schema once (and thereby not need any caching -- I'm loathe to even think about caching in jsonschema
-- if/when we need that, maybe as part of Dialect v2 work, we may as well then require schemas to be immutable dicts... But that too seems down the line.).
from referencing.
Am I understanding correctly that crawl()
would eagerly load and validate all $ref
s for a schema?
That's a potential problem for certain schemas, which may be factored out into many remote refs with the expectation that the validator will usually not have to load all of them. For a simple example:
"anyOf": [
{"$ref": "https://example.com/schemas/version5.json"},
{"$ref": "https://example.com/schemas/version4.json"},
{"$ref": "https://example.com/schemas/version3.json"},
{"$ref": "https://example.com/schemas/version2.json"},
{"$ref": "https://example.com/schemas/version1.json"}
]
With the expectation that an implementation will try to match against version5.json
before trying version4.json
or even loading it, etc.
The only project I've encountered in the wild where this is very significant is mkdocs-material. They have a plugin community which provides their own schemas. As a result, $ref
resolution can crawl dozens-to-hundreds of files! 😬
So that's sort of a question / input here: I think lazy evaluation is desirable for some users. Am I following the idea of crawl()
correctly?
from referencing.
Related Issues (20)
- Jsonschema validation 4 to 5 times slower when upgrading to referencing 0.31.0 HOT 2
- documentation uses `url` module which is no longer maintained HOT 2
- Type check errors HOT 2
- Check all $refs are resolvable HOT 6
- Resolving a `$ref` to a nested definition can fail to catch that a (bad) subschema is neither bool nor object, resulting in an `AttributeError` HOT 4
- Question regarding usage of generics HOT 5
- Are relative JSON pointers supported? HOT 2
- Invalid `type: ignore` style when checking with mypy HOT 3
- Add support for `definitions` to newer JSON Schema drafts
- 0.32.0: pytest fails
- 0.32.0: pytest fails HOT 1
- Type annotations for `Registry` usage are not well-supported by `mypy`, slightly tricky with `pyright` HOT 4
- sdist is missing suite HOT 1
- Consider providing `retrieve` functions with an enclosing specification
- Relative file path references not supported? HOT 3
- FYI: Small breaking changes are coming to some (non-schema author related) APIs
- Relative URL paths as $refs HOT 5
- Moving from `RefResolver` to the referencing library HOT 3
- Potential issue with attrs==22.1.0 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from referencing.