Usage¶
Invenio-JSONSchemas is a module for building and serving JSON Schemas.
Using this module one can:
- Define JSON Schemas and expose them under a
/schemas
endpoint. - Validate data using locally defined and/or external JSON Schemas.
- Resolve complex schemas and expand their references (
$ref
) orallOf
tags for usage with other libraries that do not support them.
JSON Schema basics¶
Using JSON Schemas is a popular way to define and make publicly available the internal structure of complex entities being used inside an application. Since Invenio is a digital library framework, dealing with entities that contain complex metadata, like for example bibliographic records, is common and if not handled properly can lead to data inconsistencies.
We will not attempt to explain in detail how JSON Schemas are defined and work, since there are much more thorough resources available in the official JSON Schema website. Having a basic knowledge of them though is recommended in order to understand what this module provides on top of them.
Here is a basic JSON Schema defining the structure of a bibliographic record
with a title
, an optional description
, a list of creators
and a
publication_year
:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"id": "https://myapp.org/schemas/record.json",
"type": "object",
"properties": {
"title": { "type": "string" },
"description": { "type": "string" },
"creators": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" }
},
"required": ["name"]
}
},
"publication_year": { "type": "integer" }
},
"required": ["title", "creators", "publication_year"]
}
A valid record for this JSON Schema would be the following:
{
"title": "This is a record title",
"creators": [ { "name": "Doe, John"}, { "name": "Roe, Jane" } ],
"publication_year": 2018
}
Initialization¶
First create a Flask application:
>>> from flask import Flask
>>> app = Flask(__name__)
Configuration¶
Before we initialize the InvenioJSONSchemas
extension, we need to
configure which is our application’s host. This will help with automatically
skipping additional HTTP requests when fetching locally defined JSON Schemas.
More about how this works is described in
Composable schemas with JSONRef.
>>> # If your website's host is e.g. "myapp.org"
>>> app.config['JSONSCHEMAS_HOST'] = 'myapp.org'
Last, but not least, let’s initialize the extension:
>>> from invenio_jsonschemas import InvenioJSONSchemas
>>> ext = InvenioJSONSchemas(app)
Setuptools integration¶
The above steps didn’t actually register any JSON Schemas. In order for your
JSON Schemas to be registered you must specify in your package’s setup.py
an entry point item in the invenio_jsonschemas.schemas
group, pointing to
a Python module where the actual JSON Schema .json
files are placed.
Invenio-JSONSchemas then takes care of loading them automatically during
application initialization.
By default the extension loads from entrypoint group name
invenio_jsonschemas.schemas
but you can change that as shown below:
ext = InvenioJSONSchemas(app, entry_point_group=<entrypoint_group_name>)
Registering JSON Schemas¶
Here is a directory structure containing two
JSON Schemas, biology/animal_record_schema.json
and record_schema.json
,
taken from this module’s example application:
$ tree --dirsfirst invenio-jsonschemas/examples/samplepkg
invenio-jsonschemas/examples/samplepkg
├── samplepkg
│ ├── jsonschemas
│ │ ├── biology
│ │ │ └── animal_record_schema.json
│ │ ├── __init__.py
│ │ └── record_schema.json
│ └── __init__.py
└── setup.py
The first thing in order to use invenio_jsonschemas
is to register your
folder that holds your schemas. To do so you have to include the entrypoint
to your package that points to your schema folder as shown below:
# invenio-jsonschemas/examples/samplepkg/setup.py
...
entry_points={
'invenio_jsonschemas.schemas': [
'samplepkg = samplepkg.jsonschemas' # path to your schema folder
],
},
...
After registering your schemas folder the extension knows where to find your
schemas and how to load them. The extension loads every schema that is under
{JSONSCHEMAS_HOST}/{JSONSCHEMAS_ENDPOINT}
by fetching it locally and not
making a network request. This means that reads directly the file of your
schema. Also, the same happens if you have inside of your schema a $ref
field pointing to the same url format. You can see the function called when the
schema url is requested here.
Exposing JSON Schemas¶
You can enable/disable the endpoint that serves your schemas during the
initialization of InvenioJsonSchemas
extension by passing the parameter
register_config_blueprint
. This parameter points to your configuration
variable that controls the serving of the schemas. So, if you want to disable
the schemas serving you can do it as shown below:
Also by default the schema endpoint will be prefixed by /schemas
. If you
want to change that you can change the JSONSCHEMAS_ENDPOINT
configuration
variable. For more available configuration options see the Configuration.
For the above example the two schemas would we available under:
https://myapp.org/schemas/record_schema.json
, andhttps://myapp.org/schemas/biology/animal_record_schema.json
Composable schemas with JSONRef¶
A JSON Schema can be a fully fleshed out schema, composed of only the
“primitive” types provided in the specification, but usually this is
impractical when there are sub-entities that are repeated throughout the
schema. For that reason JSON Schema provides $ref
fields which can point
to internal or external schemas. See such an example below:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"id": "https://myapp.org/schemas/record.json",
"definitions": {
"location": {
"type": "object",
"properties": {
"city": { "stype": "string" },
"country": { "stype": "string" },
"address": { "stype": "string" }
}
}
},
"type": "object",
"properties": {
"title": { "type": "string" },
"creator": { "$ref": "https://foo.org/schemas/person.json" },
"origin": { "$ref": "#/definitions/location" }
}
}
Invenio-JSONSchemas provides the ability to serve the fully resolved schema or
the compact version including one or many $ref
fields. The way to tell
the extension to serve the resolved schema is either by passing the
querystring parameter refs=1
when fetching a schema or by setting the
JSONSCHEMAS_REPLACE_REFS
configuration variable to True
. Internally the
module uses the JsonRef package
for resolving the references in the schema.
If you make a request to GET https://myapp.org/schemas/record.json?refs=1
,
you will get something similar to:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"id": "https://myapp.org/schemas/record.json",
"definitions": { ... },
"type": "object",
"properties": {
"title": { "type": "string" },
"creator": {
// person.json schema will be fetched and expanded here
...
},
"origin": {
"type": "object",
"properties": {
"city": { "stype": "string" },
"country": { "stype": "string" },
"address": { "stype": "string" }
}
}
}
}
The module also expands the allOf
tags when the resolved=1
querystring
parameter is passed or the JSONSCHEMAS_RESOLVE_SCHEMA
configuration
variable is set to True
. A schema example that includes the allOf
tag
can be shown below:
...
"id": "https://myapp.org/schemas/record.json",
"allOf": [
{ "properties": { "title": { "type": "string" } } },
{ "properties": { "status": { "enum": [ "published", "draft" ] } } }
]
...
If you make a request to GET https://myapp.org/schemas/record.json?resolved=1
you would get a response in the following format:
...
// The "allOf" items have been merged in a single object
"properties": {
"title": { "type": "string" },
"status": { "enum": [ "published", "draft" ] }
}
...
Note on storing absolute URLs¶
As discussed in this issue, it is not recommended to store and expose
absolute URLs in the $ref
, as they can change in the future. One should
instead try to use DOI/EPIC or other kind of identifiers with the certitude
that they will never change, to avoid broken references.
Using with Invenio-Records¶
Invenio-JSONSchemas includes an invenio_records.jsonresolver
entry point
item which registers a JSONResolver plugin for Invenio-Records. This basically
means that records that are being validated against schemas that include
$ref
s to locally defined schemas won’t do an HTTP request to fetch these
schemas, and resolve them locally. You can read more about record validation
in the documentation of Invenio-Records.