Coder Social home page Coder Social logo

hasura / graphql-data-specification Goto Github PK

View Code? Open in Web Editor NEW
18.0 9.0 2.0 224 KB

A specification for Data APIs with GraphQL

Home Page: https://playground.graphql-data.com

Haskell 71.47% TypeScript 23.73% JavaScript 0.67% CSS 3.18% Dockerfile 0.85% Shell 0.10%

graphql-data-specification's Introduction

GraphQL Data Specification (GDS)

Here is the talk at Enterprise GraphQL Conf '22 motivating the need and introducing the key ideas in this specification.

Status: DRAFT. This specification is still heavily a WIP and currently an early draft. We will achieve completion once this specifcation is entirely formalized and also has a reference implemention (currently in /tooling).

Contact: If you or your organization is interested in collaborating on the specification and/or working on a GraphQL Data API platform, reach out to us at: [email protected] and we'd love to exchange notes.

GDS is a GraphQL API specification for accesssing transactional, analytical and streaming data across multiple data sources that contain semantically related data. Accessing refers to reading, writing or subscribing to data.

GDS solves for the following requirements:

  • High performance out of the box: high-concurrency & low-latency
    • Automated query planning (compilation with JSON aggregation > data-loader > n+1)
    • Authorization rules integrated into data fetching automatically (predicate push-down)
    • Automated caching (cache-key discovery)
  • Security
    • Intuitive fine-grained authorization
    • Declarative
  • Standardization without losing flexibility
    • Federation across multiple types of sources (databases, API services)
    • Expose data-source specific capabilities (eg: specific types, operators or aggregation functions from upstream source)

There are 3 main components of this specification:

  1. Domain Graph Description Language
  2. Node Level Security
  3. GraphQL Schema and API specification

Check out the playground to get a feel for the DGDL, NLS and the resulting GraphQL schema.

Domain Graph Description Language (DGDL)

The Domain Graph Description Language is a DSL that describes the business domain of the user and the relationships between nodes of that domain graph. This domain graph can be used by GraphQL engines to generate a GraphQL schema and API that allows clients to access and operate on the domain graph.

The DGDL has the following concepts:

Model

A model that is backed by a data source and connected to other models in the same or other data sources. This model indicates whether it can be read from, and written to.

Model
  name :: String
  fields :: [Field]
  edges :: [Edge]
  selectable :: Boolean
  insertable :: Boolean
  updateable :: Boolean  // (only if selectable)
  deleteable :: Boolean  // (only if selectable)
  implements :: [VirtualModelName]
  implemented_by :: [VirtualModelName]

Virtual Model

A virtual model is a complex entity of the domain that may not exist concretely in the upstream data sources but is brought into existence as a part of the data graph during the lifetime of a data API request/response.

A virtual model would typically be used to represent one of 3 types of concepts:

  1. A complex property of a model
  2. A piece of data that is required as an input to a data API request
  3. A peice of data is output from a data API request

Like a model, a virtual model can also be semantically connected to other models and virtual models in the domain's data graph.

VirtualModel
	name :: String
	fields :: [Field]
	edges :: [Edge]
	implements :: [VirtualModelName]
	implemented_by :: [VirtualModelName]

Fields & edges

Fields and edges represent properties of a model or a virtual model. Fields are used to represent a property of the model itself, and edges are properties of the model that reference other models or virtual models that the parent model is "related to".


Field
	name :: FieldName 
	input :: VirtualModel
	output :: FieldType

FieldType
	FieldTypeNamed FieldTypeName | 
	FieldTypeList FieldType | 
	FieldTypeNotNull FieldType

FieldTypeName 
	FieldTypeNameScalar ScalarFieldTypeName | 
	FieldTypeNameVirtualModel VirtualModel | 
	FieldTypeNameEnum EnumName

ScalarFieldTypeName 
	Int | 
	String |
	Boolean | 
	Id | ... | 

Edge
	name :: String
	target :: VirtualModel | Model
  kind :: Object | Array

Enum
	name :: String
	values: [String]

Commands

Commands are methods that operate on the domain graph. They take a node as an input, perform an operation on the underlying data sources and return a node as an output. The logic that executes inside a command is opaque to the DGDL and not its concern.

Command
	name :: String
  input :: VirtualModel
  output :: VirtualModel
  operationType :: Read | Write

Aggregation functions

Any list of models or list of virtual models can be aggregated over using aggregation functions. These aggregation functions are composed along with predicate functions and are also available in the final GraphQL API that is exposed.

For every model or virtual model, the following aggregation function is generated:

// Input
ModelAggregateExpression: {
	groupBySet: [ModelScalarField], 
	aggregatedFields: [
		{AggregationOperator: {arguments:..., ModelField}}
	],
	where: ModelBooleanExpression
}

// Output
AggregatedModel:
	groupBySet: [ModelScalarField]
	aggregatedFields: [ModelField]

Predicate functions

Predicate functions operate on input and return a true or a false. Since they are type-safe, predicate functions rely on boolean expressions that are unique to each model and virtual model and allow the composition of fields, edges & aggregation functions.

Predicate functions are key to implementing filter arguments in the final GraphQL API and are used for Node Level Security

For every model or virtual model, the following boolean expression is generated:

FieldBooleanExpression:

  1. And / Or / Not expressions that allow composition with boolean algebra
  2. Boolean expressions for scalar fields that follow the following syntax:
	FieldName: { FieldTypeOperator: Input }
  1. Boolean expressions for fields that take an input, follow the following syntax:
	FieldName: { arguments: {...}. output: FieldBooleanExpression}
  1. Boolean expressions for fields that are virtual models, follow the following syntax:
	Field: VirtualModelBooleanExpression
  1. Boolean expressions for edges, follow the following syntax:
	EdgeName: VirtualModelBooleanExpression | ModelBooleanExpression
  1. Boolean expressions for fields & edges, that are a list of models or virtual models, follow the following syntax:
	FieldName: {arguments: ModelAggregateExpression, ModelFieldBooleanExpression}

Predicate functions (represented as boolean expressions) are the core of what allow validation, filtering and fine-grained security when accessing or operating on a data graph.

Node Level Security

Node level security is a authorization policy engine that allows creating fine-grained policies and privileges to scope access and operations on a data graph for end users.

Node level security rules can be applied to Models, Virtual Models & Commands.

NLS for models

Terminology:

  • Node: A node is a concrete instance of a model
  • Filter: Filter is a boolean expression that allows a particular role to select specific nodes in the domain graph that can be accesssed or operated on
  • Fields: Fields is a list of fields in the node that can be accessed or operated on
  • Check: Check is a boolean expression that validates whether a particular node meets that constraint, after it is operated on

Grammar:

ReadPermission:
 modelName: String
 roleName: String
 fields: [ModelFieldNames]
 filter: ModelBooleanExpression

InsertPermission:
 modelName: String
 roleName: String
 fields: [ModelFieldNames]
 presets: [(ModelFieldName, LiteralValue)] // For every scalar & enum field
 check: ModelBooleanExpression
 
UpdatePermission:
 modelName: String
 roleName: String
 fields: [ModelFieldNames]
 presets: [(ModelFieldName, LiteralValue)] // For every scalar & enum field
 filter: ModelBooleanExpression
 check: ModelBooleanExpression

DeletePermission:
 modelName: String
 roleName: String
 fields: [ModelFieldNames]
 filter: ModelBooleanExpression

NLS for virtual models

A virtual model doesn't support any particular operations but only exists in the context of an operation on the data graph (an entity in the data API request/response).

Grammar:

Permission:
 modelName: String
 roleName: String
 fields: [ModelFieldNames]
 constraint: ModelBooleanExpression

GraphQL schema & API

The GraphQL schema which defines the permitted list of operations on the data graph is created automatically from the Domain Graph Description Language with the following specification.

We're making it easy to browse the GraphQL schema & API convention via the GDS playground.

Query

The Query field of the GraphQL API contains:

  • For each model:
    • A GraphQL field to select one model, a list of models or an aggregate property of the model
  • For each command:
    • A GraphQL root field to invoke the command

GraphQL field to select one model or a list of models

Follows the following convention:

  • GraphQL field name: ModelName or ModelNameList
  • GraphQL field arguments:
    • where: ModelBooleanExpression
    • limit, offset
    • order_by: ModelSortExpression
  • GraphQL field type (selection set):
    • The fields of the model
    • For each edge:
      • A field that represents select one instance or a list of instances from the edge
      • A field that allows selecting an aggregate property from the edge

GraphQL field for commands of operationType read

Follows the following convention:

  • GraphQL field name: CommandName
  • GraphQL field arguments:
    • input: VirtualModel | [VirtualModel]
  • GraphQL field type (selection set):
    • OutputModel | [OutputModel]

Mutation

GraphQL field for commands of operationType write

Follows the following convention:

  • GraphQL field name: CommandName
  • GraphQL field arguments:
    • input: InputModel | [InputModel]
  • GraphQL field type (selection set):
    • OutputModel | [OutputModel]

GraphQL field to insert models (writeable models only)

Follows the following convention:

  • GraphQL field name: insert[ModelName]
  • GraphQL field arguments: Model
    • The fields available in the Model are the writeable fields
    • Edges that are writeable can also be inserted
  • GraphQL field type (selection set):
    • affectedNodes: Int
    • returning: [Model]

GraphQL field to update models

Follows the following convention:

  • GraphQL field name: update[ModelName]
  • GraphQL field arguments:
    • where: ModelBooleanExpression
    • _set: Model
      • The fields available in the Model are the writeable fields
      • Edges that are updateable can also be traversed and updated
  • GraphQL field type (selection set):
    • affectedNodes: Int
    • returning: [Model]

GraphQL field to delete models

Follows the following convention:

  • GraphQL field name: delete[ModelName]
  • GraphQL field arguments:
    • where: ModelBooleanExpression
  • GraphQL field type (selection set):
    • affectedNodes: Int
    • returning: [Model]

Subscription

GraphQL field for live queries

Follows the following convention:

  • GraphQL field name: ModelName
  • GraphQL field arguments:
    • where: ModelBooleanExpression
  • GraphQL field type (selection set):
    • The fields of the model
    • For each edge:
      • A field that represents select one instance or a list of instances from the edge
      • A field that allows selecting an aggregate property from the edge

GraphQL field for subscribing to a stream of models

Follows the following convention:

  • GraphQL field name: stream[ModelName]
  • GraphQL field arguments:
    • where: ModelBooleanExpression
    • cursor: [ModelFields]
    • order_by: ModelSortExpression
  • GraphQL field type (selection set):
    • The fields of the model
    • For each edge:
      • A field that represents select one instance or a list of instances from the edge
      • A field that allows selecting an aggregate property from the edge

Data source mapping

A data source contains one or both of the following entities:

  1. Logical or physical data models that can be read or written
  2. Methods that access underlying data models (read or write) and expose a view of the data

A data source can be any kind of database or an API service.

Conventions for exposing models from a database

  • The mapping layer should provide a way of creating a logical model from the underlying data model in the database
  • The logical model definition should only depend on other entities within the same data source
  • This includes computed fields, views, parameterized queries
  • The logical model should indicate if it can be read from or written to (insert, update, delete)
  • Each data source also indicates a set of types, and operators & functions that can operate on each type

Examples:

  • Postgres table with computed fields
  • Parameterized query (a parameterized view)

Conventions for exposing endpoints from an API service

  • The mapping layer should provide a way of annotating the following properties of an API backed resource:
    • (required) A Get method to fetch one instance of the resource
    • (optional) A List method to fetch multiple instances of the resource
    • (optional) A set of key parameters that are required to access the resource

Examples:

  • An algolia search endpoint
  • A createUser API endpoint

graphql-data-specification's People

Contributors

0x777 avatar coco98 avatar wawhal avatar

Stargazers

walkin avatar Larry Diehl avatar Arif Datoo avatar  avatar Raphael Costa avatar Jan Killian avatar Nick Amoscato avatar Aravind K P avatar Jesse avatar  avatar Gordon Johnston avatar Andrejs Agejevs avatar Gavin Ray avatar floydwch avatar C. T. Lin avatar  avatar Praveen Durairaju avatar yassine avatar

Watchers

Hudson Afonso avatar Uri Goldshtein avatar Rajoshi Ghosh avatar Manas Agarwal avatar  avatar Teo Stocco avatar Adrian Pauli avatar  avatar Rahul Agarwal avatar

graphql-data-specification's Issues

Take a final call on the authoring experience

A GDS compliant engine takes GDS input and presents a GraphQL API.

GDS Input:

  1. Source capabilities:
  • List of sources: Each source has:
    • catalog
    • operators
    • predicate functions
    • aggregate functions
    • projection functions
  • List of engine provided features (that can be applied on any datasource):
    • least common denominator operators, predicate functions, aggregation functions
    • custom projection functions
  1. DGDL: Models, Edges & Mapping
  2. NLS

Field constraints in GraphQL schemas

I really like the idea of GDS ... and even the current Hasura metadata model! ๐Ÿ˜ƒ

One aspect of Hasura-generated GraphQL schemas that I find lacking is the absence of constraints on model fields. While I understand that this isn't natively supported by GraphQL schemas, it would be incredibly helpful for clients utilizing the GraphQL schema to be aware of these constraints. Examples of useful constraints include field requirements, numerical data ranges, or text data formats (such as email, URL, or regex-defined constraints).

In an ideal world, Hasura metadata or GDS could be used to define field constraints and generate GraphQL directives (such as @constraint) for the fields. Is there a plan to implement such a feature in GDS?

Clarify Predicate functions expressiveness and flexibility

Hi,

I'm trying to understand the spec in its current form. Could you clarify how expressive predicate functions you intend predicate functions to be? Specifically, can they be defined as arbitrary functions using some simple scripting language? Or is it limited to a few keywords? In either case, can you elaborate on the goals and constrains that informed your current thinking?

Best,
J

Add specification for projection functions

Projection functions are stateless functions that take an input and provide an output.

They are very useful, especially in allowing addition of domain specific logic to NLS (eg: input validation) (#16 )

Allow usage of functions in predicate expressions

We should extend the predicate function grammar to allow application of functions and not just : : style expressions.

This will allow the predicate expressions to capture more complex use-cases.

// Custom type validation (check if email is valid)
isValid(node.email): true

Possible syntax that works well when represented in a JSON DSL:

apply:
  functionName: isValid
  arguments: [node.email]

Add support for ability to chain operations

Based on a conversation with Pierre on the Legend team at GS:

Aggregations on a GraphQL API are much harder for data consumers to pick up and use, and people prefer a tabular data structure that they can project, filter and aggregate. The reason they prefer the tabular structure is because they aren't constrained by the type system and can easily chain a sequence of operations. This is the KEY ask for teams working in financial services.

What would the equivalent in the GraphQL world be?

Eg: Do we need to introduce meta type to represent a tabular structure that can be introspected on the fly - to permit tools like graphiql, graphql-codegen to validate queries entirely on the client side?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.