Coder Social home page Coder Social logo

elasticsearch-specification's Introduction

Elasticsearch API Specification

The Elasticsearch API Specification provides the contract for communication between client and server components within the Elasticsearch stack. With almost 500 API endpoints and around 3000 data types across the entire API surface, this project is a vitally important part of sustaining our engineering efforts at scale.

The repository has the following structure:

Path Description
api-design-guidelines/ Knowledge base of best practices for API design.
compiler/ TypeScript compiler for specification definition to JSON.
compiler-rs/
docs/
output/
specification/ Elasticsearch request/response definitions in TypeScript.
typescript-generator/

This JSON representation is formally defined by a set of TypeScript definitions (a meta-model) that also explains the various properties and their values.

Prepare the environment

For generating the JSON representation and running the validation code you need to install and configure Node.js in your development environment.

You can install Node.js with nvm:

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.1/install.sh | bash

Once the installation is completed, install Node.js with nvm:

# this command will install the version configured in .nvmrc
nvm install

How to generate the JSON representation

# clone the project
$ git clone https://github.com/elastic/elasticsearch-specification.git

# install the dependencies
$ make setup

# generate the JSON representation
$ make generate

# the generated output can be found in ./output/schema/schema.json
$ cat output/schema/schema.json

Make Targets

Usage:
  make <target>
  validate         Validate a given endpoint request or response
  validate-no-cache  Validate a given endpoint request or response without local cache
  generate         Generate the output spec
  compile          Compile the specification
  license-check    Add the license headers to the files
  license-add      Add the license headers to the files
  spec-format-check  Check specification formatting rules
  spec-format-fix  Format/fix the specification according to the formatting rules
  spec-dangling-types  Generate the dangling types rreport
  setup            Install dependencies for contrib target
  clean-dep        Clean npm dependencies
  contrib          Pre contribution target
  help             Display help

Structure of the JSON representation

The JSON representation is formally defined as TypeScript definitions. Refer to them for the full details. It is an object with two top level keys:

{
  "types": [...],
  "endpoints": [...]
}

The first one, types, contains all the type definitions from the specification, such as IndexRequest or MainError, while the second one, endpoints, contains every endpoint of Elasticsearch and the respective type mapping. For example:

{
  "types": [    {
    "attachedBehaviors": [
      "CommonQueryParameters"
    ],
    "body": {
      "kind": "value",
      "value": {
        "kind": "instance_of",
        "type": {
          "name": "TDocument",
          "namespace": "_global.index"
        }
      }
    },
    "generics": [
      {
        "name": "TDocument",
        "namespace": "_global.index"
      }
    ],
    "inherits": {
      "type": {
        "name": "RequestBase",
        "namespace": "_types"
      }
    },
    "kind": "request",
    "name": {
      "name": "Request",
      "namespace": "_global.index"
    },
    "path": [...],
    "query": [...]
  }, {
    "inherits": {
      "type": {
        "name": "WriteResponseBase",
        "namespace": "_types"
      }
    },
    "kind": "response",
    "name": {
      "name": "Response",
      "namespace": "_global.index"
    }
  }],
  "endpoints": [{
      "accept": [
        "application/json"
      ],
      "contentType": [
        "application/json"
      ],
      "description": "Creates or updates a document in an index.",
      "docUrl": "https://www.elastic.co/guide/en/elasticsearch/reference/master/docs-index_.html",
      "name": "index",
      "request": {
        "name": "Request",
        "namespace": "_global.index"
      },
      "requestBodyRequired": true,
      "response": {
        "name": "Response",
        "namespace": "_global.index"
      },
      "since": "0.0.0",
      "stability": "stable",
      "urls": [...],
      "visibility": "public"
    }]
}

The example above represents the index request, inside the endpoints array you can find the API name and the type mappings under request.name and response.name. The respective type definitons can be found inside the types array.

In some cases an endpoint might be defined, but there is no a type definition yet, in such case the request and response value will be null.

How to validate the specification

The specification is validated daily by the client-flight-recorder project. The validation result can be found here.

Validate the specification in your machine

The following step only apply if you don't have ~/.elastic/github.token in place.

Create GitHub token to allow authentication with Vault.

  • Go to https://github.com/settings/tokens.
  • Click Generate new token.
  • Give your token a name and make sure to click the repo and read:org scopes.
  • Create a file at ~/.elastic/github.token and paste the GitHub token into it.
  • Change permissions on the file allow access only from the user. chmod 600 ~/.elastic/github.token

You can see here how to generate a token.

Once you have configured the environment, run the following commands:

git clone https://github.com/elastic/elasticsearch-specification.git
git clone https://github.com/elastic/clients-flight-recorder.git

cd elasticsearch-specification
# this will validate the xpack.info request type against the 8.1.0 stack version
make validate api=xpack.info type=request stack-version=8.1.0-SNAPSHOT

# this will validate the xpack.info request and response types against the 8.1.0 stack version
make validate api=xpack.info stack-version=8.1.0-SNAPSHOT

The last command above will install all the dependencies and run, download the test recordings and finally validate the specification. If you need to download the recordings again, run make validate-no-cache api=xpack.info type=request stack-version=8.1.0-SNAPSHOT.

Once you see the errors, you can fix the original definition in /specification and then run the command again until the types validator does not trigger any new error. Finally open a pull request with your changes.

Documentation

FAQ

I want to see a report of how types and namespaces are being used.

You can find a report of the main branch here.

A specific property is not always present, how do I define it?

When you define a property the syntax is propertyName: propertyType. By default a property is required to exist. If you know that a property will not always be there, you can add a question mark just before the column:

propertyRequired: string
propertyOptional?: string

A definition is missing, how do I add it?

See here.

A definition is not correct, how do I fix it?

All the definitons are inside /specification folder, search the bad defintion and update it, you can find above how to run the validation of the spec.

An endpoint is missing, how do I add it?

See here.

An endpoint definition is not correct, how do I fix it?

All the endpoint definitons are inside /specification/_json_spec folder, which contains a series of JSON files taken directly from the Elasticsearch rest-api-spec. You should copy from there the updated endpoint defintion and change it here.

The validation in broken on GitHub but works on my machine!

Very likely the recordings on your machine are stale, rerun the validation with the validate-no-cache make target.

You should pull the latest change from the client-flight-recorder as well.

cd client-flight-recorder
git pull

Where do I find the generated test?

Everytime you run make validate script, a series of test will be generated and dumped on disk. You can find the failed tests in clients-flight-recorder/scripts/types-validator/workbench. The content of this folder is a series of recorded responses from Elasticsearch wrapped inside an helper that verifies if the type definiton is correct.

Which editor should I use?

Any editor is fine, but to have a better development experience it should be configured to work with TypeScript. Visual Studio Code and IntelliJ IDEA come with TypeScript support out of the box.

Is there a complete example of the process?

Yes, take a look here.

realpath: command not found

The validation script uses realpath which may be not present in your system. If you are using MacOS, run the following command to fix the issue:

brew install coreutils

I need to modify che compiler, help!

Take a look at the compiler documentation.

BirdsEye overview

The work of several repositories come together in this repository. This diagram aims to sketch an overview of how different pieces connect

overview.png

elasticsearch-specification's People

Contributors

abdonpijpelink avatar albertzaharovits avatar anaethelion avatar carlosdelest avatar davidkyle avatar delvedor avatar dependabot[bot] avatar droberts195 avatar elasticmachine avatar ezimuel avatar flobernd avatar github-actions[bot] avatar jedrazb avatar joshmock avatar jrodewig avatar kderusso avatar l-trotta avatar lcawl avatar maxhniebergall avatar miriam-eid avatar mpdreamz avatar n1v0lg avatar philkra avatar picandocodigo avatar pquentin avatar sethmlarson avatar stevejgordon avatar swallez avatar szabosteve avatar technige avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elasticsearch-specification's Issues

Additional Information for type_alias items.

We have various type_alias definitions which represent a union of two possible instance items. Often these are fairly primitive types such as this example for MinimumShouldMatch.

{
  "kind": "type_alias",
  "name": {
	"name": "MinimumShouldMatch",
	"namespace": "common_options.minimum_should_match"
  },
  "type": {
	"items": [
	  {
		"kind": "instance_of",
		"type": {
		  "name": "integer",
		  "namespace": "internal"
		}
	  },
	  {
		"kind": "instance_of",
		"type": {
		  "name": "string",
		  "namespace": "internal"
		}
	  }
	],
	"kind": "union_of"
  },
  "url": "https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-minimum-should-match.html"
}

In the existing NEST client, we provide a type deriving from our Union<T1, T2> base. For MinimumShouldMatch it provides constructors to initialise it with either a string or integer. We also provide two factory methods in this example to make it clearer and easier to create an instance. For example, we allow a double value to be provided as an argument, which we then format as a percentage string.

public class MinimumShouldMatch : Union<int?, string>
public class MinimumShouldMatch : Union<int?, string>
{
    public MinimumShouldMatch(int count) : base(count) { }
    public MinimumShouldMatch(string percentage) : base(percentage) { }

    public static MinimumShouldMatch Fixed(int count) => count;
    public static MinimumShouldMatch Percentage(double percentage) => $"{percentage}%";
}

The current schema does not encode enough information to support generating these rich, user-friendly types and I wanted to open a discussion around this.

On the code generator sync on Monday, we did consider the option of using a type_alias representing Percentage and either Fixed or Count as appropriate. The Union would then use those type aliases, rather than the primitive types. This has some advantages in that we can create these richer types during code generation. However, it may see an explosion of the number of type aliases in the schema.

Even using type aliases, I'm not sure how/if we'd like to encode sufficient information into the model to support strongly-typed clients providing rich creation of a percentage string from a double.

Default, Version & Description Parameter Annotations

🚀 Feature Proposal

Enable the enrichment of request parameters via annotation with

  • a default value
  • version introduced
  • parameter description
    All values should be optional and this must been seen as a long term goal to provide this data with collaboration of other teams.

Motivation

  1. Developer experience, by getting a hint on the default value of a parameter and a proper description.
  2. The docs team could provide more elaborate generated documentation.
  3. The UI could add more context to UI fields and generate larger parts of forms.

Example

Example definition of a param within a request class:

/**
 * @default 30s
 * @description This is the description of a parameter that was introduced in 7.8.0 with the default value 30s
 * @version 7.8.0
 */
a_parameter: string

json output:

{
  "name": "a_parameter",
  "required": true,
  "default": "30s",
  "description": "This is the description of a parameter that was introduced in 7.8.0 with the default value 30s",
  "version": "7.8.0",
  "type": {
    "kind": "instance_of",
    "type": {
      "name": "SearchableSnapshotsUsage",
      "namespace": "x_pack.info.x_pack_usage"
    }
  }
}

Create Report of Import Graph for improving Organisation

🚀 Feature Proposal

Track non Request & Response class usage for optimising organisation.

Motivation

It's difficult to keep to identify common classes. Having usage graph would allow moving more classes into common and the other way around to push only one time used classes (& enums) back into the calling class.

Example

{
"name": "ApiKey",
"namespace": "common.security",
"imported_by": [
  {
    "name": "GetApiKeyRequest",
    "namespace": "security"
  },
  {
    "name": "PutApiKeyRequest",
    "namespace": "security"
  }
]
}

Validation with both request and response specified is incomplete

When I run ./run-validations.sh with both --request --response arguments, the output is unclear and doesn't signal if tests have been run or not.

Iterating over spec, found API's: 377
✔ Validated 1 endpoints (0 recordings tested, 0 skipped)
security.change_password
┌─────────┐
│ (index) │
├─────────┤
└─────────┘

Running either --request or --response individually provides the expected output.

✔ Validated 1 endpoints (10 recordings tested, 0 skipped)
security.clear_api_key_cache response validated successfully!
security.clear_api_key_cache
┌──────────┬─────────────┬────────────┬───────────────┬──────────┬───────────┐
│ (index)  │ error_count │ test_files │ tests_skipped │ has_type │ exception │
├──────────┼─────────────┼────────────┼───────────────┼──────────┼───────────┤
│ response │      0      │     10     │       0       │   true   │ undefined │

Empty interfaces should not exists

Due to how the TypeScript type system works, if you define an empty interface, then every object will be accepted.

Eg:

import { expectAssignable, expectType } from 'tsd'

interface Foo {}

expectAssignable<Foo>({ hello: 'world' })
expectType<Foo>({ hello: 'world' })

The test above will pass.

The only way to make the test above fail is to use the TypeScript never type:

import { expectAssignable, expectType } from 'tsd'

interface Foo {
  [k: string]: never
}

expectAssignable<Foo>({ hello: 'world' })
expectType<Foo>({ hello: 'world' })

It's very hard to determine the impact of this change in our test, but given that is a fairly simple step to add in the TypeScript generator, it's easier to just do it and see what happens.

Discrepancy between generic parameters declaration and references

There is a discrepancy in metamodel.ts between how we declare the generic parameters of a type, and how they are later referenced in its definition:

  • declaration uses generics: string[]
  • usage in the definition uses a TypeName like for any other type, consisting of a name and a namespace. For open generic parameters references, the name is the one used in the declaration, which is expected, but the namespace is the one of the enclosing type, which is more surprising.

Example:

{
  "kind": "interface",
  "name": {
    "name": "KeyedBucket",
    "namespace": "aggregations"
  },
  "generics": [
    "TKey"
  ],
  "properties": [
  {
    "name": "key",
    "required": true,
    "type": {
      "kind": "instance_of",
      "type": {
        "name": "TKey",
        "namespace": "aggregations"
      }
    }
  },
  ...

The fact that declarations use a simple name and references use a namespace that depends on the current type makes processing generic parameters more complicated than needed.

To resolve this inconsistency, we can consider different approaches:

  • change the declaration to also use namespaces: generics: TypeName with the same namespace as the one used when they are referenced:

    {
      "kind": "interface",
      "name": {
        "name": "KeyedBucket",
        "namespace": "aggregations"
      },
      "generics": [
        {
          "name": "TKey",             // <------
          "namespace": "aggregations" // <------
        }
      ],
      "properties": [
      {
        "name": "key",
        "required": true,
        "type": {
          "kind": "instance_of",
          "type": {
            "name": "TKey",
            "namespace": "aggregations"
          }
        }
      },
      ...
  • keep the generics? : string[] declaration and consider that open generic parameters all live in a builtin generic_parameters namespace. References would then use that namespace:

    {
      "kind": "interface",
      "name": {
        "name": "KeyedBucket",
        "namespace": "aggregations"
      },
      "generics": [
        "TKey"
      ],
      "properties": [
      {
        "name": "key",
        "required": true,
        "type": {
          "kind": "instance_of",
          "type": {
            "name": "TKey",
            "namespace": "generic_parameters" // <------
          }
        }
      },
      ...

The first approach makes the model more self-contained (no magic builtin namespace) but requires code generators to keep track of type names that are open generic parameters in the enclosing scope, while the second approach introduces a convention but avoids code generators to build that open generics context.

A 3rd approach could be to combine the two, using full type names for declarations so that the model is internally consisitent, but making sure (and validating in validate-model.ts) that open generics live in the generic_parameters namespace so that code generators can just match on that namespace to distinguish them:

{
  "kind": "interface",
  "name": {
    "name": "KeyedBucket",
    "namespace": "aggregations"
  },
  "generics": [
    {
      "name": "TKey",                   // <------
      "namespace": "generic_parameters" // <------
    }
  ],
  "properties": [
  {
    "name": "key",
    "required": true,
    "type": {
      "kind": "instance_of",
      "type": {
        "name": "TKey",
        "namespace": "generic_parameters" // <------
      }
    }
  },
  ...

Thoughts?

/cc @Mpdreamz @delvedor @stevejgordon

Note: TypeAlias generics use a ValueOf[]. This is a bug in the model and it should use the same approach as other type declarations.

Specification re-organization

Now that we have adopted the new specification folder structure that follows the rest-api-spec and migrated the request and response definitions, the next step is to reorganize every other definition to ensure it's stored in the right place.

For example, the cat.aliases API has 3 definitions, request, response, and record. Request and response definitions should always go inside the API namespace, so in this case /cat/aliases, while every other type can go either inside the API folder or inside the namespace's _types folder. A good rule of thumb is that if a definition it's used across different APIs within a namespace, then it should go in _types, otherwise inside the API folder. For instance, the cat aliases record definition should go inside /cat/aliases, while the cat base definition should go inside /cat/_types.

Furthermore, many folders will need to be updated according to the API name. For example, the cat.aliases API should be defined in /cat/aliases and not /cat/cat_aliases.

Finally, some types will need to be renamed. A language generator can create a type name by combining the namespace and the type name (the file name is not important), which means that every type should have a meaningful name based on where it's defined.
Let's take the cat.aliases API again. Other than the Request and Response definition, it defines the alias record as well. The record should be named AliasRecord and not CatAliasesRecord. In this way, language generators will have a nice and easy to discover the type name.

Remember to run make contrib before opening a pr!

FAQ:

Where should I start?

If you never worked on the specification, start here and here.

There is no specific order to follow, but it's recommended to work on one namespace at a time.
There is no need to fork the project, you can create a branch directly here, I recommend naming the branch {username)/fix-409-{namespace} to smooth things :)

Which editor should I use?

See here.
Pro tip: if you move files inside the tree view of VSCode, it will automatically fix the imports!

While renaming and moving type definitions, I broke too many imports, what should I do?

While it's recommended to fix the imports while you are moving the types, you can run make spec-imports-fix, which will fix most of the imports (import aliases won't be updated). Be aware that if you rename a type from FooBar to Foo, you should also update all its references before running the spec imports fix command.

How can I be sure that I didn't forget anything behind?

Run make spec-compile. It will run the typescript compiler and tell you if there is an error in the specification.
You can also validate the API you are updating.

The CI is failing!

Very likely it's a code style or missing header error. Run make contrib :)

Something broke badly.

Ping @delvedor

Wheel of fortune:

Path parameters should be represented in the spec

Currently, the path parameters are inferred from the json spec by the compiler and added on the fly.
We should start tracking the path parameters directly in the input spec as well, so we can reuse already present definitions (such as Indices) and making it easier to contribute to the spec.

For example:

 class IndexRequest<TDocument> extends RequestBase {
+  path: {
+    index: IndexName;
+    id?: Id;
+  }
   query_parameters?: {
     if_primary_term?: long;
     if_seq_no?: long;
     op_type?: OpType;
     pipeline?: string;
     refresh?: Refresh;
     routing?: Routing;
     timeout?: Time;
     version?: long;
     version_type?: VersionType;
     wait_for_active_shards?: string;
     require_alias?: boolean;
   }
   body?: TDocument;
 }

Add duplicate properties to Validation Report

🚀 Feature Proposal

The validation report should contain the information of duplicate properties within a Request. E.g. query_parameters has a property force and so does body in StopDatafeedRequest.

Motivation

Feedback loop with the server team to remove this duplicate parameters.

Example

"duplicateProperties": {
  "StopDatafeedRequest": {
    "force": ["query_parameters", "body"]
  }
}

Unpack native ES Error type into union type

🚀 Feature Proposal

It would be useful to have distinct ES Error types rather than a single ErrorCause container.
ErrorCause can be something like:

type ErrorCause = ESErrorTypeA | ESErrorTypeB | ...

The final shape of ErrorCause will contain all the current properties, but rather than being all optional/partial, they will be certain based on the specific type.

Motivation

Having a union of type could help traversing error types, in particular when building the caused_by chain.

Example

function getPainlessErrorReasons(e: PainlessError){
  // here I know type is PainlessError and can traverse it safely 
  // rather than duck type all props
  ...
}

function getErrorsReason(e: ErrorCause){
  const errors = [];
  if(hasPainlessScriptError(e)){
    errors.push(getPainlessErrorReasons(e));
  }
  if(hasErrorX(e){
    errors.push(getErrorXReasons(e));
  }
  return errors;
}

Feedback from Kibana v7.13

The list is based on the integration work in elastic/kibana#83808

  • no way to declare a type for aggregation in the search response
  • MultiGetHit._source is optional. our code handles it as required.
  • DeleteByQueryRequest doesn't accept q parameter
  • helpers.bulk doesn't support TransportRequestOptions as second argument
  • GetIndexResponse should provide IndexState.alias and IndexState.mappingsas required
  • GetTaskResponse doesn't contain response and error properties
  • Hid._id expected to be string
  • PropertyBase is missing dynamic, ignore_above, value, and fields,
  • AsyncSearchGetResponsemissing all properties
  • asyncSearch.status method is not defined
  • existsAlias, existsTemplate response body boolean, code expects { exists: boolean }
  • GetUserAccessTokenResponse declares authentication: string, but expected AuthenticatedUser
  • XPackRole type doesn't define applications and transient_metadata.
  • XPackRoleMapping type doesn't define role_templates property.
  • AggregationRange declares from and to as number, but Kibana declares them as strings
  • DeleteResponse should have an optional error property which when present has a type property
  • NodeUsageInformation doesn't contain scripted_metric
  • date_range aggregations support from?: integer and to?: integer, but not from_as_string?: string, to_as_string?: string.
  • HistogramAggregation doesn't contain buckets fields
  • CreateApiKeyResponse has expiration: number, expected: expiration: string
  • GetUserAccessTokenResponse requires kerberos_authentication_response_token
  • HasPrivilegesResponse Kibana expects index: Record<string, Record<string, boolean>>, application: Record<string, boolean>;
  • StoredScript.language is required, but it's unlikely since it's omitted in Kibana code
  • indices.resolveIndex is not typed
  • indices.getIndexTemplate is not typed
  • Policy doesn't contain name property
  • PutMappingRequest doesn't declare write_index_only property
  • Transform interface is empty
  • DeleteSnapshotLifecycleRequest.policy_id is required, but our code uses policy property only
  • AsyncSearchSubmitRequest.indices_boost expected to be Record<IndexName, double>[]
  • AsyncSearchSubmitRequest doesn't decalre body.fields type
  • security.grantApiKey method is not defined
  • AuthenticateResponse doesn't define authentication_type and enabled.
  • UpdateByQueryResponse declares updated ,total and task as optional
  • TransformPivot.max_page_search_size expected to be optional
  • SearchRequest.sort expected to be the same type as SearchRequest.body.sort. Or what is the purpose of it?
  • CatIndicesRecord the next properties are expected to be required: 'docs.count', 'docs.deleted'
  • ByteSize is expected to be string
  • ListTasksRequest decalres actions?: string, expected: actions?: string[]
  • All properties of DynamicTemplate are required (in the type)
  • docvalue_fields in SearchRequest and SearchRequest['body'] has incompatible types
  • should accept readonly version of request (readonly string [], for example)
  • SortResults contains null in Array<string | number | null>. does it make sense? Our code has to filter the result.
  • skipped is optional in ShardStatistics
  • RolloverIndexRequest.alias expected to accept a string
  • BulkResponseItemBase._id?: string | null. expected _id: string.
  • PutRoleRequest.body doesn't declare transient_metadata property
  • PrivilegesActions doesn't declare application and name properties.
  • OpenPointInTimeRequest.index expected to accept string[]
  • ApiKey.role_descriptors expected to be Record<string, any>
  • TaskId expected be a string
  • start_time_in_millis expected to be number in all the types (AsyncSearchResponseBase, for example)
  • TopHitsAggregation._source should accepts string[]
  • MultiGetHit.found expected to be required

Handling of union type and `server_default`

The compiler refuses to accept a value for @server_default if the type is a union of types.

Example:

/** @server_default 10 */
size: integer | float

I assume there is a type check in the compiler if the value is assignable to both types. We should either force the convention to always use the first one defined or be more lenient, if one type is assignable, the compiler accepts the input.

Ref: #251

Interface for RuntimeField missing string option

🐛 Wrong type

From the documentation, it seems like the script field in runtime mapping can be a string or a script object. So the follow two examples should be both valid:

PUT my-index/_mappings
{
  "runtime": {
    "http.clientip": {
      "type": "ip",
      "script": """
        String clientip=grok('%{COMMONAPACHELOG}').extract(doc["message"].value)?.clientip;
        if (clientip != null) emit(clientip); 
      """
    }
  }
}
  "runtime_mappings": {
    "day_of_week": {
      "type": "keyword",
      "script": {
        "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))"
      }
    }
  },

The type: FieldType is also incorrect here because currently, runtime supports the following types:

boolean
date
double
geo_point
ip
keyword
long

Definition

If possible provide a snippet with the fix.

export type RuntimeType = 'keyword' | 'long' | 'double' | 'date' | 'ip' | 'boolean'

export interface RuntimeField {
  format?: string
  script?: string | StoredScript
  type: RuntimeType
}

Track content type directly in the input spec

Currently, the input specification does not contain any information about the body content type, we assume that is a plain JSON. In some cases, it should be ndjson, which we normally represent as an array of objects.

For example, here's the bulk API:

https://github.com/elastic/elastic-client-generator/blob/ba32e0ee31ba6d2054903f10c6dd586689d7e7e9/specification/specs/document/multiple/bulk/BulkRequest.ts#L43

The only way for language generators for knowing this information is to read the contentType field of the Endpoint definition, which we are gathering from the rest-api-spec. For example:

"contentType": [
   "application/x-ndjson"
]

🚀 Feature Proposal

Create a new js doc tag to track this information directly in the input spec.

Motivation

The rest-api-spec will not be there forever, and overtime we'll move most of the rest-api-spec information directly in the input spec. Furthermore I think strongly type languages will benefit from having this information easily accessible, but let's discuss more about it.

If we have a specific js doc tag for this we can instruct the compiler to throw if the body does not respect certain rules. For example, at the moment every ndjson API has the body defined as array.

Example

 /**
  * @rest_spec_name bulk
  * @since 0.0.0
  * @stability stable
+ * @content_type ndjson
  */
 interface BulkRequest<TSource> extends RequestBase {
   path_parts?: {
     index?: IndexName
     type?: Type
   }
   query_parameters?: {
     pipeline?: string
     refresh?: Refresh
     routing?: Routing
     _source?: boolean
     _source_excludes?: Fields
     _source_includes?: Fields
     timeout?: Time
     type_query_string?: string
     wait_for_active_shards?: WaitForActiveShards
     require_alias?: boolean
   }
   body?: Array<BulkOperationContainer | TSource>
 }

Add missing APIs

There are a bunch of endpoints that are not tracked here. Adding them is straightforward, it's enough to add the json spec definition.

Calling "simulateIndexTemplate" without a "name" parameter causes client to throw a 500 error

This is blocking elastic/kibana#105863, which I'd like to ship with 7.15.

This is occurring on master in Kibana, which is on elasticsearch-canary@^8.0.0-canary.13. To reproduce, call this method with these arguments (you'll need to create a component template called "demo" first):

indices.simulateIndexTemplate({
  index_patterns: [ 'foo ],
  composed_of: [ '.deprecation-indexing-mappings', 'demo' ]
});

You'll get a 500 error and in the logs you'll see this message:

server    log   [20:37:37.561] [error][http] ConfigurationError: Missing required parameter: name
    at IndicesApi.simulateIndexTemplate (/Users/cjcenizal/Documents/GitHub/Elastic/kibana/node_modules/@elastic/elasticsearch/api/api/indices.js:1359:17)

According to the ES docs, name should be optional. Here are the corresponding client docs for reference.

Rename Repository and default branch

Once the reorganisation of the spec #409 is completed, we shall:

  • rename the default branch to main
  • rename the repository to elasticsearch-specification

/cc @elastic/clients-team

Incorrect TS type for ilm.explainLifecycle response

🐛 Wrong type

The ilm.explainLifecycle response's indices property is typed as Record<string, IlmExplainLifecycleLifecycleExplain> | IlmExplainLifecycleLifecycleExplainProject.

Definition

If I try to access indices[indexName], TS complains with this error:

Element implicitly has an 'any' type because expression of type 'string' can't be used to index type 'Record<string, IlmExplainLifecycleLifecycleExplain> | IlmExplainLifecycleLifecycleExplainProject'.
  No index signature with a parameter of type 'string' was found on type 'Record<string, IlmExplainLifecycleLifecycleExplain> | IlmExplainLifecycleLifecycleExplainProject'.ts(7053)

This is defined in https://github.com/elastic/elasticsearch-specification/blob/main/specification/ilm/explain_lifecycle/ExplainLifecycleResponse.ts#L26.

Based on the Explain lifecycle API docs, I don't see how LifecycleExplainProject could be relevant to the response. I believe the correct type is:

export class Response {
  body: {
-    indices: Dictionary<IndexName, LifecycleExplain> | LifecycleExplainProject
+    indices: Dictionary<IndexName, LifecycleExplain>
  }
}

Add URL Annotation

🚀 Feature Proposal

Annotation that associates an URL to a property and/or type, that is then reflected in the spec.

Motivation

Additional meta data to improve e.g. developer experience to link to a resource for more context.

Example

in TypeScript:

/** @url https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#byte-units */
type ByteSize = string

in the spec:

{
    "name": "data.input_bytes",
    "required": false,
    "type": {
    "kind": "instance_of",
    "type": {
        "name": "ByteSize",
        "namespace": "internal",
        "url": "https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#byte-units"
    }
}

Missing TS types for hidden setting and data stream property on Get indices response

🐛 Wrong type

The indices.get response's indices are missing two properties:

  • settings.index.hidden
  • data_stream

Definition

If I attempt to access the hidden property of an index's settings, e.g. index.settings.index.hidden, TS complains:

Property 'index' does not exist on type 'IndicesIndexSettings | IndicesIndexStatePrefixedSettings'.
  Property 'index' does not exist on type 'IndicesIndexSettings'. ts(2339)

However, this property exists and is accessible in the response. It looks like it's defined as an alias here: https://github.com/elastic/elasticsearch-specification/blob/main/specification/indices/_types/IndexSettings.ts#L74. My TS fu is not strong enough to suggest a fix for this.

If I attempt to access the data_stream property of an index, e.g. index.data_stream, TS complains:

Property 'data_stream' does not exist on type 'IndicesIndexState'.ts(2339)

I believe we need to update IndexState to have this property, defined here: https://github.com/elastic/elasticsearch-specification/blob/main/specification/indices/_types/IndexState.ts.

Define all the endpoints as TypeScript classes

Currently, the endpoints are generated in the model from the rest-api-spec. We should define the endpoints directly in TypeScript, so we can directly connect the request and response definition to the endpoint without relying on naming and js doc tags conventions.

For example:

class MyEndpoint extends Endpoint {
  name: 'my-endpoint'
  description: 'description'
  docUrl: 'https://elastic.co'
  request: MyEndpointRequest
  requestBodyRequired: boolean
  response: MyEndpointResponse
  urls:[{
    path:'/'
    methods: ['GET']
  }]
  since: '7.10.0'
  stability: 'stable'
  visibility: 'public'
  accept: ['application/json']
  contentType: ['application/json']
}

interface MyEndpointRequest {}

class MyEndpointResponse

We can create a script with ts-morph to keep the rest-api-spec in sync.

ML job reset endpoint

A new endpoint has been added for reseting anomaly detection jobs
elastic/elasticsearch#73908

POST _ml/anomaly_detectors/<job_id>/_reset

I imagine the request type will looks something like this

export interface MlResetJobRequest extends RequestBase {
  job_id: Id
  wait_for_completion?: boolean
}

MlJob will also need updating with new properties added with this reset work

 class MlJob {
...
+ blocked?: MlJobBlocked
 }

where MlJobBlocked should look like this:

MlJobBlocked {
  reason: 'delete' | 'revert' | 'reset';
  task_id?: string;
}

Add unsigned numbers and custom type

  • add types for unsigned integer uint and unsigned long ulong
  • add custom type for Version of type ulong and apply it to all *_version fields
  • add custom type for SeqNo of type long and apply it to all _seq_no fields

Add preamble to spec

Please add the following preamble to the generated output/spec:

{
"info": {
  "version": "x.y.z", 
  "title": "Elasticsearch Request & Response Specification", 
  "license": {
    "name": "Apache 2.0", 
    "url": "https://github.com/elastic/elastic-client-generator/blob/master/LICENSE"
  }
}

`server_default` annotation unable to accept `@` value

The @server_default annotation is unable to accept a value that carries another @ as first character. Example given in EQL you can override the timestamp_field, with the default value @timestamp, see docs.

Thee compiler does not accept any variants, such as "@timestamp" or @timestamp

Ref: #251

Implement Versioned Branching Strategy

Currently we are not distinguishing between Elasticsearch's master and 7.x branches. The goal is to follow the same branching strategy.

Blocking Tasks

  • Merge all open PRs

New Branches

  • Create 7.x branch from main
  • Create 7.14 branch from 7.x

Tooling & Infra

  • Investigate & update the scripts to be version aware - #503
  • Add the backporting bot and ensure only compiler changes are backport-able - #502
  • Read the version number & build hash from the spec artefact and write these values to the schema.json _info object - #501
  • This action should be updated as well.

TypeError for update_by_query_rethrottle response validation

Running the command ./run-validations.sh --api update_by_query_rethrottle --response tells me there are no tests but also fails with a TypeError.

Iterating over spec, found API's: 350
✔ Validated 1 endpoints (0 recordings tested, 0 skipped)
TypeError: Cannot read property 'length' of undefined
    at /home/stevejgordon/clients-flight-recorder/scripts/types-validator/index.js:146:36
    at Array.map (<anonymous>)
    at generate (/home/stevejgordon/clients-flight-recorder/scripts/types-validator/index.js:140:37)
    at Object.<anonymous> (/home/stevejgordon/clients-flight-recorder/scripts/types-validator/index.js:178:1)
    at Module._compile (internal/modules/cjs/loader.js:1063:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1092:10)
    at Module.load (internal/modules/cjs/loader.js:928:32)
    at Function.Module._load (internal/modules/cjs/loader.js:769:14)
    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:72:12)
    at internal/main/run_main_module.js:17:47

Additional model validations

Validations we should add to the model parser/translater/exporter:

  • Verify js_doc annotations: both their name (must be known) and their value (e.g. @since version format)
  • Verify that path and query parameters in TS requests match those in the json rest spec.

Types.ts: Cannot find name ResponseBase.

After pulling in latest commits, when validating an endpoint with tests (even ones previously working and on master) I'm getting an error from the generated types.ts.

✔ Validated 1 endpoints (69 recordings tested, 0 skipped)
  node_modules/elasticsearch-client-specification/output/typescript/types.ts:2787:82
  ✖  2787:82  Cannot find name ResponseBase.
  1 error

That line is also flagged in VS code:

export interface DictionaryResponseBase<TKey = unknown, TValue = unknown> extends ResponseBase {

Support for enums which serialize as a numeric value

Right now, most enums are treated as strings when serialized where the name of members is the string that is used in the JSON.

I'm looking at GeoTilePrecision which right now is defined as export type GeoTilePrecision = number. In the .NET client, we define this as an enum since the range is limited 0-29 and most values have a specific meaning which we include in the IntelliSense documentation e.g.

public enum GeoTilePrecision
{
    /// <summary>
    /// Whole world
    /// </summary>
    Precision0 = 0,
    Precision1 = 1,
    /// <summary>
    /// Subcontinental area
    /// </summary>
    Precision2 = 2,
    ...
}

This provides a way to help consumers choose a valid value, while also providing a descriptive tooltip for values where possible.

I'd like to encode this in the spec, but I believe right now, it would assume the value to be serialised to/from a string. It's a similar story for GeoHashPrecision.

Auto-generate kibana route schemas

ML provides kibana endpoint wrappers for every ML elasticsearch endpoint.
This means we need to maintain a collection of schemas to validate the endpoint arguments. These schemas are basically identical to types supplied by the elasticsearch client.

Would it be possible to have the endpoint request schemas auto-generated along with the types?

Having them autogenerated should also reduce the risk of discrepancies between the schemas and types when the client is updated.

An example of one of the ML schema files:
https://github.com/elastic/kibana/blob/master/x-pack/plugins/ml/server/routes/schemas/anomaly_detectors_schema.ts

Rename nullable to required in InterfaceProperty

We should rename nullable to required in the InterfaceProperty definition, as semantically those two words are very different.
nullable means that a value can either have any type or null, while required indicates if a value must be defined or not.

  export class InterfaceProperty {
    name: string
    type: InstanceOf
-   nullable: boolean
+   required: boolean

The version can't be automatically updated due to branch protection

Unfortunately, the update-model-info.yml action can't be executed successfully because of the branch protections rules we have currently in place.

remote: error: GH006: Protected branch update failed for refs/heads/main.        
remote: error: 2 of 2 required status checks are expected. At least 1 approving review is required by reviewers with write access.   

Not sure if there is any other way to solve this without passing each time via a PR or by disabling the branch protection. I'll investigate.

Related: #499

Add `visibility` property to the canonical spec

🚀 Feature Proposal

The visibility property has landed in the rest-api-spec, bulk for example and generates value for every consumer of the canonical spec. Please add that property to the endpoint definition, next to stability in the tree.

Motivation

as previously described.

Example

{
      ...
      "since": "7.7.0",
      "stability": "stable",
      "visbility": "public",
      "urls": [
        ...
      ]
    },

Allow `AdditionalProperties` behavior on parent class.

I would like to update the definition of DecayFunction by moving the decay function configuration represented with an AdditionalProperties behavior to DecayFunctionBase instead of on each leaf class. This makes the definition more correct as we can access the fields in DecayPlacement from a DecayFunction instance, and simpler by factorizing the behavior in the common parent class.

export class DecayFunctionBase<TOrigin, TScale> extends ScoreFunctionBase
  implements AdditionalProperty<Field, DecayPlacement<TOrigin, TScale>> {
  multi_value_mode?: MultiValueMode
}

export class NumericDecayFunction extends DecayFunctionBase<double, double> {}
export class DateDecayFunction extends DecayFunctionBase<DateMath, Time> {}
export class GeoDecayFunction extends DecayFunctionBase<GeoLocation, Distance> {}

export type DecayFunction =
  | DateDecayFunction
  | NumericDecayFunction
  | GeoDecayFunction

However, when doing so, the TypeScript generator doesn't generate the behavior on the DecayFunctionBase parent class but only on child classes, and it uses the open generic parameter names instead of their actual value (see <QueryDslTOrigin, QueryDslTScale> below):

export interface QueryDslDateDecayFunctionKeys extends QueryDslDecayFunctionBase<DateMath, Time> {
}
export type QueryDslDateDecayFunction = QueryDslDateDecayFunctionKeys |
    { [property: string]: QueryDslDecayPlacement<QueryDslTOrigin, QueryDslTScale> }

export interface QueryDslDateDistanceFeatureQuery extends QueryDslDistanceFeatureQueryBase<DateMath, Time> {
}

export type QueryDslDecayFunction = QueryDslDateDecayFunction | QueryDslNumericDecayFunction | QueryDslGeoDecayFunction

export interface QueryDslDecayFunctionBase<TOrigin = unknown, TScale = unknown> extends QueryDslScoreFunctionBase {
  multi_value_mode?: QueryDslMultiValueMode
}

I tried to fix the TS generator but this is a rather complex area and did not manage to find how to fix it. Halp needed!

ML Additional missing count types

Missing properties in ML types

 class TimingStats {
...
+  total_bucket_processing_time_ms: number
 }
 class DatafeedTimingStats {
...
+  average_search_time_per_bucket_ms: number
 }
 class Job {
...
-  groups: Array<string>
+  groups?: Array<string>

-  model_plot: ModelPlotConfig
 }
 class Datafeed {
...
-  chunking_config: ChunkingConfig
+  chunking_config?: ChunkingConfig
 }
 class DetectionRule {
...
-  conditions: Array<RuleCondition>
+  conditions?: Array<RuleCondition>
 }
 class PutJobRequest {
...
-  analysis_config?: AnalysisConfig
+  analysis_config: AnalysisConfig
 }
 class ModelSizeStats {
...
+  model_bytes_exceeded: number;
+  model_bytes_memory_limit: number;
+  peak_model_bytes?: number;
+  categorized_doc_count: number;
+  total_category_count: number;
+  frequent_category_count: number;
+  rare_category_count: number;
+  dead_category_count: number;
+  categorization_status: 'ok' | 'warn';
 }
 class JobForecastStatistics {
...
+  forecasted_jobs: number
 }

Question regarding IndicesOptions
Should all properties in this be optional?

TypeError: Cannot read property 'map' of undefined

As per #92 if the DictionaryDecompounderTokenFilter is included as part of the TokenFilter type, it causes the following exception. Once this issue is fixed, it can be uncommented from the TokenFilter in the TokenFilterBase.ts file.

/home/stevejgordon/elastic-client-generator/specification/src/specification/type-reader.ts:268
    const args: ts.Node[] = t.typeArguments.map(n => n as ts.Node)
                                            ^
TypeError: Cannot read property 'map' of undefined
    at InterfaceVisitor.createDictionary (/home/stevejgordon/elastic-client-generator/specification/src/specification/type-reader.ts:268:45)
    at InterfaceVisitor.visitTypeReference (/home/stevejgordon/elastic-client-generator/specification/src/specification/type-reader.ts:247:103)
    at InterfaceVisitor.visitTypeNode (/home/stevejgordon/elastic-client-generator/specification/src/specification/type-reader.ts:197:54)
    at /home/stevejgordon/elastic-client-generator/specification/src/specification/type-reader.ts:215:59
    at Array.map (<anonymous>)
    at InterfaceVisitor.visitUnionType (/home/stevejgordon/elastic-client-generator/specification/src/specification/type-reader.ts:215:42)
    at InterfaceVisitor.visitTypeNode (/home/stevejgordon/elastic-client-generator/specification/src/specification/type-reader.ts:203:49)
    at InterfaceVisitor.visit (/home/stevejgordon/elastic-client-generator/specification/src/specification/type-reader.ts:73:32)
    at TypeReader.visit (/home/stevejgordon/elastic-client-generator/specification/src/specification/type-reader.ts:354:33)
    at /home/stevejgordon/elastic-client-generator/specification/src/specification/type-reader.ts:358:56
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! [email protected] generate-schema: `ts-node src/metamodel_generate.ts`
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] generate-schema script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
npm ERR! A complete log of this run can be found in:
npm ERR!     /home/stevejgordon/.npm/_logs/2021-01-26T01_17_25_913Z-debug.log
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! [email protected] compile:canonical-json: `npm run generate-schema --prefix specification`
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] compile:canonical-json script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
npm WARN Local package.json exists, but node_modules missing, did you mean to install?
npm ERR! A complete log of this run can be found in:
npm ERR!     /home/stevejgordon/.npm/_logs/2021-01-26T01_17_25_930Z-debug.log

msearch api body asking for object but examples are using an array

💬 Questions and Help

The type defined for the msearch api request is a body property typed as an object with nested operations but the example docs show the operations passed into the body property as an array of objects.

Curious which form is the expected format for the request? Do the docs need to be updated? Is the request typed incorrectly?

Thanks!

Please note that this issue tracker is not a help forum and this issue may be closed.

It's not uncommon that somebody already opened an issue or in the best case it's already fixed but not merged. That's the reason why you should search at first before submitting a new one.

Incorrect TS types for `GET _snapshot/<repo>/<snapshot>` response

🐛 Bug Report

The TS type for calling client.snapshot.get is associated with the following type:

https://github.com/elastic/elasticsearch-js/blob/7a3cfe7e23b7a81c563f4deaa2990fed4c321ee9/api/types.d.ts#L6069

When looking at the actual response from the cluster it is of the form:

{
  "responses": [
    {
      "repository": "my-repo",
      "snapshots": [ /* list of snapshots... */ ]
    }
}

The TS type is missing the responses array of objects in which snapshots is nested.

Actual request:

          const response = await client.snapshot.get({
            repository,
            snapshot: '_all',
            ignore_unavailable: true, // Allow request to succeed even if some snapshots are unavailable.
          });

Expected behavior

The TS types should match the shape of the response from the cluster.

Your Environment

  • @elastic/elasticsearch npm:@elastic/elasticsearch-canary@^8.0.0-canary.4

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.