apollographql / apollo-tracing Goto Github PK

A GraphQL extension for performance tracing

apollo-tracing's Introduction

Archival

This repo was archived by the Apollo Security team on 2023-05-26

Apollo Tracing

[2022-02-16] Notice: This tracing format was designed to provide tracing data from graphs to the Apollo Engine engineproxy, a project which was retired in 2018. We learned that a trace format which describes resolvers with a flat list of paths (with no way to aggregate similar nodes or repeated path prefixes) was inefficient enough to have real impacts on server performance, and so we have not been actively developing consumers or producers of this format for several years. Apollo Server (as of v3) no longer ships with support for producing this format, and engineproxy which consumed it is no longer supported. We suggest that people looking for formats for describing performance traces consider either the Apollo Studio protobuf-based trace format or a more generic format such as OpenTelemetry.

Apollo Tracing is a GraphQL extension for performance tracing.

Thanks to the community, Apollo Tracing already works with most popular GraphQL server libraries, including Node, Ruby, Scala, Java, Elixir, Go and .NET, and it enables you to easily get resolver-level performance information as part of a GraphQL response.

Apollo Tracing works by including data in the extensions field of the GraphQL response, which is reserved by the GraphQL spec for extra information that a server wants to return. That way, you have access to performance traces alongside the data returned by your query.

It’s already supported by Apollo Engine, and we’re excited to see what other kinds of integrations people can build on top of this format.

We think this format is broadly useful, and we’d love to work with you to add support for it to your tools of choice. If you’re looking for a first idea, we especially think it would be great to see support for Apollo Tracing in GraphiQL and the Apollo Client developer tools!

If you’re interested in working on support for other GraphQL servers, or integrations with more tools, please get in touch on the #apollo-tracing channel on the Apollo Slack.

Supported GraphQL Servers

Response Format

The GraphQL specification allows servers to include additional information as part of the response under an extensions key:

The response map may also contain an entry with key extensions. This entry, if set, must have a map as its value. This entry is reserved for implementors to extend the protocol however they see fit, and hence there are no additional restrictions on its contents.

Apollo Tracing exposes trace data for an individual request under a tracing key in extensions:

{
  "data": <>,
  "errors": <>,
  "extensions": {
    "tracing": {
      "version": 1,
      "startTime": <>,
      "endTime": <>,
      "duration": <>,
      "parsing": {
        "startOffset": <>,
        "duration": <>,
      },
      "validation": {
        "startOffset": <>,
        "duration": <>,
      },
      "execution": {
        "resolvers": [
          {
            "path": [<>, ...],
            "parentType": <>,
            "fieldName": <>,
            "returnType": <>,
            "startOffset": <>,
            "duration": <>,
          },
          ...
        ]
      }
    }
  }
}

Collected data

The startTime and endTime of the request are timestamps in RFC 3339 format with at least millisecond but up to nanosecond precision (depending on platform support).

Some more details (adapted from the description of the JSON encoding of Protobuf's Timestamp type):

A timestamp is encoded as a string in the RFC 3339 format. That is, the format is "{year}-{month}-{day}T{hour}:{min}:{sec}[.{frac_sec}]Z" where {year} is always expressed using four digits while {month}, {day}, {hour}, {min}, and {sec} are zero-padded to two digits each. The fractional seconds, which can go up to 9 digits (i.e. up to 1 nanosecond resolution), are optional. The "Z" suffix indicates the timezone ("UTC"); the timezone is required, though only UTC (as indicated by "Z") is presently supported. For example, "2017-01-15T01:30:15.01Z" encodes 15.01 seconds past 01:30 UTC on January 15, 2017. In JavaScript, one can convert a Date object to this format using the standard toISOString() method. In Python, a standard datetime.datetime object can be converted to this format using strftime with the time format spec '%Y-%m-%dT%H:%M:%S.%fZ'. Likewise, in Java, one can use the Joda Time's ISODateTimeFormat.dateTime() to obtain a formatter capable of generating timestamps in this format.

Resolver timings should be collected in nanoseconds using a monotonic clock like process.hrtime() in Node.js or System.nanoTime() in Java.

The limited precision of numbers in JavaScript is not an issue for our purposes, because Number.MAX_SAFE_INTEGER nanoseconds is about 104 days, which should be plenty even for long running requests!

The server should keep the start time of the request both as wall time, and as monotonic time to calculate startOffsets and durations (for the request as a whole and for individual resolver calls, see below).
The duration of a request is in nanoseconds, relative to the request start, as an integer.
The startOffset of parsing, validation, or a resolver call is in nanoseconds, relative to the request start, as an integer.
The duration of parsing, validation, or a resolver call is in nanoseconds, relative to the resolver call start, as an integer.

The end of a resolver call represents the return of a value for a field, but it does not include resolving subfields. If an asynchronous value such as a promise is returned from a resolver however, the resolver call isn't considered to have ended until the asynchronous value has been resolved.

The path is the response path of the current resolver in a format similar to the error result format specified in the GraphQL specification:

This field should be a list of path segments starting at the root of the response and ending with the field associated with the error. Path segments that represent fields should be strings, and path segments that represent list indices should be 0‐indexed integers. If the error happens in an aliased field, the path to the error should use the aliased name, since it represents a path in the response, not in the query.

parentType, fieldName and returnType are strings that reflect the runtime type information usually passed to resolvers (e.g. in the info argument for graphql-js).

Example

query {
  hero {
    name
    friends {
      name
    }
  }
}

{
  "data": {
    "hero": {
      "name": "R2-D2",
      "friends": [
        {
          "name": "Luke Skywalker"
        },
        {
          "name": "Han Solo"
        },
        {
          "name": "Leia Organa"
        }
      ]
    }
  },
  "extensions": {
    "tracing": {
      "version": 1,
      "startTime": "2017-07-28T14:20:32.106Z",
      "endTime": "2017-07-28T14:20:32.109Z",
      "duration": 2694443,
      "parsing": {
        "startOffset": 34953,
        "duration": 351736,
      },
      "validation": {
        "startOffset": 412349,
        "duration": 670107,
      },
      "execution": {
        "resolvers": [
          {
            "path": [
              "hero"
            ],
            "parentType": "Query",
            "fieldName": "hero",
            "returnType": "Character",
            "startOffset": 1172456,
            "duration": 215657
          },
          {
            "path": [
              "hero",
              "name"
            ],
            "parentType": "Droid",
            "fieldName": "name",
            "returnType": "String!",
            "startOffset": 1903307,
            "duration": 73098
          },
          {
            "path": [
              "hero",
              "friends"
            ],
            "parentType": "Droid",
            "fieldName": "friends",
            "returnType": "[Character]",
            "startOffset": 1992644,
            "duration": 522178
          },
          {
            "path": [
              "hero",
              "friends",
              0,
              "name"
            ],
            "parentType": "Human",
            "fieldName": "name",
            "returnType": "String!",
            "startOffset": 2445097,
            "duration": 18902
          },
          {
            "path": [
              "hero",
              "friends",
              1,
              "name"
            ],
            "parentType": "Human",
            "fieldName": "name",
            "returnType": "String!",
            "startOffset": 2488750,
            "duration": 2141
          },
          {
            "path": [
              "hero",
              "friends",
              2,
              "name"
            ],
            "parentType": "Human",
            "fieldName": "name",
            "returnType": "String!",
            "startOffset": 2501461,
            "duration": 1657
          }
        ]
      }
    }
  }
}

Compression

We recommend that people enable compression in their GraphQL server, because the tracing format adds to the response size, but compresses well.

Although we tried other approaches to make the tracing format more compact (including deduplication of keys, common items, and structure) this complicated generating and interpreting trace data, and didn't bring the size down as much as compressing the entire HTTP response body does.

In our tests on Node.js, the processing overhead of compression is less than the overhead of sending additional bytes for an uncompressed response. But more test results from different server environments are definitely welcome, so we can help people make an informed decision about this.

apollo-tracing's People

Contributors

Stargazers

Watchers

apollo-tracing's Issues

Link to Node.js is broken

The Node.JS link mentioned under Supported GraphQL Servers section appears to be broken. Here is the link - https://github.com/apollographql/apollo-tracing#supported-graphql-servers

Add an "extra" field

Awesome! This spec looks pretty sensible, from a first impression.

I do have one question:

If I wanted to send extra analytics information not accounted for in this spec, is there a story for achieving that? My example is: a count of database queries per resolver.

:resolution not found in: %Absinthe.Blueprint

Not entirely sure if this due to apollo-tracing, but I'm receiving this :resolution not found in: %Absinthe.Blueprint error. Here are my GraphQL related deps:

{:absinthe, "~> 1.4.0-rc.3", override: true},
{:absinthe_plug, "~> 1.4.0-rc.1"},
{:absinthe_ecto, "~> 0.1.2"},
{:absinthe_phoenix, github: "absinthe-graphql/absinthe_phoenix"},
{:apollo_tracing, git: "https://github.com/sikanhe/apollo-tracing-elixir.git"},

If I remove tracing from the pipeline, no errors (which makes me think the error originates from this module). It can be hard to track down since I'm a little fuzzy on how all the related modules interrelate. Also note, I'm using sockets as well (which don't pass through tracing), and that works.

I believe I had this working with a previous version of absinthe and/or absinthe_plug, but I'd rather keep up with the core module versions as tracing isn't mission critical.

Thoughts?

[question] Tracing validation errors

/label question

Is there a way to have apollo-tracing logging validation errors?

See https://github.com/apollographql/apollo-server/blob/master/packages/apollo-tracing/src/index.ts#L54

What's necessary to support that?

Expose Extensions in Schema Introspection

For tools like GraphiQL to make use of Extensions, I think there needs to be some Extension introspection of some sort, so that the extensions that exist can be known by tools like GraphiQL.

I envision GraphiQL having some interface, like a Muiti-Select that lets you choose which options you want to include in your request.

But for that to happen, GraphiQL would have to have knowledge of registered extensions, so that it could build out a Select with the options.

This ties in with #5 where a standard for dictating what extensions to include or exclude can be handled in the request in some fashion. (of course, server side rules could be configured to have certain extensions always on or always off, regardless of the client's request)

Add sections for query parsing and validation

The spec as I see it offers plenty of support for the individual data fetchers, but sometimes a significant proportion of the query may be taken up by its parsing into an AST and its validation.

Could these areas be added to the specification? I might imagine something similar to this (note the "parse" and "validate" keys):

{
  "data": ...,
  "errors": ...,
  "extensions": {
    "tracing": {
      "version": 1,
      "startTime": ...,
      "endTime": ...,
      "duration": ...,
      "execution": {
        "parse": {
          "startOffset": ...,
          "duration": ...,
        },
        "validate": {
          "startOffset": ...,
          "duration": ...,
        },
        "resolvers": [
          {
            "path": [..., ...],
            "parentType": ...,
            "fieldName": ...,
            "returnType": ...,
            "startOffset": ...,
            "duration": ...,
          },
          ...
        ]
      }
    }
  }
}

Spec question - is the tracing times just for field fetching or for complete field resolution

There are 2 stages to field fetching in graphql - the raw field resolver stage and then the complete field resolution of that field and its sub values.

Given a user field query like this

     user(id:1234) {
          name
          friends {
              name
         }
    }

I just wanted to confirm the tracing times are for the "fetch" of the user(id:1234) object OR is it the fetch + the resolution times of the sub fields?

I am guessing its just the fetch time. The spec currently says:

The duration of a resolver call is relative to the resolver call start.

But its not explicit on that this is. Can we please make it more explicit.

module init issue in combination with `aws xray sdk`

This seems to be a newer issue, since I have used it in combination together for a while now, but suddenly the behaviour changed:

When I am using graphql-server-express in combination with aws-xray-sdk on AWS Lambda I am now getting a module init error for the apollo-tracing module and something related to Promise.resolve() & type error while requiring apollo-tracing.

If I am disabling the aws-xray-sdk, it starts working again.

Is that a known issue?

Since I am more interested in xray than apollo-tracing right now and apollo-tracing still being in an experimental state: is there a way to disable this module within graphql-server-express somehow?

I am using:
"aws-sdk": "^2.121.0",
"aws-xray-sdk": "^1.1.4",
"aws-serverless-express": "^3.0.2",
"express": "^4.15.4",
"graphql-server-express": "^1.1.2"

warning: update peer dependencies to support graphql 0.12.x

[email protected] requires a peer of graphql@^0.10.0 || ^0.11.0 but none is installed. You must install peer dependencies yourself.

Include/Exclude in the response?

Has there been any discussion on how the consumer can specify whether to include tracing in the response?

My initial thought for my implementation would be to check the request headers to see if tracing is being requested or not.

Next.js Support

In moving from Express to the Next.js API server, I'm losing support for Apollo Studio. Perhaps apollo-server-micro doesn't support tracing? Are there any plans to add support for the engine to report the schema via apollo-server-micro served through the Next.js API routes?

[feature] Tracing Errors

Errors from the resolvers are not traced.

const books = [
...
];

const resolvers = {
  Query: {
    books: () => books,
    reservedBooks: () => {
      throw new Error("Reserved books could not be found!")
    }
  }
};

Here the reservedBooks resolver is throwing an error which is not anyhow detected by the apollo-tracing.

{  
   "version":1,
   "startTime":"2019-03-21T15:12:52.808Z",
   "endTime":"2019-03-21T15:12:52.809Z",
   "duration":972315,
   "execution":{  
      "resolvers":[  
         {  
            "path":[  
               "books"
            ],
            "parentType":"Query",
            "fieldName":"books",
            "returnType":"[Book]",
            "startOffset":573807,
            "duration":9734
         },
         ...
         {  
            "path":[  
               "reservedBooks"
            ],
            "parentType":"Query",
            "fieldName":"reservedBooks",
            "returnType":"[Book]",
            "startOffset":651279,
            "duration":16398
         }
      ]
   }
}

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.