vaticle / typedb-protocol Goto Github PK

View Code? Open in Web Editor NEW

15.0 10.0 14.0 657 KB

TypeDB (Core and Cluster) RPC Communication Protocol

License: Mozilla Public License 2.0

Starlark 98.25% Shell 1.75%

typedb typeql vaticle typedb-client grpc grpc-java grpc-python grpc-node

typedb-protocol's Issues

Replace 'oneof' with 'optional'

Problem to Solve

We currently use oneof to denote an optional field. As of Protobuf 3.13+ this is no longer the standard, and is ugly.

Proposed Solution

Replace 'oneof' with 'optional' when all dependants are on Protobuf 3.13+

Clean up unused Rust protobuf compilation tooling

We have left-over compilation tooling for Rust using the grpc crate. Once we've verified the tonic-based compiler introduced in #160 is working fully, we should delete the old tooling.

User role management

Implement user role management protocol supporting the following functionalities:

Role CRUD
Assigning a role to a user

Split ConceptMethod into ThingMethod and TypeMethod

Problem to Solve

ConceptMethod now has a oneof identifier - either an IID or a Label, which is weird.

Proposed Solution

Split ConceptMethod into ThingMethod and TypeMethod.

Cluster-specific protocol should not be included in Core's protocol

Problem to Solve

[This issue has been generalised based on Alex's message below]

The protocol spec includes definitions specific to Core and Cluster in one place. However, this exposes Cluster APIs on the Core server that could accidentally be implemented and additionally represents a domain leak. The Cluster protocol should be separate and extend the Core protocol in a separately released package.

Simplify Concept API by replacing `overridden`, `explicit` etc. methods with parameters

Problem to Solve

We currently have a very large number of Concept API methods - and a lot of them feel like they should just be overloads rather than being separate methods. For example:

ThingType.GetOwns.Req thing_type_get_owns_req = 303;
ThingType.GetOwnsExplicit.Req thing_type_get_owns_explicit_req = 310;
ThingType.GetOwnsOverridden.Req thing_type_get_owns_overridden_req = 311;

Proposed Solution

In the example above, explicit and overridden should really be parameters - not methods in their own right. We should add them to ThingType.GetOwns.Req:

    message GetOwns {
        message Req {
            oneof filter {
                AttributeType.ValueType value_type = 1;
            }
            bool keys_only = 2;
            bool explicit_only = 3;
            bool overridden_only = 4;
        }
        message ResPart {
            repeated Type attribute_types = 1;
        }
    }

We can apply this same simplification to a great number of Concept API methods to bring down the total count.

Refactor nodejs compilation from genrule to full bazel rule

Problem to solve
The nodejs genrule is cumbersome, and also requires exposing the raw .proto files with a filegroup instead of accessing via the proto_library rules provided.

Proposed solution
Rewrite the genruled as a full fledge bazel rule, including accessing the source files via proto_library.src_file, and delete the associated filegroup from the /proto/BUILD file

Rename 'Value' (Answer<Value>) to 'Numeric' in Session.proto

We're renaming this class in grakn.core.graql.Answer (server) so we need to align it in the Session.proto definition. However, we need to also update the client drivers. So let's only do this after we complete vaticle/typedb#4890.

NodeJS package dependencies should be declared via Bazel

Right now, some dependencies and their respective version numbers are declared in package.json. This is not ideal and they should be declared in Bazel instead.

Store UUIDs as bytes, not strings

Problem to Solve

UUID representation as string is inefficient.

Proposed Solution

Store Session IDs and Request IDs (and any other UUIDs) as bytes, not strings.

Remove Null from oneof's

Problem to Solve

We use a oneof with a custom Null message to represent deliberately missing values in protobuf. This is non-standard since google provides standard Null messages, but also, oneof's in protobuf already have a "not set" state, making the Null message redundant.

Proposed Solution

We should replace all of our usages with either single-field oneofs or normal messages (which are always optionally not set).

Split up server and client gRPC bindings for Rust

Problem to Solve

In Rust, we have the ability to split up our gRPC bindings, producing a server package and a client package. This would currently make the client package imported by typedb_client roughly 25% smaller, and the server package imported by typedb roughly 15% smaller.

Proposed Solution

One naive way would be to compile two distinct crates (typedb_protocol_client and typedb_protocol_server.) But that's not great: we should be importing from typedb_protocol, not from typedb_protocol_client.

A better strategy exists: compile the server and client bindings individually, then pack them both into the same crate - but gate them behind crate features, so if the user wants to depend on the client they specify the following in their Cargo.toml:

[dependencies]
typedb_protocol = { version: "1.0.0", features = ["client"] }

And if they want the server, they specify features = ["server"].

Additional Information

At the time of writing, producing a crate with optional features is most likely not supported in bazel-distribution.

Merge `proto` packages

Problem to Solve

The current structure of having separate packages for Session, Keyspace and KGMS doesn't accurately reflect the programmating story of interacting with the GRPC protocol and could be better organised.

Proposed Solution

TODO

Replace RequestID with a tuple of TransactionID + RequestSeq

Problem to Solve

UUIDv4 generation is quite costly, and generating a new one for each query in rapid succession (during e.g. data ingestion) can make it a bottleneck.

Proposed Solution

Generate UUID once for the transaction and append a sequence number for each request in a transaction, or move away from uuids altogether.

Stream a batch of / all responses from the server

Problem to Solve

When performing a query that returns many results, the query performance is always relatively bad, scaling with the connection latency to the database, because the protocol explicitly iterates one result at a time despite the server being able to generate all of the results quickly.

Current Workaround

Currently there is no known workaround to this issue.

Proposed Solution

This performance issue could be mitigated by allowing the server to continue sending results in batches.

The iteration request could specify a batch size (up to ALL) in Iter.Req, and the grakn server would return the requested number of results with multiple Iter.Res, ending with a Iter.Res.done = true if the results have completed before the batch size was met (or at the end of the results in ALL).

Client     Server
======     ======

Req(ALL)      |
  |     -->   |
  |          Res
  |     <--  Res
Receive <--  Res
Receive <--  Res
Receive <--  Res
Receive <--  Done
Receive <--   |
  Done        |

This works well for execute requests, since ALL can always be requested (as the execute is always planning to receive all).

In streaming requests, a large batch size could be used instead. The client can begin iterating the results as fast as it receives them, but the batch size also prevents the server from overwhelming it with results (a very simplified variation of back-pressure). In future, this back-pressure would be better handled with custom flow control for the Transaction RPC.

The only issue with using large batches for a streaming request is that if the client wants to use the Concept API during the transaction, it must wait until it has received the whole of the last batch.

This change also does not impact existing client implementations, since not providing a batch size would default to a single iteration (as before).

Additional Notes

An alternative to using large batches in streaming would be to use a double-buffered batching stream, which ensures the next batch is being fetched whilst the previous batch is being consumed, however this would better be built on top of this current work. The best variation of this would be an "adaptive" double-buffered stream, which adapts the batch size based on the iteration latency.

vaticle / typedb-protocol Goto Github PK

typedb-protocol's Issues

Problem to Solve

Proposed Solution

Problem to Solve

Proposed Solution

Problem to Solve

Problem to Solve

Proposed Solution

Problem to Solve

Proposed Solution

Problem to Solve

Proposed Solution

Problem to Solve

Proposed Solution

Additional Information

Problem to Solve

Proposed Solution

Problem to Solve

Proposed Solution

Problem to Solve

Current Workaround

Proposed Solution

Additional Notes

Recommend Projects

Recommend Topics

Recommend Org