Coder Social home page Coder Social logo

mpihlak / mongoproxy Goto Github PK

View Code? Open in Web Editor NEW
38.0 2.0 8.0 1.81 MB

Lightweight proxy to collect MongoDb client metrics

License: MIT License

Rust 87.59% Python 8.52% Dockerfile 0.38% Shell 0.71% Go 2.80%
metrics sidecar mongodb tracing observability

mongoproxy's Introduction

Mongoproxy the Observable

There are many MongoDb proxies, but this one is about observability. It passes bytes between client and the server and records query latencies, response sizes, document counts and exposes them as Prometheus metrics. It also does Jaeger tracing if the client passes a trace id in the $comment field of the operation.

All bytes are passed through unchanged. Furthermore, the metrics processing is decoupled from the proxying so that latency or bugs in the metrics processor have no impact on moving the bytes.

In the backend it uses Tokio for async IO and a custom streaming BSON parser to only extract the interesting bits from the stream.

Current state

Supports MongoDb 3.6 and greater (OP_MSG protocol), and produces throughput and latency metrics both at the network and document level. The legacy OP_COMMAND protocol used by some drivers and older Mongo versions is not fully supported. The proxy won't crash or anything but the collected metrics will be limited.

Jaeger tracing is supported for OP_MSG protocol (sorry, no legacy clients) and only for operations that support $comment strings.

Performance is good. Expect to add just few millicores for the proxy process and < 10 MB of memory used. The actual numbers depend on the workload.

Usage

Sidecar with iptables port forwarding

mongoproxy --proxy 27111

This mode is used when running the proxy as a sidecar on a K8s pod. iptables rules need to be set up to redirect all port 27017 traffic through the proxy. The proxy then determines the original destination address via getsockopt and forwards the requests to its original destination. Because it captures all traffic to Mongo ports, it automatically supports replicaset connections.

See the manually added or automatically injected sidecar examples.

Static server address

mongoproxy --proxy 27113:localhost:27017

This will proxy all requests on port 27113 to the MongoDb instance running on localhost:27017. Useful when running as a shared front-proxy. See the front proxy for a basic example.

Note that this mode does not automatically support replica sets, as replicaset connections can be redirected to any host in the set. To work around this, the proxy needs to run on each of the replicaset nodes and intercept incoming port 27017 traffic. For example, with iptables:

iptables -t nat -A PREROUTING -i ${IFACE} -p tcp --dport ${MONGO_PORT} -j REDIRECT --to-port ${PROXY_PORT}

With Jaeger tracing

mongoproxy --proxy 27113:localhost:27017 \
    --service-name mongoproxy-ftw \
    --enable-jaeger \
    --jaeger-addr localhost:6831

Same as above but with Jaeger tracing enabled. Spans will be sent to collector on localhost:6831. The service name for the traces is set to mongoproxy-ftw.

Running with --enable-jaeger adds some overhead as the full query text is parsed and tagged to the trace.

Other tips

More verbose logging can be enabled by specifying RUST_LOG level as info or debug. Add RUST_BACKTRACE=1 for troubleshooting those (rare) crashes.

To log all MongoDb messages specify --log-mongo-messages.

Metrics

Per-request histograms:

  • mongoproxy_response_latency_seconds - Response latency
  • mongoproxy_documents_returned_total - How many documents were returned.
  • mongoproxy_documents_changed_total - How many documents were changed by insert, update or delete.
  • mongoproxy_client_request_bytes_total - Request size distribution.
  • mongoproxy_server_response_bytes_total - Response size distribution.

All per-request metrics are labeled with client (IP address), app (appName from connection metadata), op, collection, db, server and replicaset.

Connection counters

  • mongoproxy_client_connections_established_total
  • mongoproxy_client_bytes_sent_total
  • mongoproxy_client_bytes_received_total
  • mongoproxy_client_disconnections_total
  • mongoproxy_client_connection_errors_total

Per connection metrics are only labeled with client.

Example:

Metrics example

Tracing

Mongoproxy will not create tracing spans unless the application explicitly requests it. The application does this by passing the trace id in the $comment field of the MongoDb query. So, for example if a find operation has uber-trace-id:6d697c0f076183c:6d697c0f076183c:0:1 in the comment, the proxy picks this up and will create a child span for the find operation. Like this:

Trace example

mongoproxy's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

mongoproxy's Issues

Robo3T 1.4.1 sends extra data that prevents logging mongo messages

I'm interested in capturing logs of client activity, e.g. what queries they send. To do this I look for MsgOpMsg output from the client tracker. Unfortunately Robo3t 1.4.1 sends extra data that causes the rdr.read*() calls that use await? to exit with an io::ErrorKind::UnexpectedEof, even if a document has been successfully read.

Robo3t 1.1.1 seems to work fine.

Reproducing the issue:

  1. Start mongoproxy: RUST_LOG=info target/release/mongoproxy --log-mongo-messages --proxy 27018:localhost:27017 | grep "client tracker"
  2. Connect to the proxy using Robo3T 1.4.1
  3. Query a collection.
  4. Observe the following errors, and with no query logging (no { find: ...}):
Oct 22 09:07:49.577  INFO handle_connection{client_addr="127.0.0.1:51290" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: OP_QUERY BSON: { isMaster: 1, client: { application: { name: "robo3t-1.4.1" }, driver: { name: "MongoDB Internal Client", version: "4.2.6-17-g6bce88c" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "20.04" } } }
Oct 22 09:07:49.578  INFO handle_connection{client_addr="127.0.0.1:51290" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: OP_MSG BSON: { listDatabases: 1, nameOnly: true, $readPreference: { mode: "secondaryPreferred" }, $db: "admin" }
Oct 22 09:07:49.578 ERROR handle_connection{client_addr="127.0.0.1:51290" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: Failed to parse MongoDb 2013 message: unexpected end of file
Oct 22 09:07:49.631  INFO handle_connection{client_addr="127.0.0.1:51296" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: OP_QUERY BSON: { isMaster: 1, client: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.2.6-17-g6bce88c" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "20.04" } } }
Oct 22 09:07:49.632  INFO handle_connection{client_addr="127.0.0.1:51296" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: OP_MSG BSON: { whatsmyuri: 1, $db: "admin" }
Oct 22 09:07:49.632 ERROR handle_connection{client_addr="127.0.0.1:51296" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: Failed to parse MongoDb 2013 message: unexpected end of file
Oct 22 09:07:56.704  INFO handle_connection{client_addr="127.0.0.1:51312" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: OP_QUERY BSON: { isMaster: 1, client: { application: { name: "robo3t-1.4.1" }, driver: { name: "MongoDB Internal Client", version: "4.2.6-17-g6bce88c" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "20.04" } } }
Oct 22 09:07:56.706  INFO handle_connection{client_addr="127.0.0.1:51312" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: OP_MSG BSON: { ping: 1, $db: "admin" }
Oct 22 09:07:56.706 ERROR handle_connection{client_addr="127.0.0.1:51312" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: Failed to parse MongoDb 2013 message: unexpected end of file

Expected behaviour (from a PR I will open soon):

Oct 22 09:10:14.693  INFO handle_connection{client_addr="127.0.0.1:51408" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: OP_QUERY BSON: { isMaster: 1, client: { application: { name: "robo3t-1.4.1" }, driver: { name: "MongoDB Internal Client", version: "4.2.6-17-g6bce88c" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "20.04" } } }
Oct 22 09:10:14.694  INFO handle_connection{client_addr="127.0.0.1:51408" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: OP_MSG BSON: { listDatabases: 1, nameOnly: true, $readPreference: { mode: "secondaryPreferred" }, $db: "admin" }
Oct 22 09:10:14.743  INFO handle_connection{client_addr="127.0.0.1:51414" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: OP_QUERY BSON: { isMaster: 1, client: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.2.6-17-g6bce88c" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "20.04" } } }
Oct 22 09:10:14.744  INFO handle_connection{client_addr="127.0.0.1:51414" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: OP_MSG BSON: { whatsmyuri: 1, $db: "admin" }
Oct 22 09:10:14.747  INFO handle_connection{client_addr="127.0.0.1:51414" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: OP_MSG BSON: { buildinfo: 1, $db: "admin" }
Oct 22 09:10:14.755  INFO handle_connection{client_addr="127.0.0.1:51408" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: OP_MSG BSON: { serverStatus: "1", $db: "db" }
Oct 22 09:10:14.755  WARN handle_connection{client_addr="127.0.0.1:51408" server_addr="localhost:27017"}:client tracker: mongoproxy::tracker: unsupported op: serverStatus
Oct 22 09:10:14.756  INFO handle_connection{client_addr="127.0.0.1:51408" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: OP_MSG BSON: { buildInfo: "1", $db: "db" }
Oct 22 09:10:14.756  INFO handle_connection{client_addr="127.0.0.1:51408" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: OP_MSG BSON: { buildInfo: "1", $db: "db" }
Oct 22 09:10:14.760  INFO handle_connection{client_addr="127.0.0.1:51414" server_addr="localhost:27017"}:client tracker: mongoproxy::mongodb: OP_MSG BSON: { find: "user", filter: {}, lsid: { id: BinData(0x4, NNUSAJRbSh+lVvRyOsMazg==) }, $clusterTime: { clusterTime: Timestamp(1603379410, 4), signature: { hash: BinData(0x0, AAAAAAAAAAAAAAAAAAAAAAAAAAA=), keyId: 0 } }, $readPreference: { mode: "secondaryPreferred" }, $db: "test" }

Support outbound TLS encryption.

Since your proxy inspects the traffic, the traffic obviously cannot be encrypted. That is fine for me if I deploy the proxy as a sidecar.

However, my application must be HIPAA compliant. So, I need to encrypt the traffic when it leaves the proxy.

At present, it appears that the proxy does not support outbound TLS encryption. Any plans to support it?

Question: is this proxy stable?

I need a mongo proxy that works with the 3.6+ wire protocol. My primary need is to intercept write requests and return success while letting read requests through.

Is this proxy a good place to start? If not, do you know of others?

Service mesh integration

I use linkerd and it uses Sidecars with iptables. I want to use your proxy in that environment. I can figure it out myself, but if you already have a config that avoids conflicts I would appreciate it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.