Coder Social home page Coder Social logo

mongobetween's Introduction

mongobetween

mongobetween is a lightweight MongoDB connection pooler written in Golang. It's primary function is to handle a large number of incoming connections, and multiplex them across a smaller connection pool to one or more MongoDB clusters.

mongobetween is used in production at Coinbase. It is currently deployed as a Docker sidecar alongside a Rails application using the Ruby Mongo driver, connecting to a number of sharded MongoDB clusters. It was designed to connect to mongos routers who are responsible for server selection for read/write preferences (connecting directly to a replica set's mongod instances hasn't been battle tested).

How it works

mongobetween listens for incoming connections from an application, and proxies any queries to the MongoDB Go driver which is connected to a MongoDB cluster. It also intercepts any ismaster commands from the application, and responds with "I'm a shard router (mongos)", without proxying. This means mongobetween appears to the application as an always-available MongoDB shard router, and any MongoDB connection issues or failovers are handled internally by the Go driver.

Installation

go install github.com/coinbase/mongobetween

Usage

Usage: mongobetween [OPTIONS] address1=uri1 [address2=uri2] ...
  -loglevel string
    	One of: debug, info, warn, error, dpanic, panic, fatal (default "info")
  -network string
    	One of: tcp, tcp4, tcp6, unix or unixpacket (default "tcp4")
  -password string
    	MongoDB password
  -ping
    	Ping downstream MongoDB before listening
  -pretty
    	Pretty print logging
  -statsd string
    	Statsd address (default "localhost:8125")
  -unlink
    	Unlink existing unix sockets before listening
  -username string
    	MongoDB username
  -dynamic string
    	File or URL to query for dynamic configuration
  -enable-sdam-metrics
        Enable SDAM(Server Discovery And Monitoring) metrics
  -enable-sdam-logging
        Enable SDAM(Server Discovery And Monitoring) logging

TCP socket example:

mongobetween ":27016=mongodb+srv://username:[email protected]/database?maxpoolsize=10&label=cluster0"

Unix socket example:

mongobetween -network unix "/tmp/mongo.sock=mongodb+srv://username:[email protected]/database?maxpoolsize=10&label=cluster0"

Proxying multiple clusters:

mongobetween -network unix \
  "/tmp/mongo1.sock=mongodb+srv://username:[email protected]/database?maxpoolsize=10&label=cluster1" \
  "/tmp/mongo2.sock=mongodb+srv://username:[email protected]/database?maxpoolsize=10&label=cluster2"

The label query parameter in the connection URI is used to any tag statsd metrics or logs for that connection.

Dynamic configuration

Passing a file or URL as the -dynamic argument will allow somewhat dynamic configuration of mongobetween. Example supported file format:

{
  "Clusters": {
    ":12345": {
      "DisableWrites": true,
      "RedirectTo": ""
    },
    "/var/tmp/cluster1.sock": {
      "DisableWrites": false,
      "RedirectTo": "/var/tmp/cluster2.sock"
    }
  }
}

This will disable writes to the proxy served from address :12345, and redirect any traffic sent to /var/tmp/cluster1.sock to the proxy running on /var/tmp/cluster2.sock. This is useful for minimal-downtime migrations between clusters.

TODO

Current known missing features:

  • Transaction server pinning
  • Different cursors on separate servers with the same cursor ID value

Statsd

mongobetween supports reporting health metrics to a local statsd sidecar, using the Datadog Go library. By default it reports to localhost:8125. The following metrics are reported:

  • mongobetween.handle_message (Timing) - end-to-end time handling an incoming message from the application
  • mongobetween.round_trip (Timing) - round trip time sending a request and receiving a response from MongoDB
  • mongobetween.request_size (Distribution) - request size to MongoDB
  • mongobetween.response_size (Distribution) - response size from MongoDB
  • mongobetween.open_connections (Gauge) - number of open connections between the proxy and the application
  • mongobetween.connection_opened (Counter) - connection opened with the application
  • mongobetween.connection_closed (Counter) - connection closed with the application
  • mongobetween.cursors (Gauge) - number of open cursors being tracked (for cursor -> server mapping)
  • mongobetween.transactions (Gauge) - number of transactions being tracked (for client sessions -> server mapping)****
  • mongobetween.server_selection (Timing) - Go driver server selection timing
  • mongobetween.checkout_connection (Timing) - Go driver connection checkout timing
  • mongobetween.pool.checked_out_connections (Gauge) - number of connections checked out from the Go driver connection pool
  • mongobetween.pool.open_connections (Gauge) - number of open connections from the Go driver to MongoDB
  • mongobetween.pool_event.connection_closed (Counter) - Go driver connection closed
  • mongobetween.pool_event.connection_pool_created (Counter) - Go driver connection pool created
  • mongobetween.pool_event.connection_created (Counter) - Go driver connection created
  • mongobetween.pool_event.connection_check_out_failed (Counter) - Go driver connection check out failed
  • mongobetween.pool_event.connection_checked_out (Counter) - Go driver connection checked out
  • mongobetween.pool_event.connection_checked_in (Counter) - Go driver connection checked in
  • mongobetween.pool_event.connection_pool_cleared (Counter) - Go driver connection pool cleared
  • mongobetween.pool_event.connection_pool_closed (Counter) - Go driver connection pool closed

Background

mongobetween was built to address a connection storm issue between a high scale Rails app and MongoDB (see blog post). Due to Ruby MRI's global interpreter lock, multi-threaded web applications don't utilize multiple CPU cores. To achieve better CPU utilization, Puma is run with multiple workers (processes), each of which need a separate MongoDB connection pool. This leads to a large number of connections to MongoDB, sometimes exceeding MongoDB's upstream connection limit of 128k connections.

mongobetween has reduced connection counts by an order of magnitude, spikes of up to 30k connections are now reduced to around 2k. It has also significantly reduced ismaster commands on the cluster, as there's only a single monitor goroutine per mongobetween process, instead of a monitor thread for each Ruby process.

mongobetween's People

Contributors

d2army avatar divjotarora avatar juwit avatar kounat avatar mdehoog avatar prestonvasquez avatar rdeavilafloqast avatar rexiaprevail avatar rootcss avatar taganaka avatar thathurleyguy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mongobetween's Issues

Need help with integrating it with PyMongo and MongoDB Node.js driver

Hi
I am working on integrating MongoDB with PyMongo and the MongoDB Node.js driver. I need some documentation on how to do this. We will be using a connection string to connect to the database, which is hosted in AWS DocumentDB. I need help with integrating this.

Thank you.

opMsg.IsIsMaster always reports false

The *opMsg.IsIsMaster function always returns false. However, drivers will use OP_MSG for the periodic server heartbeats. OP_QUERY is only used when creating new connections because the server version is unknown at that point. The server monitor uses a dedicated connection, so after the initial connection handshake, subsequent heartbeats will use OP_MSG because the server version is known to be greater than 3.6.

I think the implication here is that mongobetween will actually proxy these heartbeats to the underlying cluster rather than intercepting and manually responding.

Note: I wasn't sure where to file issues for mongobetween, so feel free to move to another location if Github issues aren't correct.

Using SSL .pem file with mongobetween

Hello,

We need to use a .pem file for the SSL enabled target Mongo database .

Can you provide an example for the same ? Could not find any parameter in the Usage section of Mongobetween to include a .pem file with location.

Thanks

Allow minWireVersion/maxWireVersion fields in heartbeat responses to be configured

mongobetween currently responds to heartbeats (periodic isMaster requests) with minWireVersion=0 and maxWireVersion=7 in order to present as a 4.0 server. To be more flexible, users should be able to configure these via command line parameters and/or environmental variables. This would allow mongobetween to support multiple server versions and would also make server upgrades easier because no code changes would be required to the proxy.

Required URI workaround should not be marked as a driver bug

The uriWorkaround function calls out a Go driver bug because the driver does not allow setting an authSource without a username. For v1.3.4, this behavior was in accordance with the auth spec. DRIVERS-796 has amended the spec to say that a URI like mongodb://localhost:27017/?authSource=foo is valid and the generated ticket for the Go Driver (GODRIVER-1473) has been finished and will be released in v1.4.0.

This is low priority, but I'd like to remove the wording saying this is a driver bug so anyone reading the proxy does not think a ticket is necessary to fix the behavior upstream in the driver.

Allow the customisation of the base namespace of the statsd namespace

As it is all stats are aggregated in mongobetween. namespace.
If you have a set of pools, all of the once are shown in the current namespace.

We should be able to add a new parameter to modify the namespace, so we can have more granular things that include host name as well.
Maybe expose a parameter or something.

type *statsd.Client has no field or method Tags

go install github.com/coinbase/mongobetween
# github.com/coinbase/mongobetween/util
go/src/github.com/coinbase/mongobetween/util/statsd.go:10:22: client.Tags undefined (type *statsd.Client has no field or method Tags)

Appears to be an issue from v4 -> v5 with statsd from datadog, most of these things are now lowercased.

Require authentication

I managed to run mongobetween between my mongodb replica and me, but wonder how if there's a way to secure that connection with a username/password.

As I understand the -username and -password options are to be only applied to the downstream connection. Is there a way? I basically would want the same password to be required, that is used to connect to the replica.

Drivers could mark mongobetween Unknown for server issues

The README says that mongobetween appears as an always-available mongos server, but I believe a driver would mark mongobetween as Unknown if there were a server error like NotMaster. In this case, mongobetween would extract the error and use the Go driver's ProcessError function to mark the actual mongos Unknown, which is correct, but would also proxy the message back to the original driver, which would then mark mongobetween Unknown per the error handling section of the SDAM spec.

EDIT: I think the same is true for connection errors. mongo.RoundTrip returns an error if the WriteWireMessage or ReadWireMessage calls fail. This is propagated upward so handleConnection returns the error to the goroutine launched by Proxy.accept, which closes the connection. This would show up in the application as a non-timeout network error, which would cause the application to mark the proxy as Unknown and clear its connection pool.

Cannot connect to mongobetween

Hello, thank you for open sourcing this project, it seems to be just what I need for my Next.js application which deploys my API as AWS Lambda functions, causing way too many database connections.

Unfortunately I can't seem to get mongobetween working. It connects to my database ok, but my application is not able to connect to mongobetween. I'm hoping to get some help.

I'm attempting to use mongobetween to proxy to a MongoDB Atlas database at version 4.4.6 (it's a 3-node replica set with no sharding).
My application uses Mongoose 5.11.8 (which uses the MongoDB Node.js driver, I'm assuming v3.6 but I couldn't confirm this).
I spun up a basic server on DigitalOcean and installed Go v1.16.5 with the latest mongobetween.

Command used to start mongobetween:
./go/bin/mongobetween -pretty -loglevel debug ":27016=mongodb+srv://username:[email protected]/database?maxpoolsize=50&label=cluster0"

MongoBetween seems to connect to my Atlas database ok, but my application receives the following error:

MongoServerSelectionError: The client metadata document may only be sent in the first isMaster
    at Timeout._onTimeout (C:\Users\user\Documents\GitHub\app\node_modules\mongodb\lib\core\sdam\topology.js:438:30)
    at listOnTimeout (internal/timers.js:554:17)
    at processTimers (internal/timers.js:497:7) {
  reason: TopologyDescription {
    type: 'Unknown',
    setName: null,
    maxSetVersion: null,
    maxElectionId: null,
    servers: Map(1) {
      '134.209.211.213:27016' => ServerDescription {
        address: '134.209.211.213:27016',
        error: MongoError: The client metadata document may only be sent in the first isMaster
            at MessageStream.messageHandler (C:\Users\user\Documents\GitHub\app\node_modules\mongodb\lib\cmap\connection.js
            at MessageStream.emit (events.js:315:20)
            at processIncomingData (C:\Users\user\Documents\GitHub\app\node_modules\mongodb\lib\cmap\message_stream.js:144:
            at MessageStream._write (C:\Users\user\Documents\GitHub\app\node_modules\mongodb\lib\cmap\message_stream.js:42:
            at writeOrBuffer (internal/streams/writable.js:358:12)
            at MessageStream.Writable.write (internal/streams/writable.js:303:10)
            at Socket.ondata (internal/streams/readable.js:719:22)
            at Socket.emit (events.js:315:20)
            at addChunk (internal/streams/readable.js:309:12)
            at readableAddChunk (internal/streams/readable.js:284:9)
            at Socket.Readable.push (internal/streams/readable.js:223:10)
            at TCP.onStreamRead (internal/stream_base_commons.js:188:23) {
          operationTime: Timestamp {
            _bsontype: 'Timestamp',
            low_: 1,
            high_: 1623700272
          },
          ok: 0,
          code: 186,
          codeName: 'ClientMetadataCannotBeMutated',
          '$clusterTime': { clusterTime: [Timestamp], signature: [Object] }
        },
        roundTripTime: -1,
        lastWriteDate: null,
        opTime: null,
        type: 'Unknown',
        topologyVersion: undefined,
        minWireVersion: 0,
        maxWireVersion: 0,
        hosts: [],
        passives: [],
        arbiters: [],
        tags: []
      }
    },
    stale: false,
    compatible: true,
    compatibilityError: null,
    logicalSessionTimeoutMinutes: null,
    heartbeatFrequencyMS: 10000,
    localThresholdMS: 15,
    commonWireVersion: null
  }
}

I then tried connecting directly from the mongo shell and got Authentication failed:

MongoDB shell version v4.4.4
connecting to: mongodb://mongobetween_server:27016/database?compressors=disabled&gssapiServiceName=mongodb
Error: Authentication failed. :
connect@src/mongo/shell/mongo.js:374:17
@(connect):2:6
exception: connect failed
exiting with code 1

Could this be from using too new of a Mongo client version?
Thank you so much for any help you can provide.

Raymond

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.