Coder Social home page Coder Social logo

ipfs / rainbow Goto Github PK

View Code? Open in Web Editor NEW
72.0 12.0 11.0 521 KB

A specialized IPFS HTTP gateway

Home Page: https://docs.ipfs.tech/reference/http/gateway/

License: Other

Go 95.19% Dockerfile 1.60% Shell 1.64% HTML 1.56%
http ipfs ipfs-gateway

rainbow's Introduction


Rainbo logo
Rainbow

A to-be-released production-grade IPFS HTTP Gateway written in Go (using Boxo).

Official Part of IPFS Project Discourse Forum Matrix ci coverage GitHub release godoc reference


About

Rainbow is an implementation of the IPFS HTTP Gateway API, based on boxo which is the tooling that powers Kubo IPFS implementation. It uses the same Go code as the HTTP gateway in Kubo, but is fully specialized to just be a gateway:

  • Rainbow acts as Amino DHT and Bitswap client only.
  • Rainbow does not pin, or permanently store any content. It is just meant to act as gateway to content present in the network.
  • Rainbow settings are optimized for production deployments and streamlined for specific choices (flatfs datastore, writethrough uncached blockstore etc.)
  • Denylist and denylist subscription support is included.
  • And more to come...

Building

go build

Running

rainbow

Use rainbow --help for documentation.

Docker

Automated Docker container releases are available from the Github container registry:

  • ๐ŸŸข Releases
    • latest always points at the latest stable release
    • vN.N.N point at a specific release tag
  • ๐ŸŸ  Unreleased developer builds
    • main-latest always points at the HEAD of the main branch
    • main-YYYY-DD-MM-GITSHA points at a specific commit from the main branch
  • โš ๏ธ Experimental, unstable builds
    • staging-latest always points at the HEAD of the staging branch
    • staging-YYYY-DD-MM-GITSHA points at a specific commit from the staging branch
    • This tag is used by developers for internal testing, not intended for end users

When using Docker, make sure to pass necessary config via -e:

$ docker pull ghcr.io/ipfs/rainbow:main-latest
$ docker run --rm -it --net=host -e RAINBOW_SUBDOMAIN_GATEWAY_DOMAINS=dweb.link ghcr.io/ipfs/rainbow:main-latest

See /docs/environment-variables.md.

Configuration

CLI and Environment Variables

Rainbow can be configured via command-line arguments or environment variables.

See rainbow --help and /docs/environment-variables.md for information on the available options.

Rainbow uses a --datadir (or RAINBOW_DATADIR environment variable) as location for persisted data. It defaults to the folder in which rainbow is run.

Peer Identity

Using a key file: By default generates a libp2p.key in its data folder if none exist yet. This file stores the libp2p peer identity.

Using a seed + index: Alternatively, random can be initialized with a 32-byte, b58 encoded seed and a derivation index. This allows to use the same seed for multiple instances of rainbow, and only change the derivation index.

The seed and index can be provided as command line arguments or environment vars (--seed , --seed-index). The seed can also be provided as a seed file in the datadir folder. A new random seed can be generated with:

rainbow gen-seed > seed

To facilitate the use of rainbow with systemd LoadCredential= directive, we look for both libp2p.key and seed in $CREDENTIALS_DIRECTORY first.

Denylists

Rainbow can subscribe to append-only denylists using the --denylists flag. The value is a comma-separated list of URLs to subscribe to, for example: https://denyli.st/badbits.deny. This will download and update the denylist automatically when it is updated with new entries.

Denylists can be manually placed in the $RAINBOW_DATADIR/denylists folder too.

See NoPFS for an explanation of the denylist format. Note that denylists should only be appended to while Rainbow is running. Editing differently, or adding new denylist files, should be done with Rainbow stopped.

Blockstores

Rainbow ships with a number of possible blockstores for the purposes of caching data locally. Because Rainbow, as a gateway-only IPFS implementation, is not designed for long-term data storage there are no long term guarantees of support for any particular backing data storage.

See Blockstores for more details.

Garbage Collection

Over time, the datastore can fill up with previously fetched blocks. To free up this used disk space, garbage collection can be run. Garbage collection needs to be manually triggered. This process can also be automated by using a cron job.

By default, the API route to trigger GC is http://$RAINBOW_CTL_LISTEN_ADDRESS/mgr/gc. The BytesToFree parameter must be passed in order to specify the upper limit of how much disk space should be cleared. Setting this parameter to a very high value will GC the entire datastore.

Example cURL commmand to run GC:

curl -v --data '{"BytesToFree": 1099511627776}' http://127.0.0.1:8091/mgr/gc

Logging

While the logging can be controlled via environment variable it is also possible to dynamically modify the logging at runtime.

  • http://$RAINBOW_CTL_LISTEN_ADDRESS/mgr/log/level?subsystem=<system name or * for all system>&level=<level> will set the logging level for a subsystem
  • http://$RAINBOW_CTL_LISTEN_ADDRESS/mgr/log/ls will return a comma separated list of available logging subsystems

Deployment

Suggested method for self-hosting is to run a prebuilt Docker image.

An ansible role to deploy Rainbow is available within the ipfs.ipfs collection in Ansible Galaxy (https://github.com/ipfs-shipyard/ansible). It includes a systemd service unit file.

Release

  1. Create a PR from branch release-vX.Y.Z against main that:
    1. Tidies the CHANGELOG.md with the changes for the current release
    2. Updates the version.json file
  2. Once the release checker creates a draft release, copy-paste the changelog into the draft
  3. Merge the PR, the release will be automatically created once the PR is merged

License

Dual-licensed under MIT + Apache 2.0

rainbow's People

Contributors

2color avatar acejam avatar aschmahmann avatar biglep avatar dependabot[bot] avatar galargh avatar gammazero avatar hacdias avatar hsanjuan avatar ipfs-mgmt-read-write[bot] avatar lidel avatar ns4plabs avatar web-flow avatar web3-bot avatar whyrusleeping avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rainbow's Issues

Configurable Listen Addresses

Tracking issue to add flag (and env variable) to be able to configure the listen addresses. Right now we use the default ones. There was prior art in #59 but at the time it was not needed.

Blockstore options

Consider enabling noPrefix and writethrough.

(I'm not sure about noPrefix right now, but if badger is the backend, writeThrough might be a good thing).

GC and TTLing blocks

Badger has support for TTL records.

We could use TTL on records, but we would also have to modify the blockstore so that any blocks read are written with a new TTL. This implies making sure that that only happens when blocks have been read for the gateway, and not for other things (i.e. providing them on bitswap, or announcing them to the dht).

Then badgerGC would automatically remove non-used blocks.

This doesn't ensure that you don't run out of space (i.e. lots of very recent blocks). It would be better if we could do "delete blocks older than X" rather than giving every block a TTL. In any case might need to be combined with other GC strategy.

Serve static frontpage

Visiting the gateway without any path/cid or with an obviously 404 path (i.e. gw.url/ifps/...) should provide more information.

Attention to be made on whether we are talking to a browser or not (html response vs text etc).

/routing/v1 http client metrics and configuration

Problem

Seems that we have hardcoded some settings related to delegated routing over HTTP

15s timeout on cold cache might lead to undesired denial of service if content is only announced to IPNI at cid.contact, and either client or server are under load so receiving response takes more than 15s

Solution

I think we should expose http routing client metrics to see if/when things fail, and make things configurable (at least the routing timeout), and use our infra to adjust the default based on real world performance:

  • expose timeout as a configuration setting, allowing us to fine-tune it on ipfs.io infra
    • config option for adjusting timeout should follow whatever naming convention we end up in #113
    • ipfs.io gateway infra timeouts (HTTP 504) ~1m, so I think it would not hurt if we wait for routing response bit longer than 15s
  • have success/failure metrics for each defined /routing/v1 endpoint

Rainbow request latency and size metrics cannot be aggregated

Problem

We instrument the code with Prometheus Summaries to measure:

  • duration/latency of requests
  • request and response byte size

rainbow/metrics.go

Lines 79 to 92 in 6832d41

opts.Name = "request_duration_seconds"
opts.Help = "The HTTP request latencies in seconds."
reqDur := prometheus.NewSummaryVec(opts, labels)
prometheus.MustRegister(reqDur)
opts.Name = "request_size_bytes"
opts.Help = "The HTTP request sizes in bytes."
reqSz := prometheus.NewSummaryVec(opts, labels)
prometheus.MustRegister(reqSz)
opts.Name = "response_size_bytes"
opts.Help = "The HTTP response sizes in bytes."
resSz := prometheus.NewSummaryVec(opts, labels)
prometheus.MustRegister(resSz)

Unfortunately, summaries cannot be aggregated which means we cannot calculate quantile calculations for all rainbow instances in the public waterworks infrastructure.

Suggested solution

Switch these over to use Histograms

https://www.robustperception.io/how-does-a-prometheus-histogram-work/
https://prometheus.io/docs/practices/histograms/

Built-in GC configuration to run periodically

Problem: Right now, the GC needs to be triggered externally, which is far from the best situation.

Solution: we could periodically run the GC in Rainbow by default, that would delete a certain number of bytes based on the disk space, if possible.


@ns4plabs I heard you may be working on this. But if you aren't, I can take a stab at it.

More rainbows?

How can I incentivize an even more striped down version of this?

I don't want any features except for a glorified webserver that can serve /ipfs/ paths from a given blockstore without any of the related datastore, dht, etc. etc.

I plan to run GC externally as well so I'm not even particularly attached to having GC as a feature.

The use case here is using with a filesystem e.g. btrfs or zfs like fs that have snapshot and mount features.
I.E.

  • Shutdown the kubo daemon
  • Snapshot the current blockstore dir
  • Mount the snapshot readonly
  • Spin up some rainbow instances backed by the mounted snapshot.
  • Run some GC on the 'live' version of the blockstore.
  • Start kubo
  • Spin down rainbow instances
  • Resume business as usual.

This whole process could be abstracted even further with containers or virtual machines or both, and all sorts of fancy disk management thereof.

The (perhaps perceived only) problem with the current state of affairs is to run the dht you need to load the datastore and resolve peer ids etc. etc.
Things requiring libp2p land features resolving peer ids and keys e.g. ipns should be handled by a full kubo node elsewhere.

Also could there please be some documentation of which blockstore repo versions this works with?

If I was really being greedy, would it be possible to load blockstore plugins e.g. aws-s3-blockstore?

I'm seriously interested in moving this forward, what's needed to incentivize a little bit of attention here?

Add RAINBOW_TRUSTLESS_GATEWAY_DOMAINS=trustless-gateway.link

Problem

  • https://trustless-gateway.link should be limited to trustless responses.
  • Right now rainbow allows deserialized responses for all domains, so the limiting of response types to trustless car/block/ipns-record on that domain is done at Nginx, which is a bit janky.
    • For example, it does support Accept: application/vnd.ipld.car but errors on ?format=car, which is a bug we need to fix eventually.

Solution

I believe explicit is better than implicit here.

Initial idea: We already have RAINBOW_SUBDOMAIN_GATEWAY_DOMAINS. Add similar, explicit RAINBOW_TRUSTLESS_GATEWAY_DOMAINS which limits specified domains to only support trustless responses, and return HTTP 400 Bad Request for everything else, and use it with trustless-gateway.link

cc @ns4plabs @hacdias

Histogram buckets are too small

Problem

HTTP gateway requests for non-cached blocks can often take much longer than a couple of seconds. In fact, it's not uncommon for them to take ~60 seconds or more.

Given that, the current bucket configuration for the histograms exposed by Rainbow don't make sense, since their tuned for much quicker responses which are rather unlikely in an uncontrolled peer-to-peer network:

Buckets: []float64{0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10, 30, 60},

Suggestion

Add a bunch of larger bucket sizes, like we do in boxo:

https://github.com/ipfs/boxo/blob/0f223aada9b8beefe449b94ee9601d917f482121/gateway/metrics.go#L17-L20

Related: https://github.com/ipshipyard/waterworks-infra/issues/141

Check block cache across multiple rainbow instances

Problem

At inbrowser.dev (backed by rainbow from ipfs.io gateway, so a general problem in our infra), we see inconsistent page load times across regions, and sometimes across requests within the same region.

User can get instant response from one instance, and then on subsequent page load, or request, I get stalled page load and timeout, even tho the data exist in cache of one of the other rainbows in the global cluster. We also see inconsistency across subresources on a single page.

Scope

  • Rainbow users running multiple instances should have means of "logically merging their block caches"
  • This should be opt-in feature, that requires manual configuration of rainbow operator
  • (Open question) Do we want to run bitswap server in rainbow, or HTTP client to avoid "the unsustainable manual peering trap"?
  • We don't want to invent any new protocols. Use HTTP stack if possible.

Solutions

A: Add HTTP Retrieval Client to Rainbow, leverage Cache-Control: only-if-cached

We know we need HTTP retrieval client for Kubo to enable HTTP Gateway over Libp2p by default, and to make direct HTTP retrieval from service providers more feasible. We can't do that without a client and end-to-end tests. Prototyping one in Rainbow sounds like a good plan, improving multiple work streams at the same time.

The idea here is to introduce HTTP client which runs in addition, or in parallel to bitswap retrieval.
Keep it simple, don't mix abstractions, do opportunistic block retrieval like bitswap, but over HTTP.

Using application/vnd.ipld.raw and trustless gateway protocol is a good match here: allows us to benefit from HTTP caching and middleware, making it more flexible than bitswap.

Rainbow could:

  • Have a list of other rainbow instances in form of URLs with trustless gateway endpoints
    • In case of ipfs.io gateway, we could produce a list with shuffled same-region instances first, and the rest of instances after them.
  • Make inexpensive block requests with Cache-Control: only-if-cached going over list in sequence.
    • This does not cost any expensive IO, if rainbow does not have the block locally, it will instantly respond with HTTP 412.

This way, once a block lands in any of our rainbow caches, we will discover it, and requests won't timeout after 1m on unlucky scenarios.

Open questions:

  • Is sequential, inexpensive HTTP check enough to avoid amplification attacks?
  • Ok to start at the same time as bitswap, or do we want to delay, and act as a fallback when we are unable to find block by regular means for (>10-30s)?

B: Set up reverse proxy (nginx, lb) to try rainbows with Cache-Control: only-if-cached first

Writing this down just to have something other than (A), I don't personally believe (B) is feasible.

The idea here is to update the way our infrastructure proxies gateway requests to rainbow instances, and first ask all upstream instances within the region for resource with Cache-Control: only-if-cached, and if none of them has the thing, retry with a normal request that will trigger p2p retrieval.

The downside here is that this feels like antipattern:

  • Overrides any user-provided Cache-Control
  • Creates cache hot spots: popular data is not distributed across rainbow instances, but always served by a specific instance which fetched it first.

C: Reuse Bitswap client and server we already have

Right now, Rainbow runs Bitswap in read-only mode. It always says it does not have data when asked over bitswap.

What we could do is to a permissioned version of peering:

  • libp2p preconnect to safelisted set of peers and protect these peering connections from being closed
    • If Rainbow does not announce peer records to DHT, we should require full /ip|dns*/.../p2p/peerid, otherwise we
  • (for now) allow serving data over bitswap to safe-listed set of /p2p/ multiaddrs (quick and easy), leverage existing peering config / libraries where possible (#35)
  • (allows us to do more in the future) switch to HTTP retrieval (over libp2p or /http)

D: ?

Ideas welcome.

HTTP caching layer directly in the gateway

Caching on top of rainbow means we cannot effectively integrate the content-blocking layer as the cache would bypass it.

If we can introduce an HTTP cache directly at the gateway handlers that performs moderately well then we could resolve this problem. Investigate what options exist in go.

Bitswap options

Bitswap internal options need to be adjustable to real gateway traffic.

Add libp2p resource manager

We might check what customization over the default the current gateways are using and let that be configurable too.

Add remote backend mode from bifrost-gateway

Filling here so we don't forget, upstream details in ipfs/boxo#576.

Once we have block / car backends in boxo/gateway, it will be easy to add config flag to rainbow which disables libp2p stack and only uses delegated retrieval via HTTP (trustless gateway + delegated routing).

I think the basic configuration options would be:

  • RAINBOW_REMOTE_BACKEND with list of URLs of trustless gateways
  • RAINBOW_REMOTE_BACKEND_MODE=block|car where block is implicit default (if not set)

End-to-end configuration tests

Rainbow needs end-to-end regression tests that build binary and confirm sensitive configuration is applied correctly:

  • confirm env RAINBOW_TRUSTLESS_GATEWAY_DOMAINS is applied correctly
  • confirm cli param --trustless-gateway-domains is applied correctly

We can't afford regression in trustless-only mode, but once we have end-to-end setup, we cna test more with it.

Bitswap client only

Every time we connect to someone looking for something, we might receive their want list as a present.

We don't care about their wantlist and definitely don't want to give them any traffic over bitswap. Investigate possible mitigations here.

Support consuming https://badbits.dwebops.pub/badbits.deny

We want rainbow to be enough to do badbits handling at ipfs.io / dweb.link.
If it is good enough for us, it will be good enough for people who self-host.

Unfortunately, https://badbits.dwebops.pub/badbits.deny is not append-only (https://github.com/protocol/badbits.dwebops.pub/issues/32733), and HTTPSubscriber from nopfs assumes the list is. We also gave bug around CAR handling.

Below is a list of known issues that we need to close to make it a viable recommendation:

Support direct HTTP retrieval from /https providers

This is GO version of ipfs/service-worker-gateway#72.

We want rainbow to benefit from /https providers (example) and use them in addition to bitswap

Ideally, we would be prioritizing HTTP retrieval over bitswap, where possible, as it lowers the cost of content providers, and incentivizes them to configure, expose, and announce HTTPS endpoints.

MVP scope

Focus should be on block (application/vnd.ipld.raw, ?format=raw) requests, as these will always work, across all implementations, and provide the best cachability for HTTP infrastructure we have.

CAR with IPIP-402 may be more involved, and may lead to duplicated block retrievals due to the way loading a page with a dozen of subresources works (all share the same parent, all fetched in parallel, may lead to racy case where parent blocks are fetched multiple times, slowing down page loads)

Switch storage to badger/v4

Ensure we expose meaningful configuration options and adjust blocks to 256KiB which should be the most usual.

Debug endpoint to check peering status

It would be advantageous if we have a debug endpoint where we can check the peering status. This can be especially useful if we're using seed-based automatic peering.

The only way of checking it now is using GOLOG_LOG_LEVEL="peering=debug" and checking the debug logs.

Performance benchmarking

I tested Kubo 0.23.0 vs Rainbow (master) on Thunderdome.

We see rainbow is like 5x faster in ttfb metrics:

image

Rainbow uses many less resources than Kubo. I.e. 10x less mean heap usage. 5x less goroutines. 6x less CPU.

image

Rainbow has similar request metrics, seems to be slightly faster.

image

Looking in the time dimension, Kubo has waves where TTFB increases greatly, where rainbow doesn't:

image

In general Rainbow appears to be much more performance in terms of resource usage than Kubo, for slightly better results when processing requests in terms of return codes (with the exception of TTFB, where rainbow is much better), although we see similar graphs of dropped requests/timeouts:

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.