Coder Social home page Coder Social logo

delta-dm's Introduction

Δ Dataset Manager (DDM)

What is this?

A tool to manage deal replication tracking for onboarding datasets to the Filecoin network via import storage deals. This provides a solution to quickly make deals for massive amounts of data, where the transfer is better handled out-of-band.

Core Concepts

Dataset

The top-level logical grouping of data in DDM is the dataset. Datasets are identified by a name (aka "slug"), along with a replication quota, deal length, and a wallet to make the deals from. Datasets are added independently from the content making them up.

Content

Once a dataset has been created, content may be added to it. A content represents a .CAR file - archive of data that will be shipped to the SP and loaded into their operation. Content is identified by its PieceCID (CommP), has two sizes (raw file size, Padded Piece Size), and also contains a CID of the actual data (Payload CID).

Providers

DDM tracks deals to Storage Providers in the network. Add a list of storage providers to DDM before making deals to begin tracking them.

Replication Profiles

A Replication Profile is what ties a Dataset together with a Provider. It defines the parameters for any deals made to that provider for that dataset. Currently, it allows specifying whether to keep an unsealed copy and whether to announce to the IPNI indexer. This allows for flexibility in how deals are made to different providers, such as defining a single SP to host the unsealed copies for retrieval while the others maintain a cold copy for backup.

Replication

Once a Dataset, Content, Providers, and a Replication Strategy have been specified, DDM can make replications for the content to the providers. A Replication is a single deal made to a single provider for a single piece of content. Replications are tracked by DDM, and can be queried for status and deal information.

Instructions

  • Set DELTA_AUTH environment variable to Delta API key. It can also be provided as the CLI flag --delta-auth
  • DDM will default to a delta instance running at localhost:1414. It must be running or DDM will not start. Override the url by providing specifying the DELTA_API environment variable, or CLI flag --delta-api
  • DDM will use the Estuary Auth server by default. It can be overridden by specifying the AUTH_URL environment variable, or CLI flag --auth-url

Usage

DDM runs as a daemon, which is a webserver. Start it up with the daemon command.

./delta-dm daemon

By default, delta-dm daemon runs on port 1415. It can be changed with the --port flag or DELTA_DM_PORT environment variable.

Once running, you can interact with DDM through the API, CLI, or via the Delta Web frontend

API

See api docs in /docs/api.md.

Command-Line Interface

See cli docs in /docs/cmd.md.

Provider Self-service

See docs in /docs/self-service.md.

Importing CIDs from Singularity

See docs in /docs/singularity-import.md.

Developer Tips

By default, DDM will run using a SQLite database. This is fine for development, but for production use, it is recommended to use a Postgres database. To test this, you can run a Postgres instance in Docker and connect to it with DDM.

docker run --name ddm-postgres -p 5432:5432 -e POSTGRES_PASSWORD=password -d postgres:14.7
psql postgres://postgres:password@localhost:5432/ # to connect to the database

Update the env file (or --db flag) to connect to the dev postgres database.

DB_DSN="postgres://postgres:password@localhost:5432/"

delta-dm's People

Contributors

anjor avatar elijaharita avatar jcace avatar lucroy avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.