4mc icon 4mc

4mc - splittable lz4 and zstd in hadoop/spark/flink

airflow icon airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

alluxio icon alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud

angel icon angel

A Flexible and Powerful Parameter Server for large-scale machine learning

antlr4 icon antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

arctic icon arctic

Arctic is a streaming lake warehouse service open sourced by NetEase

arcticdb icon arcticdb

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.

aresdb icon aresdb

A GPU-powered real-time analytics storage and query engine.

arrow icon arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing

arrow-rs icon arrow-rs

Official Rust implementation of Apache Arrow

arrow-spark-publication icon arrow-spark-publication

Implementation connecting Arrow to Spark, effectively making all code related to reading in Spark redundant.

arroyo icon arroyo

Arroyo is a distributed stream processing engine written in Rust

avro icon avro

Apache Avro is a data serialization system.

awadb icon awadb

AI Native database for embedding vectors

beam icon beam

Apache Beam is a unified programming model for Batch and Streaming

blaze icon blaze

Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.

braft icon braft

An industrial-grade C++ implementation of RAFT consensus algorithm based on brpc, widely used inside Baidu to build highly-available distributed systems.

buck2 icon buck2

Build system, successor to Buck

buf icon buf

A new way of working with Protocol Buffers.

ceph icon ceph

Ceph is a distributed object, block, and file storage platform

ceresdb icon ceresdb

CeresDB is a high-performance, distributed, cloud native time-series database.

chubaofs icon chubaofs

ChubaoFS (abbrev. CBFS) is a cloud native distributed file system and object store.

