brucemen711,github

airbyte

Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.

appdocs

Application Performance Optimization Summary

awesome-datascience

:memo: An awesome Data Science repository to learn and apply for real world problems.

awesome-scalability

The Patterns of Scalable, Reliable, and Performant Large-Scale Systems

awesome-sysadmin

A curated list of amazingly awesome open source sysadmin resources inspired by Awesome PHP.

book-notes

Notes from books and other interesting things that I've read. Table of contents at the end 👇

citus

Scalable PostgreSQL for multi-tenant and real-time analytics workloads

dag-factory

Dynamically generate Apache Airflow DAGs from YAML configuration files

data-science-ipython-notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

databaseology

Collection of Papers On Database Management Systems

dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

dbt-spark

dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks

debezium-examples

Examples for running Debezium (Configuration, Docker Compose files etc.)

etl-with-airflow

ETL best practices with airflow, with examples

herringbone

Tools for working with parquet, impala, and hive

hive-metastore-docker

Example for article Running Spark 3 with standalone Hive Metastore 3.0

iceberg

Apache Iceberg

iceberg-stack-docker

Iceberg Stack

incubator-dolphinscheduler

Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available `out of the box`.

integrations-extras

Community developed integrations and plugins for the Datadog Agent.

k8s-example

example

kafka-spark-consumer

High Performance Kafka Connector for Spark Streaming.Supports Multi Topic Fetch, Kafka Security. Reliable offset management in Zookeeper. No Data-loss. No dependency on HDFS and WAL. In-built PID rate controller. Support Message Handler . Offset Lag checker.

brucemen711 Goto Github PK

brucemen711's Projects

Recommend Projects

Recommend Topics

Recommend Org