Coder Social home page Coder Social logo

dataengineering101's Introduction

What is it?

  • A similar book like DataScience101 which is a personal note to be a better machine learning engineer.

  • 50+ notes so far, continously updating.

Course & Introduction

data_engineering_brief_intro_by_google

data engineer roadmap 2021

awesome data engineering tools

Emerging Architectures for Modern Data Infrastructure 2022

Books

Designing Data-Intensive Applications

Hands On Course

Youtube - hands on data engineering on gcp

Google - data engineering course

Datacomp - Data Engineer with Python

udemy - Data Engineering on Google Cloud platform

CSXXX

CS246 Mining Massive datasets - Notes

Note of CS246 Mining Massive Datasets

Note of CS329S Machine Learning System Design

Theorem

Designing Data-Intensive Applications

Chp1 可靠、可擴展與可維護的資料系統

Chp2 資料模型與查詢語言

Chp3 資料儲存與檢索

Database Fundamental

CAP theorem

OLTP vs OLAP (database vs data warehouse)

Dimentional Modeling

date warehouse, datalake, datamesh and other buzzyword

Data Processing

Lambda and Kappa Architecture

Computational Framework Survey

Batch

Spark - installation

pyspark 101

RAPIDS for spark

Streaming

data ingestion

streamming framework survey

spark streaming introduction

structured streaming introduction

case study - near realtime arct for recommender in LinkedIn

realtime mvp for recommendation from Chip Huyen

Cloud Logging

Pipeline Management (ETL Management)

data piepline 101 - I - mirroring

data piepline 101 - II - partition mirroring

data piepline 101 - II - accumulated mirroring

data piepline 101 - III - etl, elt

data piepline 101 - IV - pipeline design - functionality

data piepline 101 - V - Idempotency

data piepline 101 - VI - Guard

data piepline 101 - VII - Checkpoint, Security, Accounts

data pipeline 101 - IIX - etl development

schema-changable system

data modeling

Data Goverance

data goverance

metadata management

Google Cloud Platform

GCP command

GCP data_lake_warehouse

GCP BigQuery

GCP streamming

Google App Engine

Google Kubernetes Engine Introduction

Google Kubernetes Getting Start

VPC

PubSub

Google Kubernetes Engine Introduction

Google Kubernetes Getting Start

VPC

IAM

Kubernates

Kubernetes for the Absolute Beginners - Hands-on

BigData Algorithm

Storage

lsh family

join algorithm

Relational databases

MySQL install and python connector

database wrapper sqlalchemy, pymysql, pyodbc

Basic sql injection

sql hint

sql 101

Non-relational databases

Document

ElasticSearch 101

Graph

Key-Value

redis 101

Wide Column

Workflow Scheduling

airflow 101

other python scheduler

Crawler

Coding stuff

python crawler packages

Web Analysis

web analysis hits

HTML Tags

GraphQL

Cache

Introduction

Data Ingestion & Store

sync tables from database

Primary Key, Index and Partition

CI/CD

Introduction & Circle CI

Data Parsing / Cleaning 101

data cleaning for traffic analysis

Infra101

data science workstation

dataengineering101's People

Contributors

yltsai0609 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.