satyam245 Goto Github PK
Name: Satyam
Type: User
Bio: I am an aspiring Data Engineer with a passion for designing, building and optimizing data pipelines.
Name: Satyam
Type: User
Bio: I am an aspiring Data Engineer with a passion for designing, building and optimizing data pipelines.
Streamline logistics data orchestration with Apache Airflow on Google Cloud Platform. Automate ingestion, transformation, and storage of CSV files in Google Cloud Storage (GCS) into Hive tables on Google Cloud Dataproc. Utilizes dynamic partitioning for scalability and efficiency.
Efficiently ingest daily airline data into AWS using a seamless end-to-end pipeline, integrating S3 uploads, Glue schema discovery, Redshift data transformation, and SNS notifications.
‘Save To the Cloud’ is a full stack web application that mainly deals with storing and saving files by leveraging cloud infrastructure.
Personal Data Engineering Projects
Data Engineering with AWS, Published by Packt
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Data Engineering YouTube Analysis Project by Darshil Parmar
😈Complete End to End ETL Pipeline with Spark, Airflow, & AWS
🌐 Seamless E-commerce Data Integration Pipeline using Python, GCP Pub/Sub, DataStax Cassandra, and Pandas. This repository includes scripts to load, publish, and consume data, along with instructions for setting up the environment. Simplify your data integration process for e-commerce orders with this efficient and scalable solution.
Transform daily bank transactions effortlessly with this AWS ETL pipeline. Ingest CSVs to S3, trigger Glue jobs with Lambda, store securely in Parquet, and analyze seamlessly using Athena
Learn the entire ETL process based on Spotify API data
Google IT Automation with Python Professional Certificate - Practice files
A Python application integrating Kafka and MongoDB for efficient logistics data processing, with Avro serialization, Docker scaling, and an API for seamless interaction.
The Order Tracking Incremental Load Project automates the integration of order tracking data using Apache Spark in Databricks. Leveraging Google Cloud Storage (GCS) for input, it features efficient stage processing, upserts to a target Delta table, and automated execution.
"Real-Time Data Processing with GCP Pub/Sub and DataStax Cassandra" is a project demonstrating the integration of Google Cloud Platform's Pub/Sub and DataStax Cassandra for efficient real-time data processing. It handles orders and payments data streams, ingesting them via Pub/Sub and storing them in Cassandra tables.
Stock Market Kafka Project: A robust data pipeline leveraging Python, Confluent Kafka, AWS S3, IAM, Glue, and Athena to efficiently process and analyze stock market data. Streamline your workflow from CSV ingestion to insights with this comprehensive solution.
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.