Coder Social home page Coder Social logo

Hi, I'm Martin. Welcome to my Github

Data Engineer @Honda Research Institute/99P Labs. Data Analytics Instructor @COOP Careers. Curious human.

About Me

I am a data engineer and a data analytics instructor with experience across various industries, including higher education and automotive.

At Honda Research Institute/99P Labs, I develop streaming data pipelines using open-source technologies such as Apache Spark and Kafka. At COOP Careers, I teach fundamental skills in Excel, SQL, Tableau, and Python to aspiring data analysts and provide guidance to junior instructors. My previous experiences range from higher education to web development, content editing, and software quality assurance.

I am passionate about technology, education, and helping people who want to break into the tech industry by making instruction in the basics accessible and easy to digest. Check out some of the work that me and my friends have created to help spread free tech education at The Freestack Initiative.

Click here to learn a little more about my work.

Projects

Here are some projects that I'm particularly proud of (WIP = Work in Progress):

DataLab (WIP)

Datalab is a curated data analytics environment that helps you get hands-on practice with common industry tools. It is also extensible. This project was born out of a desire to practice the skills that I consistently saw in data and analytics engineering positions, as well as wanting to learn Docker. Think of it as a data laboratory in a box - using Docker, I created containers for a Python environment, a database server, an database administration tool, and a visualization tool. Using this library, you can avoid the headache of setting up your own database server and getting all these pieces interconnected and talking to one another. Or, if you're curious, you can dig into what I did and make it your own. This project is still under development.


teachdb (WIP)

teachdb is an in-memory micro relational database, powered by duckdb. It was made with two types of users in mind: instructors who want to teach SQL concepts, and students who want to learn and practice the fundamentals. Combined with a Jupyter Notebook, teachdb provides a database that can be used to demonstrate fundamental SQL concepts such as select queries, filtering, aggregations, and joins. It can even be used to introduce more advanced topics such as analytical/window functions, common table expressions (CTEs), data definition language (DDL) commands, etc.

For students, it provides a safe environment to learn and experiment with a SQL database without the need for setting up your their own server or downloading additional software.


I recently worked with COOP Careers to revise their introductory SQL curriculum for the Fall 2023 semester. The final deliverable includes approximately 6 hours worth of material that is designed to take learners with little to no experience with SQL and get them ready for technical interviews. There is a crash course on database theory, a short course on combining data in SQL, and three Jupyter Notebooks that include interactive SQL lessons utlizing my teachdb library.

By leveraging teachdb and Google Colab, we are able to set up a basic database environment within a notebook that can be used in the browser. This means that all students need to work with a real database and learn is an internet connection - no configuration required - which was really important for the COOP community.


This is one of my favorite projects because I was fortunate to be able to have a positive impact on a great organization while sharpening my own skills. In Fall 2021, I worked with a non-profit organization called Celebrate Dyslexia to analyze raw text data to help them create a curriculum for young children with dyslexia.

I created a web scraper to grab the data, used an NLP model to help me extract verbs, analyzed the results to find the most common verbs across each subject, and created a dashboard that would allow the team to interactively look through the results to help craft their curriculum.

Overall, the project was a great success and my work had a positive impact on the organization.

Martin Arroyo's Projects

animequote icon animequote

Generate a random quote from your favorite anime title or character with Golang. This is a simple project that I am using to help me learn how to program in Go.

arknights icon arknights

One of the very first few project completed in during college. Learning web-designing and implementing the knowledge of HTML, CSS and JAVA, to create my own project has given me a passion to create something with envision. However, there are many rooms for improvement.

coop-da-env icon coop-da-env

A portable learning environment for the COOP Data Analytics Track

coop_grader icon coop_grader

A package to assist with autograding Jupyter Notebook assignments for COOP Careers Data Analytics track

cv icon cv

My Personal Website

datalab icon datalab

A curated, open source, data analytics laboratory

datalab-py icon datalab-py

The Docker image for the DataLab Python environment

dbtoolbox icon dbtoolbox

A toolbox of wrapper functions for common database tasks

de-env icon de-env

A sample data pipeline setup with Docker.

exoplanet icon exoplanet

An end-to-end data pipeline, including analysis, using data from the Extrasolar Planet Encyclopedia.

exoplanet_analysis icon exoplanet_analysis

An ETL Pipeline and analysis of Exoplanet data from The Extrasolar Planets Encyclopaedia

f1tenth_gym icon f1tenth_gym

This is the repository of the F1TENTH Gym environment.

golang-practice icon golang-practice

A repository with practice code, notes, and a defined environment for practicing and learning Golang

golinkedlist icon golinkedlist

A practice exercise in implementing a simple LinkedList using Golang

hudi icon hudi

Upserts, Deletes And Incremental Processing on Big Data.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.