Welcome to Module 1 projects! You are assigned to a pair, and will have until Tuesday evening to work on projects. On Wednesday, we'll move onto Module 2, and have science fair on Wednesday afternoon.
Projects are designed to review the material we covered in module 1:
- data structures
- object orientation
- SQL
- ORMs
- Flask and Dash
- ETL
Requirements for projects are the following:
- At least 3 models. The models must include a belongs to has many relationship.
- Use of ORM aggregate methods: AVG, SUM, Count, etc.
- ETL with use of at least one, and ideally two APIs. Alternatively, you can scrape data from a website. (Essentially, you are not allowed to simply download a pre-built csv file)
- A Dash app that is connected to a Flask/SQLAlchemy back end
- Choose a question you want to answer with data
For inspiration look other data projects:
- reddit - data beautiful Look at some nice datasets:
- 538 datasets
- github awesome data
- nyc datasets
- Get a sense of whether or not you can answer your original question with the data
- Are the data sets high quality? Are there any missing values?
- Can the data be modeled with relationships? What are the relationships?
- Do some basic exploration of datasets in jupyter notebook
- Clean your data and ensure that each of your fields is in a sensible format/datatype
- Model your database, and build your models
-
Integrate with API and seed database using ETL
-
Build some dummy data in your models
- begin with some basic charts
- add callback functions for interactive components
-
Make visualizations with data from your database
-
Iterate, iterate, iterate