This short lesson summarizes the topics we covered in section 15 and why they'll be important to you as a data scientist.
You will be able to:
- Understand and explain what was covered in this section
- Understand and explain why this section will help you become a data scientist
In this section we got to review SQL - the language that you will often have to use to retrieve information for your projects, and OO programming in Python - a skill that'll be invaluable to you as you work on bigger projects and/or have to productionize your models. We also got some hands on experience with SQLAlchemy - the most popular ORM in the Python world. Key takeaways include:
- Normalization refers to the practice of spreading information across multiple tables in a database to reduce duplication of information
- Entity Relationship Diagrams (ERDs) are a common way for developers to document the structure (schema) of their databases. When you start working with a new database, they can be a good starting point in understanding how the database is structured (although they are neither necessary nor sufficient for fully understanding a database)
- Object Relational Mappers (ORMs) make it easy for programmers writing OO code to retrieve information from a database without having to write a bunch of SQL by hand
- Many:many relationships require a join table for connecting the tables - e.g. an OrderItem table for joining the Products in a catalog to the Orders that contain them.