Here's a bonus real-world example lab. In this repo, we have a dataset collected by Five Thirty Eight, a popular online magazine that writes data-focused articles.
This dataset features a list of every guest Jon Stewart had as host of 'The Daily Show' and includes their name, the date they appeared on the show, occupation, and category of data.
- First, in the
parse.rb
file, write a script to convert the rows of this CSV into a SQL database that we can run queries against. - Write a query to answer the following questions. Once you've come up with a working query, write it in the
answers.sql
file.
- Who did Jon Stewart have on the Daily Show the most?
- What was the most popular profession of guest for each year Jon Stewart hosted the Daily Show?
- What profession was on the show most overall?
- How many people did Jon Stewart have on with the first name of Bill?
- What dates did Patrick Stewart appear on the show?
- Which year had the most guests?
- What was the most popular "Group" for each year Jon Stewart hosted?