Coder Social home page Coder Social logo

dsc-working-with-known-json-schemas-lab-online-ds-ft-081219's Introduction

Working with Known JSON Schemas - Lab

Introduction

In this lab, you'll practice working with JSON files whose schema you know beforehand.

Objectives

You will be able to:

  • Use the JSON module to load and parse JSON documents
  • Extract data using predefined JSON schemas
  • Convert JSON to a pandas dataframe

Reading a JSON Schema

Here's the JSON schema provided for a section of the NY Times API:

or a fully expanded view:

You can more about the documentation here.

Note that this is a different schema than the schema used in the previous lesson, although both come from the New York Times.

Loading the JSON Data

Open the JSON file located at ny_times_movies.json, and use the json module to load the data into a variable called data.

# Your code here

Run the code below to investigate its contents:

# Run this cell without changes
print("`data` has type", type(data))
print("The keys are", list(data.keys()))

Loading Results

Create a variable results that contains the value associated with the 'results' key.

# Your code here

Below we display this variable as a table using pandas:

# Run this cell without changes
import pandas as pd
df = pd.DataFrame(results)
df

Data Analysis

Now that you have a general sense of the data, answer some questions about it.

How many results are in the file?

The metadata says this:

# Run this cell without changes
data['num_results']

Double-check that by looking at results. Does it line up?

# Your code here
"""
Your written answer here
"""

How many unique critics are there?

A critic's name can be identified using the 'byline' key. Assign your answer to the variable unique_critics.

# Your code here

This code checks your answer.

# Run this cell without changes
assert unique_critics == 7

Flattening Data

Create a list review_urls that contains the URL for each review. This can be found using the 'url' key nested under 'link'.

# Your code here (create more cells as needed)

The following code will check your answer:

# Run this cell without changes

# review_urls should be a list
assert type(review_urls) == list

# The length should be 20, same as the length of reviews
assert len(review_urls) == 20

# The data type contained should be string
assert type(review_urls[0]) == str and type(review_urls[-1]) == str

# Spot checking a specific value
assert review_urls[6] == 'http://www.nytimes.com/2018/10/11/movies/barbara-review.html'

Summary

Well done! In this lab you continued to practice extracting and transforming data from JSON files with known schemas.

dsc-working-with-known-json-schemas-lab-online-ds-ft-081219's People

Contributors

mathymitchell avatar lmcm18 avatar mas16 avatar fpolchow avatar hoffm386 avatar

Watchers

James Cloos avatar Kevin McAlear avatar  avatar Mohawk Greene avatar Victoria Thevenot avatar Belinda Black avatar Bernard Mordan avatar raza jafri avatar  avatar Joe Cardarelli avatar The Learn Team avatar Sophie DeBenedetto avatar  avatar Antoin avatar Alex Griffith avatar  avatar Amanda D'Avria avatar  avatar Nicole Kroese  avatar Kaeland Chatman avatar Lisa Jiang avatar Vicki Aubin avatar Maxwell Benton avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.