Script for scraping movie ratings and reviews from Common Sense Media written with Python using BeautifulSoup.
To run the script, you need to install the following packages:
- BeautifulSoup
- requests
- pandas
Run the script with the following command:
python scrape_movies_list.py
# Outputs a list of movies with their URLs
To scrape movie details, run the following command:
python get_data.py
# Outputs a dataset of movie details
This will create a csv file with the following columns:
- Movie Title
- Movie URL
- Movie Rating
- Movie Year
- Movie Genres
- Movie Reviews by Available Categories. This includes: Positive Messages, Role Models, Consumerism, and Other. If unavailable, the cell will be empty.
This will scrape all the movie details and save them in a csv file, which is around 11,000 movies. There is currently no support for setting a limit for the number of movies to scrape.
For research and educational purposes only.
This data belongs to Common Sense Media and should not be used for any commercial and/or non-academic purposes. If you make use of this dataset either as a whole or create a visualization based on this data, it is necessary to upload it with the following attribution:
Copyright (c) 2021 Common Sense Media
Common Sense Media Dataset by Aman Bhargava is licensed under CC BY-NC 4.0