Coder Social home page Coder Social logo

simonw / cdc-vaccination-history Goto Github PK

View Code? Open in Web Editor NEW
24.0 5.0 13.0 309.54 MB

A git scraper recording the CDC's Covid Data Tracker numbers on number of vaccinations per state.

Home Page: https://cdc-vaccination-history.datasette.io/

Python 100.00%
git-scraping

cdc-vaccination-history's Introduction

cdc-vaccination-history

Project retired as of 25th October 2023

A git scraper recording the CDC's Covid Data Tracker numbers on number of vaccinations per state.

Archives the JSON from https://covid.cdc.gov/covid-data-tracker/COVIDData/getAjaxData?id=vaccination_data every time it changes, checking three times an hour.

Watch Git scraping, the five minute lightning talk to see me live-code the creation of this repository.

This data as CSV

If you want to grab the entire dataset I'm now publishing it as two CSV files here:

This data in Datasette

The build_database.py script loops through the full commit history and uses it to build a SQLite database with a row for every daily report, mainly as a demonstration of how Python code can be used to extract data from a git scraped repository.

That database is then deployed using Datasette - you can browse the data at https://cdc-vaccination-history.datasette.io/cdc/daily_reports

You can filter down to individual states like so:

Take a look at the scrape.yml GitHub Actions workflow to see how the scraper runs, and how the data is then built into a database and published to Vercel using datasette publish.

Should you trust these numbers?

I honestly don't know. These are not coming from a documented API - I found it using the Firefox developer tools network pane. I don't know how the CDC are sourcing these. I don't know if they themselves consider them to be accurate.

All I know is that these are the numbers they are displaying on their own site - so you should treat this repository as tracking "numbers that were displayed on the CDC's website" as opposed to assuming it represents the full truth on the ground.

cdc-vaccination-history's People

Contributors

simonw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

cdc-vaccination-history's Issues

Old county JSONs

Hi, do you archive the prior JSONs for the county vaccination data? I only see the latest here. Thanks!

How do you download the full data set?

I am using the link provided: https://cdc-vaccination-history.datasette.io/cdc/daily_reports_counties
Under CSV options I check off "download file" and "stream all rows", and then click on "export csv". I get the following error:

An error occurred with your deployment

504: GATEWAY_TIMEOUT Code: FUNCTION_INVOCATION_TIMEOUT ID: iad1::xd25w-1621449834980-16b0300f009e

If you are a visitor, contact the website owner or try again later.
If you are the owner, check the logs for the application error. 

Is there another way you could just post the whole file? Like maybe a separate git hub repository could hold the results?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.