Coder Social home page Coder Social logo

owidr's Introduction

owidR

CRAN downloads CRAN status R-CMD-check

This package acts as an interface to Our World in Data datasets, allowing for an easy way to search through data used in over 3,000 charts and load them into the R environment.

Installation

To install from CRAN:

install.packages("owidR")

To install the development version from GitHub:

devtools::install_github("piersyork/owidR")

Using the package

The main function in owidR is owid(), which takes a chart id and returns a data.table of the corresponding OWID dataset. To search for chart ids you can use owid_search() to list all the chart ids that match a keyword or regular expression.

Example

Lets use the core functions to get data on how human rights have changed over time. First by searching for charts on human rights.

library(owidR)

owid_search("human rights")
##      chart_id                                                                   
## [1,] "human-rights-index-vs-electoral-democracy-index"                          
## [2,] "cases-of-killed-human-rights-defenders-journalists-trade-unionists"       
## [3,] "countries-with-independent-national-human-rights-institution"             
## [4,] "distribution-human-rights-index-vdem"                                     
## [5,] "human-rights-index-vdem"                                                  
## [6,] "human-rights-index-population-weighted"                                   
## [7,] "human-rights-index-vs-gdp-per-capita"                                     
## [8,] "share-countries-accredited-independent-national-human-rights-institutions"
##      title                                                                              
## [1,] "Human rights index vs. electoral democracy index"                                 
## [2,] "Confirmed killings of human rights defenders, journalists and trade unionists"    
## [3,] "Countries with accredited independent national human rights institutions"         
## [4,] "Distribution of human rights index"                                               
## [5,] "Human rights index"                                                               
## [6,] "Human rights index"                                                               
## [7,] "Human rights index vs. GDP per capita"                                            
## [8,] "Share of countries with accredited independent national human rights institutions"

Let’s use the v-dem human rights index dataset.

rights <- owid("human-rights-index-vdem")

rights
## Key: <entity, code, year>
##             entity   code  year civ_libs_vdem_owid civ_libs_vdem_high_owid
##             <char> <char> <int>              <num>                   <num>
##     1: Afghanistan    AFG  1789              0.125                   0.169
##     2: Afghanistan    AFG  1790              0.125                   0.169
##     3: Afghanistan    AFG  1791              0.125                   0.169
##     4: Afghanistan    AFG  1792              0.125                   0.169
##     5: Afghanistan    AFG  1793              0.125                   0.169
##    ---                                                                    
## 33373:    Zimbabwe    ZWE  2018              0.428                   0.473
## 33374:    Zimbabwe    ZWE  2019              0.403                   0.456
## 33375:    Zimbabwe    ZWE  2020              0.413                   0.469
## 33376:    Zimbabwe    ZWE  2021              0.395                   0.443
## 33377:    Zimbabwe    ZWE  2022              0.388                   0.432
##        civ_libs_vdem_low_owid
##                         <num>
##     1:                  0.089
##     2:                  0.089
##     3:                  0.089
##     4:                  0.089
##     5:                  0.089
##    ---                       
## 33373:                  0.381
## 33374:                  0.361
## 33375:                  0.373
## 33376:                  0.344
## 33377:                  0.331

ggplot2 makes it easy to visualise our data.

library(ggplot2)
library(dplyr)

rights |> 
  filter(entity %in% c("United Kingdom", "France", "United States")) |> 
  ggplot(aes(year, civ_libs_vdem_owid, colour = entity)) +
  geom_line()

COVID-19 Data

You can quickly download world covid-19 data, including vaccination rates, using owid_covid().

covid <- owid_covid()

str(covid)
## Classes 'owid', 'data.table' and 'data.frame':   351765 obs. of  67 variables:
##  $ iso_code                                  : chr  "AFG" "AFG" "AFG" "AFG" ...
##  $ continent                                 : chr  "Asia" "Asia" "Asia" "Asia" ...
##  $ location                                  : chr  "Afghanistan" "Afghanistan" "Afghanistan" "Afghanistan" ...
##  $ date                                      : IDate, format: "2020-01-03" "2020-01-04" ...
##  $ total_cases                               : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ new_cases                                 : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ new_cases_smoothed                        : num  NA NA NA NA NA 0 0 0 0 0 ...
##  $ total_deaths                              : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ new_deaths                                : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ new_deaths_smoothed                       : num  NA NA NA NA NA 0 0 0 0 0 ...
##  $ total_cases_per_million                   : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ new_cases_per_million                     : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ new_cases_smoothed_per_million            : num  NA NA NA NA NA 0 0 0 0 0 ...
##  $ total_deaths_per_million                  : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ new_deaths_per_million                    : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ new_deaths_smoothed_per_million           : num  NA NA NA NA NA 0 0 0 0 0 ...
##  $ reproduction_rate                         : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ icu_patients                              : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ icu_patients_per_million                  : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ hosp_patients                             : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ hosp_patients_per_million                 : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ weekly_icu_admissions                     : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ weekly_icu_admissions_per_million         : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ weekly_hosp_admissions                    : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ weekly_hosp_admissions_per_million        : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_tests                               : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ new_tests                                 : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_tests_per_thousand                  : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ new_tests_per_thousand                    : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ new_tests_smoothed                        : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ new_tests_smoothed_per_thousand           : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ positive_rate                             : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ tests_per_case                            : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ tests_units                               : chr  "" "" "" "" ...
##  $ total_vaccinations                        : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ people_vaccinated                         : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ people_fully_vaccinated                   : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_boosters                            : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ new_vaccinations                          : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ new_vaccinations_smoothed                 : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_vaccinations_per_hundred            : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ people_vaccinated_per_hundred             : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ people_fully_vaccinated_per_hundred       : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ total_boosters_per_hundred                : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ new_vaccinations_smoothed_per_million     : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ new_people_vaccinated_smoothed            : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ new_people_vaccinated_smoothed_per_hundred: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ stringency_index                          : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ population_density                        : num  54.4 54.4 54.4 54.4 54.4 ...
##  $ median_age                                : num  18.6 18.6 18.6 18.6 18.6 18.6 18.6 18.6 18.6 18.6 ...
##  $ aged_65_older                             : num  2.58 2.58 2.58 2.58 2.58 ...
##  $ aged_70_older                             : num  1.34 1.34 1.34 1.34 1.34 ...
##  $ gdp_per_capita                            : num  1804 1804 1804 1804 1804 ...
##  $ extreme_poverty                           : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ cardiovasc_death_rate                     : num  597 597 597 597 597 ...
##  $ diabetes_prevalence                       : num  9.59 9.59 9.59 9.59 9.59 9.59 9.59 9.59 9.59 9.59 ...
##  $ female_smokers                            : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ male_smokers                              : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ handwashing_facilities                    : num  37.7 37.7 37.7 37.7 37.7 ...
##  $ hospital_beds_per_thousand                : num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
##  $ life_expectancy                           : num  64.8 64.8 64.8 64.8 64.8 ...
##  $ human_development_index                   : num  0.511 0.511 0.511 0.511 0.511 0.511 0.511 0.511 0.511 0.511 ...
##  $ population                                : num  41128772 41128772 41128772 41128772 41128772 ...
##  $ excess_mortality_cumulative_absolute      : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ excess_mortality_cumulative               : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ excess_mortality                          : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ excess_mortality_cumulative_per_million   : num  NA NA NA NA NA NA NA NA NA NA ...
##  - attr(*, ".internal.selfref")=<externalptr>

To-do

  • Add function to load multiple country datasets into one dataframe
  • Add caching of data (inc. backend)
  • Remove interactive plotting to reduce dependencies
  • Create way to import owid explorers

owidr's People

Contributors

edomt avatar piersyork avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

owidr's Issues

Fail to retrieve some datasets

When i run:
owid("population-density")

I get the following error:
Error in `*tmp*`[[1]] : subscript out of bounds

The same happens when I run owid("co2-intensity")

However, running owid("ocean-plastic-waste-per-capita-vs-gdp") returns the data as requested.

I'm using owidR version 1.4.2.9000.

Session info
R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

error: Problem with `id`

When I try to import a dataset from owid (e.g., tempdata <- owid("share-graduates-stem-female")), I get the following error:

Error in left_join():
! Join columns in y must be present in the data.
✖ Problem with id.

A backtrace suggests that maybe the id label or values are missing?:

  1. ├─owidR::owid("share-graduates-stem-female")
  2. │ └─... %>% arrange(entity, year)
  3. ├─dplyr::arrange(., entity, year)
  4. ├─dplyr::select(., entity = name, code, year = years, values)
  5. ├─dplyr::left_join(., entities, by = c(entities = "id"))
  6. └─dplyr:::left_join.data.frame(., entities, by = c(entities = "id"))
  7. └─dplyr:::join_mutate(...)
  8. └─dplyr:::join_cols(...)
    
  9.   └─dplyr:::check_join_vars(by$y, y_names, by$condition, "y", error_call = error_call)
    
  10.     └─rlang::abort(bullets, call = error_call)
    

Invalid char in json text

Hello,
I am getting an error fetching the JSON from ourworldindata - worked just fine a couple of days ago.
The error is as follows:

Error: lexical error: invalid char in json text.
https://ourworldindata.org/grap
(right here) ------^

The issue is most likely in line 114 of owid.R:
data <- jsonlite::fromJSON(data_url)
As they suggest on stackoverflow, it should be enough to add a readLines(), like this:
data <- jsonlite::fromJSON(readLines(data_url))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.