sckott / request Goto Github PK

View Code? Open in Web Editor NEW

36.0 10.0 3.0 161 KB

http requests DSL for R

License: Other

R 96.62% Rebol 2.52% Makefile 0.86%

http r r-stats curl curl-library api

request's Introduction

request

request is DSL for http requests for R, and is inspired by the CLI tool httpie.

request is built on httr, though may allow using the R packages RCurl or curl as optional backends at some point.

I gave a poster at User2016, its in my talks repo

Philosophy

The web is increasingly a JSON world, so we assume applications/json by default, but give back other types if not
The workflow follows logically, or at least should, from, hey, I got this url, to i need to add some options, to execute request
Whenever possible, we transform output to data.frame's - facilitating downstream manipulation via dplyr, etc.
We do GET requests by default. Specify a different type if you don't want GET
You can use non-standard evaluation to easily pass in query parameters without worrying about &'s, URL escaping, etc. (see api_query())
Same for body params (see api_body())

All of the defaults just mentioned can be changed.

Auto execute http requests with pipes

When using pipes, we autodetect that a pipe is being used within the function calls, and automatically do the appropriate http request on the last piped function call. When you call a function without using pipes, you have to use the http() function explicitly to make the http request.

low level http

Low level access is available with http_client(), which returns an R6 class with various methods for inspecting http request results.

Peek at a request

The function peep() let's you peek at a request without performing the http request.

Install

From CRAN

install.packages("request")

Development version from GitHub

remotes::install_github("sckott/request")

library("request")

NSE and SE

NSE is supported

api('https://api.github.com/') %>%
  api_path(repos, ropensci, rgbif, issues)

as well as SE

api('https://api.github.com/') %>%
  api_path_('repos', 'ropensci', 'rgbif', 'issues')

Building API routes

Works with full or partial URLs

api('https://api.github.com/')
#> URL: https://api.github.com/
api('http://api.gbif.org/v1')
#> URL: http://api.gbif.org/v1
api('api.gbif.org/v1')
#> URL: http://api.gbif.org/v1

Works with ports, full or partial

api('http://localhost:9200')
#> URL: http://localhost:9200
api('localhost:9200')
#> URL: http://localhost:9200
api(':9200')
#> URL: http://localhost:9200
api('9200')
#> URL: http://localhost:9200
api('9200/stuff')
#> URL: http://localhost:9200/stuff

Make HTTP requests

The above examples with api() are not passed through a pipe, so only define a URL, but don't do an HTTP request. To make an HTTP request, you can either pipe a url or partial url to e.g., api(), or call http() at the end of a string of function calls:

'https://api.github.com/' %>% api()
#> $current_user_url
#> [1] "https://api.github.com/user"
#> 
#> $current_user_authorizations_html_url
#> [1] "https://github.com/settings/connections/applications{/client_id}"
#> 
#> $authorizations_url
#> [1] "https://api.github.com/authorizations"
#> 
#> $code_search_url
...

api('https://api.github.com/') %>% http()
#> $current_user_url
#> [1] "https://api.github.com/user"
#> 
#> $current_user_authorizations_html_url
#> [1] "https://github.com/settings/connections/applications{/client_id}"
#> 
#> $authorizations_url
#> [1] "https://api.github.com/authorizations"
#> 
#> $code_search_url
...

http() is called at the end of a chain of piped commands, so no need to invoke it. However, you can if you like.

Templating

repo_info <- list(username = 'craigcitro', repo = 'r-travis')
api('https://api.github.com/') %>%
  api_template(template = 'repos/{{username}}/{{repo}}/issues', data = repo_info) %>%
  peep
#> <http request> 
#>   url: https://api.github.com/
#>   paths: 
#>   query: 
#>   body: 
#>   paging: 
#>   headers: 
#>   rate limit: 
#>   retry (n/delay (s)): /
#>   error handler: 
#>   write: 
#>   config:

Set paths

api_path() adds paths to the base URL (see api_query()) for query parameters

api('https://api.github.com/') %>%
  api_path(repos, ropensci, rgbif, issues) %>%
  peep
#> <http request> 
#>   url: https://api.github.com/
#>   paths: repos/ropensci/rgbif/issues
#>   query: 
#>   body: 
#>   paging: 
#>   headers: 
#>   rate limit: 
#>   retry (n/delay (s)): /
#>   error handler: 
#>   write: 
#>   config:

Query

api("http://api.plos.org/search") %>%
  api_query(q = ecology, wt = json, fl = 'id,journal') %>%
  peep
#> <http request> 
#>   url: http://api.plos.org/search
#>   paths: 
#>   query: q=ecology, wt=json, fl=id,journal
#>   body: 
#>   paging: 
#>   headers: 
#>   rate limit: 
#>   retry (n/delay (s)): /
#>   error handler: 
#>   write: 
#>   config:

ToDo

See the issues for discussion of these

Paging
Retry
Rate limit

request's People

Contributors

Stargazers

Watchers

Forkers

arturochian linearregression jimsforks

request's Issues

Use tibble when data frame output

Caching helper fxns

Consider fxns to make it simple to cache locally, matching request url, query, etc

Ideally just use another pkg that does this well

Make GetIter() a general purpose iterator for all http verbs

right now it's just a GET iterator

Test suite

add PUT method

api_headers data lost when used with api_config

Incorporate last pipe detector from jqr

so that last command will execute

api_template doesn't get shown correctly in peep

works correctly for similar api_path function, but doesn't for api_template, eg..,

repo_info <- list(username = 'craigcitro', repo = 'r-travis')
api('https://api.github.com/') %>%
    api_template(template = 'repos/{{username}}/{{repo}}/issues', data = repo_info) %>% 
    peep
#> <http request> 
#>   url: https://api.github.com/
#>   paths: ## should be shown here
#>   query: 
#>   body: 
#>   paging: 
#>   headers: 
#>   rate limit: 
#>   retry (n/delay (s)): /
#>   error handler: 
#>   config:

standard evaluation errors?

This problem is so simple but I can't seem to figure it out. I have some very simple code to interact with the bing web search api.

query <- "depression icd 10"
mkt <- "en-US"

hits <- api("https://api.cognitive.microsoft.com") %>%
  api_path(bing, v7.0, search) %>%
  api_headers('Ocp-Apim-Subscription-Key' = 'fake') %>%
  api_query_(q = bquote(.(query)), count = 20) %>%
  peep

It produces an error:

> hits <- api("https://api.cognitive.microsoft.com") %>%
+   api_path(bing, v7.0, search) %>%
+   api_headers('Ocp-Apim-Subscription-Key' = 'fake') %>%
+   api_query_(q = bquote(.(query)), count = 20) %>%
+   peep
Error in parse(text = x) : <text>:1:12: unexpected symbol
1: depression icd

General error handler

Add a vignette

Retry helper

A user may want to retry a request every X seconds - e.g. if they are getting 503 errors, a server may temporarily be down, and they may want to try setting retry = 5 to retry their request ever 5 seconds.

retry
retry_end

DSL brainstorming

functions in inst/ignore/brainstorming.R

from r-lib/httr#197

replace httr with crul, or add as another option

If body used, automatically do a POST request (like httpie)

Paging

various paging patterns:

using query parameters
- limit & offset - specify no. desired, and which record to start at
- per_page & page - specify no. desired, and which page to return (page size variable)
using link headers
- headers return some combination of next, first, last, or prev links - if these provided, in all cases should probably use these - Github recommends that at least

examples

github:
- params: page, per_page
- strategy: GH suggests using link headers to do paging b/c sometimes paging is based on SHA's instead of page numbers
GBIF
- params: limit, offset
- strategy: page via query parameters
Crossref
- params: rows, offset
idigbio
- params: limit, offset (Elasticsearch backed, but not exposed directly)
vertnet
- params: limit, cursor
Tropicos
- params: pagesize, startrow
NOAA NCDC v2
- params: limit, offset
CKAN API
- params: limit, offset
Berkeley Ecoengine
- params: page_size, page
iNaturalist
- params: per_page, page
DataCite
- params: rows, start (solr backed)
PLOS Search API
- params: rows, start (solr backed)
Europeana API
- params: rows, start (solr backed)
ORCID API
- params: rows, start (solr backed)
DPLA API
- params: page_size, page (Elasticsearch backed)
Twitter search API
- params: count (that's it, AFAICT, not sure this is accurate)
Enigma
- params: limit, page

approach

Automagically figure out what params to pass and their values given user input. We'll need some user input:

name of query parameters
how many records they want
what record to start at
maximum records allowed (if known)

Things we can figure out automatically

whether API uses link headers or not (just look for link headers)
if a cursor is used (e.g., Vertnet API) look for a cursor

Arguments:
- value - Limit value (e.g., 100)
- period - Per time frame (e.g. sec, min, hr, day)
- on_limit - what to do when limit reached? (see below)
- May need extra arguments depending on API
on_limit -
- stop - and give error message to use
- warn - and give error message to use
- queue - and give max time to wait in queue (if R session quits this queue gone)

Simple way to allow any httr options/fxns to be passed in

e.g., config = verbose() or content_type_json()

Please assist with AOTH code snippet?

Dear Scott,

We are seeking to use your request package for an authenticated call to Swagger.

Would it be possible for you send a code snippet, indicating how to pass the API Key.

Help much appreciated.

Kind regards,
Tobe