Coder Social home page Coder Social logo

malaria-atlas-project / malariaatlas Goto Github PK

View Code? Open in Web Editor NEW
42.0 7.0 20.0 7.74 MB

An R interface to open-access malaria data, hosted by the Malaria Atlas Project.

Home Page: https://malariajournal.biomedcentral.com/articles/10.1186/s12936-018-2500-5

License: Other

R 100.00%
malaria opendata database raster r

malariaatlas's Introduction

output
md_document
preserve_yaml

malariaAtlas

An R interface to open-access malaria data, hosted by the Malaria Atlas Project.

The gitlab version of the malariaAtlas package has some additional bugfixes over the stable CRAN package. If you have any issues, try installing the latest github version. See below for instructions.

Overview

This package allows you to download parasite rate data (Plasmodium falciparum and P. vivax), survey occurrence data of the 41 dominant malaria vector species, and modelled raster outputs from the Malaria Atlas Project.

More details and example analyses can be found in the [published paper)[(https://malariajournal.biomedcentral.com/articles/10.1186/s12936-018-2500-5).

Available Data:

The data can be explored at https://data.malariaatlas.org/maps.

List Versions Functions

The list version functions are used to list the available versions of different datasets, and all return a data.frame with a single column for version. These versions can be passed to functions such as listShp, listSpecies, listPRPointCountries, listVecOccPointCountries, getPR, getVecOcc and getShp.

Use:

  • listPRPointVerions() to see the available versions for PR point data, which can then be used in listPRPointCountries and getPR.

  • listVecOccPointVersions() to see the available versions for vector occurrence data, which can then be used in listSpecies, listVecOccPointCountries and getVecOcc.

  • listShpVersions() to see the available versions for admin unit shape data, which can then be used in listShp and getShp.

listPRPointVersions()
listVecOccPointVersions()
listShpVersions()

List Countries and Species Functions

To list the countries where there is available data for PR points or vector occurrence points, use:

  • listPRPointCountries() for PR points
  • listVecOccPointCountries() for vector occurrence points

To list the species available for vector point data use listSpecies()

All three of these functions can optionally take a version parameter (which can be found with the list versions functions). If you choose not to provide a version, the most recent version of the relevant dataset will be selected by default.

listPRPointCountries(version = "202206")
listVecOccPointCountries(version = "201201")
listSpecies(version = "201201")

List Administrative Units

To list administrative units for which shapefiles are stored on the MAP geoserver, use listShp(). Similar to the list countries and species functions, this function can optionally take a version.

listShp(version = "202206")

List Raster Function

listRaster() gets minimal information on all available rasters. It returns a data.frame with several columns for each raster such as dataset_id, title, abstract, min_raster_year and max_raster_year. The dataset_id can then be used in getRaster and extractRaster.

listRaster()

Is Available Functions

isAvailable_pr confirms whether or not PR survey point data is available to download for a specified country, ISO3 code or continent.

Check whether PR data is available for Madagascar:

isAvailable_pr(country = "Madagascar")

Check whether PR data is available for the United States of America by ISO code:

isAvailable_pr(ISO = "USA")

Check whether PR data is available for Asia:

isAvailable_pr(continent = "Asia")

isAvailable_vec confirms whether or not vector survey point data is available to download for a specified country, ISO3 code or continent.

Check whether vector data is available for Myanmar:

isAvailable_vec(country = "Myanmar")

Check whether vector data is available for multiple countries:

isAvailable_vec(country = c("Nigeria", "Ethiopia"))

You can also pass these functions a dataset version. If you don't they will default to using the most recent version.

isAvailable_pr(country = "Madagascar", version = "202206")

Downloading & Visualising Data:

get* functions & autoplot methods

Parasite Rate Survey Points

getPR() downloads all publicly available PR data points for a specified location (country, ISO, continent or extent) and plasmodium species (Pf, Pv or BOTH) and returns this as a dataframe with the following format:

MDG_pr_data <- getPR(country = "Madagascar", species = "both")
## Rows: 395
## Columns: 28
## $ dhs_id                    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ site_id                   <int> 8689, 6221, 18093, 6021, 15070, 15795, 7374, 13099, 9849, 11961, 21475, 11572, 15943, 7930, 13748, 16323,…
## $ site_name                 <chr> "Vodivohitra", "Andranomasina", "Ankazobe", "Andasibe", "Ambohimarina", "Antohobe", "Ambohimazava", "Anke…
## $ latitude                  <dbl> -16.21700, -18.71700, -18.31600, -19.83400, -18.73400, -19.76990, -25.03230, -18.70100, -18.71920, -19.36…
## $ longitude                 <dbl> 49.68300, 47.46600, 47.11800, 47.85000, 47.25200, 46.68700, 46.99600, 47.16600, 47.49050, 48.16667, 47.46…
## $ rural_urban               <chr> "RURAL", "UNKNOWN", "RURAL", "UNKNOWN", "UNKNOWN", "UNKNOWN", "UNKNOWN", "UNKNOWN", "UNKNOWN", "UNKNOWN",…
## $ country                   <chr> "Madagascar", "Madagascar", "Madagascar", "Madagascar", "Madagascar", "Madagascar", "Madagascar", "Madaga…
## $ country_id                <chr> "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", …
## $ continent_id              <chr> "Africa", "Africa", "Africa", "Africa", "Africa", "Africa", "Africa", "Africa", "Africa", "Africa", "Afri…
## $ month_start               <int> 11, 1, 11, 3, 1, 7, 4, 1, 1, 2, 7, 11, 4, 7, 11, 4, 9, 7, 7, 3, 7, 7, 7, 11, 3, 4, 6, 3, 11, 11, 7, 7, 7,…
## $ year_start                <int> 1989, 1987, 1989, 1987, 1987, 1995, 1986, 1987, 1987, 2003, 1995, 1989, 1986, 1995, 1997, 1986, 1991, 199…
## $ month_end                 <int> 11, 1, 12, 3, 1, 8, 6, 1, 1, 2, 8, 12, 4, 8, 11, 6, 9, 8, 8, 6, 7, 7, 7, 12, 3, 6, 6, 6, 11, 11, 7, 8, 8,…
## $ year_end                  <int> 1989, 1987, 1989, 1987, 1987, 1995, 1986, 1987, 1987, 2003, 1995, 1989, 1986, 1995, 1997, 1986, 1991, 199…
## $ lower_age                 <dbl> 5, 0, 5, 0, 0, 2, 7, 0, 0, 0, 2, 5, 6, 2, 2, 7, 0, 2, 2, 0, 2, 0, 0, 5, 0, 7, 0, 0, 6, 5, 0, 2, 2, 2, 13,…
## $ upper_age                 <int> 15, 99, 15, 99, 99, 9, 22, 99, 99, 99, 9, 15, 12, 9, 9, 22, 99, 9, 9, 5, 9, 99, 99, 15, 99, 22, 99, 5, 12…
## $ examined                  <int> 165, 50, 258, 246, 50, 50, 119, 50, 50, 210, 50, 340, 20, 50, 61, 156, 104, 50, 50, 147, 147, 944, 541, 9…
## $ positive                  <dbl> 144.0, 7.5, 139.0, 126.0, 2.5, 6.0, 37.0, 13.5, 4.5, 34.0, 11.5, 255.0, 8.0, 7.0, 3.0, 97.0, 24.0, 33.0, …
## $ pr                        <dbl> 0.87272727, 0.15000000, 0.53875969, 0.51219512, 0.05000000, 0.12000000, 0.31092437, 0.27000000, 0.0900000…
## $ species                   <chr> "P. falciparum", "P. falciparum", "P. falciparum", "P. falciparum", "P. falciparum", "P. falciparum", "P.…
## $ method                    <chr> "Microscopy", "Microscopy", "Microscopy", "Microscopy", "Microscopy", "Microscopy", "Microscopy", "Micros…
## $ rdt_type                  <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ pcr_type                  <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ malaria_metrics_available <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRU…
## $ location_available        <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRU…
## $ permissions_info          <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ citation1                 <chr> "Lepers, J.P. (1989). <i>Rapport sur la situation du paludisme dans la région de Mananara Nord.</i> . Ant…
## $ citation2                 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ citation3                 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
Africa_pvpr_data <- getPR(continent = "Africa", species = "Pv")
Extent_pfpr_data <- getPR(extent = rbind(c(-2.460181, 13.581921), c(-3.867188, 34.277344)), species = "Pf")

You can also pass this function a dataset version. If you don't it will default to using the most recent version.

MDG_pr_data_202206 <- getPR(country = "Madagascar", species = "both", version = "202206")

autoplot.pr.points configures autoplot method to enable quick mapping of the locations of downloaded PR points.

autoplot(MDG_pr_data)

plot of chunk unnamed-chunk-21

A version without facetting is also available.

autoplot(MDG_pr_data,
         facet = FALSE)

plot of chunk unnamed-chunk-22

Vector Survey Points

getVecOcc() downloads all publicly available Vector survey points for a specified location (country, ISO, continent or extent) and species (options for which can be found with listSpecies) and returns this as a dataframe with the following format:

MMR_vec_data <- getVecOcc(country = "Myanmar")
## Rows: 2,866
## Columns: 25
## $ id             <int> 1945, 1946, 1951, 1952, 790, 781, 772, 791, 773, 783, 774, 776, 777, 792, 778, 779, 780, 1953, 784, 785, 786, 788, 7…
## $ site_id        <int> 30243, 30243, 30243, 30243, 1000000072, 1000000071, 1000000071, 1000000072, 1000000071, 1000000071, 1000000071, 1000…
## $ latitude       <dbl> 16.2570, 16.2570, 16.2570, 16.2570, 17.3500, 17.3800, 17.3800, 17.3500, 17.3800, 17.3800, 17.3800, 17.3800, 17.3800,…
## $ longitude      <dbl> 97.7250, 97.7250, 97.7250, 97.7250, 96.0410, 96.0370, 96.0370, 96.0410, 96.0370, 96.0370, 96.0370, 96.0370, 96.0370,…
## $ country        <chr> "Myanmar", "Myanmar", "Myanmar", "Myanmar", "Myanmar", "Myanmar", "Myanmar", "Myanmar", "Myanmar", "Myanmar", "Myanm…
## $ country_id     <chr> "MMR", "MMR", "MMR", "MMR", "MMR", "MMR", "MMR", "MMR", "MMR", "MMR", "MMR", "MMR", "MMR", "MMR", "MMR", "MMR", "MMR…
## $ continent_id   <chr> "Asia", "Asia", "Asia", "Asia", "Asia", "Asia", "Asia", "Asia", "Asia", "Asia", "Asia", "Asia", "Asia", "Asia", "Asi…
## $ month_start    <int> 2, 3, 8, 9, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 10, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5…
## $ year_start     <int> 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 1998, 19…
## $ month_end      <int> 2, 3, 8, 9, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 10, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3…
## $ year_end       <int> 1998, 1998, 1998, 1998, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 1998, 2000, 20…
## $ anopheline_id  <int> 17, 17, 17, 17, 50, 49, 17, 51, 11, 4, 15, 1, 35, 30, 50, 51, 30, 17, 17, 11, 15, 1, 35, 49, 4, 17, 11, 15, 1, 35, 5…
## $ species        <chr> "Anopheles dirus species complex", "Anopheles dirus species complex", "Anopheles dirus species complex", "Anopheles …
## $ species_plain  <chr> "Anopheles dirus", "Anopheles dirus", "Anopheles dirus", "Anopheles dirus", "Anopheles stephensi", "Anopheles sinens…
## $ id_method1     <chr> "unknown", "unknown", "unknown", "unknown", "morphology", "morphology", "morphology", "morphology", "morphology", "m…
## $ id_method2     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ sample_method1 <chr> "man biting", "man biting", "man biting", "man biting", "man biting indoors", "man biting indoors", "man biting indo…
## $ sample_method2 <chr> "animal baited net trap", "animal baited net trap", "animal baited net trap", "animal baited net trap", "man biting …
## $ sample_method3 <chr> NA, NA, NA, NA, "animal baited net trap", "animal baited net trap", "animal baited net trap", "animal baited net tra…
## $ sample_method4 <chr> NA, NA, NA, NA, "house resting inside", "house resting inside", "house resting inside", "house resting inside", "hou…
## $ assi           <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", …
## $ citation       <chr> "Oo, T.T., Storch, V. and Becker, N. (2003).  <b><i>Anopheles</i> <i>dirus</i> and its role in malaria transmission …
## $ time_start     <date> 1998-02-01, 1998-03-01, 1998-08-01, 1998-09-01, 1998-05-01, 1998-05-01, 1998-05-01, 1998-05-01, 1998-05-01, 1998-05…
## $ time_end       <date> 1998-02-01, 1998-03-01, 1998-08-01, 1998-09-01, 2000-03-01, 2000-03-01, 2000-03-01, 2000-03-01, 2000-03-01, 2000-03…
## $ geometry       <POINT [°]> POINT (97.725 16.257), POINT (97.725 16.257), POINT (97.725 16.257), POINT (97.725 16.257), POINT (96.041 17.3…

You can also pass this function a dataset version. If you don't it will default to using the most recent version.

MMR_vec_data_201201 <- getVecOcc(country = "Myanmar", version = "201201")

autoplot.vector.points configures autoplot method to enable quick mapping of the locations of downloaded vector points.

autoplot(MMR_vec_data)

plot of chunk unnamed-chunk-26

N.B. Facet-wrapped option is also available for species stratification.

autoplot(MMR_vec_data,
         facet = TRUE)

plot of chunk unnamed-chunk-27

Shapefiles

getShp() downloads a shapefile for a specified country (or countries) and returns this as a simple feature object.

MDG_shp <- getShp(ISO = "MDG", admin_level = c("admin0", "admin1"))
## Rows: 23
## Columns: 17
## $ iso           <chr> "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG", "MDG"…
## $ admn_level    <int> 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
## $ name_0        <chr> "Madagascar", "Madagascar", "Madagascar", "Madagascar", "Madagascar", "Madagascar", "Madagascar", "Madagascar", "Mada…
## $ id_0          <int> 10000910, 10000910, 10000910, 10000910, 10000910, 10000910, 10000910, 10000910, 10000910, 10000910, 10000910, 1000091…
## $ type_0        <chr> "Country", "Country", "Country", "Country", "Country", "Country", "Country", "Country", "Country", "Country", "Countr…
## $ name_1        <chr> NA, "Alaotra Mangoro", "Amoron I Mania", "Analamanga", "Analanjirofo", "Androy", "Anosy", "Atsimo Andrefana", "Atsimo…
## $ id_1          <int> NA, 10022998, 10022989, 10022983, 10022999, 10023001, 10023002, 10023003, 10022990, 10023000, 10022994, 10022995, 100…
## $ type_1        <chr> NA, "Region", "Region", "Region", "Region", "Region", "Region", "Region", "Region", "Region", "Region", "Region", "Re…
## $ name_2        <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ id_2          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ type_2        <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ name_3        <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ id_3          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ type_3        <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ source        <chr> "Madagascar NMCP 2016", "Madagascar NMCP 2016", "Madagascar NMCP 2016", "Madagascar NMCP 2016", "Madagascar NMCP 2016…
## $ country_level <chr> "MDG_0", "MDG_1", "MDG_1", "MDG_1", "MDG_1", "MDG_1", "MDG_1", "MDG_1", "MDG_1", "MDG_1", "MDG_1", "MDG_1", "MDG_1", …
## $ geometry      <MULTIPOLYGON [°]> MULTIPOLYGON (((44.2278 -25..., MULTIPOLYGON (((48.2394 -16..., MULTIPOLYGON (((45.7685 -19..., MULTIPOLYGON (((46.74…

autoplot.sf configures autoplot method to enable quick mapping of downloaded shapefiles.

autoplot(MDG_shp)

plot of chunk unnamed-chunk-30

N.B. Facet-wrapped option is also available for species stratification.

autoplot(MDG_shp,
         facet = TRUE,
         map_title = "Example of facetted shapefiles.")

plot of chunk unnamed-chunk-31

Modelled Rasters

getRaster()downloads publicly available MAP rasters for a specific dataset_id & year, clipped to a given bounding box or shapefile

MDG_shp <- getShp(ISO = "MDG", admin_level = "admin0")
MDG_PfPR2_10 <- getRaster(dataset_id = "Explorer__2020_Global_PfPR", shp = MDG_shp, year = 2013)

autoplot.SpatRaster & autoplot.SpatRasterCollection configures autoplot method to enable quick mapping of downloaded rasters.

p <- autoplot(MDG_PfPR2_10, shp_df = MDG_shp)

plot of chunk unnamed-chunk-33

Combined visualisation

By using the above tools along with ggplot, simple comparison figures can be easily produced.

MDG_shp <- getShp(ISO = "MDG", admin_level = "admin0")
MDG_PfPR2_10 <- getRaster(dataset_id = "Explorer__2020_Global_PfPR", shp = MDG_shp, year = 2013)

p <- autoplot(MDG_PfPR2_10, shp_df = MDG_shp, printed = FALSE)

pr <- getPR(country = c("Madagascar"), species = "Pf")
p[[1]] +
geom_point(data = pr[pr$year_start==2013,], aes(longitude, latitude, fill = positive / examined, size = examined), shape = 21)+
scale_size_continuous(name = "Survey Size")+
 scale_fill_distiller(name = "PfPR", palette = "RdYlBu")+
 ggtitle("Raw PfPR Survey points\n + Modelled PfPR 2-10 in Madagascar in 2013")

plot of chunk unnamed-chunk-34

Similarly for vector survey data

MMR_shp <- getShp(ISO = "MMR", admin_level = "admin0")
MMR_An_dirus <- getRaster(dataset_id = "Explorer__2010_Anopheles_dirus_complex", shp = MMR_shp)

p <- autoplot(MMR_An_dirus, shp_df = MMR_shp, printed = FALSE)

vec <- getVecOcc(country = c("Myanmar"), species = "Anopheles dirus")
p[[1]] +
geom_point(data = vec, aes(longitude, latitude, colour = species))+
  scale_colour_manual(values = "black", name = "Vector survey locations")+
 scale_fill_distiller(name = "Predicted distribution of An. dirus complex", palette = "PuBuGn", direction = 1)+
 ggtitle("Vector Survey points\n + The predicted distribution of An. dirus complex")

plot of chunk unnamed-chunk-35

Installation

Latest stable version from CRAN

Just install using install.packages("malariaAtlas") or using the package manager in RStudio.

Latest version from github

While this version is not as well-tested, it may include additional bugfixes not in the stable CRAN version. Install the devtools package and then install using devtools::install_github('malaria-atlas-project/malariaAtlas')

malariaatlas's People

Contributors

danpfeffer avatar geryan avatar joemap avatar mauricio-tki avatar ojwatson avatar shk313 avatar timcdlucas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

malariaatlas's Issues

Error message with fillDHSCoordinates

Hello - thank you for creating this great package! I'm trying to use the fillDHSCoordinates function to link the DHS geo-coordinates to parasite prevalence rates from the Malaria Atlas project. However the following command:

pf <- getPR(country = c("Nigeria", "Cameroon", "Malawi"), species = "pf")
pf <- fillDHSCoordinates(pf, 
                        email = "[email protected]",
                        project = "My Project Name")

leads to the following error message:

Writing your configuration to:
   -> ~/Library/Caches/rdhs/rdhs.json

Error in handle_pagination_json(endpoint, query, all_results, timeout) : 
  Records returned equal to 0. Most likely your query terms are too specific or there is a typo that does not trigger a 404 or 500 error

I've checked and ensured that the email address and the project name are consistent with what I had in my DHS account. I also have access to the GPS data I try to link. Could you give me a hint on how to correctly link the DHS to malaria Atlas data here? Many thanks!

fillDHSCoordinates - "These requested datasets are not available from your DHS login credentials"

I am trying to download the DHS data from Tanzania.

I run the code:

    username = "[email protected]"
    projectname = "Predicting malaria from spatiotemporal data"
    TANZ_pr_data <- getPR(country = "Tanzania", species = "both")
    TANZ_pr_data_plus <- fillDHSCoordinates(TANZ_pr_data, 
				      email = username,
                    		      project = projectname,
				      password_prompt=TRUE)

I get the following error:

  Please enter password in TK window (Alt+Tab)
  Writing your configuration to:
     -> C:\Users\r-kli\AppData\Local/r-kli/rdhs/Cache/rdhs.json
  
  Downloading DHS data.
  These requested datasets are not available from your DHS login credentials:
  ---
  TZGE52FL.zip, TZGE6AFL.ZIP, TZGE7AFL.zip, TZGE7IFL.ZIP
  ---
  Please request permission for these datasets from the DHS website to be able to download them
  Error in readRDS(geo[[file_name]]) : bad 'file' argument

I have access to the files and can download them from the DHS website.

Are there any restrictions on the format of the username or password or what could the mistake be?

image

Being good citizens

Thinking about the package license vs data license discussion we had made me think. If someone is using data (under CC-BY or something) accessed via our package, they should still cite the data source as well as our package.

I guess we should add lines in the documentation and citation info that citing our package does not cover their citation obligations.

Now think this should only go in the citation info. Probably won't bother making citation info until AFTER the paper is published (i.e. we want people to cite the paper.) So moving this to a new milestone called post-paper.

Write vignette

It's not much more than the readme. Maybe just some str() calls etc.

getRaster fails for default PR due to multiple Plasmodium falciparum PR2-10 titles

Hi there,

Noticed the following that used to work, now errors as follows:

ken <- malariaAtlas::getShp("Kenya")
#> OGR data source with driver: ESRI Shapefile 
#> Source: "/tmp/RtmpMaNcl9/shp/shp413a46542f0d/mapadmin_0_2018.shp", layer: "mapadmin_0_2018"
#> with 1 features
#> It has 8 fields
raster <- malariaAtlas::getRaster(shp=ken,year = 2008:2015,surface = "Plasmodium falciparum PR2-10")
#> Downloading list of available rasters...
#> All specified surfaces are available to download.
#> Checking if the following Surface-Year combinations are available to download:
#> 
#>     RASTER CODE  YEAR
#>   - Plasmodium falciparum PR2-10:  2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015
#> Error in year[[which(raster_code_list == i)]]: subscript out of bounds

I had a debug and seems to be because there are multiple rasters with the title Plasmodium falciparum PR2-10 - c("2015_Nature_Africa_PR","2018_Global_PfPR"), which then causes issues later when trying to match the years to the second raster.

Let me know if there is anything else that would be useful for debugging.


Session Info
sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 19

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8     LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] raster_2.8-4    sp_1.3-1        forcats_0.3.0   stringr_1.3.1   dplyr_0.8.0.1   purrr_0.2.5     readr_1.3.1     tidyr_0.8.2     tibble_2.0.1    ggplot2_3.1.0  
[11] tidyverse_1.2.1

loaded via a namespace (and not attached):
 [1] xfun_0.4           tidyselect_0.2.5   haven_2.0.0        lattice_0.20-38    malariaAtlas_0.0.3 colorspace_1.4-0   generics_0.0.2     htmltools_0.3.6   
 [9] rlang_0.3.1        pillar_1.3.1       glue_1.3.0         withr_2.1.2        modelr_0.1.2       readxl_1.2.0       plyr_1.8.4         munsell_0.5.0     
[17] gtable_0.2.0       cellranger_1.1.0   rvest_0.3.2        codetools_0.2-16   evaluate_0.12      knitr_1.21         callr_3.1.1        ps_1.3.0          
[25] curl_3.2           broom_0.5.1        Rcpp_1.0.0         clipr_0.4.1        scales_1.0.0       backports_1.1.3    jsonlite_1.6       fs_1.2.6          
[33] digest_0.6.18      hms_0.4.2          packrat_0.5.0      stringi_1.2.4      processx_3.2.1     grid_3.5.2         rgdal_1.3-6        cli_1.0.1         
[41] tools_3.5.2        magrittr_1.5       lazyeval_0.2.1     whisker_0.3-2      crayon_1.3.4       pkgconfig_2.0.2    data.table_1.11.8  xml2_1.2.0        
[49] reprex_0.2.1       lubridate_1.7.4    assertthat_0.2.0   rmarkdown_1.11     httr_1.4.0         rstudioapi_0.8     R6_2.3.0           nlme_3.1-137      
[57] compiler_3.5.2    

extent plus species doesn't work

#d2 <- getVecOcc(extent = matrix(c(-100, -30, -40, -10), nrow = 2), species = 'Anopheles gambiae')
#expect_true(inherits(d2, 'vector.points'))
#expect_true(nrow(d2) > 0)
#expect_true(length(unique(d2$country)) > 1)

having a bbox and a cql_filter together doesn't seem to work.

`getShp` returns error when passed the 'ALL' parameter

When using the version 1.5.1 of the interface installed via remotes::install_github("malaria-atlas-project/malariaAtlas") the getShp function returns the following errors:

> shp <- malariaAtlas::getShp(ISO = "ALL", version = '202206')
Error in malariaAtlas::getShp(ISO = "ALL", version = "202206") : 
  One or more ISO codes are wrong
> shp <- malariaAtlas::getShp(country = "ALL", version = '202206')
Error in malariaAtlas::getShp(country = "ALL", version = "202206") : 
  One or more country names are wrong

Based upon the documentation hosted on rddr.io, I would expect to be able download all of the shapefiles using either of these commands.

Bump up test coverage

In short term, simply tests for all autoplot functions will make the README look greener...

Authenticated internal use

If we want this package to replace many lines of code in our internal work, we need it to get the DHS data (and other private data) via authenticated access.

Ideally, we want to keep this as hidden from external users as possible. So for example, we don't want

d <- getPR('ZAF', username = 'tim', password = 'mystrongpassword')

because the docs on CRAN will just have to say "this is not for you". Which is not good software documentation.

Make sure "I can't find this raster" doesn't get saved.

Going to be hard to make a reproducible example.

For example

  1. Turn off internet
  2. Fetch (and fail) raster
  3. Turn on internet
  4. Fetch.

This should work but might possibly fail if the results from 2 get saved.

I believe this happened when Dan M fixed an issue. I had to restart my session. But not 100%.

More reproducible PR data retrieval

One aim for the package is for it to enable reproducible analyses.

If you write a script

d <- getPR()
lm(x ~ y, data = d)

and I run the script 6 months later, getting the same result is useful. But currently, if data has been added, we will get different reuslts.

Spoke to Joe and he said

"
the code i wrote for importing new PR data creates a log table with timestamps correlated with the created IDs

so i think it should be a relatively simple matter to change the process Daniel uses to export PR data into the explorer to include a outer join to this log and provide the info the api function would need to filter and include stuff only as it was at a certain date (assuming all the dates are in the future, i think we have no ability to tell when PR data added before the log table existed were put in)
"

So this should be possible. In future the syntax would be:

d <- getPR(asAt = '2018-02-02')
lm(x ~ y, data = d)

Which should be completely reproducible. Then to update the analysis you'd simply do

d <- getPR(asAt = '2018-06-07')
lm(x ~ y, data = d)

Liberally add skip_on_cran to tests

ggplot2 devs got failing tests from us when they were testing reverse dependencies. Pretty sure it was just a connection fail. But want to avoid these as much as possible.

Duplicated surface names

r <- getRaster('Prevalence of improved housing')

r <- getRaster('Relative Abundance Africa')

These two names refer to multiple rasters and don't work.

Bug: shapefiles for the USA not loading properly

I use malariaAtlas in my example script for making custom accessibility surfaces. In the past, I've used the following getShp code to download a shapefile for Colorado state:

USA.shp <- malariaAtlas::getShp(ISO = "USA", admin_level = "admin1")
analysis.shp <- USA.shp[USA.shp@data$name=="Colorado",]

Running this code now returns analysis.shp as a SpatialPolygonsDataFrame with 0 features, which makes sense because the @data field of USA.shp is no longer populated:

screen shot 2019-01-22 at 5 53 16 pm

The same is true for other non-malaria-endemic countries like Italy, but countries like Zambia and Vietnam load just fine. Are there now only subnational shapefiles available for endemic countries? Can we bring the non-endemic countries back?

Change over to new databse

Hi @danielmamay.

The functions in this package aren't working now. Did you switch over to the new database?
For example

utils::read.csv("https://map.ox.ac.uk/geoserver/Explorer/ows?service=wfs&version=2.0.0&request=GetFeature&outputFormat=csv&TypeName=surveys_pr&PROPERTYNAME=country,country_id,continent_id", encoding = "UTF-8")

returns some mess.

If so what's the easiest way for me to work out how to fix it given that I don't know anything about these calls to the server? Is it best for me to sit down with you at some point? Or is it easy to explain?

Thanks,

Pf temperature suitability not working

@timcdlucas the explorer is showing this as static for me? maybe you have the time-varying pfpr underneath on the explorer and this is what is showing the time-bar? will investigate the getRaster for MDG

Oh quite possibly. That seems a bit weird that the time-bar doesn't dissappear when a different layer is on top. But maybe it makes sense.

Feature Request : Python implementation

Hi,

Wanted to say thanks for all the time and effort that went into this great package! Are there any plans to have this implemented as a Python package?

Regards,

Reuben

Get extractRaster fixed

I don't know if this is server side, but can't see any thing wrong package side.

extractRaster doesn't work for any layers.

Shape data not downloadable

Hi there,

Thank you for the great package! I just got some error message when I tried to download some shape files. Could you help?

Input:
listData(datatype = "shape")

Error message:

[1] "Error in file(file, \"rt\") : \n cannot open the connection to 'https://malariaatlas.org/geoserver/Explorer/ows?service=wfs&version=2.0.0&request=GetFeature&outputFormat=csv&TypeName=mapadmin_0_2018&PROPERTYNAME=iso,admn_level,name_0,id_0,type_0,source'\n" attr(,"class") [1] "try-error" attr(,"condition") <simpleError in file(file, "rt"): cannot open the connection to 'https://malariaatlas.org/geoserver/Explorer/ows?service=wfs&version=2.0.0&request=GetFeature&outputFormat=csv&TypeName=mapadmin_0_2018&PROPERTYNAME=iso,admn_level,name_0,id_0,type_0,source'>

Input:

MDG_shp <- getShp(ISO = "MDG", admin_level = c("admin0", "admin1"))

Error message:

Error in file(file, "rt") : cannot open the connection to 'https://malariaatlas.org/geoserver/Explorer/ows?service=wfs&version=2.0.0&request=GetFeature&outputFormat=csv&TypeName=mapadmin_0_2018&PROPERTYNAME=iso,admn_level,name_0,id_0,type_0,source' In addition: Warning message: In file(file, "rt") : cannot open URL 'https://malariaatlas.org/geoserver/Explorer/ows?service=wfs&version=2.0.0&request=GetFeature&outputFormat=csv&TypeName=mapadmin_0_2018&PROPERTYNAME=iso,admn_level,name_0,id_0,type_0,source': HTTP status was '400 Bad Request' Error in file(file, "rt") : cannot open the connection to 'https://malariaatlas.org/geoserver/Explorer/ows?service=wfs&version=2.0.0&request=GetFeature&outputFormat=csv&TypeName=mapadmin_0_2018&PROPERTYNAME=iso,admn_level,name_0,id_0,type_0,source' Error in file(file, "rt") : cannot open the connection to 'https://malariaatlas.org/geoserver/Explorer/ows?service=wfs&version=2.0.0&request=GetFeature&outputFormat=csv&TypeName=mapadmin_0_2018&PROPERTYNAME=iso,admn_level,name_0,id_0,type_0,source'

Please remove dependencies on **rgdal**, **rgeos**, and/or **maptools**

This package depends on (depends, imports or suggests) raster and one or more of the retiring packages rgdal, rgeos or maptools (https://r-spatial.org/r/2022/04/12/evolution.html, https://r-spatial.org/r/2022/12/14/evolution2.html). Since raster 3.6.3, all use of external FOSS library functionality has been transferred to terra, making the retiring packages very likely redundant. It would help greatly if you could remove dependencies on the retiring packages as soon as possible.

And disaggregation too, please!

check `file_path` before dowload begins in `getRaster`

feature request

Hi MAP team!

If file_path is somehow misspecified,getRaster only fails at the end of a long slow download. Annoying for me, I suspect VV frustrating for colleagues with less good internet.

Something like stopifnot(dir.exists(file_path) at the beginning of the function should solve this.

Related: the writeRaster(...... overwrite = FALSE) hard coding would also lead to (unnecessary) failure after a long download if the file already existed. Checking if the target file exists before downloading would be kind - this is how geodata::gadm operates, i.e., if files exist it just reads them in instead of re-downloading.

Use rdhs for dhs data

Hey,

I was sending someone who wanted to get all the data used for the malaria maps to this package and noticed the DHS coordinates were missing and then saw this issue :)

The following gets you very close to what you may want. I've started it in a fork, but there were a couple of dhs_ids i could not match correctly within the DHS surveys which are commented in the code below.

Most the function documentation is the same as that for rdhs::set_rdhs_config that does the auth bits for you.

Let me know what you think/any ideas on the odd dhs_ids

Ta, OJ

#' Add DHS locations to malaria data
#'
#'
#' @inheritParams rdhs::as_factor
#' @param data Data to add DHS coordinates to
#' @examples 
#' 
#' pf <- malariaAtlas::getPR("all",species = "pf")
#' pf <- fillDHSCoordinates(pf, 
#' email = "[email protected]",
#' project = "Testing Malaria Investigations")

fillDHSCoordinates <- function(data,
                                email = NULL, project = NULL, 
                                cache_path = NULL, config_path = NULL, 
                                global = TRUE, verbose_download = FALSE, 
                                verbose_setup = TRUE, data_frame = NULL, 
                                timeout = 30, password_prompt = FALSE, 
                                prompt = TRUE) {
  
  # set up a config for rdhs
 set_rdhs_config(email = email, project = project, cache_path = cache_path, config_path = config_path, 
    global = global, verbose_download = verbose_download, verbose_setup = verbose_setup, 
    data_frame = data_frame, timeout = timeout, password_prompt = password_prompt, 
    prompt = prompt)

  # get stems and remove blanks
  dhs_id_stems <- unique(substr(data$dhs_id, 1, 6))
  dhs_id_stems <- dhs_id_stems[nchar(dhs_id_stems)==6]
  
  # then there are some odd dhs ids I noticed
  dhs_id_stems[dhs_id_stems=="MDG201"] <- "MD2011"
  
  # I couldn't find the following ids in the datasets
  # dhs_id_stems[dhs_id_stems=="BI2012"] <- "BU2012"
  # dhs_id_stems[dhs_id_stems=="MZ2014"] <- "MZ2014"
  
  # find the necessary geographic data files from the DHS API
  dats <- rdhs::dhs_datasets(countryIds = unique(substr(dhs_id_stems, 1, 2)),
                             surveyYear = unique(substr(dhs_id_stems, 3, 6)),
                             fileType = "GE")
  dats <- dats[which(substr(dats$SurveyId, 1, 6) %in% dhs_id_stems),]
  
  # download the datasets
  geo <- get_datasets(dats)
  no_permission <- "Dataset is not available with your DHS login credentials"
  geo <- geo[-which(unlist(geo) == no_permission)]
  
  # missing info (can add more depending on factors, e.g. encoding of urban/rural)
  mis_info <- c("dhs_id","site_id", "latitude", "longitude")
  dhs_info <- c("DHSID","DHSCLUST", "LATNUM", "LONGNUM")
  
  # fill in blanks
  for(stem in dhs_id_stems) {
    
    # what file does the stem relate to
    file_name_match <- dats$FileName[which(substr(dats$SurveyId, 1, 6) == stem)]
    file_name <- gsub("(*).zip", "", file_name_match, ignore.case = TRUE)
    
    # did we find that file
    if (length(file_name)==1) {
      
      # read in the data and then fill in blanks
      shp <- readRDS(geo[[file_name]])@data
      matches <- match(shp$DHSID,data$dhs_id)
      
      data[na.omit(matches), mis_info] <- shp[which(!is.na(matches)), dhs_info]
      
    } 
  }
  
  return(data)
  
}

Originally posted by @OJWatson in #5 (comment)

Switch to Australian domain

This will be annoying.

We will have to change the package. Everyone's install will break etc.

Things we should do considerably before the change:

  • Add a message that prints when the package loads?
  • Add a big banner on the github readme
  • Email/twitter to notify anyone we know that uses the package.

edit: I didn't do these. Oh well.

Things to do when we know the new server set up:

  • Create a branch of the package that reads from the new server and test it.
  • Do all checks for CRAN so that the new version of the package definitely functions.

Things to do when the server switches:

  • Update GitHub master to have the new URL
  • Update the big banner on the github readme
  • Update the package on CRAN
  • Advertise the fact that everyone needs to update their package install.

Trailing whitespaces in listRaster()

I found a trailing whitespace in title for the walking-only friction surface, so I want to go through and look for others. I'll try to do it this weekend and submit a PR.

readd r devel to travis

27e6672

removed because travis was failing because rcpp was installed before R 4.0.0

I don't know how to fix it but probably once 4,0.0is released it will all fix itself.

New friction surfaces don't work

Hi all,

Some of the new friction surfaces don't seem to work.

USA_shp <- getShp(ISO = "USA", admin_level = "admin0")
USA_moto_19 <- getRaster(surface = 'Global friction surface enumerating land-based travel speed with access to motorized transport for a nominal year 2019', shp = USA_shp)
Downloading list of available rasters...
The following surfaces have been incorrectly specified, use listRaster to confirm spelling of raster 'title':
  - Global friction surface enumerating land-based travel speed with access to motorized transport for a nominal year 2019

The old accessibility surfaces seem to work though. So guess it's just a naming issue.

Add data dictionary to getPR

Currently looks like this:

COLUMNNAME description of contents

COLUMNNAME description of contents

COLUMNNAME description of contents

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.