Coder Social home page Coder Social logo

epiverse-trace / epico Goto Github PK

View Code? Open in Web Editor NEW
5.0 5.0 0.0 67.07 MB

R package for statistical and spatiotemporal modeling of epidemiological data for vector-borne diseases in Colombia

Home Page: https://epiverse-trace.github.io/epiCo/

License: Other

R 100.00%
colombia decision-support demographics epiverse outbreak-analysis r r-package spatio-temporal-analysis vector-borne-diseases

epico's People

Contributors

bisaloo avatar jd-otero avatar juan-umana avatar juanmontenegro99 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

epico's Issues

Error en la introdución - Variables demográficas

Al momento de usar el traductor de google, se puede ver que la introducción tiene un problema de redacción, ejemplifico:

El módulo le permite:

-Consulta y visualiza la pirámide poblacional de un municipio, departamento o país para un año de interés.
-Consulte las definiciones de variables demográficas como etnias, grupos especiales de población y etiquetas ocupacionales.

Como se puede ver, las palabras correctas no son consulta u consulte sino consultar para ambos casos.

Demographic describe_ethnicity

Please place an "x" in all the boxes that apply

  • I have the most recent version of linelist and R
  • I have found a bug

Please include a brief description of the problem with a code example:

When the output obtained of describe_ethnicity still has the "\n" on the strings

# insert reprex here

Bug age_risk gender

There is a bug in the age_risk function when you pass gender as NULL. It shows the following error:

Error in plot.window(xlim, ylim, "", ...) :
se necesitan valores finitos de 'ylim'

canal endémico plot

data_ibague <- data_event[data_event$cod_mun_o == 73675, ]
cuando se cambia el código del municipio sale un error que no permite graficar
ERROR:
Error in data.frame(central = central, up_lim = up_lim, low_lim = low_lim, :
arguments imply differing number of rows: 53, 52
Además: Warning messages:
1: In endemic_channel(incidence_historic = incidence_historic, observations = observations, :
Data prior to 2016-01-03 were not used for the endemic channel calculation.
2: In endemic_channel(incidence_historic = incidence_historic, observations = observations, :
Data after 2020-12-27 were not used for the endemic channel calculation.
3: In matrix(counts_historic, nrow = length(years), byrow = TRUE) :
la longitud de los datos [261] no es un submúltiplo o múltiplo del número de filas [5] en la matriz

Edit epi_data built-in dataset

Please place an "x" in all the boxes that apply

  • I have the most recent version of linelist and R
  • I have found a bug
  • I have a reproducible example
  • I want to request a new feature

Bugs on epi_data

  1. epiCo functions require that DIVIPOLA codes are complete (first two digits corresponding to department, and last three digits corresponding for municpality). Built-in epi_data dataset has only municipality of notification with this structure, residence and occurrance municipality should be in the same format.

Also is important to clean this codes, some has a zero in the beginning leading to codes like 05250 instead of 5250, and incidence functions handle this as different groups. This seems to happen for municipalities in departments with only one digit.

  1. Residence country should be clean, some cases that report to live in Colombia (code 170) have this column with NA or NULL values. And codes for other countries should not start with digit zero

  2. Edad en años: calculate age in years based on edad and uni_med columns

  3. Check coherence between character and numeric type: we need to define wheter to storage data as codes or age columns as numeric or characters, and use them coherently within functions and documentation. Eg, age_risk example in demographic vignette selects municipality data using numeric code, when data is storaged as character.


EWMA

Documentation of detect_outbreaks_EWMA.R

Endemic channels: Calculation of the standard deviation

According to PAHO, the cut-off points to determine the channels are calculated based on the geometric mean (expected value), the standard deviation and the value of the t distribution.

  • $E(\mu) - t*SD(\mu)/\sqrt{n}$
  • $E(\mu)$
  • $E(\mu) + t*SD(\mu)/\sqrt{n}$

Where $\mu$ is a transformation of the cases and

$SD(\mu) = \sqrt{Var(\mu)}$

$Var(\mu) = \frac{1}{n} \sum (\mu - E(\mu) )^2$

Endemic Channel function use stats::sd(x) to calculate standar deviation. This function is based on stats::var(x). It uses a C function which takes the arithmetic mean as the expected value

Therefore, I consider recommending that within the code to determine the endemic channels, the variance (and therefore, the standard deviation) be calculated in such a way that uses the geometric mean as the expected value. In other words, a function could be added to calculate it

Error in age_risk function

Please place an "x" in all the boxes that apply

  • [] I have the most recent version of linelist and R
  • [ x] I have found a bug
  • [x ] I have a reproducible example
  • I want to request a new feature

Please include a brief description of the problem with a code example:
This is the line that I run
age_risk_data <- age_risk(

  • age = data_ibague$edad,
    
  • population_pyramid = ibague_pyramid_2019,
    
  • gender = data_ibague$sexo, plot = TRUE
    
  • )

and this is the error
Error in age_risk(age = data_ibague$edad, population_pyramid = ibague_pyramid_2019, :
population_pyramid should include gender

# insert reprex here

Viñeta canal endémico

Debería agregarse los resultados esperados del codigo en las viñetas, para poder hacer una comparación

poisson test

Poisson test used by the National Health Institute to evaluate significant difference in incidence on hypoendemic municipalities

Offer spatial data as sf object

This issue is to request that the Colombia administrative region spatial data should be provided as sf objects.

The advantages are that sf objects play well with Tidyverse functions and packages including {dplyr} and {ggplot2} allowing easy subsetting and plotting; {sf} is the preferred package for modern spatial data science in R, and is taught in texts such as Geocomputation with R

sf objects can also be handled by {spdep}, and {spdep} depends on {sf}, so most users will already have it installed.

Fix decribe_ethnicity order

  • I have found a bug

When ethnicity is displayed, each item is accompanied by its position in the list, and this generates confusion with respect to the original nomenclature. Additionally, if it does not exist, it should simply not display that item instead of displaying NA.

Bug describe_occupation

Please place an "x" in all the boxes that apply

  • I have found a bug

Please include a brief description of the problem with a code example:
the function is not displaying the occupation labels in all the cases

epiCo::describe_occupation(111, output_level = 1)
[[1]]
[1] 1

epiCo::describe_occupation(111, output_level = 2)
[[1]]
[1] "Legislators, Senior Officials and Managers "

Bugs on population_pyramid function

Please place an "x" in all the boxes that apply

  • I have found a bug

Please include a brief description of the problem with a code example:

  1. population_pyramid function crashes when the range is one year
  2. the function is not checking that range parameter is integer and between 1 and 100
# insert reprex here
epiCo::population_pyramid(73001, 2013, range = 1)

Error in data.frame(age = rep(seq(0, length(female_counts) - range, range), :
arguments imply differing number of rows: 202, 200


echarts4R

Recomendaría considerar utilizar echarts como base para algunos gráficos que serán presentados en tableros interactivos o shiny apps. Esto debido a su manejo interactivo.

Por ejemplo, los gráficos presentes en el siguiente enlace

Problemas con incidence_rate

Please place an "x" in all the boxes that apply

  • I have the most recent version of linelist and R
  • [X ] I have found a bug
  • I have a reproducible example
  • I want to request a new feature

Please include a brief description of the problem with a code example:
Para el codigo de incidence rate al usar el codigo con nivel 1 este me tira un error pero no dice el porque ocurre este siendo que la funcion funciona tambien con un level 0.

incidence_object <- incidence(
  dates = data_tolima$fec_not,
  groups = data_tolima$cod_mun_o,
  interval = "1 epiweek"
)
incidence_rate_object <- incidence_rate(incidence_object, level = 1)
head(incidence_rate_object$counts)

incidence_rate_errors_handling

  • Error handling: when using the incidence_rate function there are error cases for incidence objects with groups different that those expected from the DIVIPOLA codes.
  • Data handling: is not necessary to load all population projections when calling the function
  • Automatic level detection: it may be helpful that the function detects the administration level of input as an strategy of proofreading

Contributors

Review the documentation and update the authors and contributors.

Failing to install epiCo in Ubuntu 22.04 LTR

Please place an "x" in all the boxes that apply

  • I have the most recent version of linelist and R
  • [ X] I have found a bug
  • I have a reproducible example
  • I want to request a new feature

Please include a brief description of the problem with a code example:
I am trying to install epiCo on my PC with Ubuntu 22.04 LTR via conda but failed. I created a conda virtual environment specifically for installing the package and previous to the installation attempt I installed R (v4.2) and r-remotes (2.4.2) packages from the r anaconda channel. Then I opened an R session and run:

remotes::install_github("epiverse-trace/epiCo")

After a period loading I obtained the following error:

ERROR: dependency ‘raster’ is not available for package ‘leaflet’
* removing ‘/home/acs98/anaconda3/envs/epico/lib/R/library/leaflet’

The downloaded source packages are in
	‘/tmp/RtmpinhsiP/downloaded_packages’
Updating HTML index of packages in '.Library'
Making 'packages.html' ... done
Running `R CMD build`...
* checking for file ‘/tmp/RtmpinhsiP/remotes5a40353abc3f/epiverse-trace-epiCo-62dff7d/DESCRIPTION’ ... OK
* preparing ‘epiCo’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘epiCo_0.2.tar.gz’
ERROR: dependencies ‘spdep’, ‘leaflet’ are not available for package ‘epiCo’
* removing ‘/home/acs98/anaconda3/envs/epico/lib/R/library/epiCo’
Warning messages:
1: In i.p(...) : installation of package ‘terra’ had non-zero exit status
2: In i.p(...) :
  installation of package ‘stringi’ had non-zero exit status
3: In i.p(...) : installation of package ‘units’ had non-zero exit status
4: In i.p(...) : installation of package ‘raster’ had non-zero exit status
5: In i.p(...) : installation of package ‘sf’ had non-zero exit status
6: In i.p(...) : installation of package ‘spdep’ had non-zero exit status
7: In i.p(...) :
  installation of package ‘leaflet’ had non-zero exit status
8: In i.p(...) :
  installation of package ‘/tmp/RtmpinhsiP/file5a404ee4a0ef/epiCo_0.2.tar.gz’ had non-zero exit status

![Screenshot from 2023-10-18 14-56-59](https://github.com/epiverse-trace/epiCo/assets/87541159/2ff3e1d8-ab41-4de8-b2dc-cabf2e23d021)


demographics-vignette

Development of the vignette for the Demographics Module
The vignette should consist in an example of demographic analyses at national, departmental, and municipal level using the following functions

  • Population pyramids
  • Age risk assesment / viz
  • Socio-economic status risk assesment / viz
  • Ethnic risk assesment / viz
  • Occupational risk assesment / viz
  • Incidence rate calculation
  • Incidence rate visualization

Update vignettes table visualization

The display of tables in the bullets currently uses the "head" function to display the dataframe. To improve the aesthetics of the vignettes, it is better to use other functions that improve this visualization.

Fix incidence historic length handling

Bugs on the errors handling of endemic_channel function

  • Verify that user provides at least one year of data in the incidence historic
  • If user provides observations it has to be numeric, positive, and has a maximum of 12 (week) or 52(month) observations
  • Incidence historic has to start in week/month 1, and finish with a week/month 52/12. If user’s data has exceeding counts (incomplete years) they will be excluded for the analysis and reported to the user with a warning

modify moran index function

  • Modify morans_index function in the spatiotemporal.R module so the user inputs an incidence object and the incidence rate is estimated within the function using incidence_rate from utils.R
  • Update morans_index tests accordingly

Update the index_moran map

  • Assign colors and the missing legend for the municipalities that present Low-High and High-Low.
  • Return the map as an object so that the user can obtain the values and make their own analysis.

Store spatial data as Geopackages

This issue is to suggest that the spatial data provided with the package could be stored as a Geopackage, rather than as .Rda files.

This reduces the language-specific aspect of the data, and makes it readable from other languages as well. Alternatives include shapefiles (vendor specific) and GeoJSONs (slower loading for larger files, I think).

This change would require adding a data access function that reads in the package data, e.g. get_colombia_regions(admin_level = n), but this could be a good way to access any of the three admin levels via an argument to the function.

Happy to chat or help with this as well.

Update the occupation_plot and describe_occupation functions

  • Correct the function to pass the lint tests.
  • The National Administrative Department of Statistics (DANE) published a new coding system for each of the occupations in Colombia. Discuss the relevance of this new system within epiCo and update the describe_occupation function.
  • Update the graph of the occupation_plot function to show the frequency of occupations and improve the division by gender.

Review and update vignettes

  • Given the different changes in the epidemiological data and the functions of the package, review the correct functioning of the vignettes.
  • Add the option to install the latest version of the library in each of the vignettes.
  • Verify that all the updated graphs are displayed for each of the chunks.

Function to convert count data into incidence object

Is your feature request related to a problem? Please describe.
Users typically have counting (weekly or monthly) data regarding a disease rather than the exact dates

Describe the solution you'd like
A function that converts count data to incidence object by repeating the dates

Base de datos dinamica

Please place an "x" in all the boxes that apply

  • I have the most recent version of linelist and R
  • I have found a bug
  • I have a reproducible example
  • [X ] I want to request a new feature

El uso de la base de datos epi_data puede tener otros datos o ser actualizada en tiempo real para los municipios así las personas que necesites usarlos pueden tenerlos de manera inmediata para los municipios


readme-vignette

Modify README into a introductoru vignette with:

  • Package description
  • What does it do?
  • How to install it?
  • What problem does it solve?
  • Context and justification
  • What are the usages?
  • Modules overview
  • Main functions
  • Resources and bibliography

Error en impresión describe_ethnicity

Is your feature request related to a problem? Please describe.
When I print the function describe_ethnicity(demog_data$ethnicity_label), the print include \n instead of the enter.

Describe the solution you'd like
I think that is a problem of the R version.

Additional context
Add any other context or screenshots about the feature request here.

Update the population projections

Given the COVID-19 pandemic, the National Administrative Department of Statistics (DANE) updated the population projections for each of the administrative units in Colombia.
For this reason, the data loaded in epiCo as population_projection_col_0.rda, population_projection_col_1.rda, and population_projection_col_2.rda must be updated.

BUG canal endemico

Please place an "x" in all the boxes that apply

  • I have the most recent version of linelist and R
  • [X ] I have found a bug
  • I have a reproducible example
  • I want to request a new feature

Encontré un error al usar el código de municipio 73520 el cual en principio esta bien hasta que llegamos al código de canal endémico donde no encuentra incidencias en el municipio siendo que había incidencias.

library(epiCo)
library(incidence)

data("epi_data")
data_event <- epi_data

data_ibague <- data_event[data_event$cod_mun_o == 73520, ]

## Building the historic incidence data

incidence_historic <- incidence(data_ibague$fec_not,
  interval = "1 epiweek"
)

observations <- sample(15:100, 52, replace = FALSE)

outlier_years <- c(2016, 2019)

ibague_endemic_chanel <- endemic_channel(
  incidence_historic = incidence_historic,
  observations = observations,
  outlier_years = outlier_years,
  plot = FALSE
)

ibague_endemic_chanel$endemic_channel_plot

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.