Coder Social home page Coder Social logo

tsufz / recotox Goto Github PK

View Code? Open in Web Editor NEW
1.0 3.0 0.0 1.92 MB

REcoTox is a semi-automated, interactive R workflow to process US EPA ECOTOX Knowledgebase entire database ASCII files

License: Other

R 7.76% TeX 0.43% HTML 91.81%
hazard-assessment ecotoxicology toxic-unit data-aggregation data-retrieval ecotoxicology-knowlegdebase enviromental-chemistry

recotox's Introduction

License: AGPL v3 DOI

Background

The search and extraction of experimental ecotoxicological information is often a tedious work. A good and comprehensive data source is the US EPA ECOTOX Knowledgebase. It contains about 1 million data points for more than 12,000 chemicals and 13,000 single species. However, for a high-throughput hazard assessment, it is not possible to extract all relevant data of the online database The purpose of REcoTox is to extract the relevant information and to aggregate the data based on the user criteria out of the entire database ASCII files.

Introduction

REcoTox is a semi-automated, interactive workflow to process US EPA ECOTOX Knowledgebase entire database ASCII files to extract and process ecotoxicological data relevant (but not restricted) to the ecotoxicity groups algae, crustaceans, and fish in the aquatic domain. The latest version of the ASCII files is available on US EPA ECOTOX Knowledgebase. The focus is aquatic ecotoxicity and the unit of the retrieved data is mg/L.

Requirements

REcoTox expects an R version >4.3.0. Please install additionally the R packages Tidyverse, data_table, EnvStats, and webchem.

Installation

For use of REcoTox, install it from GitHub, please.

To install the latest stable version (0.4.0):

remotes::install_github("tsufz/[email protected]", build = TRUE, build_manual = TRUE)

To install the latest beta version (main):

remotes::install_github("tsufz/recotox@main", build = TRUE, build_manual = TRUE)

To install the latest development version (dev):

remotes::install_github("tsufz/recotox@dev", build = TRUE, build_manual = TRUE)

Workflow

The file Query_Ecotox_DB.R contains the workflow and loads all relevant packages and functions. The workflows allows to filter for endpoints, measurements, and species. The ecotoxicity data is interactivitely enriched with chemical information (e.g. the average mass). In best case with data linked to US EPA CompTox Chemicals Dashboard for example by using the output of the batch search according to Figure 1 and Figure 2.

Figure1: US EPA CompTox Chemicals Dashboard Batch Search - Enter Identifiers to Search

Figure 2: US EPA CompTox Chemicals Dashboard Batch Search - Recommended selection of identifiers and properties

At least, the molecular weight or average mass is required for the recalculation of the water concentrations from molar to milligrams. The main purpose of this workflow is to generate data for the hazard assessment of chemical pressures to aquatic organisms. Thus, only relevant data is aggregated and all data is calculated to mg/L.

The data output contains long pivot tables containing all filtered datasets as the basis of further data processing and aggregation for the users' purposes. But it includes also a further pivoting step to wider pivot tables containing aggregated information, e.g. the geomean and the 5-percentile of the extracted data for each chemical, endpoint, and species.

Note

This workflow will be further developed. Contributions and suggestions are welcome. Please create an issue to initialize the discussion.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.