Coder Social home page Coder Social logo

ess-dive-community / essdive-sample-id-metadata Goto Github PK

View Code? Open in Web Editor NEW
4.0 5.0 1.0 398 KB

READY TO USE. A reporting format for assigning sample identifiers and metadata to characterize samples, their collection details, and linkages to other samples and data.

Home Page: https://ess-dive.gitbook.io/sample-id-and-metadata/

License: Creative Commons Attribution 4.0 International

data-standard reporting-format samples sample-event sample-location igsn

essdive-sample-id-metadata's Introduction

ESS-DIVE Sample ID and Metadata Reporting Format (IGSN-ESS) v1.1.0

ESS-DIVE recommends registering samples for Global Sample Numbers (IGSNs) through the System for Earth Sample Registration (SESAR). IGSNs are associated with standardized metadata to characterize a variety of different samples and their collection details. These sample identifiers facilitate sample discovery, tracking, and reuse; they are especially useful when sample data is shared with collaborators, sent to different labs or user facilities for analyses, or distributed in different data files, datasets, and/or publications.

ESS-DIVE has worked with our community scientists to test use of IGSNs and associated metadata in interdisciplinary Environmental Systems Science (ESS). Here we outline modified IGSN metadata guidelines to account for needs of a variety of related geological and biological samples. While generally following the IGSN core descriptive metadata schema, we provide recommendations for extending sample type terms, and connecting to related templates geared towards biodiversity (Darwin Core) and genomic (Minimum Information about any Sequence, MIxS) samples and specimens. The resulting template and recommendations are an extension of IGSN core metadata - IGSN for Environmental Systems Science (IGSN-ESS).

Getting started

ESS-DIVE's IGSN metadata reference guide:
IGSN sample metadata guide, modified (from the SESAR IGSN guide - see link and citation below) for interdiscipinary Environmental System Science (ESS) samples.

Other documents to get started:

  • Instructions document: Instructions to register samples for IGSNs through SESAR, and submit related datasets to ESS-DIVE.
  • Sample metadata template: Download spreadsheet template with standard fields to register samples for IGSNs.

Updates in v1.1.0

  • This is the second release of the Sample ID and Metadata Reporting Format

File formatting requirements

ESS-DIVE has higher-level requirements and recommendations for documenting individual files, including those in csv spreadsheet formats. Here is a quick summary of relevant requirements.

  • File names should include only ASCII characters, camelCase, underscores or dashes.
  • There should be no empty rows.
  • Reporting should be consistent within a column, for example with the same number of decimal places where relevant.
  • Use "NA" for missing values in columns with text, and "-9999" for missing values in columns with numbers.

How to contribute:

Want to make a change to the reporting format? Submit a new issue and use one of several templates that we provide to: suggest a new term, modify a term, or propose changes to documentation within this repository.

Have a question? Contact us at ess-dive-support [at] lbl.gov.

The issue templates we use are modeled from that provided by Darwin Core:

Darwin Core maintenance group, Biodiversity Information Standards (TDWG) (2014). Darwin Core. Zenodo. https://doi.org/10.5281/zenodo.592792


About the sample ID and metadata reporting format

The ESS-DIVE sample ID and metadata reporting format primarily follows the SESAR IGSN guide and template, with modifications to address ESS sample needs and practicalities. To develop recommendations for ESS, we conducted research on related sample standards and templates in addition to work with project scientists to test use of IGSN and standard metadata. See our related paper describing the recommended sample ID and metadata reporting format: http://doi.org/10.5334/dsj-2021-011.

Below we outline supporting documents used to compare related metadata and vocabulary terms.

Metadata research documents: Files with descriptions of metadata fields and controlled vocabulary terms. Metadata fields with controlled vocabularies are described in the sample metadata reference guide.

Copyright information

The ESS-DIVE Sample ID and metadata reporting format (IGSN-ESS) is licensed under the Creative Commons Attribution 4.0 International (CCby4).

Funding and acknowledgements

Funding for the development of ESS-DIVE's Sample ID and metadata reporting format was provided by the US Deparment of Energy (DOE), Biological and Environmental Research Program, Earth and Environmental Systems Sciences Division, Data Management.

The updated sample reporting format for interdisciplinary ESS samples contains modifications and extensions of guidelines originally created by the System for Earth Sample Registration (SESAR) and IGSN organization. Individuals within these organizations are responsible for creating IGSN identifiers and standard sample metadata templates. We especially thank Kerstin Lehnert, Jens Klump, Lesley Wyborn, and Sarah Ramdeen for their development and engagement work, and/or direct assistence with using IGSN.

As outlined in the sample metadata sources document, many recommended metadata additions to the IGSN guidelines/schema come from Darwin Core, and MIxS. We utilize formats for our user guides from Darwin Core resources, as well.

ESS community scientists helped test use of IGSNs, and provided critical feedback - resulting in our final recommendations. These include individuals from eight ESS projects, including:

  • SLAC Subsurface Biogeochemical Research (SBR) Scientific Focus Area (SFA), National Accellerator Laboratory Groundwater Quality: Zach Perzan and Kristin Boye
  • Lawrence Livermore National Lab SBR SFA, Subsurface Biogeochemistry of Actinides: Nancy Merino and Mavrik Zavarin
  • Pacific Northwest National Lab SBR SFA, Worldwide Hydrobiogeochemistry Observation Network for Dynamic River Systems (WHONDRS): Amy Goldman and James Stegen
  • Argonne National Lab SBR SFA, Argonne Wetland Hydrobiogeochemistry: Pamela Weisenhorn
  • Lawrence Berkeley National Lab SBR SFA, Watershed Function (2 sub-projects): Patrick Sorensen, Dana Chadwick, Zarine Kakalia, Eoin Brodie
  • Lawrence Berkeley National Lab Terrestrial Ecosystem Function (TES) SFA, Belowground Biogeochemistry
  • Next Generation Ecosystem Experiments -Arctic (NGEE-Arctic) and -Tropics (NGEE-Tropics) projects: Kim Ely

We thank representatives from other US DOE data systems and user facilities that work with ESS samples who have contributed to sample discussions, including:

  • Joint Genome Institute (JGI): Kjiersten Fagnan, David Hays
  • National Microbiome Data Collaborative (NMDC): Emiley Eloe-Fadrosh, Elisha Wood-Charlson, Chris Mungall, William Duncan
  • The Department of Energy Systems Biology Knowledgebase (KBase): Shane Canon, Paramvir Dehal
  • Environmental Molecular Sciences Laboratory: Lee Ann McCue, Dave Millard, Steven Wiley

Recommended citation

Please cite the SESAR IGSN guide and/or schema, Darwin Core, and ESS-DIVE's modified version for Environmental Systems Science Samples when describing sample collection and data management methods in your related datasets and publications.

System for Earth Sample Registration (SESAR). 2020. SESAR Batch Registration Quick Guide (Version 7.0). Zenodo. http://doi.org/10.5281/zenodo.3874923

System for Earth Sample Registration (SESAR). 2020. SESAR XML Schema for samples (Version 4.0). Zenodo. http://doi.org/10.5281/zenodo.3875531

Darwin Core maintenance group, Biodiversity Information Standards (TDWG) (2014). Darwin Core. Zenodo. https://doi.org/10.5281/zenodo.592792

Damerow J ; Varadharajan C ; Boye K ; Brodie E ; Burrus M ; Chadwick D ; Cholia S ; Crystal-Ornelas R ; Elbashandy H ; Eloy Alves R ; Ely K ; Goldman A ; Hendrix V ; Jones C ; Jones M ; Kakalia Z ; Kemner K ; Kersting A ; Maher K ; Merino N ; O'Brien F ; Perzan Z ; Robles E ; Snavely C ; Sorensen P ; Stegen J ; Weisenhorn P ; Whitenack K ; Zavarin M ; Agarwal D (2020): Sample Identifiers and Metadata Reporting Format for Environmental Systems Science. Environmental Systems Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE). https://doi.org/10.15485/1660470

essdive-sample-id-metadata's People

Contributors

jedamerow avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

geosamples

essdive-sample-id-metadata's Issues

renaming contribute.md

Let's rename contribute.md to CONTRIBUTING.md The new file name will be consistent with how other groups that use github label their file about contributing to repos.

Broken link

  • Submitter: Kristin Boye

I suggest the following changes:
The link to the Sample metadata translation table is not working in the About... section

Adding new issue templates

Change documentation

I suggest the following changes: I propose that we convert our GitHub issue templates from a copy and paste version of the templates, to one that auto generates when users click the "submit an issue" button. Users will have the ability to still choose from 3 different issue types and provide a template, but they will not have to copy and paste the templates from a separate page.

locationID or siteIdentifier

  • Submitter: Kim

Seeking clarification on the use of locationID as defined in the Sample ID and metadata guide, compared to siteIdentifier listed as the ESS-DIVE preferred term in the metadata translation table. These would appear to be the same thing, but do they have different applications?

thanks,
Kim

review of Material_translation_table.csv

I saved the csv as a spreadsheet workbook, added a sheet with the iSamples high-level sample vocabs for SpecimentType, MaterialType, and SampledFeatureType, added columns in the material_translation_table worksheet for mapping to the iSamples vocabularies. Also an iSamples comments column.
The mapping is pretty straight forward for the most part. various of the essdive material types convolve material type and sampled feature (e.g gaseous environmental material: air vs. gaseous environmental material) or genesis (e.g. alluvium vs. gravel vs. scree).

Error in the acknowledgements

Emily Robles is incorrectly acknowledged as part of BNL in the README.
(Is using "Issues" the best way of advising of these kinds of small changes? I'm not fully up to speed on the real power of Git yet. This is more of a test than a real issue).

groundwater term

Kristin Boye noted that in the suggested ESS-DIVE vocabulary for materials there is the option for "liquid environmental material: liquid water: underground water: groundwater" which when I read the description comprises porewater AND groundwater (porewater can be groundwater, but can also be from the unsaturated zone and groundwater can be from a well, which would not be considered porewater...) - not sure this is essential to address, I think it's fine to call it all "underground water", but in that case suggest removing "groundwater" to allow for a wider application and avoid confusion for people trying to register e.g. a "soil water sample" or "a porewater sample" from the unsaturated zone

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.