Coder Social home page Coder Social logo

cidgoh / pathogen-genomics-package Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 4.0 24 MB

This is the DataHarmonizer spreadsheet web application bundled with pathogen genomics data entry and validation templates

License: MIT License

HTML 100.00%
data-harmonization infectious-disease

pathogen-genomics-package's People

Contributors

cmrn-rhi avatar ddooley avatar griffie avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pathogen-genomics-package's Issues

MPX template: remove LIMS export field "PH_INSTRUMENT_CGN"

In the CanCOGeN DH template, the sequencing instrument information is supposed to export to PH_INSTRUMENT_CGN field whereas the MPX template exports sequencing instrument to PH_INSTRUMENT.

For some reason both are showing up in the MPX export. Can we turn off the "PH_INSTRUMENT_CGN" field in the MPX export?

I'm not even sure how that field is showing up since it's not in the MPX template's LIMS export column at all...

Canadian MPX template: age bin data overwritten when a saved file if re-opened in the DH

Currently in the Canadian MPX template a user can have a dataset in which the age value is a null value, but they have entered an age bin instead (so that the exact age is obfuscated but the general age range is shared). There is code in the DH that says "if there is a null value in the age field, put the same null value in the age bin field", which was created in order to automate populating associated fields and reduce data entry. If the user saves the dataset and re-opens it later, then the entered/saved age bin data gets overwritten with the same null value as in the age field.

This is erasing information that the user has entered. And we had the same issue in the CanCOGeN template.

You put a fix in place in the CanCOGeN template so that upon opening a file, whatever is in the age bin field will remain untouched. BUT if a user is entering fresh data, if they enter a null value for the age field, the age bin field still autofills the same null value (which they can edit themselves if they want).

Can you put the same fix into the MPX template that you put in for the CanCOGeN template to address this issue?

Thanks!

CanCOGeN Template - NML LIMS Export

I believe that data entered more than one time in the vaccination fields that are concatenated into the PH_VACCINATION_HISTORY field is getting unintentionally omitted from the NML LIMS dataharmonizer export from the CanCOGeN template.

When, for example:

Astrazeneca (Vaczevria) is entered for more than one vaccination dose name

the result is:

<Host Vaccination Status>;Astrazeneca (Vaxzevria);2022-11-01;2023-01-01;2023-03-01;2023-06-01;<Vaccination History>

rather than the expected:

<Host Vaccination Status>;Astrazeneca (Vaxzevria);2022-11-01;Astrazeneca (Vaxzevria);2023-01-01;Astrazeneca (Vaxzevria);2023-03-01;Astrazeneca (Vaxzevria);2023-06-01;<Vaccination History>

The same is occurring when the same date is used for one or more vaccination doses.

Thank you!

AMBR template: new release request

Some fields have been removed, others added. New picklists have been added. Ontology IDs have been added. Guidance and examples have been updated.

Can we do a new release of the AMBR template pretty please?

I tracked the changes in the version tracker and bumped the proposed version number to 2.1.1 (in red) as there were changes to fields (x), terms (y) and guidance/defs/IDs (z).

Pathogen Genomics Package "Get latest release" needs to be updated (goes to DH repo)

If you open your instance of the PGP and click on "Get latest release" under the Help button, it takes you to the latest DH release (in the DH repo), not the latest release in the PGP repo.

e.g.
Go into pathogen-genomics-package-PGPv1.3.7 data harmonizer
and clicked "get latest release"
and it took me to here:https://github.com/cidgoh/DataHarmonizer/releases
but shouldn't it be going here: https://github.com/cidgoh/pathogen-genomics-package/releases

Can we update please?

CanCOGeN template: new field and export reqs for NML LIMS

I added a new field called "travel history availability" and values.

Can the values from this field be added to those that are concatenated in the NML LIMS field "PH_TRAVEL" in the NML LIMS export, pretty please?

And then can we do a new release (at the same time as the AMBR release maybe?)? I bumped the template version to 2.1.2 (in red) because there were changes to fields, terms and guidance/examples.

I suggested bumping the PGP version to 2.0.1 (in red) as there were no new templates added (x), no new schemas (y), but there were changes to existing templates (z).

ReadMe - more details

Can we make it so this readme includes the Stand-Alone DataHarmonizer Functionality section from the main DataHarmonizer readme? Also the information from the old "stand-alone" installation instructions?

DH modularity and 1:N wish list

A sample can contain multiple organisms, multiple kinds of the same organism (i.e. multiple isolates), and isolates may be sequenced multiple times using different protocols or instruments. This creates a 1-to-many issue, where one sample may need to be linked to multiple organisms, isolates, library IDs, associated tests (AMR drug panels from different companies) etc.

Currently the contextual data for organisms, isolates etc from the same sample have to be entered repeatedly over and over again which creates a data entry burden for data providers.

Ideally, modularity could be created so that sample information could be entered once and linked to different isolates.
Similarly, isolate information could be entered once and linked to different libraries with different processing details/instruments.
Also similarly, libraries could be linked to multiple sequencing runs and/or associated tests.

image
To submit the data to LIMS or public repositories, every library or isolate or organism would need the metadata from the sample so
ideally upon export, the DH would populate that info and present each thing as a separate line in a spreadsheet.
e.g. the above situation would appear like:
sample 1 --> organism 1 --> isolate A --> library 1 --> sequence 1
sample 1 --> organism 2 --> isolate B --> library 2 --> sequence 2
sample 1 --> organism 2 --> isolate C --> library 3 --> sequence 3
sample 1 --> organism 2 --> isolate C --> library 4 --> sequence 4
sample 1 --> organism 2 --> isolate C --> library 4 --> sequence 5
*But the data provider wouldn't have to enter the different metadata multiple times.

Can we make the DH do this modular/1:N data capture and transformation (pretty please)?

MPX template: Replace "PH_CANCOGEN_AUTHORS" with "PH_SEQUENCING_AUTHORS"

Found another CanCOGeN artifact in the NML LIMS export from the MPX template.

Can we please replace "PH_CANCOGEN_AUTHORS" with "PH_SEQUENCING_AUTHORS" after the "SUBMITTED_RESLT - Gene Target #5 CT Value" field in the NML export, pretty please?

In the DH template it's supposed to export as "PH_SEQUENCING_AUTHORS" so I'm not sure where "PH_CANCOGEN_AUTHORS" is coming from...

CanCOGeN and MPXV templates: fix rule for transforming Homo sapiens to Human in NML LIMS output

There are 2 places in the NML LIMS export that the field "host (scientific name)" outputs to - the field that goes into NML LIMS called PH_SPECIMEN_SOURCE and the DH field that also appears in the export file but doesn't get uploaded to LIMS called "host (scientific name)".

In the recent changes to the PGP, we lost the rule that says IF host (scientific name) is Homo sapiens THEN PH_SPECIMEN_SOURCE is Human. The NML uses "Human" instead of "Homo sapiens".

The issue we had before was that the DH is outputting the Human rule in the host (scientific name) field as well as PH_SPECIMEN_SOURCE.
i.e. IF host (scientific name) is Homo sapiens THEN host (scientific name) should be Homo sapiens and NOT Human.

In other words, we want the entered data (Homo sapiens) to be in the DH output fields (lower case after the Provenance field), but the transformed value (Human) in the NML LIMS field (PH_SPECIMEN_SOURCE, before the Provenance field).

Can we do this?
The fix is needed for the NML LIMS from both the CanCOGeN and Monkeypox templates.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.