Coder Social home page Coder Social logo

mipscripts's People

Contributors

arisp99 avatar asimkinbrown avatar iek avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mipscripts's Issues

Sample sheet column names non standard

During the merge_sampleset subcommand, we merge together multiple sample sheets by column names. However, when column names do not align, this can cause issues.

We recommend the use of snake case as a standard in naming column names. However, there are situations where this may not occur and the capture plate name column may appear as capture_plate_name in one sample sheet and Capture Plate Name in another sample sheet.

To address this, we should standardize file column names when files are fed into mipscripts functions.

Duplicate sample sheet column names

If column names are duplicated within a sample sheet, this can cause problems when we try to merge sheets or even when we compute stats on the sequenced data.

It would be ideal to throw an error when this occurs to inform the user of the issue.

FASTQs with similar names flagged as duplicate

During the seqrun_stats subcommand, we check if FASTQ names for a particular sample name, sample set, and replicate are duplicated. In some cases, we flag FASTQs as duplicates even though they are not. For example,

python3 -m mipscripts seqrun_stats --samplesheet /nfs/jbailey5/baileyweb/bailey_share/raw_data/220518_nextseq/220518_samples.tsv

returns an error for the FASTQs: 36DA-EPHI-1_S22_R1_001.fastq.gz and 1836DA-EPHI-1_S92_R1_001.fastq.gz. Here seqrun_stats is not able to discriminate between the sample names (36DA and 1836DA).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.