Coder Social home page Coder Social logo

flow's People

Contributors

evanbiederstedt avatar imielinski avatar kevinmhadi avatar mfansler avatar mskilab avatar shaiberalon avatar zining01 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flow's Issues

Sync core counts from Task to Job

Problem

The number of cores requested via Job constructor is independent of the number of threads actually used by the task, resulting in the end user having to know what to specify based on knowledge of the run.sh internals, or already specifying a thread count elsewhere (e.g., entities file).

Possible solution

Create a reserved keyword argument to be used in .task files to specify the default number of threads used in the module. If no cores parameter is specified by user in Job instantiation, default to that of the task.

Allow a task file to only include a subset of fields and use default values for missing fields

I am listing it here since I think this would be useful and I am outlining two potential solutions below

Template task file

The idea would be to have a template task file that includes default values for all required fields.
Then, each task file can point to that template and only include the subset of the fields which we want to update with values to override the default values from the template task file.

So for example, we can have a template task file for JaBbA ~/tasks/JaBbA.task:

~/modules/JaBbA
input   CovFile cov_rds path
input   RAfile  junctionFilePath        path
input   CovField        field   value   "ratio"
input   RAsupp  junctionUnfiltered      path    /dev/null
input   TierFieldName   tfield  value   "tier"
input   NormalSegFile   cbs_nseg_rds    path    /dev/null
input   SegFile cbs_seg_rds     path    /dev/null
input   SlackPenalty    slack   value   100
input   OptionalHetPileupOutput het_pileups_wgs path    /dev/null
input   Purity  purity  value   NA
input   Ploidy  ploidy  value   NA
input   tilim   tilim   value   6000
input   epgap   epgap   value   1e-6
input   ppmethod        pp.method       value   "sequenza"
input   maxna   maxna   value   -1
input   flags   flags   value   ""
input   blacklist.coverage      blacklist.coverage      path    /dev/null
input   blacklist.junctions     blacklist.junctions     path    /dev/null
input   NumIterations   iter    value   0
input   TumorName       pair    value   "tumor"
input   job.spec.memory 15      value
input   indel           indel   value   "exclude"
input   cnsignif        cnsignif        value   0.00001
input   lp      lp      value   "TRUE"
input   ism     ism     value   "TRUE"
input   mem     treemem value   16
input   fix.thres       fix.thres       value   5
input   max.threads     "Inf"   value
output  jabba_rds       jabba.simple.rds$
output  jabba_gg        jabba.simple.gg.rds$
output  jabba_vcf       jabba.simple.vcf$
output  jabba_raw_rds   jabba.raw.rds$
output  opti    opt.report.rds$
output  jabba_seg       jabba.seg$

And then another task file ~/tasks/JaBbA.hg38.task:

~/modules/JaBbA
~/tasks/JaBbA.task
input   blacklist.coverage      blacklist.coverage      path    '/path/to/hg38.coverage.mask.rds'

This way if we want to update the default values for a module, we don't need to do this for every flavor of the tasks of a certain module.

Optional vs. required parameters

An alternative solution could be to loose the requirement that every parameter expected by the flow.deploy file will be listed in the task file. Instead we can have required and optional parameters so that if a parameter is optional then the task file can omit it, and then the default value listed in the module itself will prevail.

`cores` argument ignored

The Job(...) constructor fails to pass the cores argument to the initializer for the Job class. Hence, all jobs are create with bcmds having a -n 1,1 flag, i.e., 1 core, irrespective of different argument values for cores being specified.

NB: This is just a stub for the fix that I'm about to push.

Incompatible with stringr 1.2.0

Upgrading to stringr 1.2.0 will result in incorrectly generated cmd strings. For example, using stringr 1.1.0:

> Module('~/modules/BWAMem/')
#Module BWAMem ("sh <libdir>run.sh <libdir> <sampleName> <FASTQPathFile> <ReferenceFASTA> <NumCores> <jobs.spec.memor...")
...

but in v 1.2.0, the result is

> Module('~/modules/BWAMem/')
#Module BWAMem ("sh <libdir>run.sh <libdir> NA<sampleName> NA<FASTQPathFile> NA<ReferenceFASTA> NA<NumCores> NA<jobs....")
...

I suspect the issue comes from Line 89.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.