Coder Social home page Coder Social logo

Comments (5)

mlathara avatar mlathara commented on August 23, 2024

Can you tell us more about how you imported the data?

  • Did you reblock the gvcfs.? This should help a lot with reducing the data size/memory requirements https://gatk.broadinstitute.org/hc/en-us/articles/13832696945307-ReblockGVCF
  • How did you partition into 3200 intervals?
  • Can you share what options you used for import?
  • What sort of memory/core count did each import job have available?
  • Did the import jobs eventually error out or finish?

from gatk.

DarioS avatar DarioS commented on August 23, 2024
  • No reblocking.
  • Approximately equal-width of about 1 million bases intervals across human genome.
  • Import command used (university bioinformatics core facility's pipeline, not mine).
  • 1 core and 4 GB RAM per task, but tasks seem to be using only about 1 GB RAM per task. 768 tasks (16 nodes) in total.
                                %CPU  WallTime  Time Lim     RSS    mem memlim cpus
normal-exe = open&run
105581211 R ds6924 hm82 genotype   4  00:18:25  02:00:00 1064GB 1064GB 3072GB   768
  • Jobs eventually finish if not running out of allocated time.
  • Takes a long time to begin processing the first set of variants.
13:51:37.925 INFO  GenotypeGVCFs - ------------------------------------------------------------
13:51:39.736 INFO  GenotypeGVCFs - Done initializing engine
13:51:39.923 INFO  ProgressMeter - Starting traversal
13:51:39.923 INFO  ProgressMeter -        Current Locus  Elapsed Minutes    Variants Processed  Variants/Minute
14:23:57.323 WARN  ReferenceConfidenceVariantContextMerger - Detected invalid annotations: When trying to merge variant contexts at location chr17:18363145 the annotation AS_RAW_MQ=64800.000|50400.000|0.000 was not a numerical value and was ignored
14:23:57.346 WARN  ReferenceConfidenceVariantContextMerger - Reducible annotation 'AS_RAW_MQ' detected, add -G Standard -G AS_Standard to the command to annotate in the final VC with this annotation.
14:23:58.180 INFO  ProgressMeter -       chr17:18363854             32.3                  1000             31.0
14:24:13.258 INFO  ProgressMeter -       chr17:18376854             32.6                 14000            430.0
14:24:58.358 INFO  ProgressMeter -       chr17:18382854             33.3                 20000            600.5
14:32:49.287 INFO  ProgressMeter -       chr17:18393855             41.2                 31000            753.2
14:33:39.240 INFO  ProgressMeter -       chr17:18405856             42.0                 43000           1024.1
14:33:49.493 INFO  ProgressMeter -       chr17:18411856             42.2                 49000           1162.3
14:34:17.285 INFO  ProgressMeter -       chr17:18425856             42.6                 63000           1478.1

CPU utilisation does not improve after the variants begin processing after half an hour preparing traversal.

                                %CPU  WallTime  Time Lim     RSS    mem memlim cpus
normal-exe = open&run
105581211 R ds6924 hm82 genotype   4  00:42:34  02:00:00 1200GB 1200GB 3072GB   768
  • Excellent CPU efficiency if running serially (but defeats the purpose of a H.P.C. with Lustre).
                                %CPU  WallTime  Time Lim     RSS    mem memlim cpus
normal-exe = open&run
105381052 R ds6924 hm82 genotype  61  00:19:55  10:00:00 1487MB 1487MB 4096MB     1

09:17:51.114 INFO  ProgressMeter -      chr10:106687146              1.2                  1000            822.3
09:18:01.308 INFO  ProgressMeter -      chr10:106710146              1.4                 24000          17315.6
09:18:21.691 INFO  ProgressMeter -      chr10:106721171              1.7                 35000          20281.0
09:18:31.944 INFO  ProgressMeter -      chr10:106742172              1.9                 56000          29526.0

Intervals take about fifteen minutes each instead of about seven hours if running serially. Outputting results to $PBS_JOBFS folder on compute node instead of directly to project folder did not improve performance at all.

from gatk.

nalinigans avatar nalinigans commented on August 23, 2024

Not sure what your GenotypeGVCFs command was, but did you use the --genomicsdb-shared-posixfs-optimizations option? This option is available for the import too and may improve your performance.

--genomicsdb-shared-posixfs-optimizations <Boolean>
                              Allow for optimizations to improve the usability and performance for shared Posix
                              Filesystems(e.g. NFS, Lustre). If set, file level locking is disabled and file system
                              writes are minimized.  Default value: false. Possible values: {true, false} 

from gatk.

mlathara avatar mlathara commented on August 23, 2024

As @nalinigans suggested, the --genomicsdb-shared-posixfs-optimizations should help, though probably mostly for import. Similarly, I would highly recommend --bypass-feature-reader link for the import as well.

As I mentioned before, reblocking will help import and query - mainly because it reduces the input GVCF size by 5x-8x. Shouldn't be necessary for the number of samples you indicate, but will become more important as number of samples scales up (and does help at any number of samples, I should add).

That doesn't seem to the crux of your problem though...you note that running serially does better than trying to parallelize across many cores. I don't have a lot of insight into Lustre specifically, but do you have any metrics on how the IOPS looks for the Lustre FS in each case? Also, the bit about the the first set of variants taking a while - does that time look different when running serially versus in parallel?

One experiment to consider - maybe try to copy the workspace to the $PBS_JOBFS folder on the compute node before running GenotypeGVCFs. Not sure it is feasible in terms of amount of storage, etc but it would at least rule out possible Lustre issues.

from gatk.

DarioS avatar DarioS commented on August 23, 2024

I copied the Genomics DB to the compute nodes rather than reading it from /scratch/hm82/ Lustre and voila! Good guess.

                                %CPU  WallTime  Time Lim     RSS    mem memlim cpus
normal-exe = open&run
105643164 R ds6924 hm82 genotype  60  00:10:03  02:00:00 2266GB 2266GB 3072GB   768

from gatk.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.