Coder Social home page Coder Social logo

Comments (10)

cjw85 avatar cjw85 commented on July 17, 2024

Processing the short regions efficiently has often been a thorny issue. We have made some changes to how these are handled in v0.6.4 which is available now on github and pypi (a conda package is waiting for a resolution to an issue in bioconda/bioconda-recipes#14433).

from medaka.

eernst avatar eernst commented on July 17, 2024

Unfortunately, the v0.6.4 update seems to have slowed things down severely (I was running v0.6.2 previously).

The same runs that were running relatively quickly (though still taking a disproportionate amount of time in the short regions and eventually failing) are now moving extremely slowly:

[23:04:31 - PWorker] 100.0% Done (747.7/747.7 Mbases) in 22537.3s
[04:34:34 - PWorker] All done, 530 remainder regions.
[04:34:34 - Predict] Processing 530 short region(s).
[04:34:34 - ModelLoad] Building model (steps, features, classes): (None, 10, 5)
[04:34:34 - ModelLoad] With cudnn: False
[04:34:35 - ModelLoad] Loading weights from /mnt/grid/martienssen/hpc/home/data/eernst/src/medaka-env/lib/python3.6/site-packages/medaka/data/r941_flip235_model.hdf5
[04:34:35 - PWorker] Running inference for 0.0M draft bases.
[04:34:35 - Sampler] Initializing sampler for consensus of region ctg1:2996042-3000000.
[04:34:36 - Feature] Processed ctg1:2996042.0-2999999.1 (median depth 8.0)
[04:34:36 - Sampler] Took 0.66s to make features.
[04:34:36 - Sampler] Pileup for ctg1:2996042.0-2999999.1 is of width 4854
[09:38:42 - PWorker] All done, 0 remainder regions.
[09:38:42 - PWorker] Running inference for 0.0M draft bases.
[09:38:42 - Sampler] Initializing sampler for consensus of region ctg2:4213693-4215290.
[09:38:43 - Feature] Processed ctg2:4213693.0-4215289.0 (median depth 1.0)
[09:38:43 - Sampler] Took 0.20s to make features.
[09:38:43 - Sampler] Pileup for ctg2:4213693.0-4215289.0 is of width 1744

It's taking about 5 hours for each short region it appears:

$ grep -i process 03-medaka/logs/medaka-consensus.log
[...]
[04:34:34 - Predict] Processing 530 short region(s).
[04:34:36 - Feature] Processed ctg1:2996042.0-2999999.1 (median depth 8.0)
[09:38:43 - Feature] Processed ctg2:4213693.0-4215289.0 (median depth 1.0)
[15:11:00 - Feature] Processed ctg29:4128474.0-4130248.0 (median depth 2.0)
[21:29:19 - Feature] Processed ctg29:4133104.0-4135866.0 (median depth 2.0)
[04:07:00 - Feature] Processed ctg79:3945261.0-3945992.0 (median depth 1.0)
[09:07:24 - Feature] Processed ctg101:1131000.0-1132157.0 (median depth 1.0)
[14:03:16 - Feature] Processed ctg129:2999092.0-2999999.0 (median depth 8.0)
[20:13:54 - Feature] Processed ctg140:1242445.0-1251787.0 (median depth 3.0)
[01:14:01 - Feature] Processed ctg140:1251913.0-1255851.0 (median depth 1.0)
[06:08:58 - Feature] Processed ctg149:3267277.0-3268391.0 (median depth 1.0)
[11:04:10 - Feature] Processed ctg162:999000.0-1001270.0 (median depth 53.0)
[16:12:36 - Feature] Processed ctg165:2840290.0-2840425.0 (median depth 5.0)

from medaka.

cjw85 avatar cjw85 commented on July 17, 2024

I've managed to replicate this; in my case the program seems to be in an uninterruptible sleep state. Running medaka consensus only on the region which made the full calculation hang, completes successfully.

We will continue to debug this, first by finding an example that doesn't take two hours before hanging!

from medaka.

cjw85 avatar cjw85 commented on July 17, 2024

We have identified the cause of the slowdown (some unnecessary verification of the output file), and will have a bugfix release ASAP.

from medaka.

caity-s avatar caity-s commented on July 17, 2024

Great timing! I have just found this with my sample run - it seems to be using just one thread too at this short read processing stage. I am looking forward to the fix!

Thanks!

from medaka.

cjw85 avatar cjw85 commented on July 17, 2024

We will have a new release later today.

from medaka.

cjw85 avatar cjw85 commented on July 17, 2024

medaka v0.6.5 is now available on github and pypi, a bioconda package should follow shortly.

from medaka.

dominik-handler avatar dominik-handler commented on July 17, 2024

I still saw this bug. Killing the job and only re-running this particular region fixed it and it ran through.
Just wanted to tell you that it still occurs in v0.6.5

Dominik

from medaka.

eernst avatar eernst commented on July 17, 2024

On all four of my datasets v0.6.5 now runs successfully in a reasonable amount of time. Thanks!

from medaka.

valntn avatar valntn commented on July 17, 2024

I've managed to replicate this; in my case the program seems to be in an uninterruptible sleep state. Running medaka consensus only on the region which made the full calculation hang, completes successfully.

We will continue to debug this, first by finding an example that doesn't take two hours before hanging!

Hi,

I am running the latest conda installation of medaka 0.10.1, and I am encountering a similar error. The process seems to go to sleep randomly during the short region processing stage. It doesn't always occur within the same region/contig. Running medaka consensus on the contig in question works fine.

It does report back some messages concerning tensorflow. Not sure what to make of them.

0: [08:58:40 - Sampler] Region contig_10079:4499.0-6515.0 (2341 positions) is smaller than inference chunk length 10000, quarantining.
0: [08:58:40 - Sampler] Region contig_10079:7002.0-16161.0 (9899 positions) is smaller than inference chunk length 10000, quarantining.
0: [08:58:40 - Sampler] Region contig_10079:17079.0-19253.0 (2250 positions) is smaller than inference chunk length 10000, quarantining.
0: [08:58:40 - Sampler] Region contig_10079:19342.0-26003.0 (6823 positions) is smaller than inference chunk length 10000, quarantining.
0: [08:58:40 - Sampler] Region contig_10079:26516.0-29471.0 (3113 positions) is smaller than inference chunk length 10000, quarantining.
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11886 thread 798 bound to OS proc set 1
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11887 thread 799 bound to OS proc set 2
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11888 thread 800 bound to OS proc set 3
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11885 thread 797 bound to OS proc set 39
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11889 thread 801 bound to OS proc set 4
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11890 thread 802 bound to OS proc set 5
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11891 thread 803 bound to OS proc set 6
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11892 thread 804 bound to OS proc set 7
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11893 thread 805 bound to OS proc set 8
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11894 thread 806 bound to OS proc set 9
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11895 thread 807 bound to OS proc set 10
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11897 thread 809 bound to OS proc set 12
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11896 thread 808 bound to OS proc set 11
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11898 thread 810 bound to OS proc set 13
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11899 thread 811 bound to OS proc set 14
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11900 thread 812 bound to OS proc set 15
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11902 thread 814 bound to OS proc set 17
0: OMP: Info #250: KMP_AFFINITY: pid 8985 tid 11901 thread 813 bound to OS proc set 16

Valentin

from medaka.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.