bioinfo-pf-curie / mpisort Goto Github PK
View Code? Open in Web Editor NEWLicense: Other
License: Other
Deletion of Makefiles.in files as they are generated by automake.
The idea is to build a complete pipeline from fatsq to markduplicate using MPI tools and samtools
mpiBWA + samtools fixmate + mpiSORT + samtools markduplicate (+ samtools merge).
Add examples directory to store test data and script with an example.
Problem with a old assert in write.c line 1430.
Shall remove it.
It is written in the docs in the section Informatic resources / Memory :
The total memory used during the sorting is around one and a half the size of the SAM file.
The total memory needed in this case is around 2.5 the individual SAM size.
I think that if should be 2.5 the size of the SAM size. Is it correct?
Deletion of compile file as it is generated by automake
Datatypes are expensive to build. See
https://github.com/besnardjb/mpiSORT/tree/mpc-trial
to test and merge.
So far only pair end reads are supported.
Remove warning at compilation. Essentially unused variables...
So far we link with curl, lzma and bz2.
Add an option in configure.ac to enable the link with libcrypto.
and update the doc.
remove some memory leaks
Refactoring the code to allow MPMD with Slurm.
line Release notes
Harmonize source file naming convention.
Rename executable psort
into mpiSORT
.
Deletion of src/config.h* files as they are generated by configure
results of discordant and unmapped are not write in the proper file SAM.
Add the documentation in the archive built with make dist.
Provide a singularity recipe with the mpiSORT executable.
Discordant or mate umapped could be kept in the sorted ChrN.sam, should review the parser.c script to include them.
In the docs/README.md
, a example with slurm is provided:
mpirun $MPISORT $SAM $OUTPUTDIR -q 0
The -n
or -np
parameter can be omitted when mpirun
is invoked with slurm? Or should it possible to add the -np
like this: mpirun -np ${SLURM_NPROCS} mpiSORT
In the case of PBS/Torque, it seems that it is required: mpirun -np ${PBS_NP}
@fredjarlier , can you provide more details/comments about these -n
, -np
or parameter when the code is launched with Slurm or PBS/Torque please?
So far the memory limits during the Bruck is 2gb (2^31-1) per jobs.
Propose a method to overpass this limit.
Propose a simple script to explore the cluster architecture.
The idea is to get a overview of the cluster topology at partition, node, network, etc... level in order to help users building the right command line.
When we sort SAM file with unique chromosome the memory can increase very high because we keep all the SAM in buffer (X5). It is a problem of memory copy before the Bruck.
Solution: remove SAM buffer before Bruck.
Replace the existing @hd SO: when multiple consecutive sorting are applied (sorting query + sorting coord) for instance when marking duplicate with samtools.
Hi,
I have following slurm
script:
#!/bin/bash
#SBATCH -N 2
#SBATCH -n 4
#SBATCH -c 1
#SBATCH --tasks-per-node=2
#SBATCH -t 12:0:0
#SBATCH -p normal
#SBATCH --output=sortmpi.out
module load 2020
module load OpenMPI/4.0.3-GCC-9.3.0
module load impi/2019.8.254-iccifort-2020.1.217
with sbatch
command:
mpirun -n 4 /home/tahmad/hawk/mpiSORT/src/mpiSORT /scratch-shared/tahmad/bio_data/ERR194147/out/ERR194147_mpibwa.sam.sam /scratch-shared/tahmad/bio_data/ERR194147/out -p -q 0 2> /scratch-shared/tahmad/bio_data/ERR194147/out/mpiSORT.log
Every time the mpiSORT
program crashes with following error:
Number of processes : 4
Reads' quality threshold : 0
Compression Level is : 3
SAM file to read : /scratch-shared/tahmad/bio_data/ERR194147/out/ERR194147_mpibwa.sam.sam
Output directory : /scratch-shared/tahmad/bio_data/ERR194147/out
The size of the file is 30898833423 bytes
Header has 195+2 references
rank 3 ::: counter = 3356333
rank 0 ::: counter = 3464114
rank 1 ::: counter = 3448595
rank 2 ::: counter = 3374532
rank 3 ::: counter = 3426006
rank 1 ::: counter = 3446897
rank 2 ::: counter = 3435650
rank 0 ::: counter = 3406663
rank 3 ::: counter = 3441393
rank 0 ::: counter = 3470229
rank 1 ::: counter = 3449038
rank 2 ::: counter = 3441602
rank 3 ::: counter = 3435720
rank 0 ::: counter = 3411970
rank 2 ::: counter = 3437440
rank 1 ::: counter = 3452366
rank 3 ::: counter = 3418915
rank 0 ::: counter = 3454414
rank 2 ::: counter = 3445559
rank 1 ::: counter = 3449876
rank 3 ::: counter = 3410467
rank 2 ::: counter = 3439039
rank 0 ::: counter = 3451221
rank 1 ::: counter = 3423934
rank 3 ::: counter = 3263389
rank 2 ::: counter = 3424659
rank 0 ::: counter = 3448015
rank 3 ::: counter = 636846
rank 2 ::: counter = 671044
3 (99.51)::::: *** FINISH PARSING FILE ***
2 (100.04)::::: *** FINISH PARSING FILE ***
rank 0 ::: counter = 667634
rank 1 ::: counter = 3460384
0 (101.26)::::: *** FINISH PARSING FILE ***
rank 1 ::: counter = 669584
1 (104.64)::::: *** FINISH PARSING FILE ***
rank 0 ::::[MPISORT] total reads parsed = 98633528
rank 0 ::::[MPISORT] total read to sort for unmapped = 127110
Rank 0 :::::[WRITE] Time for chromosome unmapped writing 0.100934 seconds
rank 0 :::::[MPISORT] Time to write chromosom unmapped , 0.951701 seconds
rank 0 ::::[MPISORT] total read to sort for discordant = 662518
Rank 0 :::::[WRITE] Time for chromosome discordant writing 0.070292 seconds
rank 0 :::::[MPISORT] Time to write chromosom discordant , 3.475659 seconds
rank 0 ::::[MPISORT] Elected rank = 0
rank 0 ::::[MPISORT] we don't split the rank
Rank 0 :::::[MPISORT] Dimensions for bitonic = 4
Rank 0 :::::[MPISORT] Split size = 4
rank 0 :::::[MPISORT][MALLOC 1] time spent = 0.003256 s
rank 0 :::::[MPISORT][LOCAL SORT] time spent = 0.022284 s
rank 0 :::::[MPISORT][BITONIC 2] time spent = 1.687243 s
rank 0 :::::[MPISORT][TRIMMING] time spent = 0.001534 s
rank 0 :::::[MPISORT][BRUCK 3] time spent = 0.556936 s
rank 0 :::::[MPISORT][FREE + MALLOC] time spent = 0.000414 s
rank 0 :::::[MPISORT] we call write SAM
Rank 0 :::::[WRITE][BITONIC 2] Time spent sorting sources offsets = 1.927455
Rank 0 :::::[WRITE][BRUCK 2] Time spent in bruck = 0.395096
Rank 0 :::::[WRITE][LOCAL SORT] Time = 0.111249 seconds
Rank 0 :::::[WRITE][COMPUTE BUFFER SIZE] Time = 0.058617 seconds
Rank 0 :::: [WRITE][COMPUTE PACKs] Number of packs buffer = 3
Rank 0 :::::[WRITE][DATA PACK 0] : Time = 0.463420 seconds
mpiSORT: write.c:1395: writeSam: Assertion `number_of_reads_recieved == previous_local_readNum' failed.
mpiSORT: write.c:1395: writeSam: Assertion `number_of_reads_recieved == previous_local_readNum' failed.
mpiSORT: write.c:1395: writeSam: Assertion `number_of_reads_recieved == previous_local_readNum' failed.
Rank 0 :::::[WRITE][BRUCK PACK 0] :: Time = 0.239733 s
Rank 0 :::::[WRITE][DATA PACK 1] : Time = 0.608196 seconds
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 26525 RUNNING AT tcn1028
= KILLED BY SIGNAL: 9 (Killed)
===================================================================================
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 1 PID 26526 RUNNING AT tcn1028
= KILLED BY SIGNAL: 6 (Aborted)
===================================================================================
It only generates following incomplete output files:
-rw-r--r-- 1 tahmad tahmad 700M Sep 9 19:28 chr1.gz
-rw-r--r-- 1 tahmad tahmad 79M Sep 9 19:39 discordant.gz
-rw-r--r-- 1 tahmad tahmad 29G Aug 27 08:28 ERR194147_mpibwa.sam.sam
-rw-r--r-- 1 tahmad tahmad 3.3K Sep 9 19:39 mpiSORT.log
-rw-r--r-- 1 tahmad tahmad 12M Sep 9 19:28 (null)_unmapped.gz
-rw-r--r-- 1 tahmad tahmad 12M Sep 9 19:39 unmapped.gz
Deletion of depcomp, install-sh and missing files as they are generated by automake.
Modify header of source file to add License cecill.
Write a bench method like in mpiBWA
Is it possible to read (input) multiple SAM files from NFS instead of a single SAM file.
Reading the docs it's not clear what contains unmapped and discordant SAM after passing through the mpiSort
Propose an option to eventually sorting read name before sorting by coordinates.
So far the name sorting is done independantly before the coordinate sorting and to gain speed-up we could do this in the same job.
The actual version is very old, should update it.
Deletion of configure file as it is generated by autoconf.
How to reproduce the error:
git clone https://github.com/bioinfo-pf-curie/mpiSORT.git
git checkout d4294c7
Then compile the program according to the installation procedure.
Launch the following command:
mpirun -n 4 src/mpiSORT examples/data/HCC1187C_70K_READS.sam ~/mpiSORTExample/ -n
I got the following error:
Number of processes : 4
Reads' quality threshold : 0
Compression Level is : 3
SAM file to read : examples/data/HCC1187C_70K_READS.sam
Output directory : /home/philippe/mpiSORTExample/
The size of the file is 24471289 bytes
Header has 25+2 references
rank 3 ::: counter = 18803
rank 0 ::: counter = 18786
0 (0.08)::::: *** FINISH PARSING FILE ***
3 (0.08)::::: *** FINISH PARSING FILE ***
rank 2 ::: counter = 18782
rank 1 ::: counter = 18805
2 (0.10)::::: *** FINISH PARSING FILE ***
1 (0.10)::::: *** FINISH PARSING FILE ***
rank 0 :::: total read to sort for unmapped = 910
Rank 0 :::::[WRITE] Time for chromosome discordant writing 0.000046 seconds
rank 0 :::::[MPISORT] Time to write chromosom discordant , 0.003063 seconds
rank 0 :::: total read to sort for discordant = 194
Rank 0 :::::[WRITE] Time for chromosome unmapped writing 0.000031 seconds
rank 0 :::::[MPISORT] Time to write chromosom unmapped , 0.000917 seconds
rank 0 :::: Elected rank = 0
rank 0 ::::[MPISORT] we don't split the rank
Rank 0 :::::[MPISORT] Dimensions for bitonic = 4
Rank 0 :::::[MPISORT] Split size = 4
rank 0 :::::[MPISORT][MALLOC 1] time spent = 0.000054 s
rank 0 :::::[MPISORT][LOCAL SORT] time spent = 0.000451 s
rank 0 :::::[MPISORT][BITONIC 2] time spent = 0.000844 s
rank 0 :::::[MPISORT][TRIMMING] time spent = 0.000017 s
rank 0 :::::[MPISORT][BRUCK 3] time spent = 0.000238 s
rank 0 :::::[MPISORT][FREE + MALLOC] time spent = 0.000002 s
corrupted double-linked list
[06751] *** Process received signal ***
[06751] Signal: Aborted (6)
[06751] Signal code: (-6)
[06751] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f8f1beb0890]
[06751] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x7f8f1baebe97]
[06751] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x7f8f1baed801]
[06751] [ 3] /lib/x86_64-linux-gnu/libc.so.6(+0x89897)[0x7f8f1bb36897]
[06751] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x9090a)[0x7f8f1bb3d90a]
[06751] [ 5] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x7ea)[0x7f8f1bb4513a]
[06751] [ 6] src/mpiSORT(+0x40fb)[0x556ce0c950fb]
[06751] [ 7] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f8f1baceb97]
[06751] [ 8] src/mpiSORT(+0x533a)[0x556ce0c9633a]
[06751] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node hupe exited on signal 6 (Aborted).
--------------------------------------------------------------------------
The ressources script in examples/ should print the command line options to pass it directly to slurm command.
sometime when linking with htslib, pthread must be link.
add AC_CHECK_LIB([pthread], [pthread_create]) in AM_COND_IF for libhts
So far the sorting is not stable. We only consider one key (the coordinate) to do the sorting. We don't preserve the order of the original SAM position of the reads for tie cases. It's not clear how important it is and if it had an impact for subsequent analysis but it could be a problem for reproducibility matter when using md5 comparison at file's level.
The trouble comes from the quick sort used before the bitonic. We could replace the quick sort with any stable sort: merge sort, tim sort, grail sort,...
Add an example of script lo launch the code with PBS/Torque in the Examples section of the file docs/README.md
Add details in the documentation about the options of the program: detail what are the purposes au -q and -n options.
Propose an option to write in sam file format. To avoid compression + decompression overhead.
Documentation refactoring.
When sorting unique chromosome file a segfault could appear line 286 of mpiSort during strdup.
Deletion of the aclocal.m4 file as it is generated by aclocal.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.