Coder Social home page Coder Social logo

bcgsc / links Goto Github PK

View Code? Open in Web Editor NEW
72.0 11.0 15.0 29.49 MB

⛓ Long Interval Nucleotide K-mer Scaffolder

License: GNU General Public License v3.0

Perl 65.96% Shell 16.55% Makefile 5.98% M4 0.08% C++ 11.43%
genome-assemblies long-reads scaffold draft-assembly extract-kmer-pairs scaffold-layout bloom-filter

links's Introduction

Release Downloads Conda Issues link Thank you for your Stars

Logo

LINKS

Long Interval Nucleotide K-mer Scaffolder

LINKS v2.0.1 T. Murathan Goktas, Yaman Malkoc, René L. Warren 2014-present

Contents

  1. Description
  2. Dependencies
  3. Installation
  4. Running LINKS v2.0.1
  5. What's new?
  6. Documentation
  7. Citing LINKS
  8. Credits
  9. Command and options
  10. Tips
  11. Test data
  12. Testing the Bloom filters
  13. Algorithm
  14. Output files
  15. License

Description


LINKS is a genomics application for scaffolding genome assemblies with long reads, such as those produced by Oxford Nanopore Technologies Ltd. It can be used to scaffold high-quality draft genome assemblies with any long sequences (eg. ONT reads, PacBio reads, other draft genomes, etc). It is also used to scaffold contig pairs linked by ARCS/ARKS.

! NOTE: LINKS will throw an error when threads (-j) exceed 4. Please adjust accordingly while we look for a fix

Dependencies


  • GCC (tested on v7.2.0)
  • Perl (tested on v5.32.1)
  • Autotools (if cloning directly from repository)

Installation


Installing LINKS using Conda

conda install -c bioconda -c conda-forge links

You may also run one of the following (Conda install):

conda install -c bioconda links
conda install -c "bioconda/label/cf201901" links
pre>

Installing LINKS from the source code

git clone --recursive https://github.com/bcgsc/LINKS.git
cd LINKS

Generate autotools scripts:

./autogen.sh

To install LINKS:

./configure && make install

To install LINKS in a specific directory:

./configure --prefix=/LINKS/PATH && make install

*These steps worked on a CentOS 7 system with 128 CPU Intel(R) Xeon(R) CPU E7-8867 v3 @ 2.50GHz

Running LINKS v2.0.1


Users must have all executable files(LINKS,LINKS-make,LINKS_CPP,LINKS.pl) accesible in PATH directories.

To run LINKS with default parameters:

LINKS -f NA1281_draft.fa -s NA1281_reads.fof
  • *LINKS v2.0.0+ is implemented in C++ and perl. The C++ executable is src/LINKS_CPP and PERL executable is src/LINKS.pl. To run the full LINKS pipeline smoothly you can run LINKS-make. However LINKS-make is a Makefile and has different argument value assignment (such as d=xx t=yy instead of -d xx -t yy). To avoid this, run LINKS(shell script) which is a wrapper for LINKS-make that accepts same argument value assignment used in previous LINKS versions(-d xx -t yy).

What's new in v2.0.1 ?


  • bug fixes

What's new in v2.0.0 ?


  • C++ LINKS (re)implementation
  • ~5x less memory usage (vs. PERL implementation)
  • ~3x faster run time (vs. PERL implementation)
  • no need for compiling SWIG Bloom filter wrappers
  • drastically lower memory requirements enables extracting more information from reads using smaller step sizes (-t) and more distances (-d) in a single LINKS run
  • No longer supports MPET reads

What's new in v1.8.7 ?


minor bug fixes, improved documentation

What's new in v1.8.6 ?


When pipelined with the ARCS/ARKS scaffolder, LINKS v1.8.6 prioritizes paths with shorter gaps only when there are no ambiguous gaps distances with the neighboring sequences considered. Add linkage information (ID and size) to the output .gv graph. We now provide a utility (./tools/consolidateGraphs.pl) to highlight differences between two LINKS graphs. Implement -z min_size with an ARCS checkpoint.

What's new in v1.8.5 ?


When pipelined with the ARCS scaffolder, LINKS now extracts information from the tigpair_checkpoint.tsv file to generate a enhanced .gv file with additional information regarding other potential linking partners, number of supports and sequence orientation.

WHEN USING LONG READS FOR SCAFFOLDING, WE RECOMMEND THE USE OF v1.8.5 (FOR NOW), AS IT CONSUMES LESS MEMORY. THE LATEST RELEASE CONCERNS THE ARC/KS PIPELINE.

What's new in v1.8.4 ?


Changed license to GPLv3

What's new in v1.8.3 ?


Fixes a bug introduced in v1.8.1 that caused sequence overuse

What's new in v1.8.2 ?


Implements the -z option, minimum contig size cutoff for scaffolding

What's new in v1.8.1 ?


Stratifies/prioritizes short-to-long distances when building the scaffold layout

What's new in v1.8 ?


Native support for iterative k-mer pair extraction at distinct length intervals

What's new in v1.7 ?


Support for scaffolding with MPET (jumping library) reads Support for reading compressed long sequence [reads] and assembly files Implemented mid-scaffolding checkpoint to:

  • more quickly test certain parameters (-l min. links / -a min. links ratio)
  • quickly recover from crash
  • explore very large kmer spaces

What's new in v1.6.1 ?


Added a new output file, .assembly_correspondence.tsv This human-readable correspondence file lists the scaffold ID, contig ID, original assembly contig name, orientation, #linking kmer pairs, links ratio, gap or overlap

What's new in v1.6 ?


Incorporation of the BC Genome Sciences Centre custom Bloom filter with the fast ntHash recursive nucleotide hash function. This new data structure supports the creation of Bloom filters from large genome assemblies (tested on assemblies of 3 Gbp human and 20 Gbp white spruce).

The Bloom filter data structure swap in v1.6+ offers a ~30-fold kmer insert speed-up (~6x query speed-up) over v1.5.2, while supporting the creation of filters from large genome assembly drafts.

What's new in v1.5.2 ?


LINKS outputs a scaffold graph in gv format, with highlighted merges and edge attributes

What's new in v1.5.1 ?


Fixed a bug that prevented the creation of Bloom filters with a different false positive rate (FPR) than default. Using lower FPR does not influence scaffolding itself, only run time. For large genomes (>1Gbp), using a higher FPR is recommended when compute memory (RAM) is limiting.

What's new in v1.5 ?


LINKS uses a Bloom filter to limit hashed paired k-mers to only those found in the sequence file to re-scaffold. This feature decreases RAM usage by over 60%, while the run time is nearly unchanged. When ran iteratively, users can re-use Bloom filters with the -r options, which results in faster run times up to half compared to v1.3 and earlier.

What's new in v1.3 ?


Added support for fastq files. Added support for multiple long-reads files. With v1.3, the reads file is not supplied directly through -s, but with a file-of-filenameinstead, which is a text file listing the fullpath/FASTA or FASTQ on your system. The file-of-filenames supplied through the -s option could include a mixture of FASTA and FASTQ files.

What's new in v1.2 ?


Fixed bug that prevented reading traditional FASTA sequences (where a sequence is represented as a series of lines typically no longer than 120 characters)

What's new in v1.1 ?


Included offset option (-o option) - Enables LINKS to explore a wider k-mer space range when running iteratively Minor fixes: IUPAC codes are now preserved

Documentation


Refer to the LINKS-readme.txt/LINKS-readme.pdf file on how to run LINKS and the LINKS web site for information about the software and its performance www.bcgsc.ca/bioinfo/software/links

Citing LINKS


Thank you for your Stars and for using, developing and promoting this free software!

LINKS: Scalable, alignment-free scaffolding of draft genomes with long reads. Warren RL, Yang C, Vandervalk BP, Behsaz B, Lagman A, Jones SJ, Birol I. Gigascience. 2015 Aug 4;4:35. doi: 10.1186/s13742-015-0076-3. eCollection 2015. link link

Credits


LINKS: Rene Warren, Yaman Malkoc, T. Murathan Goktas

SWIG/BloomFilter.pm: Sarah Yeo, Justin Chu

https://github.com/bcgsc/bloomfilter: Justin Chu, Ben Vandervalk, Hamid Mohamadi (ntHash), Sarah Yeo

Command and options


e.g. ./LINKS -f ecoliK12_abyss_illumina_contig_baseline.fa -s K12_F2D.fof -b ecoliK12-ONT_linksSingleIterationTIG

Usage: ./LINKS [v2.0.0]
-f  sequences to scaffold (Multi-FASTA format with each sequence on a single line, required)
-s  file-of-filenames, full path to long sequence reads or MPET pairs [see below] (Multi-FASTA/fastq format, required)
-d  distance between k-mer pairs (ie. target distances to re-scaffold on. default -d 4000, optional)
	Multiple distances are separated by comma. eg. -d 500,1000,2000,3000
-k  k-mer value (default -k 15, optional)
-t  step of sliding window when extracting k-mer pairs from long reads
(default -t 2, optional)
	Multiple steps are separated by comma. eg. -t 10,5
-j  threads  (default -j 3, optional) 
-o  offset position for extracting k-mer pairs (default -o 0, optional)
-e  error (%) allowed on -d distance   e.g. -e 0.1  == distance +/- 10%
(default -e 0.1, optional)
-l  minimum number of links (k-mer pairs) to compute scaffold (default -l 5, optional) 
-a  maximum link ratio between two best contig pairs (default -a 0.3, optional)
	 *higher values lead to least accurate scaffolding*
-z  minimum contig length to consider for scaffolding (default -z 500, optional)
-b  base name for your output files (optional)
-r  Bloom filter input file for sequences supplied in -s (optional, if none provided will output to .bloom)
	 NOTE: BLOOM FILTER MUST BE DERIVED FROM THE SAME FILE SUPPLIED IN -f WITH SAME -k VALUE
	 IF YOU DO NOT SUPPLY A BLOOM FILTER, ONE WILL BE CREATED (.bloom)
-p  Bloom filter false positive rate (default -p 0.001, optional; increase to prevent memory allocation errors)
-x  Turn off Bloom filter functionality (-x 1 = yes, default = no, optional)
-v  Runs in verbose mode (-v 1 = yes, default = no, optional)


Notes:
-s K12_F2D.fof specifies a file-of-filenames (text file) listing: K12_full2dONT_longread.fa (see ./test folder)
   -f and -s : sequences must be on a SINGLE line with no linebreaks
   eg.  
   >LONGREAD-1
   AATACAATAGACGCACA...ATGAACGCAGACTTACAG
   >LONGREAD-2
   TGTGCTCTCTGTAATGTTC...ATACAGAACACGCAGCCAAGCGA

-x When turned on (-x 1), LINKS will run with a behaviour equivalent to v1.3 (no Bloom filters).  
This may be useful for large genome assembly drafts and when long reads are extremely high quality.

Tips to minimize memory usage and additional notes


The most important parameters for decreasing RAM usage are -t and -d. The largest dataset used for scaffolding by our group was a draft assembly of the white spruce genome (20 Gb)* - For this, a large sliding window, -t (200) was used and was decreased as the k-mer distance -d increased. *refer to LINKSrecipe_pglaucaPG29-WS77111.sh in the ./test folder

Because you want want to start with a low -d for scaffolding, you have to estimate how many minimum links (-l) would fit in a -d window +/- error -e given sliding window -t. For instance, it may not make sense to use -t 200, -d 500 at low coverages BUT if you have at least 10-fold coverage it might since, in principle, you should be able to derive sufficient k-mer pairs within same locus if there's no bias in genome sequencing.

For re-scaffolding white spruce, only 1X coverage was available (since the re-scaffolding used a draft assembly instead of long reads), but even -t 200 -d 5000 (1st iteration) did merge scaffolds even though, in theory, the -e parameter will play an important role limiting linkages outside of the target range -d (+/-) -e %. This is especially true when using raw MPET for scaffolding, to limit spurious linkages by contaminating PETs.

On the data side of things, reducing the coverage (using less long reads), and limiting to only the highest quality reads would help decrease RAM usage.

In v1.5, LINKS builds a Bloom filter that comprises all k-mer of a supplied (-f) genome draft and uses it to only hash k-mer pairs from longreads having an equivalent in the Bloom filter. When LINKS runs iteratively, the bloom filter built at the first iteration is re-used thus saving execution time.

In v1.8, Users may input multiple distances using the -d parameter, separating each distance by a comma. eg. -d 500,1000,2000,4000,5000 will have for effect to extract kmer pairs at these five distance intervals.

Similarly, the window step size now accepts multiple integers, each separated by a comma and with the order matching that in -d However, the size of the array can be shorter, and the last valid -t will be propagated to subsequent distances when they are not defined. eg. -t 20,10,5 A step size of 20, 10, 5, 5, 5 bp will be used when exploring the distances supplied above.

In v1.8, a single round of scaffolding is done, using the kmer pair space extracted at the specified distances. Accordingly, v1.8 is not expected to yield the exact same results as separate iterative LINKS runs. For users comfortable with the original set up, no change is needed to make use of the previous LINKS behavior.

Simultaneous exploration of vast kmer space is expected to yield better scaffolding results.

WARNING: Specifying many distances will require large amount of RAM, especially with low -t values.

Test data


Go to ./test

(cd test)

run:

-------------------------------------
./runme_EcoliK12single.sh

The script will download the baseline E. coli abyss scaffold assembly and full 2D ONT reads (Quick et al 2014) and used the latter to re-scaffold the former, with default parameters (Table 1D in paper). 
NEED ~8GB RAM WITH CURRENT PARAMETERS. Increase (-t) to use less RAM.

./runme_EcoliK12singleMPET.sh will scaffold using E. coli K12 MPET reads
(~42-90GB RAM for trimmed vs raw MPET)

-------------------------------------
./runme_EcoliK12iterative.sh

The script will download the baseline E. coli abyss scaffold assembly and full 2D ONT reads (Quick et al 2014) and used the latter to re-scaffold the former, iteratively 30 times increasing the distance between k-mer pairs at each iteration (Table 1F in paper). 
NEED ~16GB RAM WITH CURRENT PARAMETERS. Increase (-t) to use less RAM.

-------------------------------------
./runme_ScerevisiaeW303iterative.sh

This script will download the S. cerevisiae W303 raw ONT long reads and used them iteratively to scaffold a baseline IlluminaMiSeq assembly (both data from http://schatzlab.cshl.edu/data/nanocorr/).  You will need a computer with at least 132GB RAM.  This process was clocked at 6:08:21 (h:mm:ss wall clock) and used 118GB RAM on a Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz 16 dualcore (but running on a single CPU). (Fig 1 in the main ms, FigS8 in preprint). 
NEED <132GB RAM WITH CURRENT PARAMETERS. Increase (-t) to use less RAM.

-------------------------------------
./runme_ScerevisiaeS288citerative.sh

This script will download the S. cerevisiae W303 raw ONT long reads and used
them iteratively to scaffold a baseline ABySS assembly of Illumina data.  You will need a computer with at least 132GB RAM.
(Fig 1 in the main ms, FigS8 in preprint).
NEED <132GB RAM WITH CURRENT PARAMETERS. Increase (-t) to use less RAM.

-------------------------------------
./runme_StyphiH58iterative.sh

This script will download the S. typhi H58 2D ONT long reads and used
them iteratively to scaffold a baseline assembly of Illumina data (both from
Ashton,P.M. 2015. Nat.Biotechnol.33,296–300). You will need a computer with at least 132GB RAM.
(Fig 1 in the main ms, FigS8 in preprint).

-------------------------------------
or 
./runall.sh
This script will run ALL of the above examples.

Additional info:

The file: LINKSrecipe_pglaucaPG29-WS77111.sh is provided to show the re-scaffolding recipe used to produce a re-scaffolded white spruce (P. glauca) genome assembly.

The file: LINKSrecipe_athaliana_ectools.sh and LINKSrecipe_athaliana_raw.sh are provided to show the re-scaffolding of the A. thaliana high-quality genome draft using ECTools-corrected or raw Pacific Biosciences reads.

Testing the Bloom filters


# To test insertions:
cd tools
./writeBloom.pl
Usage: ./writeBloom.pl
-f  sequences to scaffold (Multi-FASTA format, required)
-k  k-mer value (default -k 15, optional)
-p  Bloom filter false positive rate (default -p 0.0001, optional - increase to prevent memory allocation errors)

# To test queries:
cd tools
./testBloom.pl
Usage: ./testBloom.pl
-f  sequences to test (Multi-FASTA format, required)
-k  k-mer value (default -k 15, optional)
-r  Bloom filter file

Algorithm


Process: nanopore/long reads are supplied as input (-s option, fasta/fastq format) and k-mer pairs are extracted using user-defined k-mer length (-k) and distance between the 5’-end of each pairs (-d) over a sliding window (-t). Unique k-mer pairs at set distance are hashed. Fasta sequences to scaffold are sup-plied as input (-f), and are shredded to k-mers on both strands, tracking the [contig] sequence of origin, k-mer positions and frequencies of observation.

Algorithm: LINKS has two main stages, a contig pairing and a scaffold layout phase. Cycling through each k-mer pair, k-mers that are uniquely placed on contigs are identified, and putative contig pairs are formed if k-mers are placed on different contigs. Contig pairs are only considered if the calculated distances between them satisfy the mean distance provided (-d) while allowing for a deviation (-e). Contig pairs having a valid gap or over-lap are allowed to proceed to the scaffolding stage. Contigs in pairs may be ambiguous: a given contig may link to multiple contigs. To mitigate, the number of spanning k-mer pairs (links) between any given contig pair is recorded, along with a mean putative gap or overlap. Once pairing between contigs is complete, the scaffolds are built using contigs as seeds. Contigs are used in turn until all have been incorporated into a scaffold. Scaffolding is controlled by merging sequences only when a minimum number of links (-l) join two contig pairs, and when links are dominant compared to that of another possible pairing (-a). The predecessor of LINKS is the unpublished scaffolding engine in the widely-used SSAKE assembler (Warren et al. 2007), and foundation of the SSPACE-LongRead scaffolder (Boetzer and Pirovano, 2014).

Output: A summary of the scaffold layout is provided (.scaffold) as a text file and captures the linking information of successful scaffolds. A fasta file (.scaffold.fa) is generated using that information, placing sized N-pads for gaps and a single “n” in cases of overlaps between contigs. A log summary of k-mer pairing in the assembly is provided (.log) along with a text file describing possible issues in pairing (.pairing_issues) and pairing distribu-tion (.pairing_distribution.csv).

Consider the following contig pairs (AB, AC and rAD):

    A         B
========= ======== 
  ->       <-
   ->        <-
    ->      <-
       ->       <-

    A       C
========= ======
  ->        <-
    ->        <-

   rA        D           equivalent to rDA, in this order
========= =======
      ->   <-
     ->   <-
       ->   <-

Two parameters control scaffolding (-l and -a). The -l option specifies the minimum number of links (read pairs) a valid contig pair MUST have to be considered. The -a option specifies the maximum ratio between the best two contig pairs for a given seed/contig being extended. For example, contig A shares 4 links with B and 2 links with C, in this orientation. contig rA (reverse) also shares 3 links with D. When it's time to extend contig A (with the options -l and -a set to 2 and 0.7, respectively), both contig pairs AB and AC are considered. Since C (second-best) has 2 links and B (best) has 4 (2/4) = 0.5 below the maximum ratio of 0.7, A will be linked with B in the scaffold and C will be kept for another extension. If AC had 3 links the resulting ratio (0.75), above the user-defined maximum 0.7 would have caused the extension to terminate at A, with both B and C considered for a different scaffold. A maximum links ratio of 1 (not recommended) means that the best two candidate contig pairs have the same number of links -- LINKS will accept the first one since both have a valid gap/overlap. When a scaffold extension is terminated on one side, the scaffold is extended on the "left", by looking for contig pairs that involve the reverse of the seed (in this example, rAD). With AB and AC having 4 and 2 links, respectively and rAD being the only pair on the left, the final scaffolds outputted by LINKS would be:

  1. rD-A-B
  2. C

LINKS outputs a .scaffolds file with linkage information between contigs (see "Understanding the .scaffolds csv file" below) Accurate scaffolding depends on many factors. Number and nature of repeats in your target sequence, optimum adjustments of distance (-d), deviation on the distance (-e), kmer sizes (-k), Minimum number of links (-l) and link ratio (-a) and data quality will all affect LINKS's ability to build scaffolds.

NOTE: IT IS ADVISED TO RUN LINKS WITH SMALLER DISTANCES (-d) FIRST, ESPECIALLY WHEN ASSEMBLIES ARE VERY FRAGMENTED.

MPET INPUT (deprecated in v2.0.0+)


In v1.7, a new option (-m) instructs LINKS that the long-read source (-s) is MPET. The users should prepare their input as specified in: cd test runme_EcoliK12singleMPET.sh

The MPET input is a custom format akin to FASTA and the sequence record must consist of read1:read2

>template
ACGACACATCTACGCAGCGACGACGATAAATATAC:ATCAGCACAGCGACGCAGCGACAGCAGGACGACGAC

NOTES:

  • Paired MPET reads are supplied in their original outward orientation <- ->
  • MPET sequences do not need to be trimmed (the Bloom filter will take care of eliminating erroneous kmers not found in the assembly)
  • You CANNOT combine MPET and long reads simultaneously in the same LINKS process
  • You may trim or process MPET reads if you wish (eg. with NxTrim), but remember to supply resulting MPETs in their original, outward-facing configuration (ie. <- ->). The script in ./tools/makeMPETOutput2EQUALfiles.pl does that for you.

The default behaviour is to extract kmer pairs from long-read FASTA/FASTQ files specified in -s.

Alternatively, when set to the MPET read length, the -m option will signal LINKS to extract kmer pairs across a distance set in -d, for each MPET pair supplied in files supplied under -s

When doing so, ensure that -t is set to extract at least ~5 kmer pairs/MPET pair As a rule of thumb, -l should be set to at least double that value (-l 10 in this case)

Preparing the MPET input (deprecated in v2.0.0+)


For each fastq MPET file, convert in fasta:
 gunzip -c EcMG1_S7_L001_R1_001.fastq.gz | perl -ne '$ct++;if($ct>4){$ct=1;}print if($ct<3);' > mpet4k_1.fa
 gunzip -c EcMG1_S7_L001_R2_001.fastq.gz | perl -ne '$ct++;if($ct>4){$ct=1;}print if($ct<3);' > mpet4k_2.fa
 gunzip -c set1_R1.mp.fastq.gz | perl -ne '$ct++;if($ct>4){$ct=1;}print if($ct<3);' > trimmedmpet4k_1.fa
 gunzip -c set1_R2.mp.fastq.gz | perl -ne '$ct++;if($ct>4){$ct=1;}print if($ct<3);' > trimmedmpet4k_2.fa

Generate the paired input (refer to the tools folder):
Usage: ./makeMPETOutput2EQUALfiles.pl


)   1=PET (-><-) >
** fasta files must have the same number of records & arranged in the same order

RAW: ../tools/makeMPETOutput2EQUALfiles.pl mpet4k_1.fa mpet4k_2.fa       
TRIMMED: ../tools/makeMPETOutput2EQUALfiles.pl trimmedmpet4k_1.fa trimmedmpet4k_2.fa 1

echo mpet4k_1.fa_paired.fa > mpet.fof
echo trimmedmpet4k_1.fa_paired.fa > trimmedmpet.fof

Output files


Output files Description
.log text file; Logs execution time / errors / pairing stats
.pairing_distribution.csv comma-separated file; 1st column is the calculated distance for each pair (template) with reads that assembled logically within the same contig. 2nd column is the number of pairs at that distance
.pairing_issues text file; Lists all pairing issues encountered between contig pairs and illogical/out-of-bounds pairing
.scaffolds comma-separated file; see below
.scaffolds.fa fasta file of the new scaffold sequence
.bloom Bloom filter created by shredding the -f input into k-mers of size -k
.gv scaffold graph (for visualizing merges), can be rendered in neato, graphviz, etc
.assembly_correspondence.tsv correspondence file lists the scaffold ID, contig ID, original_name, #linking kmer pairs, links ratio, gap or overlap
.simplepair_checkpoint.tsv checkpoint file, contains info to rebuild datastructure for .gv graph
.tigpair_checkpoint.tsv if -b BASNAME.tigpair_checkpoint.tsv is present, LINKS will skip the kmer pair extraction and contig pairing stages. Delete this file to force LINKS to start at the beginning. This file can be used to: 1) quickly test parameters (-l min. links / -a min. links ratio, 2) quickly recover from crash 3) explore very large kmer spaces 4) scaffold with output of ARCS

Interpreting .assembly_correspondence.tsv


This human-readable correspondence file lists the scaffold ID, contig ID, original assembly contig name, contig/sequence orientation, #linking kmer pairs, links ratio, gap or overlap(-) in this order

Interpreting the graph / .gv file


  1. Vertices correspond to the sequences being considered for scaffolding, with the LINKS re-numbered sequences displayed in each vertex (unlinked sequences are not shown)

  2. Edges are drawned between vertices when there is evidence for linking scaffolds (even if they are no ultimately scaffolded)

  3. Only vertices/scaffolds highlighted in blue satisfied user-specified scaffold criteria (l and a parameters and satisfied logic/distance). These are scaffolded in the final LINKS output

  4. Each edge in the graph will have 3 types of information (l=,g=,type=)

l=:number of kmer pairs linking any two vertices/sequences

g=:estimated gap or overlap (-) length between any two sequences

type=:refers to the orientation of the sequences (forward=1,reverse=0)

Understanding the .scaffolds csv file


scaffold1,7484,f127Z7068k12a0.58m42_f3090z62k7a0.14m76_f1473z354

column 1: a unique scaffold identifier column 2: the sum of all contig sizes that made it to the scaffold/supercontig column 3: a contig chain representing the layout:

e.g. f127Z7068k12a0.58m42_f3090z62k7a0.14m76_f1473z354

means: contig f127 (strand=f/+), size (z) 7068 (Z if contig was used as the seed sequence) has 12 links (k), link ratio of 0.58 (a) with a mean gap of 42nt (m) with reverse (r) of contig 3090 (size 62) on the right. if m values are negative, it's just that a possible overlap was calculated using the mean distance supplied by the user and the position of the reads flanking the contig. Negative m values imply that there's a possible overlap between the contigs. But since the pairing distance distribution usually follows a Normal/Gaussian distribution, some distances are expected to be larger than the median size expected/observed. In reality, if the exact size was known between each paired-reads, we wouldn't expect much negative m values unless a break occurred during the contig extension (likely due to base errors/SNPs).

License


LINKS Copyright (c) 2014-present British Columbia Cancer Agency Branch. All rights reserved.

LINKS is released under the GNU General Public License v3

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 3.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

For commercial licensing options, please contact Patrick Rebstein [email protected]

links's People

Contributors

lcoombe avatar murathangoktas avatar sjackman avatar vlad0x00 avatar warrenlr avatar yamanmalkoc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

links's Issues

working with 2 Gb genome; stuck on bloom filter being built

Hi Rene,
I've tried submitting a LINKS job using a few different parameters (including your recent suggestion by modifying the -t and -d parameters, but unfortunately my observations remain the same: the program is stuck on building the bloom filter for at least 24 hours. Perhaps it needs even longer before any additional messages are written to the .log file?

I have access to a variety of machines on our computer cluster with anywhere from 128 Gb to 995 Gb of RAM. I've been using the more readily available smaller machines to run LINKS so far, but I don't get a sense that the issue is memory at the moment.

I'm working with a 2 Gb genome and trying to close up an existing assembly with Nanopore reads. I have about 30 Gb of Nanopore data thus far. The two commands I've tried which left me with the same sort of holding pattern on the bloom filter portion were:

(default parameters)

LINKS -f {my.fasta} -s {myreadslist}

then following your suggestions, as well as modifying the kmer size

LINKS -f {my.fasta} -s {myreadslist} -t 50 -d 5000 -k23

Perhaps I need to provide more memory? I've attached a typical .log file that is generated below. Thank you for any information you can provide which might help me get over the bloom filter step.

Running: /mnt/lustre/software/linuxbrew/colsa/bin/LINKS [v1.8.6]
-f mylu.fa
-s readsfile
-m 
-d 5000
-k 23
-e 0.1
-l 5
-a 0.3
-t 50
-o 0
-z 500
-b mylu
-r 
-p 0.001
-x 0

----------------- Verifying files -----------------

Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run1/workspace/pass/fastq_runid_71710fab0f2bebd2a96ab42e073d0292ec138392.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run1/workspace/pass/fastq_runid_fa6f064a9f2c29d5f8dcdd435c5cec74e8bf3a04.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run1/workspace/pass/fastq_runid_8a1d68cce34fae5417a223dfd574649ed98d966a.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run1/workspace/pass/fastq_runid_6642b51f73438e475c0b83bd2fd69e99a632038e.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run1/workspace/pass/fastq_runid_f77e1c24e730f2e852318bbe697ec6c8420a1a85.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run1/workspace/pass/fastq_runid_76a6fa268897bf1a1be7a57f36b1fa6d4b6e9c24.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run1/workspace/pass/fastq_runid_4062f55aba7613d771243721850b42921b756677.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run1/workspace/pass/fastq_runid_93db73a2dc228333583e0ebe7acc862ca86ef30f.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run1/workspace/pass/fastq_runid_00f228b639289da986865a2d08a3a26f0d8aa592.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run1/workspace/pass/fastq_runid_58b24ffcdefabc7d926abbfc0522bb0c7800989f.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run2/workspace/pass/fastq_runid_5e30f9268bf9cb4bfd5563130c37f4027d287302.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run2/workspace/pass/fastq_runid_6ddb3559f7f102736fccb5ad6f4ec8e0d3bbe70d.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run2/workspace/pass/fastq_runid_f8b124142fd75e2fc0fcd28fe9e9d4bd944b0d02.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run2/workspace/pass/fastq_runid_d049807b20a620d75ac2de2c114428519c6c1942.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run2/workspace/pass/fastq_runid_0cb08f98dc433c79323d84e38d2f88c86a13e2f9.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run2/workspace/pass/fastq_runid_96d81b473fa5f5ca9e18c0ee474cb65ebb4d0ba7.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run2/workspace/pass/fastq_runid_781736dad98b46b7f47e23a496400586cdc75822.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run2/workspace/pass/fastq_runid_afa5caddbf6e48ff1c9dd653cdc72ee99e4b7b30.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_9b35438e109c2ee6882da220324f3d410586b708.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_ff1e252c580268da42ea57ecb730cfcc39503e87.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_1e4c06d79f4dd9b73d2ed6c270bc31dc5d28e964.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_c279c1faf060eb3dab57a9236d0c086bb03e8840.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_e41283aab878f4ab890838b5799bc5a93fb19531.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_72905c435ba6356f56ec14b83d0e428ad3976a56.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_7bce09f51eff449030d198d1fd8431b66e42f17a.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_1ea55c258d63be75be732bcf3f7fa9d0c4f48668.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_c2b5f2811d94cd980a537281e83f2fecb0afd047.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_4413d6ee11a8d8e92f21b8601659b5012a1c1e81.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_06ec875936967e6a407fed3a8723a65fb98a681f.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_5f12e1c0c79f2f9b0d0c33ad82b7947c5d2fba7e.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_40dffa7bdb63e62b02bf16e3c2d32b409e749061.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_272b541d7da883ee8680cdbcc19e9eef569cb82e.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_de629cff38f9c6be60a7caa350020c4bbd1915c1.fastq...ok
Checking /mnt/lustre/macmaneslab/devon/nanoPore/data/reads/mylu/bat13/albout/run3/workspace/pass/fastq_runid_b5c1afe9d8c9f013ecae46acd988f7945a4fb035.fastq...ok
Checking sequence target file mylu.fa...ok


=>Reading contig/sequence assembly file : Mon Jun 18 11:42:15 EDT 2018
Building a Bloom filter using 23-mers derived from sequences in -f mylu.fa...

LINKS stuck at first contig

It seems that the LINKS run is stuck processing the first contig. The LINKS executable is running on a 128GB RAM node and is using 6GB RAM for over 4 hours and is still stuck on the first contig. The test cases run through without errors, on the same machine.

=>Reading contig/sequence assembly file : Sun Aug 12 11:57:49 PDT 2018
Building a Bloom filter using 17-mers derived from sequences in -f fasta...


Bloom filter specs
elements=2925948358
FPR=0.001
size (bits)=42068078784
hash functions=9


Contigs (>= 5000 bp) processed k=17:
1

Very small contig parameters

I'm looking for some advice on how to reduce the number of small contigs.

I'm running LINKS on an ABySS assembly that is highly fragmented (3.8 million contigs, 1.7kb N50, and ~2Gbp genome size). I have corrected ONT reads at roughly 7x coverage. Over half of my contigs are <250bp. So far I've set k=15, l=4 and started with d=1000 and t=10. This reduced the contig number by 100,000 but that's a drop in the bucket for 3.8 million. Theoretically though with these parameters, a 100bp contig has a small chance of being scaffolded. Which parameters are best to tune first in order to scaffold up small contigs?

Given that I have corrected long reads, would it be safer to reduce -l or would starting with a smaller -d be better?

License issue

I started to look into updating the bioconda recipe for LINKS. Looking through the code I realized that LINKS is using GPL2 while the included bcgsc version of bloomfilter still uses a custom bcca license. I would be happy if you could look into this as those two licenses are not compatible.

Illumina contig scaffolding with PacBio reads

I have got a fastq file with draft genome contigs from Illumina paired-end reads and 10x Genomics reads by ABySS. I would like to scaffold this contigs with PacBio reads (about 30x coverage) by LINKS. Should I make corrected and trimmed PacBio reads for LINKS or not? I did not find any information about this in the manual. What is the recommended computing resource ( cpu core, memory and free space size) for 4Gb de novo plant genome?

No links using ONT

I am trying to scaffold a relatively fragmented illumina assembly using ONT reads. I have done iterations with different parameter values yet I find no links in the contigs.
./LINKS -f $genome -s $nano -b Nano -d 500,1000,2000,3000,4000,5000,6000,7000 -e $e
The ONT reads are definitely less than 10X coverage
What do you think might be causing this?

===========PAIRED K-MER STATS===========
Total number of pairs extracted from -s /home/nano.txt: 38852317
At least one sequence/pair missing from contigs: 0
Ambiguous kmer pairs (both kmers are ambiguous): 38828431
Assembled pairs: 0 (0 sequences)
Satisfied in distance/logic within contigs (i.e. -> <-, distance on target: 0
Unsatisfied in distance within contigs (i.e. distance out-of-bounds): 0
Unsatisfied pairing logic within contigs (i.e. illogical pairing ->->, <-<- or <-->): 0
---
Satisfied in distance/logic within a given contig pair (pre-scaffold): 0
Unsatisfied in distance within a given contig pair (i.e. calculated distances out-of-bounds): 0
---
Total satisfied: 0 unsatisfied: 0

AGP file Generation

Hi,
I used LINKS 1.8.5 to scaffold a 100Mb genome and am very happy with the results. However, I would like to generate an AGP file, is there a way to do so with the available output files ?

LINKS finding no kmer pairs on 10X data

We have a fly genome assembled from PacBio that we treated with Tigmint and want to scaffold with ARKS and LINKS using 10X data.

I have run the commands presented in "infos_for_message_to_arks_and_links_authors.md". In this file, I also show the head/tail of important steps:
infos_for_message_to_arks_and_links_authors.txt

Here are also both the ARKS and LINKS logs attached:
log_arks_13956097.txt
links_scaffolds_aa.log.txt

In the end, LINKS "Extracted 0 (zero) 30-mer pairs" and there is of course no scaffolding.

We are using 10X data to scaffold, which was prepared with longranger basic, ARKS 1.0.2, and LINKS 1.8.6 (I also tried 1.8.5).

I cannot track down the reason why zero 30-mer pairs are used. I would be much obliged if you could help me track my error.

Installation problem on Debian 9 - with solution

When trying to install links 1.8.6 on Debian 9 (Stretch), I observed the following problems.

  • try to use the binary failed with this error:
LINKS-1.8.6/releases/links_v1.8.6$ ./LINKS
Can't load
'/localhome/schloegl/src/links-c.build/LINKS-1.8.6/releases/links_v1.8.6/./lib/bloomfilter/swig/BloomFilter.so'
for module BloomFilter:
/localhome/schloegl/src/links-c.build/LINKS-1.8.6/releases/links_v1.8.6/./lib/bloomfilter/swig/BloomFilter.so:
undefined symbol: PL_stack_sp at
/usr/lib/x86_64-linux-gnu/perl/5.24/DynaLoader.pm line 187.
 at
/localhome/schloegl/src/links-c.build/LINKS-1.8.6/releases/links_v1.8.6/./lib/bloomfilter/swig/BloomFilter.pm
line 11.
Compilation failed in require at ./LINKS line 26.
BEGIN failed--compilation aborted at ./LINKS line 26.

  • try to compile links according to the instructions
  g++ -c BloomFilter_wrap.cxx -I/usr/lib64/perl5/CORE -fPIC -Dbool=char -O3
  g++ -Wall -shared BloomFilter_wrap.o -o BloomFilter.so -O3

failed with

In file included from /usr/include/c++/6/cmath:42:0,
                  from /usr/include/c++/6/math.h:36,
                  from BloomFilter_wrap.cxx:758:
/usr/include/c++/6/bits/cpp_type_traits.h:145:12: error: redefinition
of ‘struct std::__is_integer<char>’
      struct __is_integer<char>
             ^~~~~~~~~~~~~~~~~~
/usr/include/c++/6/bits/cpp_type_traits.h:138:12: error: previous
definition of ‘struct std::__is_integer<char>’
      struct __is_integer<bool>
             ^~~~~~~~~~~~~~~~~~
In file included from /usr/include/c++/6/exception:172:0,
                  from /usr/include/c++/6/stdexcept:38,
                  from BloomFilter_wrap.cxx:1568:
/usr/include/c++/6/bits/exception_ptr.h: In member function
‘std::__exception_ptr::exception_ptr::operator char() const’:
/usr/include/c++/6/bits/exception_ptr.h:143:16: error: invalid
conversion from ‘void*’ to ‘char’ [-fpermissive]
        { return _M_exception_object; }
                 ^~~~~~~~~~~~~~~~~~~
In file included from /usr/include/c++/6/bits/move.h:57:0,
                  from /usr/include/c++/6/bits/nested_exception.h:40,
                  from /usr/include/c++/6/exception:173,
                  from /usr/include/c++/6/stdexcept:38,
                  from BloomFilter_wrap.cxx:1568:
/usr/include/c++/6/type_traits: At global scope:
/usr/include/c++/6/type_traits:224:12: error: redefinition of ‘struct
std::__is_integral_helper<char>’
      struct __is_integral_helper<char>
             ^~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/c++/6/type_traits:220:12: error: previous definition of
‘struct std::__is_integral_helper<char>’
      struct __is_integral_helper<bool>
             ^~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/c++/6/exception:173:0,
                  from /usr/include/c++/6/stdexcept:38,
                  from BloomFilter_wrap.cxx:1568:
/usr/include/c++/6/bits/nested_exception.h: In member function ‘void
std::nested_exception::rethrow_nested() const’:
/usr/include/c++/6/bits/nested_exception.h:75:11: error: invalid
user-defined conversion from ‘const
std::__exception_ptr::exception_ptr’ to ‘bool’ [-fpermissive]
        if (_M_ptr)
            ^~~~~~
In file included from /usr/include/c++/6/exception:172:0,
                  from /usr/include/c++/6/stdexcept:38,
                  from BloomFilter_wrap.cxx:1568:
/usr/include/c++/6/bits/exception_ptr.h:142:16: note: candidate is:
std::__exception_ptr::exception_ptr::operator char() const <near match>
        explicit operator bool() const
                 ^~~~~~~~
/usr/include/c++/6/bits/exception_ptr.h:142:16: note:   return type
‘char’ of explicit conversion function cannot be converted to ‘bool’
with a qualification conversion
In file included from /usr/include/c++/6/bits/basic_string.h:5643:0,
                  from /usr/include/c++/6/string:52,
                  from /usr/include/c++/6/stdexcept:39,
                  from BloomFilter_wrap.cxx:1568:
/usr/include/c++/6/bits/functional_hash.h: At global scope:
/usr/include/c++/6/bits/functional_hash.h:111:3: error: redefinition
of ‘struct std::hash<char>’
    _Cxx_hashtable_define_trivial_hash(char)
    ^
/usr/include/c++/6/bits/functional_hash.h:108:3: error: previous
definition of ‘struct std::hash<char>’
    _Cxx_hashtable_define_trivial_hash(bool)
    ^
In file included from /usr/include/c++/6/bits/uniform_int_dist.h:35:0,
                  from /usr/include/c++/6/bits/stl_algo.h:66,
                  from /usr/include/c++/6/algorithm:62,
                  from BloomFilter_wrap.cxx:1610:
/usr/include/c++/6/limits:451:12: error: redefinition of ‘struct
std::numeric_limits<char>’
      struct numeric_limits<char>
             ^~~~~~~~~~~~~~~~~~~~
/usr/include/c++/6/limits:382:12: error: previous definition of
‘struct std::numeric_limits<char>’
      struct numeric_limits<bool>
             ^~~~~~~~~~~~~~~~~~~~

the default gcc compiler in debian 9 is g++ v6.3

In order to address these, the following issues must be considered:

  • setting the proper perl confgure path, and
  • using g++-4.9 or
  • omitting the argument "-Dbool=char"

In summary, the following commands eabled the compilation on Debian 9:

    swig -Wall -c++ -perl5 BloomFilter.i 
    g++ -c BloomFilter_wrap.cxx -I$(perl -e 'use Config; print $Config{archlib};')"/CORE" -fPIC -O3
    g++ -Wall -shared BloomFilter_wrap.o -o BloomFilter.so -O3`

BTW, "-Dbool=char" does not seem to be needed with gcc-4.9 either.

I'd like suggesting to include this information in the documentation about the installation, so others do not need to figure this out again.

Invalid file of filenames: FULL_PATH -- fatal error

Hi @warrenlr - I'm trying to use LINKS to scaffold a pre-assembled genome with Illumina Moleculo reads. I have sam and bam files that I produced using Minimap2 but doesn't look like LINKS needs them. I have a .fof file with the absolute PATH of fastq file that contains my Moleculo reads. But I keep getting the following error.

Invalid file of filenames: /home/taruna/hyb1-a2/analysis-results/links/files4link.txt -- fatal

I have checked the PATH many times and both the dir and the file are readable. Is there a way I can fix this error? Please advice. Thank you so much!

could not make scaffold using long reads

Hi,
I am trying to assemble the plant mitochondrial genomes. I have two contigs such as 430 and 193-kb that contains all mitochondrial genes and I guess repeat regions are present at the end of these two contigs .So, I would like to make scaffold with Nanopore long reads using LINKS with the default parameters as below and attached log file for your reference.
$LINKS -f ./cke_mt_two_contigs.fasta -s ./cke_ont_ref_gen.fof

I could not obtained any results and moreover the output results showed the exact input contigs only. There is no improvement. How can I make scaffold or what are the parameters that I tweak to get scaffold?

cke.log

Thank you and looking forward to hearing from you.

Unbundling bloomfilter out of the LINKS distribution tarball and more cleanup

Hi,
the installation procedure mentioned in https://github.com/bcgsc/LINKS/blob/master/README.md does not match contents of the https://github.com/bcgsc/LINKS/archive/v1.8.6.tar.gz file. The file contains:

drwxrwxr-x root/root         0 2018-03-13 23:17 LINKS-1.8.6/
-rw-rw-r-- root/root     35004 2018-03-13 23:17 LINKS-1.8.6/LICENSE
-rwxrwxr-x root/root     28165 2018-03-13 23:17 LINKS-1.8.6/README.md
drwxrwxr-x root/root         0 2018-03-13 23:17 LINKS-1.8.6/bin/
-rwxrwxr-x root/root     65311 2018-03-13 23:17 LINKS-1.8.6/bin/LINKS
-rw-rw-r-- root/root     65640 2018-03-13 23:17 LINKS-1.8.6/links-logo.png
drwxrwxr-x root/root         0 2018-03-13 23:17 LINKS-1.8.6/releases/
drwxrwxr-x root/root         0 2018-03-13 23:17 LINKS-1.8.6/releases/binaries/
-rw-rw-r-- root/root    717165 2018-03-13 23:17 LINKS-1.8.6/releases/binaries/links_v1-5-1.tar.gz
-rw-rw-r-- root/root    714073 2018-03-13 23:17 LINKS-1.8.6/releases/binaries/links_v1-5-2.tar.gz
-rw-rw-r-- root/root    717241 2018-03-13 23:17 LINKS-1.8.6/releases/binaries/links_v1-5.tar.gz
-rw-rw-r-- root/root   1054299 2018-03-13 23:17 LINKS-1.8.6/releases/binaries/links_v1-6-1.tar.gz
-rw-rw-r-- root/root   1052833 2018-03-13 23:17 LINKS-1.8.6/releases/binaries/links_v1-6.tar.gz
-rw-rw-r-- root/root   1060349 2018-03-13 23:17 LINKS-1.8.6/releases/binaries/links_v1-7.tar.gz
-rw-rw-r-- root/root   1062773 2018-03-13 23:17 LINKS-1.8.6/releases/binaries/links_v1-8-1.tar.gz
-rw-rw-r-- root/root   1063183 2018-03-13 23:17 LINKS-1.8.6/releases/binaries/links_v1-8-2.tar.gz
-rw-rw-r-- root/root   1062998 2018-03-13 23:17 LINKS-1.8.6/releases/binaries/links_v1-8-3.tar.gz
-rw-rw-r-- root/root   1068147 2018-03-13 23:17 LINKS-1.8.6/releases/binaries/links_v1-8-4.tar.gz
-rw-rw-r-- root/root   1075810 2018-03-13 23:17 LINKS-1.8.6/releases/binaries/links_v1-8-5.tar.gz
-rw-rw-r-- root/root    685978 2018-03-13 23:17 LINKS-1.8.6/releases/binaries/links_v1-8-6.tar.gz
-rw-rw-r-- root/root   1153831 2018-03-13 23:17 LINKS-1.8.6/releases/binaries/links_v1-8.tar.gz
drwxrwxr-x root/root         0 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/
-rwxrwxr-x root/root     59848 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/LINKS
-rw-rw-r-- root/root    432833 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/LINKS-readme.pdf
-rwxrwxr-x root/root     27041 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/LINKS-readme.txt
lrwxrwxrwx root/root         0 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/LINKS.pl -> LINKS
drwxrwxr-x root/root         0 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/
-rwxrwxr-x root/root       915 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/LINKSrecipe_athaliana_ectools.sh
-rwxrwxr-x root/root       915 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/LINKSrecipe_athaliana_raw.sh
-rwxrwxr-x root/root      1752 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/LINKSrecipe_pglaucaPG29-WS77111.sh
-rwxrwxr-x root/root      3088 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/runIterativeLINKS_ECK12.sh
-rwxrwxr-x root/root      3176 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/runIterativeLINKS_ECK12A2D.sh
-rwxrwxr-x root/root      3116 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/runIterativeLINKS_ECK12raw.sh
-rwxrwxr-x root/root      3528 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/runIterativeLINKS_SCS288c.sh
-rwxrwxr-x root/root      3201 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/runIterativeLINKS_SCW303.sh
-rwxrwxr-x root/root      1316 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/runIterativeLINKS_STH58.sh
-rwxrwxr-x root/root       222 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/runall.sh
-rwxrwxr-x root/root       996 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/runme_EcoliK12iterative.sh
-rwxrwxr-x root/root      1038 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/runme_EcoliK12iterativeA2D.sh
-rwxrwxr-x root/root      1030 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/runme_EcoliK12iterativeRAW.sh
-rwxrwxr-x root/root      2081 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/runme_EcoliK12single.sh
-rwxrwxr-x root/root      2466 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/runme_EcoliK12singleMPET.sh
-rwxrwxr-x root/root      1226 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/runme_ScerevisiaeS288citerative.sh
-rwxrwxr-x root/root      1226 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/runme_ScerevisiaeW303iterative.sh
-rwxrwxr-x root/root      1029 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/test/runme_StyphiH58iterative.sh
drwxrwxr-x root/root         0 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/tools/
-rwxrwxr-x root/root      1718 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/tools/makeMPETOutput2EQUALfiles.pl
-rwxrwxr-x root/root      3764 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/tools/testBloom.pl
-rwxrwxr-x root/root      4732 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.4/tools/writeBloom.pl
drwxrwxr-x root/root         0 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/
-rwxrwxr-x root/root     61976 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/LINKS
-rw-rw-r-- root/root    433249 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/LINKS-readme.pdf
-rwxrwxr-x root/root     27350 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/LINKS-readme.txt
lrwxrwxrwx root/root         0 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/LINKS.pl -> LINKS
-rw-rw-r-- root/root    657720 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/lib.tar.gz
drwxrwxr-x root/root         0 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/
-rwxrwxr-x root/root       915 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/LINKSrecipe_athaliana_ectools.sh
-rwxrwxr-x root/root       915 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/LINKSrecipe_athaliana_raw.sh
-rwxrwxr-x root/root      1752 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/LINKSrecipe_pglaucaPG29-WS77111.sh
-rwxrwxr-x root/root      3088 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/runIterativeLINKS_ECK12.sh
-rwxrwxr-x root/root      3176 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/runIterativeLINKS_ECK12A2D.sh
-rwxrwxr-x root/root      3116 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/runIterativeLINKS_ECK12raw.sh
-rwxrwxr-x root/root      3528 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/runIterativeLINKS_SCS288c.sh
-rwxrwxr-x root/root      3201 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/runIterativeLINKS_SCW303.sh
-rwxrwxr-x root/root      1316 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/runIterativeLINKS_STH58.sh
-rwxrwxr-x root/root       222 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/runall.sh
-rwxrwxr-x root/root       996 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/runme_EcoliK12iterative.sh
-rwxrwxr-x root/root      1038 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/runme_EcoliK12iterativeA2D.sh
-rwxrwxr-x root/root      1030 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/runme_EcoliK12iterativeRAW.sh
-rwxrwxr-x root/root      2081 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/runme_EcoliK12single.sh
-rwxrwxr-x root/root      2466 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/runme_EcoliK12singleMPET.sh
-rwxrwxr-x root/root      1226 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/runme_ScerevisiaeS288citerative.sh
-rwxrwxr-x root/root      1226 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/runme_ScerevisiaeW303iterative.sh
-rwxrwxr-x root/root      1029 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/test/runme_StyphiH58iterative.sh
drwxrwxr-x root/root         0 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/tools/
-rwxrwxr-x root/root      1718 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/tools/makeMPETOutput2EQUALfiles.pl
-rwxrwxr-x root/root      3764 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/tools/testBloom.pl
-rwxrwxr-x root/root      4732 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.5/tools/writeBloom.pl
drwxrwxr-x root/root         0 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/
-rwxrwxr-x root/root     65311 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/LINKS
-rwxrwxr-x root/root     65311 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/LINKS.pl
-rwxrwxr-x root/root     28165 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/README.md
-rw-rw-r-- root/root    655063 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/lib.tar.gz
drwxrwxr-x root/root         0 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/
-rwxrwxr-x root/root       915 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/LINKSrecipe_athaliana_ectools.sh
-rwxrwxr-x root/root       915 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/LINKSrecipe_athaliana_raw.sh
-rwxrwxr-x root/root      1752 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/LINKSrecipe_pglaucaPG29-WS77111.sh
-rwxrwxr-x root/root      3088 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/runIterativeLINKS_ECK12.sh
-rwxrwxr-x root/root      3176 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/runIterativeLINKS_ECK12A2D.sh
-rwxrwxr-x root/root      3116 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/runIterativeLINKS_ECK12raw.sh
-rwxrwxr-x root/root      3528 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/runIterativeLINKS_SCS288c.sh
-rwxrwxr-x root/root      3201 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/runIterativeLINKS_SCW303.sh
-rwxrwxr-x root/root      1316 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/runIterativeLINKS_STH58.sh
-rwxrwxr-x root/root       222 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/runall.sh
-rwxrwxr-x root/root       996 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/runme_EcoliK12iterative.sh
-rwxrwxr-x root/root      1038 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/runme_EcoliK12iterativeA2D.sh
-rwxrwxr-x root/root      1030 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/runme_EcoliK12iterativeRAW.sh
-rwxrwxr-x root/root      2081 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/runme_EcoliK12single.sh
-rwxrwxr-x root/root      2466 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/runme_EcoliK12singleMPET.sh
-rwxrwxr-x root/root      1226 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/runme_ScerevisiaeS288citerative.sh
-rwxrwxr-x root/root      1226 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/runme_ScerevisiaeW303iterative.sh
-rwxrwxr-x root/root      1029 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/test/runme_StyphiH58iterative.sh
drwxrwxr-x root/root         0 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/tools/
-rwxrwxr-x root/root      1971 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/tools/consolidateGraphs.pl
-rwxrwxr-x root/root      1718 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/tools/makeMPETOutput2EQUALfiles.pl
-rwxrwxr-x root/root      3764 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/tools/testBloom.pl
-rwxrwxr-x root/root      4752 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/tools/writeBloom.pl
-rwxrwxr-x root/root      2016 2018-03-13 23:17 LINKS-1.8.6/scaffoldsToAGP2.pl

As you can see, the bloomfilter is hidden in the libs.tar.gz file. The file itself contains even .git contents which is not really needed by the end-users.

Alsy, why LINKS 1.8.5 and 1.8.4 are included in the 1.6.8 tarball at all?

There is no symlink used for this pair:

-rwxrwxr-x root/root     65311 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/LINKS
-rwxrwxr-x root/root     65311 2018-03-13 23:17 LINKS-1.8.6/releases/links_v1.8.6/LINKS.pl

Anyway, woudl it be possible to move the lib/bloomfilter into a separate package with a separate version number and drop it from the LINKS tarball altogether? Would it be possible to make some conditional import of BloomFilter the step 4 in README.md shown below?

4. CHANGE the path to BloomFilter.pm in LINKS/writeBloom.pl/testBloom.pl

Thank you for your efforts.

Output analysis confusion!

Hi,
I just ran links_v1.8.6 on my CANU genome assembly and I'm not sure how to interpret the results.
When I run QUAST on my output, it is exactly the same as the input...

Here is the log:
_Running: ./LINKS [v1.8.6]
-f long_long2.fasta
-s matepair.fof
-m 1
-d 1000
-k 15
-e 0.5
-l 5
-a 0.3
-t 20
-o 0
-z 500
-b mpet1k_v2
-r
-p 0.001
-x 0

----------------- Verifying files -----------------

Checking mpet1k_1.fa_paired.fa...ok
Checking sequence target file long_long2.fasta...ok

=>Reading contig/sequence assembly file : jeudi 12 juillet 2018, 11:30:18 (UTC-0400)
Building a Bloom filter using 15-mers derived from sequences in -f long_long2.fasta...


Bloom filter specs
elements=253506447
FPR=0.001
size (bits)=3644811200
hash functions=9


Contigs (>= 500 bp) processed k=15:
2402

=>Writing Bloom filter to disk (mpet1k_v2.bloom) : jeudi 12 juillet 2018, 11:33:10 (UTC-0400)

=>Reading long reads, building hash table : jeudi 12 juillet 2018, 11:33:10 (UTC-0400)
Reads processed k=15, dist=1000, offset=0 nt, sliding step=20 nt:

Reads processed from file 1/1, mpet1k_1.fa_paired.fa:
133240440
Extracted 220116677 15-mer pairs at -d 1000, from all 133240440 sequences provided in matepair.fof
Extracted 220116677 15-mer pairs overall. This is the set that will be used for scaffolding

=>Reading sequence contigs (to scaffold), tracking k-mer positions : jeudi 12 juillet 2018, 12:27:56 (UTC-0400)

Contigs (>= 500 bp) processed k=15:
2402
=>Scaffolding initiated : jeudi 12 juillet 2018, 12:45:36 (UTC-0400)

===========PAIRED K-MER STATS===========
Total number of pairs extracted from -s matepair.fof: 220116677
At least one sequence/pair missing from contigs: 0
Ambiguous kmer pairs (both kmers are ambiguous): 20104140
Assembled pairs: 1515121 (3030242 sequences)
Satisfied in distance/logic within contigs (i.e. -> <-, distance on target: 4294
Unsatisfied in distance within contigs (i.e. distance out-of-bounds): 509692
Unsatisfied pairing logic within contigs (i.e. illogical pairing ->->, <-<- or <-->): 587143
---
Satisfied in distance/logic within a given contig pair (pre-scaffold): 47
Unsatisfied in distance within a given contig pair (i.e. calculated distances out-of-bounds): 413945
---
Total satisfied: 4341 unsatisfied: 1510780

Breakdown by distances (-d):
--------k-mers separated by 1000 bp (outer distance)--------
MIN:500 MAX:1500 as defined by 1000 * 0.5
At least one sequence/pair missing:
Assembled pairs: 1515121
Satisfied in distance/logic within contigs (i.e. -> <-, distance on target: 4294
Unsatisfied in distance within contigs (i.e. distance out-of-bounds): 509692
Unsatisfied pairing logic within contigs (i.e. illogical pairing ->->, <-<- or <-->): 587143
---
Satisfied in distance/logic within a given contig pair (pre-scaffold): 47
Unsatisfied in distance within a given contig pair (i.e. calculated distances out-of-bounds): 413945

We are in the last steps of our genome assembly draft, so please help me understanding the output

thanks a lot,

PY

Can't locate object method "new" via package "BloomFilter::BloomFilter" (perhaps you forgot to load "BloomFilter::BloomFilter"?)

Hello, I'm trying to run LINKS v.1.8.6 installed from conda on an hpc to help with scaffolding an ONT/Illumina assembly of a killifish species. I received the following error that I can't seem to figure out.

Can't locate object method "new" via package "BloomFilter::BloomFilter" (perhaps you forgot to load "BloomFilter::BloomFilter"?) at /pylon5/bi5fpmp/ljcohen/miniconda3/envs/ONT_links/bin/LINKS line 215.

This issue was also reported in the bioconda-recipes repo.

The README.md mentions "building the BloomFilter PERL module", but I was under the impression this was something conda had already installed. Searching for BloomFilter files in the Python 3 conda env:

[ljcohen@login005 ONT_links]$ find . -name "*BloomFilter*"
./man/man3/BloomFilter.3
./lib/site_perl/5.26.2/x86_64-linux-thread-multi/auto/BloomFilter
./lib/site_perl/5.26.2/x86_64-linux-thread-multi/auto/BloomFilter/BloomFilter.so
./lib/site_perl/5.26.2/x86_64-linux-thread-multi/BloomFilter.pm

Full output:

[ljcohen@login006 Fcat]$ LINKS -f Fcat_pilon.fasta -s reads.fof -v 1 -b Fcat_links

Running: /pylon5/bi5fpmp/ljcohen/miniconda3/envs/ONT_links/bin/LINKS [v1.8.6]
-f Fcat_pilon.fasta
-s reads.fof
-m 
-d 4000
-k 15
-e 0.1
-l 5
-a 0.3
-t 2
-o 0
-z 500
-b Fcat_links
-r 
-p 0.001
-x 0

----------------- Verifying files -----------------

Checking /pylon5/mc5phkp/ljcohen/kfish_ONT/F_catenatus_trimmed.Q5.fq.gz...ok
Checking sequence target file Fcat_pilon.fasta...ok


=>Reading contig/sequence assembly file : Wed May 29 16:54:19 EDT 2019
Building a Bloom filter using 15-mers derived from sequences in -f Fcat_pilon.fasta...
*****
Bloom filter specs
elements=1179858966
FPR=0.001
size (bits)=16963525632
hash functions=9
*****

Something went wrong running /pylon5/bi5fpmp/ljcohen/miniconda3/envs/ONT_links/bin/LINKS Wed May 29 16:54:19 EDT 2019
Can't locate object method "new" via package "BloomFilter::BloomFilter" (perhaps you forgot to load "BloomFilter::BloomFilter"?) at /pylon5/bi5fpmp/ljcohen/miniconda3/envs/ONT_links/bin/LINKS line 215.

/pylon5/bi5fpmp/ljcohen/miniconda3/envs/ONT_links/bin/LINKS [v1.8.6] terminated successfully on Wed May 29 16:54:19 EDT 2019

Installation issue: Bloomfilter

Hi all,

I'm trying to install LINKS for hybrid assembly of 10X and shotgun reads. As described below, I'm getting an error in trying to compile the Bloomfilter PERL module. Could it be due to a difference in capitalization, e.g. EXTERN.h and xsub.h?

From the installation instructions:
"TO COMPILE, swig needs the following Perl5 headers:

#include "Extern.h"
#include "perl.h"
#include "XSUB.h"
If they are not located in /usr/lib64/perl5, you can run "perl -e 'use Config; print $Config{archlib};" to locate them."

Our cluster has those headers in lib64, but with different capitalization: EXTERN.h and xsub.h

I tried using gcc versions 4.9.3 and 6.1.0 - see attached error logs.

error_gccv4.9.3.log
error_gccv6.1.0.log

Thank you in advance!

Amanda

Problem with bloomfilter

Hello!

I have the following error message when excecuting LINKS

Can't load '/data2/bioProgs/LINKS/releases/links_v1.8.6/./lib/bloomfilter/swig/BloomFilter.so' for module BloomFilter: /data2/bioProgs/LINKS/releases/links_v1.8.6/./lib/bloomfilter/swig/BloomFilter.so: undefined symbol: PL_stack_sp at /usr/lib/x86_64-linux-gnu/perl/5.22/DynaLoader.pm line 187.
 at /data2/bioProgs/LINKS/releases/links_v1.8.6/./lib/bloomfilter/swig/BloomFilter.pm line 11.
Compilation failed in require at ./LINKS line 26.
BEGIN failed--compilation aborted at ./LINKS line 26.

The thing is, the compilation of the bloomfilter went without a problem. I can run the test.pl script within the swig subdirectory and it just works, no error messages whatsoever.
Would be great to get some help with this.
Thanks

Philipp

LINKS termineted with no error message when I used hybrid reads to scaffold.

Hi Warren,

I'm using LINKS v1.8.5 to scaffold the contigs produced by MEGAHIT.
I used hybrid reads including PE, MPE, and PacBio long-reads. The PE and MPE reads were feed to LINKS according to the manual. But LINKS stopped with no error message. The following is the log file:

Running: /home/software/links_v1.8.5/LINKS [v1.8.5]
-f /megahit_out/final.contigs.fa
-s PE.MPE.Pacbio.fof
-m 
-d 4000
-k 15
-e 0.1
-l 5
-a 0.3
-t 2
-o 0
-z 500
-b MEGAHIT.scaffold.LINKS
-r 
-p 0.001
-x 0

----------------- Verifying files -----------------

Checking 270B_R1.fasta_paired.fa...ok
Checking 500B_R1.fasta_paired.fa...ok
Checking 800B_R1.fasta_paired.fa...ok
Checking 3k_1_R1.fasta_paired.fa...ok
Checking 5k-1_R1.fasta_paired.fa...ok
Checking 5k-2_R1.fasta_paired.fa...ok
Checking 10k_R1.fasta_paired.fa...ok
Checking av_20k.fasta...ok
Checking sequence target file /megahit_out/final.contigs.fa...ok


=>Reading contig/sequence assembly file: 2017年 07月 15日 星期六 20:54:23 CST
Building a Bloom filter using 15-mers derived from sequences in -f /megahit_out/final.contigs.fa...
*****
Bloom filter specs
elements=810918104
FPR=0.001
size (bits)=11659046080
hash functions=9
*****
Contigs (>= 500 bp) processed k=15:
392101



=>Writing Bloom filter to disk (MEGAHIT.scaffold.LINKS.bloom): 2017年 07月 15日 星期六 21:00:30 CST


=>Reading long reads, building hash table: 2017年 07月 15日 星期六 21:00:32 CST
Reads processed k=15, dist=4000, offset=0 nt, sliding step=2 nt:

Reads processed from file 1/8, 270B_R1.fasta_paired.fa:
146753470


Reads processed from file 2/8, 500B_R1.fasta_paired.fa:
228144635


Reads processed from file 3/8, 800B_R1.fasta_paired.fa:
280845230


Reads processed from file 4/8, 3k_1_R1.fasta_paired.fa:
280845396


Reads processed from file 5/8, 5k-1_R1.fasta_paired.fa:
380208084


Reads processed from file 6/8, 5k-2_R1.fasta_paired.fa:
380208494


Reads processed from file 7/8, 10k_R1.fasta_paired.fa:
438639581


Reads processed from file 8/8, av_20k.fasta:

When it started to process the Pacbio long-reads, the program stopped. I've tried twice and got the same result.

Any suggestions would be appreciated.

Yiwei Niu

interpretation of output

Hello,
I am running LINKS to scaffold a repeat-rich plant genome (5 Gb, N50 197 kb, N70 71 kb) with ONT data (80 Gb, 16x, all above 5 kb reads). I trust the assembly, since it has been aligned to an optical map and the vast majority of the sequences align continuously to the restriction pattern.

I run it as
LINKS.pl -f genome_v2.2.fa -s input_ONT_b123.fofn -k 23 -d 4000 -t 100 -a 0.2 -z 2000 -r genome.bloom -b base
with a parameter sweep (-d from 1000 to 4000, -t 100 and 50, 8 combinations for now) and I am getting the first results - very little increase in N values (199 and 72 kb). I want to be strict in the scaffolding to avoid linking just by the presence of one repeat, so I decreased -a.
I wonder if the output files contain some information I could use to tune the parameters and find a good combination.

The pairing_issues file has very large negative values:

Pairs unsatisfied in distance within a contig pair.  A-> <-B  WITH tig#825 -> -849140 <- tig#71, A=654381 nt (start:477924, end:477947) B=1445674 nt (start:674683, end:674660) CALCULATED DISTANCE APART: -849140 < -200
Pairs unsatisfied in distance within a contig pair.  rB-> <-A WITH tig#r.2180 -> -690599 <- tig#735, B=406356 nt (start:228961, end:228938) A=684072 nt (start:463638, end:463615) CALCULATED DISTANCE APART: -690599 < -200
Pairs unsatisfied in distance within a contig pair.  A-> <-B  WITH tig#8861 -> -157439 <- tig#12512, A=135117 nt (start:32672, end:32695) B=88749 nt (start:56994, end:56971) CALCULATED DISTANCE APART: -157439 < -200
Pairs unsatisfied in distance within a contig pair.  A-> <-B  WITH tig#8167 -> -150869 <- tig#17200, A=146588 nt (start:22003, end:22026) B=47450 nt (start:28284, end:28261) CALCULATED DISTANCE APART: -150869 < -200
Pairs unsatisfied in distance within a contig.  Pair (GAGACTAAATTCTGAATATGTTG - GGTAGTGTAAAGATAGGAGTAGC) on contig 1044 (592359 nt) Astart:365063 Aend:365086 Bstart:367293 Bend:367270 CALCULATED DISTANCE APART: 2230
Pairs unsatisfied in distance within a contig pair.  A-> <-B  WITH tig#1642 -> -844309 <- tig#942, A=474302 nt (start:196149, end:196172) B=617172 nt (start:568156, end:568133) CALCULATED DISTANCE APART: -844309 < -200

and this is what the log reports:

Total number of pairs extracted from -s input_ONT_b123.fofn: 14278810
At least one sequence/pair missing from contigs: 0
Assembled pairs: 617317 (1234634 sequences)
        Satisfied in distance/logic within contigs (i.e. -> <-, distance on target: 414425
        Unsatisfied in distance within contigs (i.e. distance out-of-bounds): 13350
        Unsatisfied pairing logic within contigs (i.e. illogical pairing ->->, <-<- or <-->): 903
        ---
        Satisfied in distance/logic within a given contig pair (pre-scaffold): 6557
        Unsatisfied in distance within a given contig pair (i.e. calculated distances out-of-bounds): 182082
        ---
Total satisfied: 420982 unsatisfied: 196335

Breakdown by distances (-d):
--------k-mers separated by 4000 bp (outer distance)--------
MIN:3600 MAX:4400 as defined by 4000 * 0.1
At least one sequence/pair missing:
Assembled pairs: 617317
        Satisfied in distance/logic within contigs (i.e. -> <-, distance on target: 414425
        Unsatisfied in distance within contigs (i.e. distance out-of-bounds): 13350
        Unsatisfied pairing logic within contigs (i.e. illogical pairing ->->, <-<- or <-->): 903
        ---
        Satisfied in distance/logic within a given contig pair (pre-scaffold): 6557
        Unsatisfied in distance within a given contig pair (i.e. calculated distances out-of-bounds): 182082

which values should I maximize or look after?
I don't expect to increase the numbers by much, but if possible I would like to take advantage of the few >30 kb reads I have to put some pieces together.
Would it help to use error corrected (with FMLRC) reads?
Thanks,

Dario

AGP file creation : ARCS's gap mixed with pilon gap

Hello @sjackman @warrenlr ,

First of all I want to say thank for Tigmint+LINKS pipeline. By using Tigmint+LINKS pilpein made an amazing de-novo plant genome assembly.Now we are planing to submit the genome to NCBI.But we are having a problem with AGP file.We tried to create AGP file using abyss-fatoagp program and it could not make correct AGP file .The reason because we have used pilon software with gap insert mode (10bp ) to polish the base assembly (pacbio ) before ARCS pipeline. Then the base assembly ran with TIGMINT and ARCS with four set of 10X data.

The final output is having 10bp 'Ns' from ARCS as well as pilon. This messed up the abyss-fatoagp out put.Please could you give an advise how to solve this issue ? I can explain you in detail the process we have done.

created base assembly with pacbio reads.
ran Quiver and pilon for correction and polishing of genome . Pilon ran with gap insert mode. it inserted 10bp gaps.
Generated four sets of 10x data : a. Leaf_plg, b. leaf , c. shoot d . shoot_plud
Run tigmint with four sets of 10X data recursively ( leaf-plug -> leaf -> shoot -> shoot_plug)
Run ARCS with four sets of 10X data recursively ( leaf-plug -> leaf -> shoot -> shoot_plug).
Now I want to create AGP file for assembly ????

Installation issue

Hello,
I am unable to compile the bloom filter
g++ -c BloomFilter_wrap.cxx -I/usr/lib/perl5/core_perl/CORE -fPIC -Dbool=char -O3

This returns

In file included from /usr/include/c++/7.1.1/cmath:42:0,
                 from /usr/include/c++/7.1.1/math.h:36,
                 from BloomFilter_wrap.cxx:758:
/usr/include/c++/7.1.1/bits/cpp_type_traits.h:145:12: error: redefinition of ‘struct std::__is_integer<char>’
     struct __is_integer<char>
            ^~~~~~~~~~~~~~~~~~
/usr/include/c++/7.1.1/bits/cpp_type_traits.h:138:12: note: previous definition of ‘struct std::__is_integer<char>’
     struct __is_integer<bool>
            ^~~~~~~~~~~~~~~~~~
In file included from /usr/include/c++/7.1.1/exception:142:0,
                 from /usr/include/c++/7.1.1/stdexcept:38,
                 from BloomFilter_wrap.cxx:1569:
/usr/include/c++/7.1.1/bits/exception_ptr.h: In member function ‘std::__exception_ptr::exception_ptr::operator char() const’:
/usr/include/c++/7.1.1/bits/exception_ptr.h:145:16: error: invalid conversion from ‘void*’ to ‘char’ [-fpermissive]
       { return _M_exception_object; }
                ^~~~~~~~~~~~~~~~~~~
In file included from /usr/include/c++/7.1.1/bits/move.h:54:0,
                 from /usr/include/c++/7.1.1/bits/nested_exception.h:40,
                 from /usr/include/c++/7.1.1/exception:143,
                 from /usr/include/c++/7.1.1/stdexcept:38,
                 from BloomFilter_wrap.cxx:1569:
/usr/include/c++/7.1.1/type_traits: At global scope:
/usr/include/c++/7.1.1/type_traits:230:12: error: redefinition of ‘struct std::__is_integral_helper<char>’
     struct __is_integral_helper<char>
            ^~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/c++/7.1.1/type_traits:226:12: note: previous definition of ‘struct std::__is_integral_helper<char>’
     struct __is_integral_helper<bool>
            ^~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/c++/7.1.1/exception:143:0,
                 from /usr/include/c++/7.1.1/stdexcept:38,
                 from BloomFilter_wrap.cxx:1569:
/usr/include/c++/7.1.1/bits/nested_exception.h: In member function ‘void std::nested_exception::rethrow_nested() const’:
/usr/include/c++/7.1.1/bits/nested_exception.h:69:11: error: invalid user-defined conversion from ‘const std::__exception_ptr::exception_ptr’ to ‘bool’ [-fpermissive]
       if (_M_ptr)
           ^~~~~~
In file included from /usr/include/c++/7.1.1/exception:142:0,
                 from /usr/include/c++/7.1.1/stdexcept:38,
                 from BloomFilter_wrap.cxx:1569:
/usr/include/c++/7.1.1/bits/exception_ptr.h:144:16: note: candidate is: std::__exception_ptr::exception_ptr::operator char() const <near match>
       explicit operator bool() const
                ^~~~~~~~
/usr/include/c++/7.1.1/bits/exception_ptr.h:144:16: note:   return type ‘char’ of explicit conversion function cannot be converted to ‘bool’ with a qualification conversion
In file included from /usr/include/c++/7.1.1/bits/basic_string.h:6385:0,
                 from /usr/include/c++/7.1.1/string:52,
                 from /usr/include/c++/7.1.1/stdexcept:39,
                 from BloomFilter_wrap.cxx:1569:
/usr/include/c++/7.1.1/bits/functional_hash.h: At global scope:
/usr/include/c++/7.1.1/bits/functional_hash.h:127:3: error: redefinition of ‘struct std::hash<char>’
   _Cxx_hashtable_define_trivial_hash(char)
   ^
/usr/include/c++/7.1.1/bits/functional_hash.h:124:3: note: previous definition of ‘struct std::hash<char>’
   _Cxx_hashtable_define_trivial_hash(bool)
   ^
In file included from /usr/include/c++/7.1.1/bits/uniform_int_dist.h:35:0,
                 from /usr/include/c++/7.1.1/bits/stl_algo.h:66,
                 from /usr/include/c++/7.1.1/algorithm:62,
                 from BloomFilter_wrap.cxx:1615:
/usr/include/c++/7.1.1/limits:452:12: error: redefinition of ‘struct std::numeric_limits<char>’
     struct numeric_limits<char>
            ^~~~~~~~~~~~~~~~~~~~
/usr/include/c++/7.1.1/limits:383:12: note: previous definition of ‘struct std::numeric_limits<char>’
     struct numeric_limits<bool>
            ^~~~~~~~~~~~~~~~~~~~

If I keep going with the install process, the next step seems to work but attempt at testing returns

/usr/bin/perl: symbol lookup error: /home/alessandro/tools/links_v1.8.5/lib/bloomfilter/swig/BloomFilter.so: undefined symbol: Perl_Gthr_key_ptr

Any idea?

Thanks

Out of memory error

Hello,

I have a contig file of a large size (~19 Gb) that I would like to scaffold using the PacBio data, but I keep getting the "Out of memory!" error. When I tried running it using only 500 reads from a total of 136,774,993 PacBio reads, it ran successfully, meaning the PacBio data seems to be too big to run LINKS. All my other options were left to default.

Is there a way to split up the PacBio data and run them on parallel to combine the results afterwards? What's the maximum size allowed for the PacBio input data?
Could I simply split it up into 100 files for instance and run them separately 100 times iteratively using the same bloom file?

Any help would be greatly appreciated!

Thank you,
Jenny

Attempt to free unreferenced scalar

Hi guys,

I am really interested in using this tool for scaffolding with 10X chromium reads. I have cloned this git and installed the bloom filter with swig/3.0.12 with no problem. There are a few differences between the actual directory structure and the one assumed in the instructions. But anyway, inside the LINKS directory there is a bin directory with LINK in it. I created a lib directory inside and cloned the bloomfilter module in it.
The test run correctly:

$ ./writeBloom_rolling.pl -f test.fasta

Running:./writeBloom_rolling.pl -f test.fasta -k 15 -p 0.0001

Checking sequence target file test.fasta...ok
Wed Jun 27 16:09:47 AEST 2018:Estimating number of elements from file size
*****
Bloom filter specs
elements=58086
FPR=0.0001
size (bits)=1113536
hash functions=13
*****
Wed Jun 27 16:09:47 AEST 2018:Shredding supplied sequence file (-f test.fasta) into 15-mers..
Contigs processed k=15:
35
Wed Jun 27 16:09:47 AEST 2018:Writing Bloom filter to disk (test.fasta_k15_p0.0001_rolling.bf)
Storing filter. Filter is 139192 bytes.

Wed Jun 27 16:09:47 AEST 2018:./writeBloom_rolling.pl executed normally

but when running the real case I get the following error:

./LINKS -f contigs_ge500.fasta -s empty.fof -b contigs_ge500.fasta.scaff_s98_c5_l0_d0_e15000_r0.05_original.tigpair_checkpoint

Running: ./LINKS [v1.8.6]
-f contigs_ge500.fasta
-s empty.fof
-m
-d 4000
-k 15
-e 0.1
-l 5
-a 0.3
-t 2
-o 0
-z 500
-b contigs_ge500.fasta.scaff_s98_c5_l0_d0_e15000_r0.05_original.tigpair_checkpoint
-r
-p 0.001
-x 0

----------------- Verifying files -----------------

Checking sequence target file contigs_ge500.fasta...ok


=>Reading contig/sequence assembly file : Wed Jun 27 16:10:33 AEST 2018
Building a Bloom filter using 15-mers derived from sequences in -f contigs_ge500.fasta...
Attempt to free unreferenced scalar: SV 0xd76fb8, Perl interpreter: 0xd55010 at /data/Bioinfo/bioinfo-proj-jmontenegro/Programs/LINKS/bin/./lib/bloomfilter/swig/BloomFilter.pm line 118.
*****
Bloom filter specs
elements=2913009959
FPR=0.001
size (bits)=41882055808
hash functions=9
*****
Contigs (>= 500 bp) processed k=15:
1
Something went wrong running ./LINKS Wed Jun 27 16:10:38 AEST 2018
RuntimeError Usage: insertSeq(bloom,seq,numHashes,k); at ./LINKS line 807, <IN> line 20.

So it seems something is broken. I have no idea what could be going wrong here. Could you help me sort this out?

Kind regards,

System is not responding and Pairing issues

Hi,
I am trying to assemble a hybrid assembly (Illumina and Nanopore) contigs with LINKS.

First, I generated .fof file for ONT sequences:
gunzip -c ONT.fastq.gz | perl -ne '$ct++;if($ct>4){$ct=1;}print if($ct<3);' > ONT.fa
echo ONT.fa_paired.fa > ONT.fof
Is it correct for Nanopore sequencing raw reads?

Second, I used the following commands for scaffold assembly:
bin$LINKS -f contigs.fa -s ONT.fof -d 1000,2500,5000,5000,7500,10000,12500,15000,30000 -t 10,5,5,4,4,3,3,2 -b ONT_scaffold

The contig file is containing 101 contigs.

After one day, the system is not responding and not getting any results. In the output files, I can able to get ONT.bloom and ONT.pairing_issues files. When I try to open these two files, I couldn't see anything.
How to increase RAM memory? in my system, 188 GB free RAM is there.

How to solve the problem?

Thank you.

con_kei_LINKS.log

issues compiling BloomFilter

hello!
i have issues compiling BloomFilter, i saw that others have a similar issues, but trying their solutions did not work for me.

i have this g++ version installed:
g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36)

and i checked if the path to perl is working (it is).
it would be very nice if you have a solution to my problem.
i'm rather new at installing programs :)

this is the command i try to run:
g++ -c BloomFilter_wrap.cxx -I/usr/lib64/perl5/CORE -fPIC -Dbool=char -O3

In file included from /usr/include/c++/4.8.2/cstdint:35:0,
                 from ../vendor/cpptoml/include/cpptoml.h:13,
                 from ../BloomFilter.hpp:13,
                 from ../KmerBloomFilter.hpp:11,
                 from BloomFilter_wrap.cxx:1857:
/usr/include/c++/4.8.2/bits/c++0x_warning.h:32:2: error: #error This file requires compiler and library support for the ISO C++ 2011 standard. This support is currently experimental, and must be enabled with the -std=c++11 or -std=gnu++11 compiler options.
 #error This file requires compiler and library support for the \
  ^
In file included from ../BloomFilter.hpp:13:0,
                 from ../KmerBloomFilter.hpp:11,
                 from BloomFilter_wrap.cxx:1857:
../vendor/cpptoml/include/cpptoml.h:50:7: error: expected nested-name-specifier before ‘string_to_base_map’
 using string_to_base_map
       ^
../vendor/cpptoml/include/cpptoml.h:50:7: error: ‘string_to_base_map’ has not been declared
../vendor/cpptoml/include/cpptoml.h:51:5: error: expected ‘;’ before ‘=’ token
     = std::unordered_map<std::string, std::shared_ptr<base>>;
     ^
../vendor/cpptoml/include/cpptoml.h:51:5: error: expected unqualified-id before ‘=’ token
../vendor/cpptoml/include/cpptoml.h:72:30: warning: explicit conversion operators only available with -std=c++11 or -std=gnu++11 [enabled by default]
     explicit operator bool() const
                              ^
../vendor/cpptoml/include/cpptoml.h:88:17: error: expected ‘,’ or ‘...’ before ‘&&’ token
     T value_or(U&& alternative) const
                 ^
../vendor/cpptoml/include/cpptoml.h: In constructor ‘cpptoml::option<T>::option()’:
../vendor/cpptoml/include/cpptoml.h:62:16: warning: extended initializer lists only available with -std=c++11 or -std=gnu++11 [enabled by default]
     option() : empty_{true}
                ^
../vendor/cpptoml/include/cpptoml.h: In constructor ‘cpptoml::option<T>::option(T)’:
../vendor/cpptoml/include/cpptoml.h:67:23: warning: extended initializer lists only available with -std=c++11 or -std=gnu++11 [enabled by default]
     option(T value) : empty_{false}, value_(std::move(value))
                       ^
../vendor/cpptoml/include/cpptoml.h:67:45: error: ‘move’ is not a member of ‘std’
     option(T value) : empty_{false}, value_(std::move(value))
                                             ^
../vendor/cpptoml/include/cpptoml.h: In member function ‘T cpptoml::option<T>::value_or(U) const’:
../vendor/cpptoml/include/cpptoml.h:92:31: error: ‘forward’ is not a member of ‘std’
         return static_cast<T>(std::forward<U>(alternative));
                               ^
../vendor/cpptoml/include/cpptoml.h:92:45: error: expected primary-expression before ‘>’ token
         return static_cast<T>(std::forward<U>(alternative));
                                             ^
../vendor/cpptoml/include/cpptoml.h:92:47: error: ‘alternative’ was not declared in this scope
         return static_cast<T>(std::forward<U>(alternative));
                                               ^
../vendor/cpptoml/include/cpptoml.h: At global scope:
../vendor/cpptoml/include/cpptoml.h:102:16: warning: non-static data member initializers only available with -std=c++11 or -std=gnu++11 [enabled by default]
     int year = 0;
                ^
../vendor/cpptoml/include/cpptoml.h:103:17: warning: non-static data member initializers only available with -std=c++11 or -std=gnu++11 [enabled by default]
     int month = 0;
                 ^
../vendor/cpptoml/include/cpptoml.h:104:15: warning: non-static data member initializers only available with -std=c++11 or -std=gnu++11 [enabled by default]
     int day = 0;
               ^
../vendor/cpptoml/include/cpptoml.h:109:16: warning: non-static data member initializers only available with -std=c++11 or -std=gnu++11 [enabled by default]
     int hour = 0;
                ^
../vendor/cpptoml/include/cpptoml.h:110:18: warning: non-static data member initializers only available with -std=c++11 or -std=gnu++11 [enabled by default]
     int minute = 0;
                  ^
../vendor/cpptoml/include/cpptoml.h:111:18: warning: non-static data member initializers only available with -std=c++11 or -std=gnu++11 [enabled by default]
     int second = 0;
                  ^
../vendor/cpptoml/include/cpptoml.h:112:23: warning: non-static data member initializers only available with -std=c++11 or -std=gnu++11 [enabled by default]
     int microsecond = 0;
                       ^
../vendor/cpptoml/include/cpptoml.h:117:23: warning: non-static data member initializers only available with -std=c++11 or -std=gnu++11 [enabled by default]
     int hour_offset = 0;
                       ^
../vendor/cpptoml/include/cpptoml.h:118:25: warning: non-static data member initializers only available with -std=c++11 or -std=gnu++11 [enabled by default]
     int minute_offset = 0;
                         ^
../vendor/cpptoml/include/cpptoml.h: In static member function ‘static cpptoml::offset_datetime cpptoml::offset_datetime::from_zoned(const tm&)’:
../vendor/cpptoml/include/cpptoml.h:140:22: error: ‘stoi’ is not a member of ‘std’
         int offset = std::stoi(buf);
                      ^
../vendor/cpptoml/include/cpptoml.h: In constructor ‘cpptoml::fill_guard::fill_guard(std::ostream&)’:
../vendor/cpptoml/include/cpptoml.h:171:45: warning: extended initializer lists only available with -std=c++11 or -std=gnu++11 [enabled by default]
     fill_guard(std::ostream& os) : os_(os), fill_{os.fill()}
                                             ^
../vendor/cpptoml/include/cpptoml.h: In function ‘std::ostream& cpptoml::operator<<(std::ostream&, const cpptoml::local_date&)’:
../vendor/cpptoml/include/cpptoml.h:188:16: warning: extended initializer lists only available with -std=c++11 or -std=gnu++11 [enabled by default]
     fill_guard g{os};
                ^
../vendor/cpptoml/include/cpptoml.h:188:20: error: in C++98 ‘g’ must be initialized by constructor, not by ‘{...}’
     fill_guard g{os};
                    ^
../vendor/cpptoml/include/cpptoml.h: In function ‘std::ostream& cpptoml::operator<<(std::ostream&, const cpptoml::local_time&)’:
../vendor/cpptoml/include/cpptoml.h:200:16: warning: extended initializer lists only available with -std=c++11 or -std=gnu++11 [enabled by default]
     fill_guard g{os};
                ^
../vendor/cpptoml/include/cpptoml.h:200:20: error: in C++98 ‘g’ must be initialized by constructor, not by ‘{...}’
     fill_guard g{os};
                    ^
../vendor/cpptoml/include/cpptoml.h:213:18: error: ‘num’ does not name a type
             auto num = curr_us / power;
                  ^
../vendor/cpptoml/include/cpptoml.h:214:19: error: ‘num’ was not declared in this scope
             os << num;
                   ^
../vendor/cpptoml/include/cpptoml.h: In function ‘std::ostream& cpptoml::operator<<(std::ostream&, const cpptoml::zone_offset&)’:
../vendor/cpptoml/include/cpptoml.h:224:16: warning: extended initializer lists only available with -std=c++11 or -std=gnu++11 [enabled by default]
     fill_guard g{os};
                ^
../vendor/cpptoml/include/cpptoml.h:224:20: error: in C++98 ‘g’ must be initialized by constructor, not by ‘{...}’
     fill_guard g{os};
                    ^
../vendor/cpptoml/include/cpptoml.h: At global scope:
../vendor/cpptoml/include/cpptoml.h:262:25: warning: variadic templates only available with -std=c++11 or -std=gnu++11 [enabled by default]
 template <class T, class... Ts>
                         ^
../vendor/cpptoml/include/cpptoml.h:266:38: error: expected template-name before ‘<’ token
 struct is_one_of<T, V> : std::is_same<T, V>
                                      ^
../vendor/cpptoml/include/cpptoml.h:266:38: error: expected ‘{’ before ‘<’ token
../vendor/cpptoml/include/cpptoml.h:266:38: error: expected unqualified-id before ‘<’ token
../vendor/cpptoml/include/cpptoml.h:270:34: warning: variadic templates only available with -std=c++11 or -std=gnu++11 [enabled by default]
 template <class T, class V, class... Ts>
                                  ^
../vendor/cpptoml/include/cpptoml.h:274:11: error: ‘is_same’ is not a member of ‘std’
         = std::is_same<T, V>::value || is_one_of<T, Ts...>::value;
           ^
../vendor/cpptoml/include/cpptoml.h:274:25: error: expected primary-expression before ‘,’ token
         = std::is_same<T, V>::value || is_one_of<T, Ts...>::value;
                         ^
../vendor/cpptoml/include/cpptoml.h:274:27: error: expected ‘;’ at end of member declaration
         = std::is_same<T, V>::value || is_one_of<T, Ts...>::value;
                           ^
../vendor/cpptoml/include/cpptoml.h:274:27: error: declaration of ‘const bool cpptoml::is_one_of<T, V, Ts ...>::V’
../vendor/cpptoml/include/cpptoml.h:270:20: error:  shadows template parm ‘class V’
 template <class T, class V, class... Ts>
                    ^
../vendor/cpptoml/include/cpptoml.h:274:28: error: expected unqualified-id before ‘>’ token
         = std::is_same<T, V>::value || is_one_of<T, Ts...>::value;
                            ^
../vendor/cpptoml/include/cpptoml.h:294:57: error: ‘decay’ in namespace ‘std’ does not name a type
     const static bool value = valid_value<typename std::decay<T>::type>::value
                                                         ^
../vendor/cpptoml/include/cpptoml.h:294:62: error: expected template-argument before ‘<’ token
     const static bool value = valid_value<typename std::decay<T>::type>::value
                                                              ^
../vendor/cpptoml/include/cpptoml.h:294:62: error: expected ‘>’ before ‘<’ token
../vendor/cpptoml/include/cpptoml.h:294:71: error: template argument 1 is invalid
     const static bool value = valid_value<typename std::decay<T>::type>::value
                                                                       ^
../vendor/cpptoml/include/cpptoml.h:294:74: error: expected ‘(’ before ‘value’
     const static bool value = valid_value<typename std::decay<T>::type>::value
                                                                          ^
cc1plus: error: expected ‘;’ at end of member declaration
../vendor/cpptoml/include/cpptoml.h:294:74: error: ‘value’ does not name a type
../vendor/cpptoml/include/cpptoml.h:299:38: error: ‘enable_if’ in namespace ‘std’ does not name a type
 struct value_traits<T, typename std::enable_if<
                                      ^
../vendor/cpptoml/include/cpptoml.h:299:47: error: expected template-argument before ‘<’ token
 struct value_traits<T, typename std::enable_if<
                                               ^
../vendor/cpptoml/include/cpptoml.h:299:47: error: expected ‘>’ before ‘<’ token
../vendor/cpptoml/include/cpptoml.h:300:78: error: template argument 2 is invalid
                            valid_value_or_string_convertible<T>::value>::type>
                                                                              ^
../vendor/cpptoml/include/cpptoml.h:301:1: error: expected ‘::’ before ‘{’ token
 {
 ^
../vendor/cpptoml/include/cpptoml.h:301:1: error: expected identifier before ‘{’ token
../vendor/cpptoml/include/cpptoml.h:301:1: error: qualified name does not name a class before ‘{’ token
../vendor/cpptoml/include/cpptoml.h:317:19: error: ‘enable_if’ in namespace ‘std’ does not name a type
     typename std::enable_if<
                   ^
../vendor/cpptoml/include/cpptoml.h:317:28: error: expected template-argument before ‘<’ token
     typename std::enable_if<
                            ^
../vendor/cpptoml/include/cpptoml.h:317:28: error: expected ‘>’ before ‘<’ token
../vendor/cpptoml/include/cpptoml.h:319:78: error: template argument 2 is invalid
         && std::is_floating_point<typename std::decay<T>::type>::value>::type>
                                                                              ^
../vendor/cpptoml/include/cpptoml.h:320:1: error: expected ‘::’ before ‘{’ token
 {
 ^
../vendor/cpptoml/include/cpptoml.h:320:1: error: expected identifier before ‘{’ token
../vendor/cpptoml/include/cpptoml.h:320:1: error: qualified name does not name a class before ‘{’ token
../vendor/cpptoml/include/cpptoml.h:333:22: error: ‘enable_if’ in namespace ‘std’ does not name a type
     T, typename std::enable_if<
                      ^
../vendor/cpptoml/include/cpptoml.h:333:31: error: expected template-argument before ‘<’ token
     T, typename std::enable_if<
                               ^
../vendor/cpptoml/include/cpptoml.h:333:31: error: expected ‘>’ before ‘<’ token
../vendor/cpptoml/include/cpptoml.h:336:73: error: template argument 2 is invalid
            && std::is_signed<typename std::decay<T>::type>::value>::type>
                                                                         ^
../vendor/cpptoml/include/cpptoml.h:337:1: error: expected ‘::’ before ‘{’ token
 {
 ^
../vendor/cpptoml/include/cpptoml.h:337:1: error: expected identifier before ‘{’ token
../vendor/cpptoml/include/cpptoml.h:337:1: error: qualified name does not name a class before ‘{’ token
../vendor/cpptoml/include/cpptoml.h:359:22: error: ‘enable_if’ in namespace ‘std’ does not name a type
     T, typename std::enable_if<
                      ^
../vendor/cpptoml/include/cpptoml.h:359:31: error: expected template-argument before ‘<’ token
     T, typename std::enable_if<
                               ^
../vendor/cpptoml/include/cpptoml.h:359:31: error: expected ‘>’ before ‘<’ token
../vendor/cpptoml/include/cpptoml.h:361:75: error: template argument 2 is invalid
            && std::is_unsigned<typename std::decay<T>::type>::value>::type>
                                                                           ^
../vendor/cpptoml/include/cpptoml.h:362:1: error: expected ‘::’ before ‘{’ token
 {
 ^
../vendor/cpptoml/include/cpptoml.h:362:1: error: expected identifier before ‘{’ token
../vendor/cpptoml/include/cpptoml.h:362:1: error: qualified name does not name a class before ‘{’ token
../vendor/cpptoml/include/cpptoml.h:384:11: error: expected nested-name-specifier before ‘return_type’
     using return_type = option<std::vector<T>>;
           ^
../vendor/cpptoml/include/cpptoml.h:384:11: error: using-declaration for non-member at class scope
../vendor/cpptoml/include/cpptoml.h:384:23: error: expected ‘;’ before ‘=’ token
     using return_type = option<std::vector<T>>;
                       ^
../vendor/cpptoml/include/cpptoml.h:384:23: error: expected unqualified-id before ‘=’ token
../vendor/cpptoml/include/cpptoml.h:390:11: error: expected nested-name-specifier before ‘return_type’
     using return_type = option<std::vector<std::shared_ptr<array>>>;
           ^
../vendor/cpptoml/include/cpptoml.h:390:11: error: using-declaration for non-member at class scope
../vendor/cpptoml/include/cpptoml.h:390:23: error: expected ‘;’ before ‘=’ token
     using return_type = option<std::vector<std::shared_ptr<array>>>;
                       ^
../vendor/cpptoml/include/cpptoml.h:390:23: error: expected unqualified-id before ‘=’ token
../vendor/cpptoml/include/cpptoml.h:394:8: error: ‘shared_ptr’ in namespace ‘std’ does not name a type
 inline std::shared_ptr<typename value_traits<T>::type> make_value(T&& val);
        ^
../vendor/cpptoml/include/cpptoml.h:395:8: error: ‘shared_ptr’ in namespace ‘std’ does not name a type
 inline std::shared_ptr<array> make_array();
        ^
../vendor/cpptoml/include/cpptoml.h:400:8: error: ‘shared_ptr’ in namespace ‘std’ does not name a type
 inline std::shared_ptr<T> make_element();
        ^
../vendor/cpptoml/include/cpptoml.h:403:8: error: ‘shared_ptr’ in namespace ‘std’ does not name a type
 inline std::shared_ptr<table> make_table();
        ^
../vendor/cpptoml/include/cpptoml.h:404:8: error: ‘shared_ptr’ in namespace ‘std’ does not name a type
 inline std::shared_ptr<table_array> make_table_array(bool is_inline = false);
        ^
../vendor/cpptoml/include/cpptoml.h:498:49: error: expected template-name before ‘<’ token
 class base : public std::enable_shared_from_this<base>
                                                 ^
../vendor/cpptoml/include/cpptoml.h:498:49: error: expected ‘{’ before ‘<’ token
../vendor/cpptoml/include/cpptoml.h:498:49: error: expected unqualified-id before ‘<’ token
BloomFilter_wrap.cxx:3850:1: error: expected ‘}’ at end of input

in the first few lines it was asking for either -std=c++11 or -std=gnu++11 to add to the command.

after adding it other kinds of warnings appear:
g++ -std=c++11 -c BloomFilter_wrap.cxx -I/usr/lib64/perl5/CORE -fPIC -Dbool=char -O3

In file included from /usr/lib64/perl5/CORE/perl.h:3441:0,
                  from BloomFilter_wrap.cxx:744:
 /usr/lib64/perl5/CORE/pad.h:143:19: warning: invalid suffix on literal; C++11 requires a space between literal and identifier [-Wliteral-suffix]
   Perl_croak(aTHX_ "panic: illegal pad in %s: 0x%"UVxf"[0x%"UVxf"]",\
                    ^
 /usr/lib64/perl5/CORE/pad.h:143:54: warning: invalid suffix on literal; C++11 requires a space between literal and identifier [-Wliteral-suffix]
   Perl_croak(aTHX_ "panic: illegal pad in %s: 0x%"UVxf"[0x%"UVxf"]",\
                                                       ^
 /usr/lib64/perl5/CORE/pad.h:150:19: warning: invalid suffix on literal; C++11 requires a space between literal and identifier [-Wliteral-suffix]
   Perl_croak(aTHX_ "panic: invalid pad in %s: 0x%"UVxf"[0x%"UVxf"]",\

Links-Rails_Cobbler-Ntedit - Setup parameters

Hi @warrenlr,

First of all thank for the amazing softwares that you and your team have developing.If possible we need some advices about the using of BSCG softwares in our pipelines.

We are working with several genomes, from fish's until mammals with high levels of heterozigoties and repeats.To mitigate these characteristics we have sequenced pacbio Hifi ccs datasets and short reads PE 250 (unfortunately the datasets are from different individuals).
Now we are trying to integrate and use both datasets to improve the base assembly.

Briefly, We have produced a Pacbio Hifi assembly with hifiasm (contig N50,3mb).
In addition we have produced others assemblies (of the same pacbio dataset with different parameters).
Next, we have produced 3 short read assemblies from 250 PE datasets (average 8kb N50).
Now we are thinking if would be possible to integrate all datasets with the following strategy.

1st - Links
Scaffolding of the base assembly with short reads assemblies.
Below, you can find the parameters. These parameter were obtained from your previously study (https://www.mdpi.com/2073-4425/8/12/378/html).

LINKS -f Assembly.fa -s PE250Drafts.txt -k 26 -l 5 -a 0.3 -d 1000,2500,5000,7500,10000,12500,15000,20000 -t 10,5,5,4,4,3,3,2

Regarding the -a, 0.3 , its suitable for our purposes or should we increase it the values of 0.7 or 0,9 ?
Are the remaining parameters adequate ? The links only perform the scaffolding or introduce some content in the assembly? ( as both datasets are from different individuals, in this step we would like to produce scaffolding without introduce "new nucleotide content").

2nd - Rails & CobblerRe-scaffolding and gap-closing the assembly of step one, with the remaining pacbio assemblies.

runRAILSminimap.sh Assembly_PE250.fa Other_Assemblies.fa 2000 0.95 0 5 nil

3rd- Rails & CobblerRe-scaffolding and gap-closing the assembly of step two, with the pacbio raw reads.

runRAILSminimap.sh Assembly_PE250_Others.fa CCS.fa 2000 0.95 0 2 pacbio

Are the parameter adequate in step 2 and 3?
Once we are working with ccs reads and assemblies we have have set the max. softclip to 0 in the 2 and 3 we. Should we increase or decrease the Minimum sequence identity fraction?

4 - ntEdit ntEdit to polish the final assembly with pacbio reads.

Thank you in advance,
André

PacBio scaffolding no change. Any recommendations ?

Hello,

I have 2 species of 1.2G genome size that are quite fragmented (N50 species 1: 60k, N50 species 2: 110k), made with illumina reads.

For species 1: I have 25X PacBio uncorrected

For species 2: I have 25X PacBio corrected by CANU.

I have tried only for species 2 for now and followed your recommendation about memory optimization ( I am limtied to 400GB RAM) : -t 200 -d 500

These parameters didn't change anything (or very minor).

Do you have any recommendations for these 2 cases ?

Thanks a lot.

parameter information

hello!
after installing LINKS with anaconda, running it is pretty easy and straight forward!

but now i have a question regarding the parameters: -k, -d and -t
-k is the most straight forward one, but i have problems in understanding how these three play together and how they influence the scaffolding process.
i tried to look into the closed tickets and i got some information, but im still not sure when to fine-tune what and what the consequences will be. would it be possible for you to describe it a little bit more, that i get a better understanding? i'm totally aware that there is no strick recipe to follow.
that why i would like to better understand what the parameters are doing and what will happen if i change them in one or the other direction.

for one i have problems understanding what -t does, so a lower -t less RAM a higher -t more RAM?
so how does it influence then the scaffolding? the default is -t 2 but what does a -t 4 or -t 100?

for -d what i understand for many small scaffolds a smaller -d is advised and with less larger scaffolds a bigger -d, so if you have a assembly with a combination multiple distances should be used? and how does -t have an influence on such assemblies?

it would be nice if you have time for answering this ticket!
i really like working with this program and hope to improve my outcome further :)

writeBloom.pl

Hi,
After much trouble succeeded compiling bloomfilter using g++4.8.
Now when running either through links or through tools I get the following error:

Attempt to free unreferenced scalar: SV 0x55eef06034f8, Perl interpreter: 0x55eef05e2010 at /local/workdir/eoren/links_v1.8.6/tools/../lib/bloomfilter/swig/BloomFilter.pm line 118. Wed Jan 2 11:48:23 EST 2019:Shredding supplied sequence file (-f ../../melon/assembly/dul/input/ONT/dul2.short.fastq) into 21-mers..
Contigs processed k=21:
1

Something went wrong running writeBloom.pl Wed Jan  2 11:50:10 EST 2019'
RuntimeError Usage: insertSeq(bloom,seq,numHashes,k); at writeBloom.pl line 184, <IN> line 5278940.

BloomFilter_wrap.cxx:763:20: fatal error: EXTERN.h: No such file or directory

Hi,
After a long struggle, I successfully compiled Perl Module.
When I try to assemble using the following command, I am getting the following error.

./../bin/bin/LINKS-master/bin/LINKS -f scaffoldsK77_full_ONT\ extraction.fasta -s 201907_ONT_PSJ.fastq.gz -d 1000,2500,5000,7500,10000,12500,15000,30000 -t 10,5,5,4,4,3,3,2 -b ./LINK_ONT_23032020

Can't locate BloomFilter.pm in @inc (you may need to install the BloomFilter module) (@inc contains: /home/pmslab/Desktop/Raman/bin/bin/LINKS-master/bin/./lib/bloomfilter/swig /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.22.1 /usr/local/share/perl/5.22.1 /usr/lib/x86_64-linux-gnu/perl5/5.22 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.22 /usr/share/perl/5.22 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base .) at ./../bin/bin/LINKS-master/bin/LINKS line 26.

BEGIN failed--compilation aborted at ./../bin/bin/LINKS-master/bin/LINKS line 26.

Please help to sort out this issue. Thank you.

LINKS segmentation error

Links terminates with with segmentation error with no further information to help debung durinng the long reads processing

ekw10@debruijn Pear]$ /scratch/biotools/software/assembly/links_v1.8.5/LINKS -f final.genome.scf.ge500bp.fasta -r final.genome.scf.ge500bp.fasta_k17_p0.001_rolling.bloom -s PacBio_reads.txt -d 1000 -k 17 -t 150 -l 5 -z 500 -b linked -v

Running: /scratch/biotools/software/assembly/links_v1.8.5/LINKS [v1.8.5]
-f final.genome.scf.ge500bp.fasta
-s PacBio_reads.txt
-m 
-d 1000
-k 17
-e 0.1
-l 5
-a 0.3
-t 150
-o 0
-z 500
-b linked
-r final.genome.scf.ge500bp.fasta_k17_p0.001_rolling.bloom
-p 0.001
-x 0

----------------- Verifying files -----------------

Checking /scratch/projects/Pear/pear.trimmedReads.fasta.gz...ok
Checking sequence target file final.genome.scf.ge500bp.fasta...ok


=>Reading contig/sequence assembly file: Mon May 27 22:51:56 EDT 2019
A Bloom filter was supplied (final.genome.scf.ge500bp.fasta_k17_p0.001_rolling.bloom) and will be used instead of building a new one from -f final.genome.scf.ge500bp.fasta
Checking Bloom filter file final.genome.scf.ge500bp.fasta_k17_p0.001_rolling.bloom...ok
Loading bloom filter of size 9726797760 from final.genome.scf.ge500bp.fasta_k17_p0.001_rolling.bloom


=>Reading long reads, building hash table: Mon May 27 22:51:57 EDT 2019
Reads processed k=17, dist=1000, offset=0 nt, sliding step=150 nt:

Reads processed from file 1/1, /scratch/projects/Pear/pear.trimmedReads.fasta.gz:
1Segmentation fault

Output relative to preexisting scaffold names?

Hello,
Links seems to use its own internal scaffold names.
For the output to be usable I'd really want a table reusing the names in my old assembly. (and ideally this table would also indicate which or how many reads support each link; I understand if algorithmically the latter is unfeasible). The AGP format (or the format used by 454 back in the day) are relatively standard...
Kind regards,
Yannick

packaging in Anaconda - request/question

Given that an older LINKS version had been produced as an Anaconda distribution (see here), I was wondering if it would be possible to package the more recent versions in the same manner?
If so, would the dependency of installing the PERL modules persist, or automatically be generated? It's been my experience (as a total computer novice) that the dependency issue has required system administrator privileges and has made troubleshooting more time consuming.
Thanks for your consideration and information

LINKS is not finding any kmer pairs in my ONT reads

Hi I've tried this with default settings and some variations on -d and -t, but no luck- Links does not find any kmer pairs! My read avg length is about 10kb and N50 25kb; here's the stdout for current run with default settings:
`Contigs (>= 500 bp) processed k=15:
248522

=>Writing Bloom filter to disk (links-default/links-def.bloom) : Mon Jul 23 14:43:31 CDT 2018
Storing filter. Filter is 1303720144bytes.
Writting header... magic: BlOOMFXX hlen: 72 size: 10429761152 nhash: 9 kmer: 15 dFPR: 0 aFPR: 0 rFPR: 0 nEntry: 0 tEntry: 0

=>Reading long reads, building hash table : Mon Jul 23 14:43:32 CDT 2018
Reads processed k=15, dist=4000, offset=0 nt, sliding step=2 nt:

Reads processed from file 1/1, /media/bigdata2/zeh/nanopore_all_cell1.fastq:
341656
Extracted 0 15-mer pairs at -d 4000, from all 341656 sequences provided in /media/bigdata2/zeh/reads1.txt

Extracted 0 15-mer pairs overall. This is the set that will be used for scaffolding`

No scaffolding in the result.

Hello,
I have been trying scaffold a 2.5gb genome using nanopore. The commands I am using is as follows. None of the iteration seems to scaffold the genome. The assembly corresponding file shows no scaffolding. Can you think of a reason why this is happening? I have around 5X ONT data. I have also included the log file for the first iteration.

`/LINKS -f $INFASTA -s $FOFPATH -b CPI_OG1000 -d 1000 -t 10 -k 15 -l 5 -a 0.3
./LINKS -f CPI_OG1000.scaffolds.fa -s $FOFPATH -b CPI_OG2500 -d 2500 -t 5 -k 15 -l 5 -a 0.3 -r CPI_OG1000.bloom
./LINKS -f CPI_OG2500.scaffolds.fa -s $FOFPATH -b CPI_OG5000 -d 5000 -t 5 -k 15 -l 5 -a 0.3 -r CPI_OG1000.bloom
./LINKS -f CPI_OG5000.scaffolds.fa -s $FOFPATH -b CPI_OG7500 -d 7500 -t 4 -k 15 -l 5 -a 0.3 -r CPI_OG1000.bloom
./LINKS -f CPI_OG7500.scaffolds.fa -s $FOFPATH -b CPI_OG10000 -d 10000 -t 4 -k 15 -l 5 -a 0.3 -r CPI_OG1000.bloom
./LINKS -f CPI_OG10000.scaffolds.fa -s $FOFPATH -b CPI_OG12500 -d 12500 -t 3 -k 15 -l 5 -a 0.3 -r CPI_OG1000.bloom
./LINKS -f CPI_OG12500.scaffolds.fa -s $FOFPATH -b CPI_OG15000 -d 15000 -t 3 -k 15 -l 5 -a 0.3 -r CPI_OG1000.bloom
./LINKS -f CPI_OG15000.scaffolds.fa -s $FOFPATH -b CPI_OG30000 -d 30000 -t 2 -k 15 -l 5 -a 0.3 -r CPI_OG1000.bloom

`

CPI_OG1000.log

Installing LINKS with the latest SWIG version 3.0.12

Hi, I am kind of a newbie and I am struggling to install Links,
I have downloaded the tar file and extracted it, I installed SWIG (3.0.12), etc, then got to the TEST INSTALL step and got the following output:
ubuntu@tee9seq:/mnt/cris/links_v1.8.5/lib/bloomfilter/swig$ ./test.pl
de novo bf tests done
Storing filter. Filter is 125000000bytes.
Loading header...
premade bf tests done
Filter Info: Pop - 20, numHash - 5, kmerSize - 20, size - 1000000000
0 TAGAA found
1 AGAAT found
2 GAATC found
3 AATCA found
4 ATCAC found
5 TCACC found
6 CACCC found
7 ACCCA found
8 CCCAA found
9 CCAAA found
10 CAAAG found
11 AAAGA found
Done!

Is this output ok?

Also I installed LINKS in a different directory, could someone explain me how to do step 4 (CHANGE the path to BloomFilter.pm in LINKS/writeBloom.pl/testBloom.pl)????

Thanks!

Cris.

LINKS run terminated without any error message in the .log file, not getting output files

Hi
I am running LINKS to scaffolding a draft genome assembly using the following command

LINKS -f /home/ashutosh/assembly/assembly_1kb.fa -s /home/ashutosh/assembly/fof -b 1kb_raw

However, After producing the 1kb_raw.bloom file, no more output files are generating and the log file is showing the following status after the LINKS run terminated

### Running: /home/ashutosh/LINKS/releases/links_v1.8.6/LINKS [v1.8.6]
-f /home/ashutosh/assembly/assembly_1kb.fa
-s /home/ashutosh/assembly/fof
-m
-d 4000
-k 15
-e 0.1
-l 5
-a 0.3
-t 2
-o 0
-z 500
-b 1kb_raw
-r
-p 0.001
-x 0

----------------- Verifying files -----------------

Checking /home/ashutosh/asmset_longread/assembly.chopped.filt.fastq...ok
Checking sequence target file /home/ashutosh/assembly/assembly_1kb.fa...ok

=>Reading contig/sequence assembly file : Wed Jul  4 11:42:36 AEST 2018
Building a Bloom filter using 15-mers derived from sequences in -f /home/ashutosh/assembly/assembly_1kb.fa...
*****
Bloom filter specs
elements=908707947
FPR=0.001
size (bits)=13065028096
hash functions=9
*****
Contigs (>= 500 bp) processed k=15:
2866

=>Writing Bloom filter to disk (1kb_raw.bloom) : Wed Jul  4 12:05:52 AEST 2018

Could you please advise me regarding this issues

Thanks in advanced

Does LINKs break pre-existing scaffolds?

It's not a problem, but I just wanted clarification- it appears that LINKs does NOT break a supplied sequence with internal NNNs (eg generated contigs from an input scaffold sequence). Can that be confirmed, and would you suggest I try running only unscaffolded contigs, in case my input scaffolds have errors?
Thanks! Scott

Job killed even though I have enough RAM

Hi,

I've got one of those ugly errors (v1.8.7). My job is killed even though I have 300G of RAM. The max RAM usage was 90G. Anything I can do? Perhaps recompile?

=>Reading long reads, building hash table : Wed Dec  9 09:59:02 CET 2020
Reads processed k=20, dist=4000, offset=0 nt, sliding step=2 nt:

>Reads processed from file 1/1, MY_DIR/d_tenuifolia/reads.fa:
183075(base)

Writing a 1004578712 byte filter to d_tenuifolia_scaf.bloom on disk.
/var/spool/pbs/mom_priv/jobs/7048737.hpc-batch14.SC: line 24: 118682 Killed        /gpfs/project/projects/qggp/src/LINKS/bin/LINKS -f assembly-renamed.fa -s reads.fof -k 20 -b ${species}_scaf -l 5 -t 2 -a 0.3 

Cheers,
Ricardo

installation and usage help

hello,

there is a lot of info in the readme. But I am having some trouble.

I downloaded from the link in the other issue thread (as downloading the .tar.gz files for the versions isn't an option in the documentation...)

I unzipped:

tar -zxvf ~/links_v1-8-6.tar.gz
x links_v1.8.6/
x links_v1.8.6/README.md
x links_v1.8.6/tools/
x links_v1.8.6/tools/testBloom.pl
x links_v1.8.6/tools/makeMPETOutput2EQUALfiles.pl
x links_v1.8.6/tools/writeBloom.pl
x links_v1.8.6/tools/consolidateGraphs.pl
x links_v1.8.6/lib/
x links_v1.8.6/lib/bloomfilter/
x links_v1.8.6/lib/bloomfilter/autogen.sh
x links_v1.8.6/lib/bloomfilter/bloomfilter.py
x links_v1.8.6/lib/bloomfilter/configure.ac
x links_v1.8.6/lib/bloomfilter/.gitignore
x links_v1.8.6/lib/bloomfilter/AUTHORS
x links_v1.8.6/lib/bloomfilter/BloomFilter.hpp
x links_v1.8.6/lib/bloomfilter/BloomFilterInfo.hpp
x links_v1.8.6/lib/bloomfilter/BloomFilterUtil.h
x links_v1.8.6/lib/bloomfilter/BloomFilter_pythonwrapper.cpp
x links_v1.8.6/lib/bloomfilter/BloomMap.hpp
x links_v1.8.6/lib/bloomfilter/COPYING
x links_v1.8.6/lib/bloomfilter/ChangeLog
x links_v1.8.6/lib/bloomfilter/ConvertUTF.h
x links_v1.8.6/lib/bloomfilter/CountingBloomFilter.hpp
x links_v1.8.6/lib/bloomfilter/Examples/
x links_v1.8.6/lib/bloomfilter/Examples/RollingFilterLoad.cpp
x links_v1.8.6/lib/bloomfilter/Jamroot
x links_v1.8.6/lib/bloomfilter/Makefile.am
x links_v1.8.6/lib/bloomfilter/BloomFilter.pm
x links_v1.8.6/lib/bloomfilter/python_wrapper.readme
x links_v1.8.6/lib/bloomfilter/rolling.h
x links_v1.8.6/lib/bloomfilter/swig/
x links_v1.8.6/lib/bloomfilter/swig/BloomFilter.i
x links_v1.8.6/lib/bloomfilter/swig/README.md
x links_v1.8.6/lib/bloomfilter/swig/test.pl
x links_v1.8.6/lib/bloomfilter/swig/testBloom_rolling.cpp
x links_v1.8.6/lib/bloomfilter/swig/testBloom_rolling.pl
x links_v1.8.6/lib/bloomfilter/swig/writeBloom_rolling.cpp
x links_v1.8.6/lib/bloomfilter/swig/writeBloom_rolling.pl
x links_v1.8.6/lib/bloomfilter/swig/BloomFilter_wrap.cxx
x links_v1.8.6/lib/bloomfilter/swig/BloomFilter.pm
x links_v1.8.6/lib/bloomfilter/swig/BloomFilter_wrap.o
x links_v1.8.6/lib/bloomfilter/swig/BloomFilter.so
x links_v1.8.6/lib/bloomfilter/swig/BloomFilter.bf
x links_v1.8.6/lib/bloomfilter/NEWS
x links_v1.8.6/lib/bloomfilter/README
x links_v1.8.6/lib/bloomfilter/README.md
x links_v1.8.6/lib/bloomfilter/RollingHash.h
x links_v1.8.6/lib/bloomfilter/RollingHashIterator.h
x links_v1.8.6/lib/bloomfilter/SimpleIni.h
x links_v1.8.6/lib/bloomfilter/Tests/
x links_v1.8.6/lib/bloomfilter/Tests/AdHoc/
x links_v1.8.6/lib/bloomfilter/Tests/AdHoc/BloomFilterInfoTests.cpp
x links_v1.8.6/lib/bloomfilter/Tests/AdHoc/BloomFilterTest.cpp
x links_v1.8.6/lib/bloomfilter/Tests/AdHoc/Makefile.am
x links_v1.8.6/lib/bloomfilter/Tests/AdHoc/ParallelFilter.cpp
x links_v1.8.6/lib/bloomfilter/Tests/Unit/
x links_v1.8.6/lib/bloomfilter/Tests/Unit/BloomFilterTests.cpp
x links_v1.8.6/lib/bloomfilter/Tests/Unit/BloomMapTests.cpp
x links_v1.8.6/lib/bloomfilter/Tests/Unit/CountingBloomFilterTests.cpp
x links_v1.8.6/lib/bloomfilter/Tests/Unit/Makefile.am
x links_v1.8.6/lib/bloomfilter/Tests/Unit/catch.hpp
x links_v1.8.6/lib/bloomfilter/BloomFilter.i
x links_v1.8.6/lib/bloomfilter/BloomFilter_wrap.cxx
x links_v1.8.6/lib/bloomfilter/.git/
x links_v1.8.6/lib/bloomfilter/.git/HEAD
x links_v1.8.6/lib/bloomfilter/.git/config
x links_v1.8.6/lib/bloomfilter/.git/logs/
x links_v1.8.6/lib/bloomfilter/.git/logs/refs/
x links_v1.8.6/lib/bloomfilter/.git/logs/refs/remotes/
x links_v1.8.6/lib/bloomfilter/.git/logs/refs/remotes/origin/
x links_v1.8.6/lib/bloomfilter/.git/logs/refs/remotes/origin/HEAD
x links_v1.8.6/lib/bloomfilter/.git/logs/refs/heads/
x links_v1.8.6/lib/bloomfilter/.git/logs/refs/heads/master
x links_v1.8.6/lib/bloomfilter/.git/logs/HEAD
x links_v1.8.6/lib/bloomfilter/.git/info/
x links_v1.8.6/lib/bloomfilter/.git/info/exclude
x links_v1.8.6/lib/bloomfilter/.git/objects/
x links_v1.8.6/lib/bloomfilter/.git/objects/pack/
x links_v1.8.6/lib/bloomfilter/.git/objects/pack/pack-66e9f1f975904fa5440f06dd377cbbb6ce384015.pack
x links_v1.8.6/lib/bloomfilter/.git/objects/pack/pack-66e9f1f975904fa5440f06dd377cbbb6ce384015.idx
x links_v1.8.6/lib/bloomfilter/.git/objects/info/
x links_v1.8.6/lib/bloomfilter/.git/description
x links_v1.8.6/lib/bloomfilter/.git/hooks/
x links_v1.8.6/lib/bloomfilter/.git/hooks/pre-push.sample
x links_v1.8.6/lib/bloomfilter/.git/hooks/update.sample
x links_v1.8.6/lib/bloomfilter/.git/hooks/pre-applypatch.sample
x links_v1.8.6/lib/bloomfilter/.git/hooks/post-update.sample
x links_v1.8.6/lib/bloomfilter/.git/hooks/pre-commit.sample
x links_v1.8.6/lib/bloomfilter/.git/hooks/prepare-commit-msg.sample
x links_v1.8.6/lib/bloomfilter/.git/hooks/applypatch-msg.sample
x links_v1.8.6/lib/bloomfilter/.git/hooks/commit-msg.sample
x links_v1.8.6/lib/bloomfilter/.git/hooks/pre-rebase.sample
x links_v1.8.6/lib/bloomfilter/.git/packed-refs
x links_v1.8.6/lib/bloomfilter/.git/index
x links_v1.8.6/lib/bloomfilter/.git/refs/
x links_v1.8.6/lib/bloomfilter/.git/refs/heads/
x links_v1.8.6/lib/bloomfilter/.git/refs/heads/master
x links_v1.8.6/lib/bloomfilter/.git/refs/tags/
x links_v1.8.6/lib/bloomfilter/.git/refs/remotes/
x links_v1.8.6/lib/bloomfilter/.git/refs/remotes/origin/
x links_v1.8.6/lib/bloomfilter/.git/refs/remotes/origin/HEAD
x links_v1.8.6/LINKS
x links_v1.8.6/LINKS.pl
x links_v1.8.6/test/
x links_v1.8.6/test/LINKSrecipe_pglaucaPG29-WS77111.sh
x links_v1.8.6/test/runall.sh
x links_v1.8.6/test/LINKSrecipe_athaliana_raw.sh
x links_v1.8.6/test/runme_EcoliK12singleMPET.sh
x links_v1.8.6/test/runme_EcoliK12iterativeA2D.sh
x links_v1.8.6/test/runIterativeLINKS_ECK12A2D.sh
x links_v1.8.6/test/runIterativeLINKS_ECK12raw.sh
x links_v1.8.6/test/runIterativeLINKS_ECK12.sh
x links_v1.8.6/test/runIterativeLINKS_SCS288c.sh
x links_v1.8.6/test/runIterativeLINKS_SCW303.sh
x links_v1.8.6/test/runIterativeLINKS_STH58.sh
x links_v1.8.6/test/runme_EcoliK12iterativeRAW.sh
x links_v1.8.6/test/runme_EcoliK12iterative.sh
x links_v1.8.6/test/runme_EcoliK12single.sh
x links_v1.8.6/test/runme_ScerevisiaeS288citerative.sh
x links_v1.8.6/test/runme_ScerevisiaeW303iterative.sh
x links_v1.8.6/test/runme_StyphiH58iterative.sh
x links_v1.8.6/test/LINKSrecipe_athaliana_ectools.sh

It went somewhere in my path:

$ ls links_v1.8.6/
LINKS LINKS.pl README.md lib/ test/ tools/

And then there is a break in my understanding of how to run this, as documentation gets a bit into an example with all the parameters and not much else.

it reads to do:

./LINKS
-bash: ./LINKS: No such file or directory
(base) ARSLA18061508:pagglo smwaters$

so, i do

links_v1.8.6/LINKS -f amprain.pagglo.fa
Can't locate loadable object for module BloomFilter in @inc (@inc contains: /Users/smwaters/Documents/RESEARCH/sequencing/MinION/RUNS/amprain_blast_read_lists/pagglo/links_v1.8.6/./lib/bloomfilter/swig /Users/smwaters/anaconda3/lib/site_perl/5.26.2/darwin-thread-multi-2level /Users/smwaters/anaconda3/lib/site_perl/5.26.2 /Users/smwaters/anaconda3/lib/5.26.2/darwin-thread-multi-2level /Users/smwaters/anaconda3/lib/5.26.2 .) at /Users/smwaters/Documents/RESEARCH/sequencing/MinION/RUNS/amprain_blast_read_lists/pagglo/links_v1.8.6/./lib/bloomfilter/swig/BloomFilter.pm line 11.
Compilation failed in require at links_v1.8.6/LINKS line 26.
BEGIN failed--compilation aborted at links_v1.8.6/LINKS line 26.

or

perl links_v1.8.6/LINKS.pl -f amprain.pagglo.fa
Can't locate loadable object for module BloomFilter in @inc (@inc contains: /Users/smwaters/Documents/RESEARCH/sequencing/MinION/RUNS/amprain_blast_read_lists/pagglo/links_v1.8.6/./lib/bloomfilter/swig /Users/smwaters/anaconda3/lib/site_perl/5.26.2/darwin-thread-multi-2level /Users/smwaters/anaconda3/lib/site_perl/5.26.2 /Users/smwaters/anaconda3/lib/5.26.2/darwin-thread-multi-2level /Users/smwaters/anaconda3/lib/5.26.2 .) at /Users/smwaters/Documents/RESEARCH/sequencing/MinION/RUNS/amprain_blast_read_lists/pagglo/links_v1.8.6/./lib/bloomfilter/swig/BloomFilter.pm line 11.
Compilation failed in require at links_v1.8.6/LINKS.pl line 26.
BEGIN failed--compilation aborted at links_v1.8.6/LINKS.pl line 26.

I can read there is an issue in the compilation, how to fix, don't know. I can also read it is looking for that BloomFilter, but it looked like there was a bloomfilter already there because when I do:

(base) ARSLA18061508:pagglo smwaters$ cd ./links_v1.8.6/lib
(base) ARSLA18061508:lib smwaters$ git clone git://github.com/bcgsc/bloomfilter.git
fatal: destination path 'bloomfilter' already exists and is not an empty directory.
(base) ARSLA18061508:lib smwaters$ ls
bloomfilter
(base) ARSLA18061508:lib smwaters$ cd swig
-bash: cd: swig: No such file or directory
(base) ARSLA18061508:lib smwaters$

So...
realizing there was a newer release of LINKS, downloaded that:
tar -zxvf LINKS-1.8.7.tar.gz
x LINKS-1.8.7/
x LINKS-1.8.7/LICENSE
x LINKS-1.8.7/README.md
x LINKS-1.8.7/bin/
x LINKS-1.8.7/bin/LINKS
x LINKS-1.8.7/links-logo.png
x LINKS-1.8.7/releases/
x LINKS-1.8.7/releases/binaries/
x LINKS-1.8.7/releases/binaries/links_v1-5-1.tar.gz
x LINKS-1.8.7/releases/binaries/links_v1-5-2.tar.gz
x LINKS-1.8.7/releases/binaries/links_v1-5.tar.gz
x LINKS-1.8.7/releases/binaries/links_v1-6-1.tar.gz
x LINKS-1.8.7/releases/binaries/links_v1-6.tar.gz
x LINKS-1.8.7/releases/binaries/links_v1-7.tar.gz
x LINKS-1.8.7/releases/binaries/links_v1-8-1.tar.gz
x LINKS-1.8.7/releases/binaries/links_v1-8-2.tar.gz
x LINKS-1.8.7/releases/binaries/links_v1-8-3.tar.gz
x LINKS-1.8.7/releases/binaries/links_v1-8-4.tar.gz
x LINKS-1.8.7/releases/binaries/links_v1-8-5.tar.gz
x LINKS-1.8.7/releases/binaries/links_v1-8-6.tar.gz
x LINKS-1.8.7/releases/binaries/links_v1-8.tar.gz
x LINKS-1.8.7/releases/links_v1.8.4/
x LINKS-1.8.7/releases/links_v1.8.4/LINKS
x LINKS-1.8.7/releases/links_v1.8.4/LINKS-readme.pdf
x LINKS-1.8.7/releases/links_v1.8.4/LINKS-readme.txt
x LINKS-1.8.7/releases/links_v1.8.4/LINKS.pl
x LINKS-1.8.7/releases/links_v1.8.4/test/
x LINKS-1.8.7/releases/links_v1.8.4/test/LINKSrecipe_athaliana_ectools.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/LINKSrecipe_athaliana_raw.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/LINKSrecipe_pglaucaPG29-WS77111.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/runIterativeLINKS_ECK12.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/runIterativeLINKS_ECK12A2D.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/runIterativeLINKS_ECK12raw.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/runIterativeLINKS_SCS288c.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/runIterativeLINKS_SCW303.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/runIterativeLINKS_STH58.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/runall.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/runme_EcoliK12iterative.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/runme_EcoliK12iterativeA2D.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/runme_EcoliK12iterativeRAW.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/runme_EcoliK12single.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/runme_EcoliK12singleMPET.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/runme_ScerevisiaeS288citerative.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/runme_ScerevisiaeW303iterative.sh
x LINKS-1.8.7/releases/links_v1.8.4/test/runme_StyphiH58iterative.sh
x LINKS-1.8.7/releases/links_v1.8.4/tools/
x LINKS-1.8.7/releases/links_v1.8.4/tools/makeMPETOutput2EQUALfiles.pl
x LINKS-1.8.7/releases/links_v1.8.4/tools/testBloom.pl
x LINKS-1.8.7/releases/links_v1.8.4/tools/writeBloom.pl
x LINKS-1.8.7/releases/links_v1.8.5/
x LINKS-1.8.7/releases/links_v1.8.5/LINKS
x LINKS-1.8.7/releases/links_v1.8.5/LINKS-readme.pdf
x LINKS-1.8.7/releases/links_v1.8.5/LINKS-readme.txt
x LINKS-1.8.7/releases/links_v1.8.5/LINKS.pl
x LINKS-1.8.7/releases/links_v1.8.5/lib.tar.gz
x LINKS-1.8.7/releases/links_v1.8.5/test/
x LINKS-1.8.7/releases/links_v1.8.5/test/LINKSrecipe_athaliana_ectools.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/LINKSrecipe_athaliana_raw.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/LINKSrecipe_pglaucaPG29-WS77111.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/runIterativeLINKS_ECK12.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/runIterativeLINKS_ECK12A2D.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/runIterativeLINKS_ECK12raw.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/runIterativeLINKS_SCS288c.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/runIterativeLINKS_SCW303.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/runIterativeLINKS_STH58.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/runall.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/runme_EcoliK12iterative.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/runme_EcoliK12iterativeA2D.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/runme_EcoliK12iterativeRAW.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/runme_EcoliK12single.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/runme_EcoliK12singleMPET.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/runme_ScerevisiaeS288citerative.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/runme_ScerevisiaeW303iterative.sh
x LINKS-1.8.7/releases/links_v1.8.5/test/runme_StyphiH58iterative.sh
x LINKS-1.8.7/releases/links_v1.8.5/tools/
x LINKS-1.8.7/releases/links_v1.8.5/tools/makeMPETOutput2EQUALfiles.pl
x LINKS-1.8.7/releases/links_v1.8.5/tools/testBloom.pl
x LINKS-1.8.7/releases/links_v1.8.5/tools/writeBloom.pl
x LINKS-1.8.7/releases/links_v1.8.6/
x LINKS-1.8.7/releases/links_v1.8.6/LINKS
x LINKS-1.8.7/releases/links_v1.8.6/LINKS.pl
x LINKS-1.8.7/releases/links_v1.8.6/README.md
x LINKS-1.8.7/releases/links_v1.8.6/lib.tar.gz
x LINKS-1.8.7/releases/links_v1.8.6/test/
x LINKS-1.8.7/releases/links_v1.8.6/test/LINKSrecipe_athaliana_ectools.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/LINKSrecipe_athaliana_raw.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/LINKSrecipe_pglaucaPG29-WS77111.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/runIterativeLINKS_ECK12.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/runIterativeLINKS_ECK12A2D.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/runIterativeLINKS_ECK12raw.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/runIterativeLINKS_SCS288c.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/runIterativeLINKS_SCW303.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/runIterativeLINKS_STH58.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/runall.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/runme_EcoliK12iterative.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/runme_EcoliK12iterativeA2D.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/runme_EcoliK12iterativeRAW.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/runme_EcoliK12single.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/runme_EcoliK12singleMPET.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/runme_ScerevisiaeS288citerative.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/runme_ScerevisiaeW303iterative.sh
x LINKS-1.8.7/releases/links_v1.8.6/test/runme_StyphiH58iterative.sh
x LINKS-1.8.7/releases/links_v1.8.6/tools/
x LINKS-1.8.7/releases/links_v1.8.6/tools/consolidateGraphs.pl
x LINKS-1.8.7/releases/links_v1.8.6/tools/makeMPETOutput2EQUALfiles.pl
x LINKS-1.8.7/releases/links_v1.8.6/tools/testBloom.pl
x LINKS-1.8.7/releases/links_v1.8.6/tools/writeBloom.pl
x LINKS-1.8.7/scaffoldsToAGP2.pl

But...that doesn't have a lib directory:

~/LINKS-1.8.7/
README.md bin/ releases/ scaffoldsToAGP2.pl

So...any help or advice would be appreciated.

Thank you.

Issue building BloomFilter.pm

Hi Rene,

So; we did a small variation on this by using the original bloomfilter folder that ships with v1.8.4. (just in case it needed anything other than the headers).

cd bloomfilter/swig

Then we ran the commands you have on git changing -I to our perl version core path. This overwrites the shipped BloomFilter.pm which then works great.

Hope this helps and thanks for being super quick on replying!!

Many thanks,

Victoria

On 1 Feb 2017, at 18:01, Rene Warren [email protected] wrote:

Hi Victoria,

Copying the older header files worked? Or did you managed to work with the packaged binaries?

Thanks for letting me know how you solved the problem (I may post on github in case others have the same issue)

Cheers,
Rene


From: Victoria Offord [[email protected]]
Sent: Wednesday, February 01, 2017 9:58 AM
To: Rene Warren
Subject: Re: Help with running LINKS v1.8.4

Hi Rene,

Not to worry, that works wonderfully! Thank you!

Many thanks,

Victoria

On 01/02/2017, 17:56, "Rene Warren" [email protected] wrote:

Hello Victoria,

I was notified that yesterday some changes were made to the bloomfilter.git repo
We effectively replaced RollingHashIterator.h and RollingHash.h for ntHashIterator.h and nthash.h (update bloom filter to use newest rolling hash)

My apologies for the inconvenience. We will have to update the swig instructions and BloomFilter.i interface file to work with nthash.h instead of the RollingHashIterator.h

In the mean time, you may be able to build BloomFilter.pm on your system by doing the following:

cd ../lib
mv bloomfilter/ bloomfilter_old
git clone git://github.com/bcgsc/bloomfilter.git
cp -rf ./bloomfilter_old/rolling.h ./bloomfilter/.
cp -rf ./bloomfilter_old/RollingHashIterator.h ./bloomfilter/.
cp -rf ./bloomfilter_old/RollingHash.h ./bloomfilter/.
cd bloomfilter
swig -Wall -c++ -perl5 BloomFilter.i
g++ -c BloomFilter_wrap.cxx -I/usr/lib/x86_64-linux-gnu/perl/5.22 -fPIC -Dbool=char -O3
...

Let me know,
Rene


From: Victoria Offord [[email protected]]
Sent: Wednesday, February 01, 2017 8:19 AM
To: Rene Warren
Subject: Help with running LINKS v1.8.4

Dear Rene,

I have been having some issues getting LINKS v1.8.4 working on my local machine (Ubuntu xenial, perl 5.22.1).

After downloading the tar file and uncompressing, I tried running the tests but got a bloomfilter perl error:

cd test
./runme_EcoliK12single.sh


done. Initiating LINKS scaffolding ETA 1-2 min depending on system...

/usr/bin/perl: symbol lookup error: /home/ubuntu/installs/links_v1.8.4/./lib/bloomfilter/swig/BloomFilter.so: undefined symbol: Perl_Gthr_key_ptr
/usr/bin/perl: symbol lookup error: /home/ubuntu/installs/links_v1.8.4/./lib/bloomfilter/swig/BloomFilter.so: undefined symbol: Perl_Gthr_key_ptr

done. Initiating LINKS scaffolding with iterative distances ETA 3 min depending on system...

/usr/bin/perl: symbol lookup error: /home/ubuntu/installs/links_v1.8.4/./lib/bloomfilter/swig/BloomFilter.so: undefined symbol: Perl_Gthr_key_ptr

So, I then tried the instructions for building BloomFilter.pm. This then gave an error re. a header file:

cd ../lib
mv bloomfilter/ bloomfilter_old
git clone git://github.com/bcgsc/bloomfilter.git
cd bloomfilter/swig/
swig -Wall -c++ -perl5 BloomFilter.i
g++ -c BloomFilter_wrap.cxx -I/usr/lib/x86_64-linux-gnu/perl/5.22 -fPIC -Dbool=char -O3

BloomFilter_wrap.cxx:1893:36: fatal error: ../RollingHashIterator.h: No such file or directory
compilation terminated

Is there something I haven’t configured correctly or am doing wrong?

Any help with this would be much appreciated!!

Many thanks,

Victoria
-- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

Compiling with G++6 fails but G++4 works

Although suggested in issue #13, compilation of the bloomfilter with G++ v4 works while G++ v6 results in errors (which were partially resolved by modifying the typing of some variables?). As I'm not knowledgable enough on this, is there any feedback on how to accomplish compilation with g++ v6?

Issue Building BloomFilter Perl5 Module

I am running LINKS on a virtual Ubuntu machine. When I get to step b and c building the module, the -03 option throws an error: "unrecognized command line option '-03' ." I am able to execute both b and c without "-03". Should this be a problem?

My main issue is that after setting up the perl5 module, I tried to run the test file and get this error:
Can't load './BloomFilter.so' for module BloomFilter: ./BloomFilter.so: undefined symbol: PL_stack_sp at /usr/lib/x86_64-linux-gnu/perl/5.22/DynaLoader.pm line 187.
at BloomFilter.pm line 11.
Compilation failed in require at ./test.pl line 3.
BEGIN failed--compilation aborted at ./test.pl line 3.
I then ran the test as root and got a Segmentation fault (core dumped) error.
Do you know what might be going on?
I'm really new to command line, so forgive me if I'm missing something obvious. Let me know if you need more info as well.
Thanks!

Identifying the long read(s) that contig pair scaffolding is based upon

Is there a way to identify which long read(s) in the reads input provide the information for two contigs in the assembly input to be scaffolded? I have a situation where two very long contigs are being scaffolded where the orthologous long contigs in the long read input file are separate contigs. Maybe another long read is providing the scaffolding info, but which one?

fof conversion failed

When I used this command to convert into fof format, the file size is too low.

The original file size is 8.8 GB whereas the converted file is just 4 kb only.

echo ONT.fastq.gz > ONT.fof

Then again, I tried with first method. The .fa file size is 8.9 GB and .fof the file is 4 kb only. What could be the problem?

The expected mitochondrial genome size is ~1mb. But the ONT nanopore reads contain all genome sequencing reads that include chloroplast, mitochondrial and nuclear genome sequence reads and read size is ~ 8.8 GB. The N50 of nanopore read size 22,193. So, which one is the best t and d option to start with this assembly?
Thank you.

About install

Hi Rene
Thank you for your tools,but I have a problem in install LINKS.
when I run LINKS ,the error is as follows:
Can't locate Time/HiRes.pm in @inc (@inc contains: /lustre/work/chaozhang/soft/loadfile/links_v1.8.5/./lib/bloomfilter/swig /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at ./LINKS line 28.
BEGIN failed--compilation aborted at ./LINKS line 28.
but I do have installed the packages before. this is my question1.
The other is when i am running this command,the error is
perl LINKS.pl
perl: symbol lookup error: /lustre/work/chaozhang/soft/loadfile/links_v1.8.5/./lib/bloomfilter/swig/BloomFilter.so: undefined symbol: Perl_Gthr_key_ptr

so which is the major reason?

great thanks
chaozhang

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.