Coder Social home page Coder Social logo

KAT comp - issue with big genome about kat HOT 7 OPEN

matryoskina avatar matryoskina commented on August 26, 2024
KAT comp - issue with big genome

from kat.

Comments (7)

jonwright99 avatar jonwright99 commented on August 26, 2024

Hi, I think you have a problem with your command line. You should have the reads as the first parameter, then the genome as the second. You are including a third which makes comp function very differently. The log file looks like you are putting one assembly as the first parameter, another assembly as the second, and the reads as the third which will give odd results.

from kat.

matryoskina avatar matryoskina commented on August 26, 2024

Hi,
Thanks for your help! I rerun the analysis with only the fastq and one genome, but the problem is still there. No peak was found. Shall I increase the k-mer size? Or is there something else I am missing? I am attaching the new log file
Thanks!
slurm-6522577.txt
d

from kat.

jonwright99 avatar jonwright99 commented on August 26, 2024

Is there a plot created? If so, can you post it?

Also, can you rerun without using -h and, if you set -H you will speed up the run as it won't need to double the hash size many times to find the correct size. I use -H100000000000.

So your command line above should read;
kat comp -t 32 -m 17 -H100000000000 -o genome1VSgenome2 'fastq1_R1.fastq.gz fastq1_R.fastq.gz fastq2_R1.fastq.gz fastq2_R.fastq.gz fastq3_R1.fastq.gz fastq3_R3.fastq.gz' genome1.fa

from kat.

matryoskina avatar matryoskina commented on August 26, 2024

There is no plot created from this job. I have one created from a previous run
osph0 7 plot

from kat.

jonwright99 avatar jonwright99 commented on August 26, 2024

There's something very odd with your reads here, are they paired-end reads? Also, were all the fastq files you have included in the analysis the ones used to generate the assembly? I've seen these type of plots with no peak where the libraries either are not paired-end reads or they had multiple rounds of PCR before sequencing.

from kat.

matryoskina avatar matryoskina commented on August 26, 2024

Yes, reads are all paired-ends. Regarding the assembly, well, the genome was assembled with long read and those short reads were used for misassemblies correction. Then I used an Hi-C library (Illumina paired-end) to get the chromosomes. Do you think I should use this library instead? Also, could I just compare two genomes without illumina reads?
Thanks

from kat.

jonwright99 avatar jonwright99 commented on August 26, 2024

Ah, that makes sense now. Do you know roughly the coverage of the paired-end reads that you used for misassemblies correction? I'm guessing it quite low and not enough to generate a peak on the plot. KAT is designed to compare an Illumina read dataset to an assembly generated from that dataset to show how the kmer content of the reads is represented in the assembly. Because your datasets have been used differently to generate an assembly, the plots are not working as intended.

from kat.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.