Coder Social home page Coder Social logo

Comments (3)

marekkokot avatar marekkokot commented on August 26, 2024

Hello, @8banzhuan
First of all thanks for using CoLoRd! It is hard to tell what causes this. We may have some suspicions. One of them is maybe the data you are compressing is of such excellent quality that the reference genome is in fact not beneficial at all. It would be great if you could provide your input fasta file. Also, I'm not sure what you mean by "comparison ratio". In fact, it would be great if you could also provide your whole testing pipeline (command lines).

from colord.

8banzhuan avatar 8banzhuan commented on August 26, 2024

Hello, @8banzhuan First of all thanks for using CoLoRd! It is hard to tell what causes this. We may have some suspicions. One of them is maybe the data you are compressing is of such excellent quality that the reference genome is in fact not beneficial at all. It would be great if you could provide your input fasta file. Also, I'm not sure what you mean by "comparison ratio". In fact, it would be great if you could also provide your whole testing pipeline (command lines).

thank you for your reply!Comparison rate it means the ratio of my FASTA data successfully mapped to the reference genome,I used Li Heng's minimap2 for mapping, and then used samtools to analyze the alignment rate in the sam file. I wanted to find out the impact of the alignment rate on the compression rate (in the case of a reference genome)
The command I use is similar to the following
colord compress-hifi -G reference.fasta inputfile.fasta outputfie
The situation I encountered is that the size of the reference genome does not seem to improve my compression rate very much.
In other words, I use a reference genome with an alignment rate of about 25% and a reference genome with an alignment rate of 9%, and the compression rate is almost the same (both are 10% of the original file),For the reference genome with poor alignment quality, is the compression rate of colord using the reference genome mode much better than the effect without the reference genome?
Looking forward to your reply, best wishes!

from colord.

marekkokot avatar marekkokot commented on August 26, 2024

Hi,

sorry for the late response, I didn't get (or missed) a notification.
And what is the compression ratio when the reference genome is not used at all?
Since you are using hifi data maybe its quality is so good that there are no benefits from using the reference genome.

from colord.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.