Coder Social home page Coder Social logo

Too many bins to plot? about checkm HOT 4 CLOSED

ecogenomics avatar ecogenomics commented on July 17, 2024
Too many bins to plot?

from checkm.

Comments (4)

donovan-h-parks avatar donovan-h-parks commented on July 17, 2024

Hello Mattias,

In the next version of CheckM, I will put in some options to only plot a specific number of the "best" bins. In the meantime, there are two workarounds. You can reduce the DPI (--dpi) as this will allow more bins to be plotted. Alternatively, you can create a directory with only the bins you really want to have plotted (202 is a lot to look at!) and point the bin_qa_plot command to this directory. I usual create a directory and symlink to the bins that are good enough to consider for further processing and are including in a manuscript.

Cheers,
Donovan

from checkm.

mdehollander avatar mdehollander commented on July 17, 2024

Hi Donovan,

Thanks! I will have a look at that and make a selection of bins. I have to see why there are so many bins.

Greetings,
Mattias

from checkm.

xvazquezc avatar xvazquezc commented on July 17, 2024

Hi all,
I have been able to plot 507 bins from my data but after dealing with the -dpi option. But I never got the error showed above. Most of the times this was the error:

*******************************************************************************
 [CheckM - bin_qa_plot] Creating bar plot of bin quality.
*******************************************************************************

  Calculating AAI between multi-copy marker genes.
[Error] There are too many bins to plot.
The resulting plot would be 100776 pixels in height and the maximum allowed size is 32768.
Please reduce the number of bins to be plotted or decrease the DPI (--dpi).

  { Current stage: 0:00:41.618 || Total: 0:00:41.618 }

Other alternative I tried was to set the output file format to svg but the size of the file (>167MB) makes it impossible to handle in Inkscape, or at least would take way to long to open.

from checkm.

donovan-h-parks avatar donovan-h-parks commented on July 17, 2024

Hello,
The bin_qa_plot command is really designed to provide a visualization of at most a few dozen bins. More than this and the plot quickly becomes unreasonably large. Generally, we select the best bins and place these in a separate folder. You can then produce a bin_qa_plot for just these bins. I hope to add some filtering criteria to this plotting function in the future, but for now a manual selection of the bins to plot is required when you have more than a few dozen bins.

from checkm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.