Coder Social home page Coder Social logo

conifer's People

Contributors

ivarz avatar mbhall88 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

conifer's Issues

read length percentile

Hi Ivar,

Thank you for making this useful tool! I was wondering if it is possible to know the read length percentiles also for each taxa assignment.

Thanks!
Hena

Missing output

I wanted to try out your tool as you recommended in my issue on kraken. I started it with:

./conifer --both_scores -s -i kraken.out.txt -d /scratch/databases/Standard_v2/taxo.k2d

then saw output

1000000 lines processed...                                                                                                                                                                    
2000000 lines processed...        
3000000 lines processed...
4000000 lines processed...
5000000 lines processed...
6000000 lines processed...
7000000 lines processed...
8000000 lines processed...
9000000 lines processed...
10000000 lines processed...
11000000 lines processed...
12000000 lines processed...
13000000 lines processed...
14000000 lines processed...
15000000 lines processed...
16000000 lines processed...
17000000 lines processed...
18000000 lines processed...
19000000 lines processed...
20000000 lines processed...
21000000 lines processed...
22000000 lines processed...
23000000 lines processed...
24000000 lines processed...
25000000 lines processed...
26000000 lines processed...
27000000 lines processed...
28000000 lines processed...
29000000 lines processed...
30000000 lines processed...
31000000 lines processed...
32000000 lines processed...
33000000 lines processed...
34000000 lines processed...
35000000 lines processed...
36000000 lines processed...
37000000 lines processed...
38000000 lines processed...
39000000 lines processed...
40000000 lines processed...
41000000 lines processed...
42000000 lines processed...
taxon_name      taxid   reads   P25_conf        P50_conf        P75_conf        P25_rtl P50_rtl P75_rtl

I expected to see more in the table. Any ideas what could cause this?

Docker image

Hi @Ivarz,

I recently created a Docker image for conifer. I thought, I'd leave it here in case it's useful for you or someone else.

# Copyright (c) 2020, Moritz E. Beber.
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

FROM bitnami/minideb:buster AS builder

RUN set -eux \
    && install_packages \
        build-essential \
        ca-certificates \
        git \
        libz-dev

WORKDIR /opt

RUN set -eux \
    && git clone https://github.com/Ivarz/Conifer.git \
    && cd Conifer \
    && git submodule update --init --recursive \
    && gcc -static -std=c99 -Wall -Wextra -O3 -D_POSIX_C_SOURCE=200809L -I third_party/uthash/src -I . src/utils.c src/kraken_stats.c src/kraken_taxo.c src/main.c -o conifer -l:libm.a -l:libz.a

FROM busybox:glibc

COPY --from=builder /opt/Conifer/conifer /

ENTRYPOINT ["/conifer"]

I'll track progress of this file over here.

length and confidence length differ

Hi, Ivarz:
Thanks for your convenient tool.
I am trying to calculate confidence score using result from kraken2. I am wondering why len not equal to 100?

C V100006960L1C001R001000420 853 100|100 0:16 853:8 1783272:2 748224:2 1783272:2 168384:5 186801:6 0:2 168384:5 0:18 |:|748224:7 0:2 748224:5 0:21 853:4 748224:7 0:5 748224:3 0:12
read1 : 16+8+2+2+2+5+6+2+5+18=66,
read2: 7+2+5+21+4+7+5+3+12=66.
Thanks!

conifer output not to specific readIDs

Hello,

Thanks for developing this tool! I have recently come across it and thought it would help me to fine tune the accuracy of my Kraken2 results. I read in the readme file that it generates the confidence scores for each readID. However, in my conifer output file, I see the confidence score for each taxid/taxname, as opposed to readID. Is there anything I did wrong?

Thanks,
Elly

Report the taxid and or name

Hello again,

I've been using Conifer for a bit now and I find it very useful. Thank you for that. At the moment, Conifer in its simplest form reports

kraken output read1 confidence read2 confidence average

Since Conifer can obviously do this, as seen for the summary report, I would love to get the output as

taxid name (optional) read1 confidence read2 confidence average

and simply have additional rows for the same taxid. Does this make sense? Would you consider adding this output option? Or maybe there is a different simple way to map the kraken output to the taxid that I am missing right now.

New release

Would it be possible to make a new release after adding the --help message/option? That way the bioconda recipe will trigger a new release too and the bioconda install of conifer will then have access to that option.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.