Coder Social home page Coder Social logo

Using built HMMs or from COGs about metatag HOT 3 CLOSED

robaina avatar robaina commented on September 15, 2024
Using built HMMs or from COGs

from metatag.

Comments (3)

Robaina avatar Robaina commented on September 15, 2024 1

Alright, closing the issue then!

from metatag.

Robaina avatar Robaina commented on September 15, 2024

Hi there,

According to hmmsearch's error messages, it seems like you could change hmmsearch's parameter --cut_nc by --cut_ga (or drop it altogether if that doesn't work). These parameters control the model-specific score thresholding, according to hmmsearch documentation (pag. 76).

To modify the setting, go to makedatabase.py and change line 86, additional_args='--cut_nc' by additional_args='--cut_ga' or additional_args=None:

https://github.com/Robaina/TRAITS/blob/ea8aaf3484c6f3e70ae384adee43cbf62af48872/code/makedatabase.py#L81-L87

The previous are additional arguments to hmmsearch, so any valid argument to hmmsearch can be added this way.

If that doesn't work we will consider other options :)

from metatag.

gecko1990 avatar gecko1990 commented on September 15, 2024

It worked well, although the problem was that you have to write the value twice:

TC 207.409 207.409

I added at the end the value from the COG webpage as the TC value. According to the HMM documentation (page 76.), the "TC thresholds are generally considered to be the score of the lowest-scoring known true positive that is above all known false positives." This seems to "correspond" to the definition of the value COG webpage, the Threshold Bit Score defined as:

"the domain specific threshold score (shown as a bit score) that an RPS-BLAST hit must meet or exceed in order to be considered a specific hit, which represents a high confidence association between a protein query sequence and a conserved domain and therefore a high confidence level for the inferred function of the protein query sequence. The threshold is equal to the weakest E-value (and highest bit score) among self-hits of a domains member protein sequences to the resulting domain model (illustrated example). Domain-specific threshold scores are calculated only for NCBI-curated domains."

The code in makedatabase.py was modified accordingly:

    print('* Making peptide-specific reference database...')
    with TemporaryFilePath() as tempfasta, TemporaryFilePath() as tempfasta2:
        filterFASTAByHMM(
            hmm_model=args.hmm,
            input_fasta=args.data,
            output_fasta=tempfasta,
            hmmer_output=hmmer_output,
            additional_args=f'--cut_tc'
        )

from metatag.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.