Coder Social home page Coder Social logo

omgene's People

Contributors

mpdprot avatar mpdunne avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

photocyte

omgene's Issues

bedtools compatibility issues

Hi there,

I'm trying to run omgene, but it seems that the embedded bedtools commands are out of date (My install has bedtools v2.27.1 - the most recent version). For example, when I run omgene on my .tsv file, it works for a time then produces this error:

python2 /lab/solexa_weng/testtube/omgene/omgene.py -i test.tsv
loading and checking input data locations...
Grabbing cds and aa sequences for inputted transcripts...
Traceback (most recent call last):
  File "/lab/solexa_weng/testtube/omgene/omgene.py", line 3421, in <module>
    go(path_inf, path_ref, path_resultsDir, path_wDir, minintron, minexon, int_numCores, int_slopAmount)
  File "/lab/solexa_weng/testtube/omgene/omgene.py", line 3318, in go
    dict_generegions = prepareGeneregions(dict_seqInfo, dict_genomeInfo, path_wDir, int_numCores,int_slopAmount)#qe
  File "/lab/solexa_weng/testtube/omgene/omgene.py", line 1111, in prepareGeneregions
    path_generegion	= writeGeneRegionFile(line, generegion, path_mDir_l + "/" + str(generegion) + ".gtf")
  File "/lab/solexa_weng/testtube/omgene/omgene.py", line 1082, in writeGeneRegionFile
    ".", line[3], ".", "transcript_id \"" + generegion + "\"; gene_id \"" + generegion + "\"", generegion]
IndexError: list index out of range

The "line" variable referenced seems to be the output of a bedtools stream from a previous command, and indeed if I print(line), it has only 3 columns. This patch (photocyte@9f48d5f) that uses the -c and -o parameters of bedtools seems to fix it, but I'm not sure I implemented it properly, as the script crashes later on with this problem:

python2 /lab/solexa_weng/testtube/omgene/omgene.py -i test.tsv
Checking installed programs...
loading and checking input data locations...
Grabbing cds and aa sequences for inputted transcripts...
Traceback (most recent call last):
  File "/lab/solexa_weng/testtube/omgene/omgene.py", line 3421, in <module>
    go(path_inf, path_ref, path_resultsDir, path_wDir, minintron, minexon, int_numCores, int_slopAmount)
  File "/lab/solexa_weng/testtube/omgene/omgene.py", line 3318, in go
    dict_generegions = prepareGeneregions(dict_seqInfo, dict_genomeInfo, path_wDir, int_numCores,int_slopAmount)#qe
  File "/lab/solexa_weng/testtube/omgene/omgene.py", line 1151, in prepareGeneregions
    dg["cdsBase_data"] = str(readSeqs(dg["cdsBase"])[0].seq)
IndexError: list index out of range

Any ideas?

All the best,
-Tim

Malformed gene model causes errors

Hi there Michael,

I'm trying to repair a gene model that has a bit of an error, at the C-terminus. If I exclude this gene from the omgene optimization, the whole script works, but if I include it (ILUMI_23468), BioPython complains, and then omgene crashes later.

See .zip file linked below for the .tsv, .gtf, and .gff3 files. One thing that seems strange between the ILUMI_23468 .gff3 and .gtf file is that there is a stop codon in the .GTF, but the translated peptide of the .GFF3 doesn't have a stop codon. The .GTF was produced with the script you recommend on the omgene main page.

files.zip
(The genome reference FASTAs can be found here http://www.fireflybase.org/firefly_data.html)

omgene output here:

Checking installed programs...
loading and checking input data locations...
Grabbing cds and aa sequences for inputted transcripts...
/usr/local/lib/python2.7/dist-packages/Bio/Seq.py:2309: BiopythonWarning: Partial codon, len(sequence) not a multiple of three. Explicitly trim the sequence or add trailing N before translation. This may become an error in future.
  BiopythonWarning)
Relativising sequences...
Performing first round exonerate...
Exonerating and piling sequences against sequence regions...
Performing second round exonerate...
Performing third round exonerate...
done getting options
Fixing 0
Fixing 1
Fixing 2
Fixing 3
Fixing 4
Fixing 5
Fixing 6
Fixing 7
Fixing 8
Fixing 9
Fixing 10
Fixing 11
Fixing 12
Traceback (most recent call last):
  File "/lab/solexa_weng/testtube/omgene/omgene.py", line 3438, in <module>
    go(path_inf, path_ref, path_resultsDir, path_wDir, minintron, minexon, int_numCores, int_slopAmount)
  File "/lab/solexa_weng/testtube/omgene/omgene.py", line 3369, in go
    res   = fixIt(adj, parts, dict_generegions, path_wDir, minintron, minexon, path_winnersAln)#qe
  File "/lab/solexa_weng/testtube/omgene/omgene.py", line 1816, in fixIt
    res = incrementalFixRecursive(res, path_fix, minintron, minexon, path_winnersAln, d_gr)
  File "/lab/solexa_weng/testtube/omgene/omgene.py", line 1838, in incrementalFixRecursive
    res, prevAln = incrementalFix(inparts, path_iDir, mi, mx, path_winnersAln, d_gr, tTerminal=tTerminal)
  File "/lab/solexa_weng/testtube/omgene/omgene.py", line 1829, in incrementalFix
    return processLabels(labels, options, p_lDir, path_winnersAln, False, refine)
  File "/lab/solexa_weng/testtube/omgene/omgene.py", line 3088, in processLabels
    winners  = chooseWinnersRef(options, path_fDir, path_refAln = path_refAln, doubleCheck = True, orAln = True, parallelCheck = True)
  File "/lab/solexa_weng/testtube/omgene/omgene.py", line 2607, in chooseWinnersRef
    return chooseThem(path_allAln, tryBlank, seqlookup, seqlookup_rev, path_wDir)
  File "/lab/solexa_weng/testtube/omgene/omgene.py", line 2740, in chooseThem
    w.id   = seqlookup[k][winner[k].id]
KeyError: 'generegion_5.option_0'

Thoughts?

All the best,
-Tim

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.