Coder Social home page Coder Social logo

pssun / circcode Goto Github PK

View Code? Open in Web Editor NEW
14.0 1.0 4.0 39.9 MB

A Python3-base pipeline for translated circular RNA(circRNA) identification

License: GNU General Public License v3.0

Python 98.74% Shell 1.26%
circrna bioinformatics circrna-prediction python ribosome-profiling-data

circcode's Introduction

Hi, welcome to my GitHub HomePage 🍉

👤 About me:

  • 🔭 I’m currently working on ST-seq
  • 🌱 I’m currently studying at XJTU
  • 📫 How to reach me: [email protected] / [email protected]
  • 🖥 Major: Control Science and Engineering / Bioinformatics

📑 Paper:

  • [1] Sun P, Li G . CircCode: A Powerful Tool for Identifying circRNA Coding Ability[J]. Frontiers in Genetics, 2019, 10:981.
  • [2] Sun P, Wang H, Li G . Rcirc: An R Package for circRNA Analyses and Visualization[J]. Frontiers in Genetics, 2020, 11.

📑 ORCID:

ORCID iD icon https://orcid.org/0000-0003-0796-2133


Visitor count

circcode's People

Contributors

dependabot[bot] avatar pssun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

circcode's Issues

Problem of pipeline

Hello,

I'm using CircCode, such a powerful tool, but I have problem, every files I used here are from your example.
When I run: python3 make_virtual_genomes.py -y config.yaml, the err is:
No. 1 length: 281
Traceback (most recent call last):
File "make_virtual_genomes.py", line 218, in
main()
File "make_virtual_genomes.py", line 210, in main
info.make_genome()
File "make_virtual_genomes.py", line 95, in make_genome
self.genome += (circrna.seq * 2 + polyN)
TypeError: unsupported operand type(s) for *: 'Seq' and 'int'

Could you give me some advice? Thank you very much!

Best,

Lu Han

find_RCRJ_and_classify.py

Hey

I'm using CircCode, but every time I am facing the same problem. I would appreciate any help to resolve this issue:

When I run - find_RCRJ_and_classify.py it gives following error:
Error in (function (edges, n = max(edges), directed = TRUE) :
At type_indexededgelist.c:117 : cannot create empty graph with negative number of vertices, Invalid value
Calls: classification -> createNet -> graph -> do.call ->
In addition: Warning message:
In max(edges) : no non-missing arguments to max; returning -Inf
Execution halted
Classify successfully!
sed: can't read /userdata/lab/RPFAligned.sortedByCoord.out.bam.merge_result.RCRJ_result.csv_translated_circ.fa: No such file or directory
Traceback (most recent call last):
File "find_RCRJ_and_classify.py", line 369, in
main()
File "find_RCRJ_and_classify.py", line 355, in main
classify(coding_seq,
File "find_RCRJ_and_classify.py", line 102, in classify
seqs = SeqIO.parse(final_trans_file, 'fasta')
File "/home/lab/.conda/envs/circcode_env/lib/python3.8/site-packages/Bio/SeqIO/init.py", line 607, in parse
return iterator_generator(handle)
File "/home/lab/.conda/envs/circcode_env/lib/python3.8/site-packages/Bio/SeqIO/FastaIO.py", line 183, in init
super().init(source, mode="t", fmt="Fasta")
File "/home/lab/.conda/envs/circcode_env/lib/python3.8/site-packages/Bio/SeqIO/Interfaces.py", line 47, in init
self.stream = open(source, "r" + mode)
FileNotFoundError: [Errno 2] No such file or directory: '/userdata/lab/RPFAligned.sortedByCoord.out.bam.merge_result.RCRJ_result.csv_translated_circ.fa'

Please note that -

"RPFAligned.sortedByCoord.out.bam.merge_result.RCRJ_result.csv_translated_circ.fa" no such file is getting generated after mapping in previous step.

Could you help me figure out this? Thanks in advance.

Best,

Tanvi

Bug of output function

There is a bug in the last function.

This will cause circcode to fail to output the final .csv file. This error will not affect the entire process but will cause the .csv file to fail to generate.

We have temporarily removed this feature and will fix it soon.

find_RCRJ_and_classify.py error

We met an error while perform 'find_RCRJ_and_classify.py', which is as follow:

~/miniconda3/lib/python3.6/site-packages/Bio/Seq.py:2715: BiopythonWarning: Partial codon, len(sequence) not a multiple of three. Explicitly trim the sequence or add trailing N before translation. This may become an error in future.
  BiopythonWarning)
Traceback (most recent call last):
  File "find_RCRJ_and_classify.py", line 292, in <module>
    main()
  File "find_RCRJ_and_classify.py", line 288, in main
    find_longest(tmp_file_location, ribo_name, result_file_location, number)
  File "find_RCRJ_and_classify.py", line 115, in find_longest
    end = seq.id.split(':')[-1].split('-')[1]
IndexError: list index out of range

My python is poor, could you please fix it ?

find_RCRJ_and_classify.py

Hello,

When I run "python3.7 find_RCRJ_and_classify.py -y config.yaml". I got the err :
Traceback (most recent call last):
File "find_RCRJ_and_classify.py", line 8, in
import rpy2.robjects as robjects
File "/software/biosoft/software/python/python2019/lib/python3.7/site-packages/rpy2/robjects/init.py", line 27, in
from . import language
File "/software/biosoft/software/python/python2019/lib/python3.7/site-packages/rpy2/robjects/language.py", line 16, in
_str2lang = ri.baseenv['str2lang']
File "/software/biosoft/software/python/python2019/lib/python3.7/site-packages/rpy2/rinterface_lib/conversion.py", line 44, in _
cdata = function(*args, **kwargs)
File "/software/biosoft/software/python/python2019/lib/python3.7/site-packages/rpy2/rinterface_lib/_rinterface_capi.py", line 282, in _
robj = function(*args, **kwargs)
File "/software/biosoft/software/python/python2019/lib/python3.7/site-packages/rpy2/rinterface_lib/sexp.py", line 355, in getitem
raise KeyError("'%s' not found" % key)
KeyError: "'str2lang' not found"

I guess the err is related to previous steps, because there are no "junction_result" and "junction_filter_result files" in tmp_file folder. And all files I used are from your example, could you give me some advice?

Thank you very much!

Lu

Welcome to submit any questions here

Dear users,
If you encounter any problems/bugs/doubts during use, or if you have any suggestions, you are welcome to submit them here to help us to improve CircCode.
Thank you for your using.
:)

Bug in find_RCRJ_and_classify.py

Fixed an issue where the 'coverage_counts' parameter specified by the user in the configuration file could not be properly applied to the process.

Error in running python3 find_RCRJ_and_classify.py -y config.yaml

15447
15448
15449
15450
15451
Error in (function (edges, n = max(edges), directed = TRUE) :
At type_indexededgelist.c:117 : cannot create empty graph with negative number of vertices, Invalid value
Calls: classification -> createNet -> graph -> do.call ->
In addition: Warning message:
In max(edges) : no non-missing arguments to max; returning -Inf
Execution halted
Classify successfully!
sed: can't read /path/to/my_directory/CircCode_tem/ERR45678Aligned.sortedByCoord.out.bam.merge_result.RCRJ_result.csv_translated_circ.fa: No such file or directory
Traceback (most recent call last):
File "find_RCRJ_and_classify.py", line 369, in
main()
File "find_RCRJ_and_classify.py", line 355, in main
classify(coding_seq,
File "find_RCRJ_and_classify.py", line 102, in classify
seqs = SeqIO.parse(final_trans_file, 'fasta')
File "/home/.conda/envs/circcode_env/lib/python3.8/site-packages/Bio/SeqIO/init.py", line 607, in parse
return iterator_generator(handle)
File "/home/.conda/envs/circcode_env/lib/python3.8/site-packages/Bio/SeqIO/FastaIO.py", line 183, in init
super().init(source, mode="t", fmt="Fasta")
File "/home/.conda/envs/circcode_env/lib/python3.8/site-packages/Bio/SeqIO/Interfaces.py", line 47, in init
self.stream = open(source, "r" + mode)
FileNotFoundError: [Errno 2] No such file or directory: '/path/to/my_directory/CircCode_tem/ERR45678Aligned.sortedByCoord.out.bam.merge_result.RCRJ_result.csv_translated_circ.fa'

Help needed!

find_RCRJ_and_classify.py error

当我是用命令find_RCRJ_and_classify.py,最后报错
Error in seq_len(limitThreshold) : 参数必需能被强制改变成非负整数
Calls: classification -> sapply -> lapply
此外: Warning message:
In max(numeric(0), ..., na.rm = na.rm) : max里所有的参数都不存在;回覆-Inf
停止执行

请问怎么能解决这个问题,谢谢

Access to necessary data for running Circcode

hi guys,
it is not clear to me where you have acquired the sequence of ribosomal RNA for human from ensembl.
Similarly, it is not clear to me what you mean by coding and non-coding sequences
Also I was wondering if the tool can be used on RNA seq data from ribosomal RNA depleted RNA seq

regards

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.