Coder Social home page Coder Social logo

microgenomics / tutorials Goto Github PK

View Code? Open in Web Editor NEW
79.0 79.0 55.0 7.84 MB

In this repo you will find demos and tutorials prepared by members of the CBIB. Feel free to use them for non-commercial activities given that you give proper credit. Have fun!

License: MIT License

Python 100.00%

tutorials's People

Contributors

ecastron avatar katterinne avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tutorials's Issues

Interpretation of roary output

Hello All,
I would be grateful if you can interpret the attached image which is an output of your roary tutorial.
pangenome_frequency-2

Especially with regards to the number of genomes and the corresponding genes. If lets say 2 genomes, which of the genomes does it refer to ? Do I assume it is an average ?

Pangenome sequence analysis

Hello,

I've started working with the DBGenerator.py script a few days ago. It's a great help for me, and I got it to work with Python 2.7 and with genomes which still have their version numbers attached (e.g. NC_008253.1). It worked well on my example data, but not on the whole dataset.

The problem are apparently duplicated genes in the gene_presence_absence.csv table. In that table, a sample can have multiple gene IDs for one gene, separated by tabs. In the genomas_locus.csv, I then get multiple entries as well, like this:
NZ_CP027766.1|['NZ_CP027766.1_00163', 'NZ_CP027766.1_00164']
NZ_CP027766.1|['NZ_CP027766.1_00163', 'NZ_CP027766.1_00164']

I am not sure how to proceed with this. Did this happen in your analysis as well? You have the

else:
    print(locus)
    raise

lines in the get_locus_sequence() function, so maybe you were looking at this already?

Thanks!
@LilithElina

Possible modifications or aids for the user

Hi.
I really appreciate the tutorial. I did it and appreciate some modifications or missing data, which would improve it. I hope its not obvious information~

In the part of " Determining the Pangenome" and "roary_plots" you show core alignment tree but the program only provide the accessory (at least of me) and you need to executed "FastTree –nt –gtr core_gene_alignment.aln > core_alignment.newick" to obtained and plot it.
Other, in the part of " Pangenome sequence analysis" the name of python script is DBGenerator.py and the command line above say ...GeneratorDB.py.. this provide a mistake to execute the script and maybe some people do not notice it.

Thanks for all.

About sqlite3

hi:
i'am sorry to bother you,i have a trouble in database,i have created a database,but i can't select the result.

Pangenome sequence analysis using the python script

Have anyone ever had a problem with the python script, which takes the sequences from the .ffn files from Prokka into the analysis results from Roary?
Ive used the script from: https://github.com/EnzoAndree/tutorials/blob/patch-1/DBGenerator.py
The three first parts of the script works fine and makes three .csv files. But the "get_locus_sequence" does not, and produces a empty table. Ive have tried with different number of sequences and with the demo set without any luck.
Anyone knows anything?

Disk quota exceeded when running Roary

Hi,

I was following this tutorial and I met problems when running roary on .gff produced with Prokka.
Here's the message:

Warning: unable to close filehandle $bed_fh properly: Disk quota exceeded at /home/nickolas/anaconda3/lib/site_perl/5.26.2/Bio/Roary/BedFromGFFRole.pm line 41.
sh: line 1: 22825 Aborted

The demo directory is created, but every file is empty.
I installed all the dependencies and I looked at my disk quota with the quota command-line.

Can you help me? Thanks in advance,

Nicolas

DBGenerator.py

Hi there,

I'm following your more than useful pipeline to anyalise 43 bacterial samples.
But when running the DBGenerator.py script, I get the four output with info only for just 2 samples up to 43!
I have the script, the gene_presence_absence.csv fine and the ffn folder containing all the .ffn files from Prokka in the same folder.
This is the command with the output:

python3 DBGenerator.py ffn
Starting get_genomas_locus
Starting get_pangenoma
Starting get_pangenoma_locus
END get_pangenoma
Process Process-1:
Traceback (most recent call last):
File "/usr/lib64/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib64/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "DBGenerator.py", line 24, in get_genomas_locus
for locus in loci: # separalos!
NameError: name 'loci' is not defined
Starting get_locus_sequence
END get_pangenoma_locus
END get_locus_sequence

(I'm on a server running CentOS 7).

Suggestions?

Edit.
Solved thanks to the other closed issues.
Please update the main code in the front page: https://github.com/microgenomics/tutorials/blob/master/DBGenerator.py with the one in the zip file here: #2 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.