Coder Social home page Coder Social logo

islandcafe's Introduction

IslandCafe

IslandCafe, is genomic island prediction tool that utilizes sequence composition and functional information to identify genomic islands.

Example

First set the permissions for the file

sudo chmod 775 cafe
sudo chmod 775 cafe.out

To run program on example files

./cafe -phylo -genus bartonella -gbk example.gbk

Input

CAFE requires genome sequence and annotations to predict genomic islands. These can be provided as a single Genbank file.

./cafe [options] -gbk  genome.gbk

If the genome is not annotated then CAFE can identify marker genes for genomic island. This requires Prodigal and Hmmer be installed and included in path

./cafe [options] -annot genome.fna

To use the Phylogenetic module of CAFE, first download reference protein sequence files (.faa format) in faa folder. CAFE requires atleast five reference protein files for comparison. faa folder should only have reference protein sequence files, remove pre existing files in faa folder if not running on example genome. Phylogenetic module also requires that genus name be speciied using -genus option

./cafe [options] -phylo -genus [genus_name] -gbk genome.gbk

Note this requires BLAST version 2.6 or higher. Phylogenetic module also requires that users specify the name

Options

--help Print help and exit

--info Print program information and exit

--annot Annotate marker genes. This option is only required if the input file is in fasta format. This option requires prodigal and Hmmer be installed and in path

--Thres Provide segmentation, contiguous clustering and non contiguous clustering thresholds (range: 0-1. eg 0.8 0.99999 0.999)

--gbk Use genbank file as input

--phylo Use phylogenetic module. Genus must be specified for using phylogenetic module. (eg ./cafe -phylo -genus escherichia -gbk ecoli.gbk)

--genus Specify genus name of the input genome

--out Output file name

--verbose print on screen

--expert keep temporary files for user analyses

--visual Make a map of genomic islands (Requires CGView be installed and in path)

Output

CAFE outputs a tab separated text file. File with suffix CAFE.txt shows genomic island predictions. It has four columns showing genomic island id, start and end co-ordinates, and length of the genomic island.

Requirements

IslandCafe requires bioperl (Modules -- File::Copy Bio::SeqIO List::MoreUtils List::Util). The cafe.out file is compiled using gcc compiler

Note

This program has been tested on 64-bit machine and is intended for use on 64-bit computers

islandcafe's People

Contributors

mehuljani avatar

Stargazers

 avatar  avatar

Forkers

nickp60 axbazin g1o

islandcafe's Issues

Cannot open input file: No such file or directory at cgview_xml_builder.pl line 1820.

I found out the cafe did not write the CAFE.fna file in the folder so that the required input CAFE.fna of the cgview_xml_builder.pl was missing, which gave out the the error of : Cannot open input file: No such file or directory at cgview_xml_builder.pl line 1820.

system ("perl cgview_xml_builder.pl -sequence $in_filename)
$in_filename should be "$name_CAFE.fna", if my input file and ouput file were not in the same place.

I am not familar with perl code, so I did the following simple revision of the code:

$wrong_path=$infile . "_CAFE.fna";
$correct_path="$name\_CAFE.fna";
system ("mv $wrong_path $correct_path");

And this revision work fine for me.

Testing with sample data did not work

I use this line:
./cafe -phylo -genus bartonella -gbk example.gbk --verbose

The output was:
2487641 2488298 0.060222165473
2485102 2487640 0.017415755062
2488299 2489200 0.044747224651
Clustering...
It is almost done now
Undefined subroutine &main::abs_path called at ./cafe line 400, line 47389.

No clusters with example data

When running the example, I get the following error:

$ ./cafe -phylo -genus bartonella -gbk example.gbk


Building a new DB, current time: 09/25/2018 12:17:13
New DB name:   /home/nicholas/GitHub/CAFE/faa_database
New DB title:  faa_database
Sequence type: Protein
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 7095 sequences in 0.29547 seconds.
No Genomic islands detected
rm: cannot remove 'CAFE_temp': No such file or directory

Looking at cafe, it appears that around lines 386 and 393 are commenting out the call to cafe.out.

I think I have the fix, so I'll submit a pull request, but if I'm off base please do let me know!

Thanks in advance,

~Nick

Phlylogenetic module

Hi! Thank you very much for your software, so far it has worked well for me.
I wanted to ask you if you could explain to me briefly how to make the phylogenetic module option work, I really don't understand it and I would appreciate an explanation of its use in order to do a better job.

Thank you in advance

Sven

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.