Coder Social home page Coder Social logo

linyunliu / bioinformatics Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 0.0 191.32 MB

Unraveling Evolutionary Relationship between Fungi Species

Java 35.53% Shell 49.96% Python 13.58% Batchfile 0.93%
bioinformatics blastp computer-science maximum-likelihood phylogeny biotechnology

bioinformatics's Introduction

ABOUT

Contribution

Linyun Liu, Tarandeep Kaur, Shika Borge

Research Question

How can analyzing protein sequences between Saccharomyces Cerevisiae, Schizosaccharomyces Pombe and other fungi species reveals the evolutionary relationships?

Objective

To construct phylogenetic tree from protein clusters using maximum likelihood method to reveal evolutionary relationship between Saccharomyces Cerevisiae, Schizosaccharomyces Pombe and other Fungi species

CONNECT FTP TO DOWNLOAD COMMAND LINE TOOL

ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/

REFERENCES

1. Blast Software General Introduction

2. How to blast against a particular set of local sequences (local database)?

3. Building a BLAST database with your (local) sequences

4. Blast Command Line Tool User Mannual:

ALIGNMENT FILE OUTPUT OPTIONS (-outfmt x)

x Description
0 pairwise
1 query-anchored showing identities
2 query-anchored no identities
3 flat query-anchored, show identities
4 flat query-anchored, no identities
5 XML Blast output
6 tabular
7 tabular with comment lines
8 Text ASN.1
9 Binary ASN.1
10 Comma-separated values
11 BLAST archive format (ASN.1)
12 Seqalign (JSON)
13 Multiple-file BLAST JSON
14 Multiple-file BLAST XML2
15 Single-file BLAST JSON
16 Single-file BLAST XML2
17 Sequence Alignment/Map (SAM)
18 Organism Report

COMMAND LINE USAGE

1. Making BLAST database of local sequences

The input file must consist of sequences in FASTA format.

$ makeblastdb -in sequence.fasta -parse_seqids -dbtype prot -out /PATH/TO/YOUR/DTATABASE

Here, -parse_seqids is used because it may later help in parsing the sequence ids of the given sequences for further analyses. -in refers to the input file, -dbtype can be protein or nucleotide and -out is the name of the BLAST database to be created. If your input file is present in another directory then provide the complete path.

2. BLAST the local database against a single sequence

$ blastp -db /PATH/TO/YOUR/DTATABASE -query seq.fasta -outfmt 0 -out result.txt
$ blastp -db /PATH/TO/YOUR/DTATABASE -query seq.fasta -outfmt 5 -out result.xml

where, -db is the BLAST database created in the previous step, -query is a file consisting of FASTA sequence, -outfmt is the output format which can be defined in several ways as shown here, and -numthreads refers to the number of CPUs to be used during the search. In the case of nucleotide sequences, use blastn or any other appropriate blast executable.

3. All against all

To BLAST local sequences against the local database created from the same input sequences, the input sequences are used as a query file in FASTA format.

$ blastp -db blastdb -query sequence.fasta -outfmt 0 -out result.txt

As you can see in the above command, the database is the same local database created in the first step and the query are the input sequences from which the local database was created in the first place.

bioinformatics's People

Contributors

linyunliu avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.