Coder Social home page Coder Social logo

jorgemfs / rfsc Goto Github PK

View Code? Open in Web Editor NEW

This project forked from cobilab/rfsc

0.0 1.0 0.0 134.96 MB

Reference-Free Sequence Classification Tool for DNA sequences in metagenomic samples

License: GNU General Public License v3.0

Shell 54.81% Python 45.08% Dockerfile 0.11%

rfsc's Introduction

License: GPL v3

RFSC is a Reference-Free Sequence Classification Tool that using machine learning classifiers relies on an ensemble of experts in order to provide efficient classification in metagenomic contexts.

Instalation

git clone https://github.com/cobilab/RFSC
cd RFSC
chmod +x RFSC.sh 
./RFSC.sh --install

Using Docker

git clone https://github.com/cobilab/RFSC
cd RFSC
docker-compose build
docker-compose up -d && docker exec -it rfsc bash && docker-compose down
chmod +x RFSC.sh 
./RFSC.sh --install

Build NCBI Reference Databases

./RFSC.sh --build-ref-virus --build-ref-bacteria --build-ref-archaea --build-ref-protozoa \ --build-ref-fungi --build-ref-plant --build-ref-mitochondrial --build-ref-plastid

Running Examples

✨ Generate a synthetic sequence and subsequently proceed to a Reference-Free Reconstruction of the same:

 

./RFSC.sh --clean y
./RFSC.sh --threads 8 --gen-adapters
./RFSC.sh --efetch-fasta 155971 Input_Data/EntrezGenomes 
./RFSC.sh --efetch-fasta EF491856.1 Input_Data/EntrezGenomes 
./RFSC.sh --efetch-fasta MT682520 Input_Data/EntrezGenomes
./RFSC.sh -synt Input_Data/EntrezGenomes/155971.fna Input_Data/EntrezGenomes/EF491856.1.fna Input_Data/EntrezGenomes/MT682520.fna
./RFSC.sh -trim TT PE --run-de-novo
✨ Reference-Based Classification, usign FALCON-meta:
(If the reference databases have already been built and the Reference Free Reconstruction stage is finished)

 

./RFSC.sh --threads 8 --set-len-cov 100 3 --set-threshold-max-min 70 1 --run-falcon SO Viral
✨ Reference-Free Classification, using XBoost

 

./RFSC.sh --threads 8 --efetch-fasta 155971 RefFree
./RFSC.sh --run-xgboost
✨ Run all classifiers on real data

 

./RFSC.sh --run-all-classifiers Accuracy
./RFSC.sh --run-all-classifiers F1Score
./RFSC.sh --run-all-classifiers

System Requirements

Laptop computer running Linux Ubuntu (for example, 18.04 LTS or higher) with GCC (https://gcc.gnu.org), Conda (https://docs.conda.io) and CMake (https://cmake.org) installed. The hardware must contain at least 8 GB of RAM, and a 800 GB disk. In the case of the this, if the database is not re-built, it is only needed near 10 GB of space.

Tools Integrated in RFSC

Tool URL
Trimmomatic http://www.usadellab.org/cms/?page=trimmomatic
FASTP https://github.com/OpenGene/fastp
metaSPAdes https://cab.spbu.ru/software/meta-spades/
GTO https://cobilab.github.io/gto/
Entrez https://www.ncbi.nlm.nih.gov/genome
FALCON-meta https://github.com/cobilab/falcon
Cryfa https://github.com/cobilab/cryfa
Blastn https://blast.ncbi.nlm.nih.gov/Blast.cgi
ORFfinder https://www.ncbi.nlm.nih.gov/orffinder/
ORFM https://github.com/wwood/OrfM
GeCo3 https://github.com/cobilab/geco3
AC https://github.com/cobilab/ac

License

GNU GPL

✨Developed to make a change!✨

rfsc's People

Contributors

alexmlourenco avatar joaorafaelalmeida avatar jorgemfs avatar pratas avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.