Coder Social home page Coder Social logo

genomeannotation's Introduction

Hey! I'm Daren. Welcome to my GitHub profile!


Website  Twitter  Email  Google Scholar  LinkedIn  ORCID  ResearchGate 

Background

Ph.D., Quantitative Biology (Evolutionary Biology, Genetics, Genomics, & Bioinformatics) from University of Texas, Arlington.

Main academic interest: Convergent, adaptive evolution of vertebrates using cutting-edge genomics approaches.

Currently a postdoctoral fellow in the Edwards lab at Harvard University.

Previously a graduate student studying in the Castoe lab at UT-Arlington.


Genomic Analysis Toolkit

Genome Sequencing, Assembly, & Annotation, RAD-seq, RNA-seq, ATAC-seq, Hi-C, and more!


Coding Languages

Shell Script  R  Python  Git  Markdown 

Badge Sources: Awesome Badges - https://github.com/Envoy-VC/awesome-badges and custom badges constructed using https://shields.io/

genomeannotation's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

genomeannotation's Issues

rmOutToGFF3custom sorts gff3 header into gff3 file

Hi Daren,

Thanks for posting some really detailed work on Maker, especially with examples including custom masked repeats. I found a slight error with your rmOutToGFF3custom script. As it is written it sorts the ##gff-version 3 header along with the contents of the gff file. Moving one parenthesis fixes it.

#!/usr/bin/env bash
usage()
{
cat << EOF
rmOutToGFF3custom
Version 1 (2019-10-10)
License: GNU GPLv2
To report bugs or errors, please contact Daren Card ([email protected]).
This script is provided as-is, with no support and no guarantee of proper or
desirable functioning.

This script converts the .out file from RepeatMasker to a GFF3 file. Note that the output
is probably not perfect GFF3, so beware with downstream applications. This script
emulates the rmOutToGFF3.pl script supplied with RepeatMasker but provides a fuller ID
("target=") for each element in column 9 of the GFF. This ID includes the matching element,
like rmOutToGFF3.pl, but also includes the repeat family: in the format <Family>/<Element>.
This change is because many matching elements produced from RepeatModeler have IDs that
provide no information about repeat family classification. Output is written to standard
output (SDOUT).

This script requires requires awk, which should be available on any standard Unix system.

rmOutToGFF3custom -o <RM.out> [-h] > <name.gff3>

OPTIONS:
        -h              usage information and help (this message)
        -o              RepeatMasker .out file
EOF
}

while getopts "ho:" OPTION
do
        case $OPTION in
                help)
                        usage
                        exit 1
                        ;;
                o)
                        RMOUT=$OPTARG
                        ;;
        esac
done

if [[ -z $RMOUT ]]
then
        usage
        exit 1
fi

cat <(echo "##gff-version 3") \
<(cat ${RMOUT} | tail -n +4 | \
awk -v OFS="\t" '{ if ($12 ~ /)/) print $5, "RepeatMasker", "dispersed_repeat", $6, $7, $1, $9, ".", "Target="$11"/"$10" "$14" "$13; \
else print $5, "RepeatMasker", "dispersed_repeat", $6, $7, $1, $9, ".", "Target="$11"/"$10" "$12" "$13 }' | \
awk -v OFS="\t" '{ if ($7 == "C") print $1, $2, $3, $4, $5, $6, "-", $8, $9; else print $0 }' | \
sort -k1,1 -k4,4n -k5,5n )

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.