Coder Social home page Coder Social logo

Comments (2)

billzt avatar billzt commented on September 22, 2024

pragmas (‘##’) have a meaning that may depend on the context and should not be placed arbitrarily in the file without careful considerations.

In current script, all the pragmas (‘##’) were written in the front of the file. It is wrong

Thanks to Miklos Csuros’ review

from gff3sort.

billzt avatar billzt commented on September 22, 2024

Possible pragmas lines:

##gff-version 3.2.1
required, must be in the first line

##sequence-region seqid start end
Optional

##feature-ontology URI
##attribute-ontology URI
##source-ontology URI
##species NCBI_Taxonomy_URI
##genome-build source buildName

###
This directive (three # signs in a row) indicates that all forward references to feature IDs that have been seen to this point have been resolved. After seeing this directive, a program that is processing the file serially can close off any open objects that it has created and return them, thereby allowing iterative access to the file. Otherwise, software cannot know that a feature has been fully populated by its subfeatures until the end of the file has been reached. It is recommended that complex features, such as the canonical gene, be terminated with the ### notation.
(In order to properly be indexed by tabix, features must be sorted exactly by their start position regardless of whether they belong to which gene. These happened in overlapped genes when features might cross with each other. However, we consider to add a # mark in each block, by an option)

##FASTA
This notation indicates that the annotation portion of the file is at an end and that the remainder of the file contains one or more sequences (nucleotide or protein) in FASTA format. This allows features and sequences to be bundled together. All FASTA sequences included in the file must be included together at the end of the file and may not be interspersed with the features lines. Once a ##FASTA section is encountered no other content beyond valid FASTA sequence is allowed.
(Tabix can not deal with FASTA. Therefore such FASTA block should be removed and separated)

from gff3sort.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.