Coder Social home page Coder Social logo

emacsmirror / 2bit Goto Github PK

View Code? Open in Web Editor NEW

This project forked from davep/2bit.el

0.0 2.0 0.0 75 KB

Library for reading data from 2bit files

Home Page: https://github.com/davep/2bit.el

License: GNU General Public License v3.0

Emacs Lisp 100.00%

2bit's Introduction

2bit.el

MELPA Stable MELPA

Introduction

2bit.el is a package for Emacs that provides a collection of functions, macros and interactive commands that can be used to extract data from 2bit format files. These are files that can hold DNA sequences in a compressed format. You'll see samples of this if you look at the human genome data, for example.

Emacs Lisp support

The package is built such that it provides functions and macros for working with 2bit files, and then goes on to build interactive commands on top of this code. The elisp-level support code includes:

2bit-open

2bit-open should be called to "open" a 2bit file for further use:

(2bit-open FILE &optional MASKING)

FILE is the path to the 2bit file you want to open. MASKING is an optional parameter to say if mask blocks should be taken into account when reading bases from the file. (it's worth noting that in my testing, mask block handling tends to make the reading of data a lot slower)

2bit-sequence-count

2bit-sequence-count can be used to quickly get the count of how many sequences are held in the 2bit file:

(2bit-sequence-count file)

FILE can either be the path to a 2bit file, or a value returned from 2bit-open.

2bit-sequence-names

2bit-sequence-names can be used to quickly get a list of the names of all the sequences held in a 2bit file:

(2bit-sequence-names file)

FILE can either be the path to a 2bit file, or a value returned from 2bit-open.

2bit-sequence

2bit-sequence can be used to quickly get a named sequence from a 2bit file:

(2bit-sequence file sequence)

FILE can either be the path to a 2bit file, or a value returned from 2bit-open.

SEQUENCE is the name of the sequence to get.

2bit-sequence-dna-size

2bit-sequence-dna-size returns the size of the DNA contained in the given sequence.

(2bit-sequence-dna-size sequence)

SEQUENCE must be a value returned from 2bit-sequence.

2bit-bases

2bit-bases can be used to get a string of bases from a sequence in a 2bit file:

(2bit-bases sequence start end)

SEQUENCE is a value returned from a call to 2bit-sequence. START and END describe the sub-sequence to grab. Note that the convention of zero-based, inclusive of START and exclusive of END is used.

2bit-with-file

2bit-with-file is a simple macro provides as a convenience wrapper when working with a 2bit file:

(2bit-with-file (handle file)
  ...body...)

HANDLE is the name to give to the "handle" of the 2bit file, and FILE is the file to open. For example:

;; Get a list of the sizes of each of the numbered chromosomes in the Human
;; Genome.
(2bit-with-file (hg "hg38.2bit")
  (cl-loop for chr from 1 to 22
           collect (2bit-sequence-dna-size (2bit-sequence hg (format "chr%d" chr)))))

Emacs commands

The following interactive commands are available:

2bit-insert-bases

2bit-insert-bases simply inserts the requested sequence at the current point in the current buffer.

2bit-insert-fasta

2bit-insert-fasta simply inserts the requested sequence at the current point in the current buffer, formatted in FASTA format.

Example

Here's a simple recording of a sample of using 2bit-insert-fasta to create a FASTA file from a 2bit file:

2bit's People

Contributors

davep avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.