jeff-k / bio-seq Goto Github PK

View Code? Open in Web Editor NEW

18.0 18.0 2.0 456 KB

Bit packed and well-typed biological sequences

License: MIT License

Rust 100.00%

bio-seq's People

Contributors

Stargazers

Watchers

Forkers

jianshu93 szabgab

bio-seq's Issues

amino acid kmer utility

Hello Jeff,

It's really nice that you have this bit by bit encoding of sequences for DNA. You said on todo list that amino acid support will also be available. I think we need at least 8 bit right instead of 2 or 4 for AGCT, and IUAC dna (16) because there are 20 amino acid and more for IUAC rules. I am doing amino acid kmer counting for a huge database of bacterial genomes in amino acid format and I cannot find any bit by bit encoding and decoding of amino acid sequences except this one. Will amino acid kmer available soon? I will be very happy to work on this with you or whatever. email is: [email protected]

Many thanks,

Jianshu

Why different use statements for 4-letter DNA and IUPAC DNA

The "use" statements seem to work differently for four-letter DNA and IUPAC, see the following two examples. I don't understand why the "use" statement patter differs. Is there something I've misunderstood?

For DNA:

use bio_seq::*;
use bio_seq::codec::Dna;
fn main() {
    let input_string = "AAAA";
    let dna_seq = Seq::<Dna>::from_str(input_string).unwrap();
    println!("dna_seq: {}", dna_seq);
}

For IUPAC:

use bio_seq::*;
use bio_seq::codec::iupac::Iupac;
fn main() {
    let input_string = "AAAA";
    let iupac_seq = Seq::<Iupac>::from_str(input_string).unwrap();
    println!("iupac_seq: {}", iupac_seq);
}

Note the difference in particular between:
use bio_seq::codec::Dna;
and
use bio_seq::codec::iupac::Iupac;

Add support for multiple genetic codes

I've been looking all over for a Rust library that supports multiple genetic codes. This library seems to be the best translation library, so I hope this is on your radar. Thanks!

jeff-k / bio-seq Goto Github PK

bio-seq's People

Contributors

Stargazers

Watchers

Forkers

bio-seq's Issues

amino acid kmer utility

Why different use statements for 4-letter DNA and IUPAC DNA

Add support for multiple genetic codes

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent