Coder Social home page Coder Social logo

BCF header contig IDX about noodles HOT 2 CLOSED

zaeleus avatar zaeleus commented on July 21, 2024
BCF header contig IDX

from noodles.

Comments (2)

zaeleus avatar zaeleus commented on July 21, 2024 1

Thanks for the detailed report. I see the issue here.

It was an unfortunate decision to entangle the format/spec with an implemntation detail. A few changes will have to be made to provide compatibility with htslib.

  • bcf::header::StringMap needs to be reworked to allow nonsequential entries. This means it can no longer be backed by an IndexSet. (4291d88 and 4e39a77)
  • A StringMap will need to be created for contig header records. Given that absolute positions can override relative positions, the order of vcf::header::Contigs cannot be used. (17397d7)
  • Document that a record chromosome ID (chrom) is not an index into vcf::header::Contigs but an index into the contig StringMap (6b73f93).
  • Writing a VCF record needs to use the contig StringMap instead of the VCF header contigs (9da12e7).
  • Converting a BCF record to BCF record needs to the contig StringMap instead of the VCF header contigs (9c97ba8).
  • Parse IDX field from raw record in vcf::header::Contig (22046f0).

I will work on these.

from noodles.

zaeleus avatar zaeleus commented on July 21, 2024 1

noodles 0.17.0 includes the changes that resolve this issue, mainly parsing the IDX field on VCF header contig records and also building a string map for those records.

The implementation is a breaking change. The user now has to parse a bcf::header::StringMaps, which includes both the dictionary of strings and dictionary of contigs. Typically, if any method will need both maps (e.g., Record::try_into_vcf_record and Writer::write_vcf_record), use the collection; otherwise, use the respective string map (StringMaps::strings or StringMaps::contigs). For example,

let raw_header = reader.read_header()?;
let header = raw_header.parse()?;
let string_maps: StringMaps = raw_header.parse()?;

let record = bcf::Record::default();
let dp = record.info().get(&header, string_maps.strings(), Key::TotalDepth);

let chromosome_id = usize::try_from(record.chromosome_id()).expect("invalid chrom");
let chromosome_name = strings_map.contigs().get_index(chromosome_id);

writer.write_vcf_record(&header, &string_maps, &record)?;

from noodles.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.