Coder Social home page Coder Social logo

proforma's People

Contributors

javizca avatar mobiusklein avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

rfellers oscar-gr

proforma's Issues

Minor inconsistencies in spec

There are some minor inaccuracies in some of the examples in the specification draft 12:

  • page 8: EM[R: Methionine sulfone]EVEES[O-phospho-L-serine]PEK -> This term doesn't appear in RESID. Note the leading space, but even without that the name is incorrect. Probably it should be L-methionine sulfone (RESID:AA0251)?
  • page 9: EM[UNIMOD:15]EVEES[UNIMOD:56]PEK -> accession UNIMOD:15 does not exist. In case consistency with the previous examples is desired, UNIMOD:35 corresponds to Oxidation. Same for the invalid example with U:15 just underneath.
  • page 11: EVTSEKC[half-cystine]LEMSC[half-cystine]EFD -> half-cystine should be half cystine (no hyphen).
  • page 14: The mass of HexS is specified with only three decimals, whereas other masses in that list have four decimals. It's also not rounded correctly. Instead use 242.0096 as the mass with four decimals.

More conceptual question:

  • Q: page 14: Parsing glycan compositions is somewhat non-trivial because some labels overlap. It would be easier if spaces between monosaccharides are used (split on space) or cardinality is always specified (split on [a-zA-Z]+\d+). Maybe this can be a bit more strongly recommended in section 4.2.8?
    A: Parsing is possible without enforcing spaces or cardinality by checking for only defined monosaccharides rather than any string.

  • Q: page 18: I'm a bit confused how parsers should interpret that global modifications are isotopes? The examples (13C, 15N, D) don't seem to be specified using a controlled vocabulary, whereas this is the case throughout the rest of the document. Is it that when no @ is used in the global modification part, as specified in section 4.6.2, it should always be considered an isotope instead?
    A: Yes, I currently interpret global modifications of the form INT* LETTER+ SIGNED_INT* as an isotope and global modifications of the form "[" mod "]@" (AA ",")* AA as global amino acid modifications (so square brackets and "@" sign).

  • Q: page 19: How should multiple global modifications on different amino acids be specified? I guess the following example, with a comma separating the global modifications within the angular brackets, would lie in line with the spec, but this is not explicitly detailed: <[Carbamidomethyl]@C,[Oxidation]@M>MTPEILTCNSIGCLK.
    A: Multiple global modifications are each specified in their own block between angled brackets.

Request specification clarification on sequence truncations

I would like to see a paragraph in the specification indicating how proteoform sequence truncations are to be specified. N-terminal truncations may be biological, as in the removal of the initial Met (perhaps with PTM) or the cleavage of a signal peptide or the action of a viral protease. The truncations may be instead be related to sample treatment, such as a rare cutter like CNBr for middle-down proteomics or due to a "hot" ion source. I believe ProForma should specify how a proteoform sequence compares to the sequence described by the accession, such as indicating the position of the first and last amino acids in the accession's sequence. Are amino acids preceding and succeeding the proteoform sequence expected to be included?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.