Comments (8)
Thanks for reporting and the good example case! I'll have a look at it as soon as I find the time.
from genometools.
I've had a look at the problem and as you already suspected the issue is caused by the IDs of the five_prime_UTR
features. They are identical, causing the five_prime_UTR
to be considered a multi-feature. Note that with the CDS features there already is a multi-feature in the CC, which apparently causes the layout engine to start another block. This behavior was implemented to enable AnnotationSketch to draw diagrams like Figure 1 in http://www.sequenceontology.org/gff3.shtml, where multiple multi-features as the children of the same parent feature cause multiple copies of the parent feature being drawn.
That's also why the problem disappears when the IDs are removed. I will need to investigate this in further detail.
from genometools.
Well, just as an isoform has a single CDS even if it's split across multiple exons (and thus requires multiple entries in the GFF3 file), it also has a single 5' UTR, even if it is split across multiple exons. Same could be said for 3' UTR. So as far as expected behavior is concerned, if a feature with multiple multi-feature children can't be rendered correctly, I would consider that a bug rather than a feature.
from genometools.
Yes, I agree that is not the intended behavior. However, we need to find a way to make this consistent -- in your case, one would want both multifeatures drawn in the same parent block, while in the example case mentioned above one would expect the result to be two parent blocks. Maybe it is possible to make this dependent on whether the multi-features overlap: in your case, there is no conflict in drawing both in the same parent because the CDS and 5' UTR naturally do not overlap. If they do overlap, e.g. in the case of two CDS variations given as two multi-features, the parent block is duplicated. What do you say?
from genometools.
Functionally that would work. In the general case, though, I wonder whether multifeatures of different types should force creation of a new block or whether they could be collapsed on the same block. So we might want to consider creating a new block only if the overlapping multifeatures are of the same type.
We probably need a more concrete example to reinforce whether this is a good idea or not. As far as protein-coding genes are concerned, the only multifeatures you'll commonly encounter are the CDS and UTRs, which by definition are spatially exclusive.
from genometools.
You are right. I will see when I find the time to extend the code to allow multiple multi-features collapsing into it as long as they do not overlap.
from genometools.
I'm currently having a look at the problem, and I think we need to fundamentally rethink how to handle such cases in which a parent has multiple multi-features as children, some of which overlap and some don't. Which elements in the block get cloned and which do not? This may in the end even be dependent on the order in which the features appear and may lead to many possible variations to draw the same situation!
Do we want to first find the `longest' (i.e. covering the maximal amount of bases) non-overlapping set of multi-features and use these as a basis to draw the others? What if these do not overlap? Draw them in one additional block or separately?
I think if we really want such an automatic cloning we should first agree on one generic, but intuitive and sensible approach on how to handle any possible case and a set of rules to guide the layout process. Any ideas? IMHO, this is a question of trading simplicity and genericity for visual decluttering in special cases.
That being said, I am currently somewhat inclined to drop the automatic cloning of blocks with overlapping multi-features because it is difficult to find a 'one-fits-all' behavior, leading to unexpected results. Rather, simply draw them on top of each other just as they 'normally' would, given the graph structure and collapsing rules.
from genometools.
Again, I think considering feature type is important. I think I agree with the inclination you stated in your final paragraph--to simply draw multi-features on top of each other. In the absence of a well-defined set of use cases, I see this as a satisfactory solution. However, I think this is only a good idea when the multi-features are of a different type. If two multi-features of the same type overlap (as in the mRNA with multiple coding sequences from the GFF3 spec), it would be IMHO a visual mess to try to draw these on top of each other.
Maybe that's what you meant...
from genometools.
Related Issues (20)
- Running LTRharvest/digest with parallel HOT 5
- gt gff3 -sortlines fails on particular gff HOT 11
- Difficulty extracting intronic regions HOT 2
- There is no -retainids when using ltrdigest, Why don't add one? HOT 2
- We need to run tirvish two times for a single genome? HOT 1
- Install error: cairo.h: No such file or directory on RedHat 7 HOT 2
- How to test whether the GenomeTools library is installed properly HOT 1
- After installation, how to use and test whether the function is normal? HOT 4
- bed_to_gff3 bug HOT 9
- error: ignoring return value of 'fwrite', declared with attribute warn_unused_result [-Werror=unused-result] HOT 3
- LtrPipeline: GenomeTools failed to run ltrharvest. Error code: 139 HOT 4
- gff3validator "Sequence Ontology" out of date? HOT 3
- Assertion error: gt_feature_node_remove_leaf HOT 2
- gt gff3 loss of intergenic regions HOT 1
- sketch all non-overlapping genes on one track HOT 4
- -Werror is for CI + developer only HOT 4
- Fails to build on macOS Ventura / Xcode 14.x HOT 3
- New build warnings on GCC 13 turned errors are blocking builds HOT 1
- Build time issue with GenomeTools 1.5.9 HOT 5
- Aborted (core dumped) with LTR harvest HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from genometools.