dissertation's People
dissertation's Issues
comments on Chapter 1, figure "Changes in Fragmentation"
ref fig-chap1-unitig-frag
This looks pretty interesting! I think some more discussion in the figure caption would be great, though. In particular, is there a takeaway?
Please put axis labels, esp on y axis!
Is there a different way to represent this? Perhaps in an additional figure, since I think this one is also pretty informative? What if you made one where the curves were weighted by unitig size?
Also, I have questions -
- why is the proportion of small nodes increasing for yeast towards the end??
Comments on Chapter 1, figure "Processing rates"
I think this figure would be much easier to parse first time 'round if you could invert the y axis. Alternatively state in the caption that "larger numbers are faster/better".
comments on Chapter 1, figure "Dynamic metrics of the cDBG during construction"
A few minor questions first -
- so, the sum of the
n_
lines should equal one? andkmer_p
always ends at x=1, y=1. - if correct, would suggest adding 'proportion of nodes' as y axis label.
- might also make them a bit bigger so they take up the page width?
I think the discussion of this figure in the Results section is a good start, but I have lots of questions ๐ -
- yeast looks pretty different. this is because of high coverage?
- I still have some trouble understanding the dynamics after looking at this for a while. let's see -
- n_circular should generally be small, ok
- n_trivial isn't defined anywhere?
- the kinks in kmer_p are from non-random sequencing, presumably? maybe verify this with shuffling?
- wouldn't you expect a fair number of islands in RNAseq?
- in yeast, it looks like n_islands is converging to a very different point than in the other two data sets; what gives?
- maybe: if this is due to higher coverage, what happens if you subsample the yeast data set a whole bunch?
- n_tips behavior is driven by error?
anyway, I think it would be very helpful to develop some intuition, maybe in another (simpler) figure.
also, a different or perhaps complementary tack - what interesting behavior is occurring in this figure that readers should be alerted to, and what behavior is just boring and "trivial"?
Comments on Chp 2, figure "Sketching distance curves showing saturation of a transcriptomic sample"
First question: is this done on diginormed data, or not?
I still struggle intuitively with the saturation behavior in the leftmost figure. Why would later sketches have high similarity? Can you remind me and then explain it in the caption?
Middle figure, the curve is because you're using flat k-mers w/o abundance, right, so you get a convexity when you have seen a bunch of the k-mers.
Right figure, why on earth is it decreasing a bit towards x=1?
Should you maybe be using containment instead of Jaccard? Seems easier to explain to me.
Chapter 1: Discussion: Key Points
- The primary and most intensive job of compaction is finding the decision k-mers
- Extracting sequence is incidental
- Tracking unitig metrics is potentially useful for downstream
- Streaming compaction builds a foundation for streaming assembly
- Streaming compaction can be used downstream from lighter-weight sub-linear methods
- Original idea: using compaction metrics for guessing
- Future idea (actually chapter 2): sketching methods for sub-linear compaction / assembly
- Streaming pipelines enable real-time feedback
- Stopping early to preserve computational resources
- Feedback loop w/third generation sequencing instruments
Comments on chp 2 figure Streaming $log$-Bray Curtis distance
Why switch to Bray-Curtis from Jaccard? Why use log? What does this figure mean? I haz questions!
Comments on Chp 2, figure "Streaming Jaccard Similarity"
I think I understand this behavior ๐
Q -
What does "moderately sized" mean in the caption? Why is it important?
aaaaaand what is the takeaway from this figure?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.