- make sure terminology is consistent
- rank-frequency vs rank-probability relationship VS r(w) and P(w) (or f(w))
- probabilities: indices and identities
- MLE
- Zipfianness
- quantities (theoretical vs observed)
- Subsampling (capitalised)
- randomly sampled subcorpus vs. subsample vs. filtered subcorpus vs. ...
- re-work introduction to Chapter 1 (i.e. big picture and abstract motivation)
- move Chomsky vs C&V (i.e. learnability) to end
- re-work sections on Zipf (i.e. elaborate on origin & extent debates, talk about relevance, foreshadow Subsampling)
- editing -> work through comments, ensure readability and understandability
- add references
- editing -> comments, readability
- Tables -> change sizes, making smaller (use resizebox and footnotesize)
- Figures -> change sizes, making larger (use wrap figure to have text float around); add names (languages) to the subplots
- for goodness-of-fit measures: add extreme values
- references
- add overall summary/conclusions for Chapter,
- mention data availability on Github: valevo/Thesis -> folder names correspond to chapter and figure numbers, file names to language name (e.g. valevo/Thesis/Chapter 2/Figure 2.1/FI.png)
- (make plots of MLEs and null models)
- (share cbars and y-axes in plots)
- note that the plots in this chapter are proper estimates
- (hapax growth)
- editing -> comments
- references
- terminology: set of samples, corresponding distributions, probability estimates from individual samples, averages of probability estimates
- terminology (also in plots): articles vs texts
- mention other languages and the fact that observations are similar
- Table 3.1: add column from 10*10^6
- add to conclusion of Section 3.4 that convergence is useful for Chapter 4
- conclusion for Chapter 3
- elaborate on introducing paragraph of Chapter 3
- Fano factor plot: fix y-axis
- editing -> comments
- (re-do transitioning paragraphs from typicality to filter implementations)
- conclusion of Filtering (algorithms & theory): what does it achieve (aka the gist) and what are the limitations? how does it generalise?
- conclusion for Filtering (chapter as a whole): how can learnability of Zipf be assessed?
- first a list of contributions, then future work
- re-do learnability -> integrate Chomsky vs C&V and make more rigorous -> how does learnability of Zipf generalise to learnability in the sense of C&V? how does it need ot be adapted?
- other uses of Filtering: apply to other laws, could be used to study the connection between Zipf's law and
- semantics (-> could shed light on such explanations -> could help further assess normality) and
- n-gram distributions