Coder Social home page Coder Social logo

tp-dataviz-prototype's People

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

tp-dataviz-prototype's Issues

Implement social commenting/tagging

Maybe we could plug-in some existing implementation like Discus?

However it's implemented, the comments (and tags, if tags are different from comments) need to be searchable.

Bug: some valid variable swaps are rejected

If you have num-users on one axis and something on the other axis, and you try to drag num-users to
the other axis, it should swap the roles of the variables, but instead it tells you there's an invalid
assignment (because detectInvalidAssignments is looking at num-users vs. num-users). Should attempt the switch and run detectInvalidAssignment on the switched variables before telling you it can't plot!

Lattice needs layout improvements

The margins on the individual lattice subgraphs are too big.

The labels of the individual subgraphs in the lattice can be hidden by the data bars. Labels should be up above.

Shared axis labels could be shown outside of the whole lattice, not repeated for each one.

Axes need labels

Yeah you can look at the variable assignments to tell what the axes are, but they should be labeled directly on the graph axes as well.

Need a sort order option for some axes

Can implement this using drop-down menus (and colon-separated URL params) just like the logarithmic scale (issue 1).

Typical sort options: Alphabetically, for factor names; or in a bar plot, should be able to sort from most->least or vice versa.

An event_name axis should default to sorting by the order of the eventNames array in the variables file (but also allow most->least or alphabetical)

A "Firefox Version" axis should default to sorting by version order (neither numeric nor alphabetical: 11.0a1 comes before 11.0 for instance.)

Factor-vs-factor scatterplots not very readable

We're currently jitter-plotting individual users within a square region to try to show density, but it quickly hits a threshold where it becomes indistinguishable.

A better way to do factor-vs-factor plots might be to draw boxes and write a number (or %) of users in each one (also color-coding background of boxes?)

Automatically generate and show some statistics

On numerical-vs-factor scatterplots, add (optional) box-and-whisker plots to show quartiles.

On histograms, add (optional) median line.

On numerical - vs -numerical scatterplots, add (optional) regression line.

Need a logarithmic axis option

"num users" , "num events" , and individual event counts often differ by orders of magnitude, so a linear scale axis makes the data hard to read. Should support logarithmic scale as an option for either axis by offering a drop-down "axis scale" menu whenever appropriate. This should apply to both bar plots and scatter plots.

Force latticed subgraphs to share axis scales

When latticing, all lattices should share same axes so that they can be more comparable. Create the axes
based on the pre-latticed data, then give those axes to every sub-graph. Prereq: factor the axis creation out from the other stuff, make the axis objects arguments that can be passed around.

For histograms, this means creating the bins based on the pre-latticed data, then generating each individual histogram based on those same bins.

If one axis is % of users, it should use the same scale for all latticed subgraphs, but the height of the bars should represent the percentage of users within that lattice group, not the percentage across all users. This is, again, so the height of the bars can be meaningfully compared.

Add tooltips to explain data on hover

Should be able to see an exact count for any bar by hovering over it: "1,401 users had between 4 and 5 extensions".

Should be able to see exact coordinates by hovering over any dot in the scatter plot: "This user had Windows 8 and 300 bookmarks in the bookmark toolbar".

Normalize latticed density plots

Try doing a number-vs-number scatter plot (density plot) and then latticing it by operating system.

The main thing that jumps out visually is that some OS are less dense overall, but that's not what we care about -- we're not trying to see that there are fewer Linux users than Windows XP users, we're trying to compare the shape of the density distributions. That's hard to do right now.

When doing latticed density plot, somehow normalize density by number of users in each category to make the categories more comparable to each other.

Bug: Number-of-users vs. Number-of-actions plot is broken

"Number of Users" vs "Number of Actions" is broken, because user records output by the current munger script do not have a numEvents property.

However, this was not a very useful plot anyway, so I think we should just get rid of number-of-actions, replacing it with median-actions-per-user and mean-actions-per-user (neither of which can be meaningfully plotted against number-of-users).

Generate explanation of "how to read this graph"

We do a little bit of this, but it's not in a good place, it's not very complete, and it gets clobbered the next time you mouse-over a variable. Should devote some time to writing a function that generates a very good human-readable explanation of the meaning of the chart in layman's terms.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.