Coder Social home page Coder Social logo

Data discussion about obesity-explorer HOT 8 CLOSED

ubc-mds avatar ubc-mds commented on June 23, 2024
Data discussion

from obesity-explorer.

Comments (8)

dusty736 avatar dusty736 commented on June 23, 2024 1

World Happiness Report

  • What I like:
    • Complete Dataset
    • Contains spatiotemporal features
    • All numeric features
    • Clear features to filter on (freedom level, life expectancy, generosity, etc)
  • Limitations:
    • Only 9 common features across data
    • Some years have different features
  • Potential Uses:
    • Travel companies (moving recommendations)
    • Public servants (seeing what makes people happy)

from obesity-explorer.

jraza19 avatar jraza19 commented on June 23, 2024 1

Obesity dataset

  • What I like:

    • simple dataset with time as a variable
    • imagine a nice interactive map that could be created
    • data was pre cleaned
  • Limitations

    • Only a 3 variables
  • Personas/usage

    • international government agencies
    • non profit organizations
    • dieticians/public health professionals

from obesity-explorer.

tanmaysharma19 avatar tanmaysharma19 commented on June 23, 2024 1

COVID data

  • What I like:

    • contemporary dataset
    • 55 variables
    • time series data
    • data for all countries
    • can subset data easily across time, countries and attributes
  • Limitations

    • some missing data from first half of 2020
  • Personas/usage

    • covid researchers
    • public health agencies
    • government agencies

from obesity-explorer.

rtaph avatar rtaph commented on June 23, 2024 1

Desiderata:
• Micro-data (each row represents one unit, without aggregation)
• 5+ categorical dimensions for filtering/disaggregating
• 5+ numeric measures
• Geographic variables (ideally hierarchical, e.g. municipality -> province -> country).
• time-series data
• little to no missing data
• no need for weighting

from obesity-explorer.

rtaph avatar rtaph commented on June 23, 2024

Cancer dataset:

  • What I like:

    • Many quantitative variables. This opens up potential for interesting graphs.
    • Few variables have missing data
  • Limitations (what I like less):

    • There only seems to be one categorical dimension (the Geography variable). If we want to build in dimensions as disaggregations/filters this might be a bit limiting. Of course, we could bin some categories if we wanted.
    • The data is at the level of the county, not the individual (it is not microdata). This might make it a bit tougher to disaggregate and uncover patterns. It restricts us to inferences at the level of the county and above. Many levels are actually semi-aggregated levels (e.g. pctbachdeg18_24). This will make it hard to reshape the data since we cannot cross those columns while reshaping. We can probably only analyse the data marginally, rather than drill down by multiple variables.
    • Even if we can aggregate and slice the data above, there is an additional level of complication needed in that we would likely need to weight every single measure by population size of the county.
    • There are no time-series to plot. This is not mission critical but would be nice to have.
    • This is a combined sample. This means that some variables likely clustered in the sampling design itself, which might mean we have a lot of holes in our data if we try to do geographic mapping at a certain level.
  • Potential uses / personas:

    • Clinical researcher / scientific audience
    • Policy planner (municipal gov’t)
    • Physician

See auto-generated data profile report.

from obesity-explorer.

rtaph avatar rtaph commented on June 23, 2024

NYC agency performance indicators from the FY20 Mayor's Management Report:

  • What I like:

    • It's a collection of KPIs, which is something that people naturally create dashboards for.
    • It has time depth (FY16-FY20: probably more years in older datasets) so can be visualized as several series.
    • It has target data. We could make something similar to this scorecard I made (unrelated), or use bullet charts.
    • Appears quite complete and clean.
    • There probably is a written report somewhere where we can get a lot more information about the data, and validate that our summaries match.
  • Limitations (what I like less):

    • Other than the year, there are not many disaggregating variables. Maybe we need/want more than what exists in the data? We might be able to augment the data ourselves by, say, classifying KPIs into themes.
    • It could also be overwhelming if we have too many KPIs.
    • might be hard to determine which indicators can safely be summed from year to year (vs. being cumulative or needing distinct counts)
    • The value variables is represented different ways for different KPIs (e.g. 72 calls, 0:11 average wait time, $12,300 dollars, "↓" target, 99.2%)
    • KPIs have subsetted sections making it a bit tricky to work with. E.g. "– Robbery" is a sub-bullet KPI of "Major felony crime"
  • Potential uses / personas:

    • Mayor of NYC
    • Residents of NYC (taxpayer), accountability dashboard

from obesity-explorer.

rtaph avatar rtaph commented on June 23, 2024

OECD Business Tendency Data:

  • What I like:

    • OECD generally has very complete and reliable, especially on economics.
    • Monthly data for all OECD countries going back a decade
    • Many economic metrics
    • Can disaggregate/filter by countries (or regions) and industry
  • Limitations (what I like less):

    • Would be nicer to have more disaggregating variables
    • I don't think we can access micro-data.
    • Likely requires weights for aggregation.
    • Metrics are all on a relative scale. Might make it a little less intuitive for the average person.
  • Potential uses / personas:

    • Economists (which metrics are trending up or down)
    • Public servants (planning agencies)
  • Related ideas:

    • Compare these series with Coronavirus time series to see the impact the pandemic has had on numerous economic measures

from obesity-explorer.

rtaph avatar rtaph commented on June 23, 2024

Closing this issue out. The team decided to go with the obesity data during a team meeting.

from obesity-explorer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.