Coder Social home page Coder Social logo

smartchicago / chicago-atlas Goto Github PK

View Code? Open in Web Editor NEW
155.0 155.0 228.0 9.87 MB

View citywide information about health trends and take action near you to improve your own health.

Home Page: http://www.chicagohealthatlas.org/

Ruby 17.55% JavaScript 0.08% CSS 0.61% Shell 0.05% Python 0.09% HTML 2.07% PLpgSQL 79.53% CoffeeScript 0.02%

chicago-atlas's People

Contributors

mperezdomandtom avatar muygrafico avatar parrolabs1 avatar smarziano avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chicago-atlas's Issues

Add Metro Chicago Health Facilities

This is the sample point-level dataset I was talking about: https://www.metrochicagodata.org/dataset/Metro-Chicago-Health-Facilities/kt59-57by.

It should have a wide enough distribution of points to be useful to us as example data.

There are some natural ways to associate this point-level data with conditions in the model that I have proscribed for the project. For instance:

All items with the "Dataset description" field starting with the phrase, "Women Infant Children (WIC) locations" (lines 34 - 52) should be associated with these conditions:

Births and Birth Rate - Birth Rate
General Fertility Rate - Fertility Rate
Low Birth Weight - Percent
Prenatal Care - Percent - 1ST TRIMESTER
Prenatal Care - Percent - 2ND TRIMESTER
Prenatal Care - Percent - 3RD TRIMESTER
Prenatal Care - Percent - NO PRENATAL CARE
Prenatal Care - Percent - NOT GIVEN
Preterm Births - Percent

All items with the "SITE NAME" field containing the phrase, "STI Specialty Clinic" (lines 28 - 32), should be associated with these conditions:

Chlamydia in females - Incidence Rate
Gonorrhea in females - Incidence Rate
Gonorrhea in males - Incidence Rate

That should be enough for proof of concept...

Consider adding error bars to the new bar graphs on places pages

@derekeder, I think the Places pages are looking better and better. One enhancement to consider is to add the error bars (which are based on the upper and lower CI values to the graphs for causes of death, etc. This would add value in helping users see if the difference between the City value and CA value are statistically different.

Identify established standards or benchmarks for reporting crime statistics

from @RoderickJones:

One thing I would recommend is that you try to identify some standards or benchmarks for
reporting of crime statistics from a federal agency (e.g., Dept. of Justice, FBI, etc.). The
meaning of the data will be enhanced if you are using existing, accepted definitions. In
public health, different indicators are calculated with different denominators (could be all
births, all adults, all people, etc.) and the “per” can be per 1,000, 10,000 or 100,000. It is
always best to use accepted definitions (and optimally, cite them).

set up Universal Analytics

Set up new Google Analytics tracking code:

(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','//www.google-analytics.com/analytics.js','ga');

  ga('create', 'UA-XXXXXXX-XX', 'herokuapp.com');
  ga('send', 'pageview');

Also add the ability to track exit links, downloads and emails, similar to https://github.com/smartchicago/chicago-atlas/blob/master/app/assets/javascripts/analytics_lib.js

Design neighborhood pages

This issue is here to organize the team's thoughts around designing neighborhood pages.

Example: http://chicago-atlas.herokuapp.com/geography/albany-park-community-area

-- delete the phrase "community area" from the end of the URL
-- conditions should be displayed in a list (as they are now)
-- map should be prominent on the page, with the point-level data (clinics, classes and other interventions) displayed and with no data from the condition list displayed
-- map should have the entire neighborhood centered and shown in outline, with the non-neighborhood portions of the Chicago map visible to the edges of the map box
-- hovering over the condition list colors the map with the proper shade for the condition, along with the legend for that condition, and the point-level interventions
-- clicking the condition sets the map, allowing user to drill down

This is a start, and makes our site an awful lot like every other mapping site. We will have to think about this more-- this is a rough-out.

Is there a service that allows us to pull neighborhood metdata and have it be updated through time? I think it would be good to have some "Albany Park is located in the northwest side of Chicago..." content, but I don't want to have to maintain it

We have to think about how to toggle over to seeing a citywide map, with the neighborhood highlighted.

Explanation of why it would be better to present just one level of the Prenatal Care indicator

I am not in favor of mapping/visualizing all 4 levels of the prenatal care indicator, even though the portal table that’s being used in the Atlas does contain 4 levels. In general, the indicator is defined as the percent of mothers who received prenatal care in the first trimester of pregnancy (or the inverse, i.e., those who did not get care in the first trimester). Let me provide some explanation for the opinion I’m expressing.

In every dataset we work with, there are missing values. Everytime I do an analysis I need to assess the extent to which those missing values could lead to misinterpretation of the results. In many instances, the missing values are distributed at random (or close to it) and the conclusions we come to in looking at the results are not substantially affected by the missing values.

Take a moment to look at this table of missing values on birth certificates in Chicago over the course of the 2000s:Table 80 on page 106 of 107 in this pdf –
http://www.cityofchicago.org/city/en/depts/cdph/provdrs/pol_plan_report/news/2012/dec/cdph_releases_comprehensivereportonbirthsinchicago.html
See how prenatal care has a relatively high proportion of missing values, with variation over time. In addition there happens to be variation associated with community area of residence as well.

When this problem presents itself, my approach varies with the audience. In the Births report, we simply conducted the analyses on the complete data (excluding the missings, rather than imputing values for them).

For purposes of Open Data, I chose to make all 4 categories available, and in that way, provide transparency within the dataset itself about the extent of the missing values for this indicator (i.e., the Not Given category). Even though I made this available, I don’t think this category is useful to users of the Atlas.

What I believe would be most meaningful (although still imperfect) is to take the approach of the Births report – to define the indicator as a percent with numerator=# getting prenatal care in the 1st trimester; denominator=sum of all categories except Not Given (i.e., 1st trimester+2nd trimester+3rd trimester).

Obviously this requires additional math on our Open Data table, so I can understand if this is beyond your scope. But I wanted to convey the perspective that reporting all 4 levels of the prenatal care indicator probably would strike most public health oriented people as odd.

Update About page

Remove the last line ("For technical assistance or to report a bug, email Derek Eder.").

Complete about text, including acknowledgments/ story of the site origin, to come.

Compliance with Data Portal Terms of Use

From my perspective, the site should comply with the terms of use of the open portal.
http://www.cityofchicago.org/city/en/narr/foia/data_disclaimer.html

The terms of use were brought to my attention and I wanted to pass my opinion along.

Perhaps the quickest way to get in compliance is through incorporating this text* into the About page.

*This site provides applications using data that has been modified for use from its original source, www.cityofchicago.org, the official website of the City of Chicago. The City of Chicago makes no claims as to the content, accuracy, timeliness, or completeness of any of the data provided at this site. The data provided at this site is subject to change at any time. It is understood that the data provided at this site is being used at one’s own risk.

Error in values displayed

  1. Only one value for the confidence interval (C.I) is shown. A CI contains two values-- the lower and upper.
  2. The values for Chicago aren't averages. When you hover over a point, the balloon states, " Chicago average" and the legend also has that. It should state rate, percent, count, etc. where applicable.
  3. I am not sure if this was an error or intentionally done, but I noticed the graph does not appear for community areas that have few teen births (e.g. Loop, Edison Park, etc.). Should a note explain why there isn't a graph for this indicator or similar ones?

Improvements to data on condition index page

Conditions index page is here: http://chicagohealthatlas.org/conditions. Tasks as follows:

  • Currently there is no info on these pages (example: http://chicagohealthatlas.org/condition/prenatal_care)
  • (I know these were just stubs, so no big deal-- just want to document the info that needs to be on these pages
  • Add city-wide map for browsing the condition by neighborhood in a map interface. For example, this is the map for "Births and Birth Rate - Birth Rate" in 2001 across the city: http://chicagohealthatlas.org/map/24/2001
  • Change URL to display a condition slug rather than a UID (in the example above, change the word "births" for "24"
  • Add all metadata for these pages as provided by data source
  • More to come... this is a starter.

Ability to edit datasets

Relates to #6

Create a back-end interface, similar to the Connect Chicago Locator, that allows admins to create/update/delete datasets. This includes editing:

  • name
  • description
  • provider
  • url
  • category

Once implemented, anyone will be able to update this description text.

handle/remove Chicago aggregates when charting raw numbers

Raw metrics like births, low birth weight births, pre-term births, teen births, and lead screening are being charted against the raw numbers for all of Chicago. These charts should either be removed, or some kind of average calculated for the raw Chicago numbers.

partner form page

move the partner form on the home page to its own so we can link to it from different places in the site.

Create overall brand for Chicago Health Atlas

Need a logotype with some some of graphic treatment (similar to what we did for Civic Innovation in Chicago).

There is no tagline as of yet, but the top-level description of the site is instructive:

"The Chicago Health Atlas is a place where you can view citywide information about health trends and take action near you to improve your own health."

Upshot: Analysis + Action. No mere display of data w/o a way to take action in your own life. No action w/o specific knowledge about what exists; where.

Along with this brand will be a color palette. I've already designed the basic wireframe of the site with the guiding concepts of browse (by condition/ time/ place) and the centrality of neighborhood pages, so there is less of a design task here than a workflow and feeling task.

Looking forward to taking this to the next step!

add choropleth cutoffs to CDPH datasets

Color group cutoffs provided by Jamyia at CDPH:

Births

  • 6.0 – 11.9
  • 12.0 – 17.9
  • 18.0 – 23.9
  • 24+

Breast Cancer

  • 8.0 – 20.9
  • 21.0 – 27.9
  • 28.0 – 34.9
  • 35.0 56.9

Breast Cancer Years of Potential Life Lost

  • 78.9 – 100
  • 200 – 300
  • 400 – 500
  • 600+

Cancer (All Sites)

  • 120 – 160
  • 170 – 210
  • 220 – 260
  • 270 – 280

Cancer (All Sites) Years of Potential Life Lost

  • 591 – 1100
  • 1200 – 1800
  • 1900 – 2500
  • 2600 – 2900

Colorectal Cancer Deaths

  • <14
  • 14 – 16
  • 17 – 22
  • 23 – 29
  • 30 – 55

Colorectal Cancer Deaths Years of Potential Life Lost

  • 10 – 79
  • 80 – 159
  • 160 – 249
  • 250+

Fertility

  • 20 – 59
  • 60 – 79
  • 80 – 99
  • 100+

Firearm-related Deaths

  • 1.0 – 9.9
  • 10.0 – 19.9
  • 20.0 – 29.9
  • 30.0 – 63.9

Firearm- related Deaths Years of Potential Life Lost

  • 1 – 200
  • 300 – 500
  • 600 – 1200
  • 1300+

Gonorrhea (Males)

  • 40 – 500
  • 600 – 1100
  • 1200 – 1700
  • 1800+

Gonorrhea (Females)

  • 34 – 500
  • 600 – 1100
  • 1200 – 1700
  • 1800+

Homicide

  • 0–9
  • 10 – 19
  • 20 – 29
  • 30+

Homicide Years of Potential Life Lost

  • 0 – 299
  • 300 – 899
  • 900 – 1499
  • 1500+

Low birth weight

  • 2.6 – 7.49
  • 7.50 – 12.49
  • 12.50 – 17.49
  • 17.50 – 27.0

Lung Cancer

  • 23.0 – 39.9
  • 40.0 – 49.9
  • 50.0 – 59.9
  • 60.0 – 98.9

Prenatal Care in 1st Trimester

  • <65
  • 65 – 72
  • 73 – 80
  • 81 – 96

Preterm

  • 3–9
  • 10 – 13
  • 14 – 17
  • 18 – 29

Prostate Cancer

  • 2.0 – 19.9
  • 20.0 – 39.9
  • 40.0 – 59.9
  • 60.0+

Prostate Cancer Years of Potential Life Lost

  • 0 – 49.9
  • 50.0 – 99.9
  • 100.0 – 139.9
  • 140+

TB

  • 0 – 3.9
  • 3.0 – 7.9
  • 8.0 – 11.9
  • 12+

Teen Births

  • 0 – 39.9
  • 40.0 79.9
  • 80.0 – 119.9
  • 120+

import CDPH Mortality dataset

The Mortality dataset provided by CDPH has a different format from the rest.

It contains many one-off columns over an aggregated period (2004-2008). These columns don't lend themselves to being charted over time and include:

  • Cause of Death
  • Cumulative Deaths 2004 - 2008
  • Cumulative Deaths Rank
  • Average Annual Deaths 2004 - 2008
  • Average Crude Rate 2004 - 2008
  • Crude Rate Rank
  • Average Adjusted Rate 2004 - 2008
  • Adjusted Rate Rank
  • Average Annual Years of Potential Life Lost (YPLL) Rate 2004 - 2008
  • YPLL Rate RANK

Task: come up with a data model that can handle data in this format. There may be some overlap with demographic data as well.

rename upper and lower CI columns in three CDPH datasets

@JamyiaClark and @RoderickJones, could you rename the following confidence interval columns to match the value column name for consistency?

In causes of death:

  • Adjusted Rate Lower CI => Average Adjusted Rate Lower CI
  • Adjusted Rate Upper CI => Average Adjusted Rate Upper CI

In Tuberculosis:

  • Incidence rate lower CI => Average Annual Incidence Rate 2007-2011 Lower CI
  • Incidence rate upper CI => Average Annual Incidence Rate 2007-2011 Upper CI

In Infant mortality:

  • Rate Lower CI 2004 - 2008 => Average Infant Mortality Rate 2004 - 2008 Lower CI
  • Rate Upper CI 2004 - 2008 => Average Infant Mortality Rate 2004 - 2008 Lower CI

Metadata: Create process for change management

Process should deal with three things:

-- Detect when a dataset is changed and make the change (handled with subscribing to the feed for that data ans creating an automated import process). example
-- Document the change (somehow capture the content from the version history that appears at the bottom of each dataset description. example
-- Broadcast (tweet, etc.) Work with DoIT and CDPH, as they are the canonical source of info

No condition data displayed

Determine site structure

There are both data tasks and design tasks associated with this. The classic example I keep giving of this in project meetings is:


"OK, I see the rates of diabetes, by neighborhood, through time, but what can I do today to help control my insulin levels?"

The answer: go to a healthy eating class, which is right here (show points on a map"


There are a number of ways we'll use this info:

On condition pages
For each condition (http://chicagohealthatlas.org/condition/births_and_birth_rate, for example)*, show a list of interventions along with a citywide map of all places that are associated with that intervention.

On neighborhood condition pages (as per #20)
For each condition per neighborhood (example: http://chicagohealthatlas.org/place/archer_heights), display the intervention points on the neighborhood map. One design question, which we can answer once we have some actual data in there, is whether to display all points in a particular neighborhood or only in relation to conditions. I am leaning toward the latter-- driving people to the most specific page (neighborhood condition pages) is the key, rather than showing yet another map with yet another set of dots. This is where the anagram-style paragraph of text comes in for the neighborhood conditions-- it may be possible to call out the important intervention locations in this paragraph.

  • = note that these condition names are being changed in #18, so these examples may be 404 soon, but the general idea will remain.

import zip code geographies

Some data is provided as aggregated by zip code. The City provides a dataset with these boundary definitions which will need to be imported.

A caveat of these zip codes is some of the CDHP data is aggregated in to multiple zip codes like so:

  • 60606, 60607 & 60661
  • 60601, 60602, 60603, 60604, 60605 & 60611
  • 60610 & 60654
  • 60622 & 60642
  • 60707 & 60635
  • 60827 & 60633

Question: should we treat these groupings as their own geography? If so, can we expect them to be consistent across datasets and time?

inconsistent column labeling for CDPH datasets

While going through the import process and reviewing the charts generated on community area detail pages, I noticed that most columns follow a standard naming convention. However, in a few cases, this pattern is broken.

A few examples:

  • Prenatal care, upper and lower CI
  • Teen birth rate, upper and lower CI

These column names either need to be fixed on CDPH end, or we need to impor them manually.

Using Jenks natural breaks for map colors

@RoderickJones - while looking in to calculating the category cutoffs by hand, I came across the Jenks natural breaks optimization. It seems to be a very effective way of calculating our categories.

I was able to find a javascript implementation by @tmcw and went ahead and added it to the site. I am passing the algorithm all values for all years (excluding nulls) for each given dataset.

You can see the results here: http://www.chicagohealthatlas.org/map

How do you think this compares to the categories you came up with? If they are as good or better than the ones we already have, using this approach makes sustaining this site much easier as we won't have to take the time to re-calculate the cutoffs manually anymore.

Is it necessary to have a /dataset directory?

I'm not exactly sure how to handle this. I know that I don't like the word "dataset" (too jargon-y), and I wonder if we can just let each condition/ dataset to be a directory right off the root.

We can still organize other directories (/place, /provider, /etc) but let conditions be at the root of the site.

If there is no technical reason why we can't do this, let's turn this from a question to a bug.

No IE7 Support [was: hover tool goofiness on south side(?)]

This might be a wontfix, but I wanted to share an observation that the hover functionality works great for almost all the city except for far south side. I tried a few different maps and got the same problem. Try South Deering (#51) the big one in far south. The only way I can get the hover to activate is if I hit a sweet spot in the southwest corner of the polygon. This also seems to happen with East Side (#52) and Hegewisch (#55). I'm bringing it to your attention because it seems like all the other CA's activate the hover easily without having to search for the sweet spot. Maybe this is my browser - I have no idea.

home page text and form

It seems that the text in the Google form and on the home page of Chicago Atlas are a bit redundant.

It also makes sense to me to remove this sentence since we're already there:

"If you would like to see the current status of our new website, go here."

Thoughts?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.