Coder Social home page Coder Social logo

chicago / census2020_ward_rpt Goto Github PK

View Code? Open in Web Editor NEW
2.0 9.0 1.0 275.19 MB

Census 2020 Report

Home Page: https://www.chicago.gov/city/en/sites/census2020/home/ward-reports.html

License: MIT License

R 0.40% HTML 99.57% Python 0.03% Shell 0.01%

census2020_ward_rpt's Introduction

Census 2020 Ward Report

The ward reports from this project are published at https://www.chicago.gov/city/en/sites/census2020/home/ward-reports.html

These reports are based on demographic data from the US Census Bureau, open data from https://data.cityofchicago.org/, and projections provided by Civis Analytics

The reports provide current Census response rates combined with demographic data at a ward level, so that Aldermanic offices can better understand performance within their wards.

The reports are rendered on a weekly basis as static, stand alone html files, but this project could be easily modified to run as an application on a Shiny Server.

Technical Details

The current final reports are located in ./WardReport_v2, and the actual report is WardReport_v2/WardReport.Rmd. If you are curious to see how the report works, or to adapt it to your own needs, this is a good place to start. By default this will open the lasest cached file in the project, and the current ward number from WardReport_v2\cur_ward.yaml.

The production process to generate the reports is as follows:

First create a new cache run WardReport_v2/10_refresh_cache.R. This is necesary to speed up the render process. The cache includes data for allwards, but the static dashboard is rendered for just one subset.

Second, to render the reports for each ward run WardReport_v2/20_render_reports.R. This chagnes the current ward number in WardReport_v2\cur_ward.yaml and renders the report based on the actual report markdown file WardReport_v2/WardReport.Rmd, and saves the results in an output folder.

As a side note, the report could easily be made into a modular application in Shiny's reactive framework.

The final step is to email aldermanic offics on the distributon list mantained with the Civis platform. This is accomplished though WardReport_v2\ward_reports.py. Within Civis Analytics' census platform, the Census Intelligence Center, this script will run in a containerized docker environment that automates the email distribution process.

census2020_ward_rpt's People

Contributors

dforbush avatar geneorama avatar srao-civis avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

ankitpatel0698

census2020_ward_rpt's Issues

Add automated saving of weekly ward progress

In the R markdown file, we should think of a way to automatically save the response rate week-to-week. It should be done in a way that it can be automatically recalled by the code week to week, i.e. so that it can be used to create something like a graph of the response rate over time. This will likely involve saving an .Rda file that is automatically recalled and re-saved week to week.

Source of CIC data / meaning of HTC values

I was trying to understand the source for the Census Hard to Count percentages in the CIC. On the HTC maps and the Census Bureau sites I've only seen scores, so I was wondering about the source for HTC population percentages.

I started with the example #6, tract 207.02, I had a hard time reconciling the figures from the CIC to the planning database. Am I missing something?

I took these 4 blocks in the tract:
image

I then compiled those into a small table. The total column is based on the sum product of the percentages and the populations:
image

The CIC number looks similar to the Total Population from the planning database, but I'm not seeing good matches on the rest of the numbers:

image

The full example can be found in the attached excel file.

Civis example 20702.xlsx

Develop summary table for census tracts by ward

Develop a table in Markdown / R Markdown

Draft of table contents

Tract TotPop TotHH LEPspanHHs HH_Renter_Crowded Owner_TOTAL Renter_TOTAL Pct Renter NoInternet IntFirstEng
17031010501 4,147 2,425 0 30 291 2,134 88% 526 2,652
17031010502 2,802 1,508 29 70 287 1,221 81% 262 1,796
17031010503 1,713 968 20 59 197 771 80% 308 1,258

I used this site to create the table above from something in Excel: https://www.tablesgenerator.com/markdown_tables The ratio column is calculated in Excel.

@DForbush can you put something like that in ./app/dev.R?

Make Decisions about the Civis Dataset

Importing the Civis data is cumbersome for two reasons:

  1. In the Global File, the following code (lines 49-53) takes a long time to run:
    civistable <- "cic.pdb2019trv3_us" civisdata <- read_civis(civistable, database="City of Chicago") #this will take a minute or two civisdata <- as.data.table(civisdata) civisdata <- civisdata[match(shp_tracts$GEOID, civisdata$gidtr)]

This is because the entire dataset is imported and then matched with the Chicago-specific census tracts. Is there a way, in the read_civis import line, to only import the Chicago-specific census tracts?

  1. There are 522 columns in the Civis data set. We probably don't need most of these, and it makes the data unwieldy and slow. How should we filter which columns we want to use?

Census tracts not found in crosswalk

@sherryshenker @srao-civis

Do you know why so few of the census tracts are matched to the crosswalk?

Reading in just the tracts and crosswalk:

ex <- read_civis(sql("SELECT tract FROM cic.pdb2019trv3_us"), database = "City of Chicago")
cw <- read.csv("data_census_planning/crosswalk.csv")

Seeing what's in what:

table(ex$tract %in% cw$census_tract)

Yields this table, so the vast majority are not found

FALSE  TRUE 
69599  4425 

The tracts look different:

head(ex$tract)
[1]  20100  20200 981903  20300  20400  20500
> head(cw$census_tract)
[1] 1001 1002 1003 1004 1005 1006

Am I comparing the wrong things?

plan for cic.ward_daily_rates_2020

What is the plan for this table?

I know this is dummy data (should be zeros), but I don't understand why there are 25 rows for each ward / day. For example, ward 1 has 25 entries for 2020-03-12.

Also, there is an NA ward.

Matching tract numbers between CIC and other census data

I was trying to understand the CIC numbers, such as Census Hard to Count, and what that percentage means.

I happened to pick Tract 20702 Block 3 as an example.

I tried to match it to a tract in the planning database file data_census_planning\pdb2017tract_2010MRR_2018ACS_IL.xlsx. I could not find an exact match. Two tracts contained "20702"; 17031020702 and 17031220702, but the population doesn't match for either.

image

@DForbush did you ever resolve this matching issue?

Include outreach events

We need to figure out a way to automate the inclusion of outreach events that occur within each ward. This will likely be taken from some kind of google form, where the locations of those events need to be geocoded, then matched with the ward they take place in, and then output as a list in the report.

Should we also try to limit that list based on date, i.e. include only events that have occurred in the last week or events that are coming up within the next week? Or should it be a comprehensive list?

Technical question about Civis Platform - read vs query

@sherryshenker (or others) what's the difference between these?

civis::read_civis("cic.ward_visualization_table", database="City of Chicago")
civis::query_civis("SELECT * FROM cic.ward_visualization_table;", database="City of Chicago")

For the bigger tables I was hoping to select just the columns we need, but I'm not getting the same results when I execute a query vs read_civis. Is there a way to filter columns with the read_civis function?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.