Coder Social home page Coder Social logo

oakland's Introduction

OakHoods

Evolution of Oakland Neighborhoods

## Overview:

I wanted to add to the current discussion on Oakland gentrification by answering the smaller question: How has crime changed over Oakland, and how can that add to our understanding of Oakland neighborhoods?

I used unsupervised learning to identify five distinct types of Oakland neighborhoods:

Tranquil - Low crime areas, including the Oakland hills. Quiet Crime - Areas with only nonviolent larceny crimes and auto break ins. Transitional - Areas with high larceny, medium violence, and medium levels of quality crimes (such as noise complaints, graffiti, and loitering). Violent - Areas characterized by high assaults and quality crimes. Auto Break Ins - Areas with very large spikes in auto break ins.

Process:

Crime incidents tagged with latitude/longitudes, timestamps, and crime categories were loaded into a postGres database with postGIS (geospatial capacity). I aggregated these incidents by US census block-group shape files and by year. My feature table had a row for each distinct area/year combination.

#### Features included: Count of each crime type Count of each crime type occurring on the weekend Count of each crime type occurring in the morning, day, and evening

The area/year rows were normalized for all features. I looked at the variance across the set, and found that the count and distribution in time of quality crimes were the most descriptive of changes between neighborhoods, followed by violent crimes.

The engineered features over weekend, morning, day, and evening did not segment the data in a meaningful way. These features were kept for use in the data visualization and future efforts. PCA was also attempted, but the trade off of lost interpretibility was too great for this application.

I then passed 2009 data into kMeans (initialized with kmeans++) and compared the within group sum of squares for values of k from 1-12, which told me that five neighborhoods could be separated in the data (see types above). I re-initialized my kMeans for many random seeds and observed that the Tranquil, Violent, and Auto Break In neighborhoods were robust to initialization. The remaining two clusters changed between seeds. I picked a seed to define my last two clusters on their ability to be separated geographically. ## Code Walk Through: #### Preprocessing (/database): database.py creates the database and tables preprocessing_crime.py de-duplicates the crime data and aggregates it into meta-crime categories by using the dictionary defined in crime_dict.py

#### Website and clustering (/app): app.py calls model.py to retrieve cluster and crime information and displays the information via templates/oakland.html oakland.html uses javascript, Google Maps API, and Google Charts API to display crime information for each year, for each area model.py queries the database and retrieves the feature tables, scales the data (sklearn StandardScalar), performs kMeans (sklearn kMeans) with set random seed (203) and k=5 on 2009 feature rows, and maps future area/years to 2009 centroids. The clusters and crime information are returned.

Resources: ryd.io, ansonwhitmer.tumblr.com/post/76570597222/sf-hoods-project

oakland's People

Contributors

d43 avatar

Stargazers

Christopher Rupley avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.