geodacenter / data-and-lab Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 3.0 601.62 MB

For CSDS: sample data, data cleaning, labs and miscellaneous

Home Page: https://geodacenter.github.io/data-and-lab/

data-and-lab's People

Contributors

Watchers

Forkers

makosak stuartlynn dpeachpeach

data-and-lab's Issues

Chicago Health Indicators

For these data:
https://uchicago.app.box.com/folder/17277393229

add info from final project and extra files to create standard html metadata file.
Prepared by [name of student on final project].

Fix stereo channels in video lectures

We just noticed that these videos from Luc's UChicago fall class were recorded in mono instead of stereo (or one channel, often the left one, is much clearer and louder compared to the other). You can test this with headphones.

Please download these 7 videos:
https://www.youtube.com/playlist?list=PLzREt6r1Nenkr2vtYgbP4hs44HO_s_qEO

and duplicate the louder/clearer channel (e.g. from left to right) so we get the full stereo effect. E.g. here's a way to do this in Adobe Premiere Pro but VLC etc. has similar functionality.

Once the audio is equally clear on both channels, please re-upload to the site. I'll send you the log-in info for our YouTube siteseparately.

The UChicago AV team will be taping Luc's spring lecture as well on Mondays and Wednesdays, so we'll need to update two videos per week to the site.

Replace leaflet with kepler.gl

Add videos to YouTube site

Please create two new playlists on our YouTube site and add the videos that you'll soon get a dropbox invite for:

Spatial Autocorrelation
--global_sa folder
--local_sa folder

Spatial Prediction
--spatial prediction folder

The name of the lecture will be on the title slide. Please also add this description: Lecture by Luc Anselin on X (2016). X = name.

They should all be stereo but if you could confirm with headphones.

2015 King County house prices

For this shapefile:
https://uchicago.app.box.com/folder/16339838528

add documentation from here:
https://www.kaggle.com/harlfoxem/housesalesprediction

add .prj file that works with GeoDa basemaps (WGS84)

County income and diversity data

Combine two county datasets on income and race:

Income ratio county to state:
https://philpierdo2.carto.com/me
--you can download the data and county shapefile here that makes up the final map

Merge this with these data on racial diversity:
https://www.kaggle.com/mikejohnsonjr/us-counties-diversity-index

and merge with these data from the spreadsheet
Online Data Table 11: County-level life expectancy estimates for men and women, by income quartile
(needs to be extracted and reformatted)

add html documentation for all three sources (standard template)
add prj file (WGS84)

NYC Nhood ACS data 2008-12

For these data:
https://uchicago.app.box.com/folder/16416376632

add info from spreadsheet to create standard html metadata file
Prepared by Manoradhan Murugesan.

Aug 3 Rendering Issues

Fix styling issues with the Map size
Fix styling issues w/ Map contents
Update react components on react-geoda (Take a look at mouse hover state and change behavior of the map based on that)
Write either a deploy or a bash script to clean up auto-generated IDs so that deployment is more automated.

NYC Tract ACS data 2008-12

For these data:
https://uchicago.app.box.com/folder/16417002357

add info from spreadsheet to create standard html metadata file
Prepared by Manoradhan Murugesan.

Add data and metadata from labs 1-9 to Box

Add data and metadata from labs 1-9 to Box using the same format as in the html files from NYC and Nepal data.

NYC Contracts Employment NHood

For these data:
https://uchicago.app.box.com/folder/16418855333

add info from spreadsheet to create standard html metadata file

Add data and metadata from labs 1-9 to Box

Add data and metadata from labs 1-9 to Box using the same format as in the html files from NYC and Nepal data.

Education

Hi Julie,

Could you add a new page (not published yet) with the slides from Luc's fall and spring course? They're on Blackboard and Canvas, which I believe you have access to. This will supplement the YouTube video page: Several people have been asking for the slides so they can teach with them (they should be color pdf format).

If you can list the course, then the topic with the link to the slides, that would be great. Once it's done, we'll link it to this page: https://spatial.uchicago.edu/content/lectures-luc-anselin

Let me know if you have any questions. Thx

NY LEHD data

Use this data:
https://uchicago.app.box.com/folder/16330267658

with this documentation:
https://uchicago.app.box.com/notes/121106246171

to create html files in the standard format.

for html file, last line:
Prepared by Manoradhan Murugesan.

NYC Education 2000

For NYC Education 2000 data:
https://uchicago.app.box.com/folder/16334961930

Create html file that contains the info from the spreadsheet and is formatted based on our template.

Guerry data

For these data:
https://uchicago.app.box.com/folder/17351831412

add info from html/source to create standard html metadata file.

Add headers to dataset pages to enable Google Dataset Search

I discovered in making the YAML headers to document our datasets that it's only one more step to create a dataset header schema (developer documentation here), so that our datasets show up in Google Dataset Search.

I plan to implement this after YAML is set up so that our datasets will be searchable and findable on Google Dataset Search!

Chicago AirBnB Data

Update the documentation for these data:
https://uchicago.app.box.com/folder/16335184228

based on the standard html template (variable list and description, # of variables and observations, etc.).

Setup jekyll env

@dpeachpeach Please refer this issue #39 in your pull request. Thanks!

Standardize dataset description parameters with YAML header

Currently, there is no standard structure to our data descriptions, which makes it very difficult to use a machine to read the page. I'd like to develop an HTML/CSS template with a YAML header for our data descriptions in posts to standardize our data descriptions. (I can do this by editing the template in initpost.sh I believe - @lixun910 correct me if I'm wrong.)

The idea is to have a YAML header in each post that we fill out, which Jekyll then plugs into a HTML/CSS template with custom tags, that is set up to be easily webscraped and analyzed. This will also make it easier to add and document datasets in the future, without having to worry about formatting/layout.

Here's an example of what I think this would look like:

---
source: Chicago Open Data Portal
author: Luc Anselin
variables: 77 
observations: X
(more YAML parameters...)
---

<h3>Source: </h3>
<p class="source"> {{page.source}}</p>
<h3> Author: </h3>
<p class="author"> {{page.author}} </p>
(more HTML/CSS...)

Note: This idea was inspired by Software Carpentry's workshop website template - see an example here of how this works, and the produced page.

Add population field to SanFran crime data

Please spatially join population data from 2010 Census block-level data, so we have a denominator for the crime data:

https://s3.amazonaws.com/geoda/data/SFCrime_July_Dec2012.zip

and upload to the Box account afterwards.

R sample data sets

low priority - for later: R sample data

http://origin.rdrr.io/rforge/splm/
splm/data/Insurance.rda
splm/data/RiceFarms.rda
splm/data/itaww.rda
splm/data/riceww.rda
splm/data/usaww.rda

http://origin.rdrr.io/rforge/spdep/
spdep/data/NY_data.rda
spdep/data/afcon.rda
spdep/data/auckland.rda
spdep/data/baltimore.rda
spdep/data/boston.rda
spdep/data/columbus.rda
spdep/data/datalist
spdep/data/eire.rda
spdep/data/elect80.rda
spdep/data/getisord.rda
spdep/data/hopkins.rda
spdep/data/house.RData
spdep/data/huddersfield.rda
spdep/data/nc.sids.rda
spdep/data/oldcol.rda
spdep/data/used.cars.rda
spdep/data/wheat.rda

From Luc:
no need to add afcon. it’s a very small data set and misses some countries, so it looks weird.
it was used as the example in the LISA paper, but the sample size is too small for “modern” use.

a few of them are “classic” (but then also old), such as

Auckland (used in an early paper to illustrate EB smoothing)
Elect80 (US counties election results), used as example for spatial probit, but
from the looks of it, it doesn’t have the 0-1. we have a better one from LeSage
with the 1996 elections (I used in my Brown class — I can produce if we don’t
have it)
house: Lucas county housing data (points) - is originally from LeSage toolbox
NY_Data is Leukemia data from the Waller-Gotway book, as points
(it would be nice to supplement with the actual tract areas as polygons)
but typically these data sets have very few variables, so not that useful.

Generate YAML dataset documentation from a single CSV

It would be nice to be able to regenerate all the YAML documentation pages based off of one input file (datasets.csv or something, possibly even an interactive Google Sheet), so that any changes to the dataset documentation template, or info about the datasets could be automated and not have to be calibrated manually.

Will look into how to do this, I think it could mainly be a matter of templating out the shell script that is used to generate the new dataset documentation (initdata.sh) to work with an input CSV file. I know how to do this with R but will look into how to do it with the shell.

integrate reactgeoda submodule + script

just leaving an issue here to remind to work on integration of reactgeoda as a submodule + write a script for build automation.

Update to next.js

an update of the data-and-lab website to a more modern implementation utilizing next.js.
Still utilizing a similar blog structure

setup github.io website for GeoDa Data and Lab

Please follow the README in the branch "gh-pages" and setup the dev environment for https://geodacenter.github.io/data-and-lab

Get familiar with how to use jekyll (configure, compile and run). We will need to have our own template for posting new pages.

Denver Foreclosures

For these data from a final student project in Luc's class:
https://uchicago.app.box.com/folder/17277381038

add info from final project paper and extra files in folder to create standard html metadata file.
Prepared by [name of student on final project].

2012 and 2016 county presidential election results

For this dataset:
https://www.kaggle.com/joelwilson/2012-2016-presidential-elections

link csv's to county shapfile (e.g. from nhgis.org)
add documentation from https://www.kaggle.com/joelwilson/2012-2016-presidential-elections
add .prj file to load with basemap in GeoDa (WGS84)

Nepal Data Documentation

add documentation

Sample data template

If you can use this template for the sample data (we'll use the same thing for the labs afterwards):
https://geodacenter.github.io/data-and-lab/

The webpage is in the branch "gh-pages":
https://github.com/GeoDaCenter/data-and-lab/tree/gh-pages

The idea is to add all of the sample data on our current page: https://spatial.uchicago.edu/sample-data
with the new data you put together (https://uchicago.app.box.com/folder/15959242415) and make it easier to access through the new template.

To query it, we need to categorize the data, e.g. by:

map type (polygons or points, later also networks)
time dimension present (cross-section or multi-year)
year(s)
rates (event and base)
size (<100, <5,000, <50,000, 50k+)
country
city
topic (e.g. crime, housing, health, agriculture, etc.)
spatial resolution (e.g. address, block, tract, county, country, ...)
GeoDa analysis type (ESDA, regression, averages chart, etc.: I can add this in the end)
type of example (toy like columbus or real - I can add this in the end, too)

Once we have the tags, we can filter the datasets flexibly. If you can take a look at the repo and the data page (e.g.: https://geodacenter.github.io/data-and-lab/how-to-use/) and readme file to see what questions you have for implementing this, then we can meet and resolve them and decide which details to highlight in the summary (e.g. thumbnail images of the map and table). I also requested accounts for us on our new RCC server so we can store the data there.

NYC Data

Add documentation

Sample data fixes

when you click on any of the red banners, you get a 404 page not found error
right now, the blue banner is too large relative to the map (see attached): Could you make that smaller (same extent as text) and the map bigger?
at https://geodacenter.github.io/data-and-lab/tags/, could you remove the upload date since we only need the name?
let's disable comments at the bottom ("Comments: We were unable to load Disqus. If you are a moderator please see our troubleshooting guide."
right now, there's no way to actually download the data :) We could add a "Download" link in the blue banner below the dataset name?

geodacenter / data-and-lab Goto Github PK

data-and-lab's People

Contributors

Watchers

Forkers

data-and-lab's Issues

Recommend Projects

Recommend Topics

Recommend Org