Coder Social home page Coder Social logo

data-and-lab's People

Contributors

acostak avatar angela-li avatar lixun910 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

data-and-lab's Issues

Fix stereo channels in video lectures

We just noticed that these videos from Luc's UChicago fall class were recorded in mono instead of stereo (or one channel, often the left one, is much clearer and louder compared to the other). You can test this with headphones.

Please download these 7 videos:
https://www.youtube.com/playlist?list=PLzREt6r1Nenkr2vtYgbP4hs44HO_s_qEO

and duplicate the louder/clearer channel (e.g. from left to right) so we get the full stereo effect. E.g. here's a way to do this in Adobe Premiere Pro but VLC etc. has similar functionality.

Once the audio is equally clear on both channels, please re-upload to the site. I'll send you the log-in info for our YouTube siteseparately.

The UChicago AV team will be taping Luc's spring lecture as well on Mondays and Wednesdays, so we'll need to update two videos per week to the site.

Add videos to YouTube site

Please create two new playlists on our YouTube site and add the videos that you'll soon get a dropbox invite for:

Spatial Autocorrelation
--global_sa folder
--local_sa folder

Spatial Prediction
--spatial prediction folder

The name of the lecture will be on the title slide. Please also add this description: Lecture by Luc Anselin on X (2016). X = name.

They should all be stereo but if you could confirm with headphones.

County income and diversity data

Combine two county datasets on income and race:

Income ratio county to state:
https://philpierdo2.carto.com/me
--you can download the data and county shapefile here that makes up the final map

Merge this with these data on racial diversity:
https://www.kaggle.com/mikejohnsonjr/us-counties-diversity-index

and merge with these data from the spreadsheet
Online Data Table 11: County-level life expectancy estimates for men and women, by income quartile
(needs to be extracted and reformatted)

add html documentation for all three sources (standard template)
add prj file (WGS84)

Aug 3 Rendering Issues

  • Fix styling issues with the Map size

  • Fix styling issues w/ Map contents

  • Update react components on react-geoda (Take a look at mouse hover state and change behavior of the map based on that)

  • Write either a deploy or a bash script to clean up auto-generated IDs so that deployment is more automated.

Education

Hi Julie,

Could you add a new page (not published yet) with the slides from Luc's fall and spring course? They're on Blackboard and Canvas, which I believe you have access to. This will supplement the YouTube video page: Several people have been asking for the slides so they can teach with them (they should be color pdf format).

If you can list the course, then the topic with the link to the slides, that would be great. Once it's done, we'll link it to this page: https://spatial.uchicago.edu/content/lectures-luc-anselin

Let me know if you have any questions. Thx

Add headers to dataset pages to enable Google Dataset Search

I discovered in making the YAML headers to document our datasets that it's only one more step to create a dataset header schema (developer documentation here), so that our datasets show up in Google Dataset Search.

I plan to implement this after YAML is set up so that our datasets will be searchable and findable on Google Dataset Search!

Standardize dataset description parameters with YAML header

Currently, there is no standard structure to our data descriptions, which makes it very difficult to use a machine to read the page. I'd like to develop an HTML/CSS template with a YAML header for our data descriptions in posts to standardize our data descriptions. (I can do this by editing the template in initpost.sh I believe - @lixun910 correct me if I'm wrong.)

The idea is to have a YAML header in each post that we fill out, which Jekyll then plugs into a HTML/CSS template with custom tags, that is set up to be easily webscraped and analyzed. This will also make it easier to add and document datasets in the future, without having to worry about formatting/layout.

Here's an example of what I think this would look like:

---
source: Chicago Open Data Portal
author: Luc Anselin
variables: 77 
observations: X
(more YAML parameters...)
---

<h3>Source: </h3>
<p class="source"> {{page.source}}</p>
<h3> Author: </h3>
<p class="author"> {{page.author}} </p>
(more HTML/CSS...)

Note: This idea was inspired by Software Carpentry's workshop website template - see an example here of how this works, and the produced page.

R sample data sets

low priority - for later: R sample data

http://origin.rdrr.io/rforge/splm/
splm/data/Insurance.rda
splm/data/RiceFarms.rda
splm/data/itaww.rda
splm/data/riceww.rda
splm/data/usaww.rda

http://origin.rdrr.io/rforge/spdep/
spdep/data/NY_data.rda
spdep/data/afcon.rda
spdep/data/auckland.rda
spdep/data/baltimore.rda
spdep/data/boston.rda
spdep/data/columbus.rda
spdep/data/datalist
spdep/data/eire.rda
spdep/data/elect80.rda
spdep/data/getisord.rda
spdep/data/hopkins.rda
spdep/data/house.RData
spdep/data/huddersfield.rda
spdep/data/nc.sids.rda
spdep/data/oldcol.rda
spdep/data/used.cars.rda
spdep/data/wheat.rda

From Luc:
no need to add afcon. it’s a very small data set and misses some countries, so it looks weird.
it was used as the example in the LISA paper, but the sample size is too small for “modern” use.

a few of them are “classic” (but then also old), such as

  • Auckland (used in an early paper to illustrate EB smoothing)
  • Elect80 (US counties election results), used as example for spatial probit, but
    from the looks of it, it doesn’t have the 0-1. we have a better one from LeSage
    with the 1996 elections (I used in my Brown class — I can produce if we don’t
    have it)
  • house: Lucas county housing data (points) - is originally from LeSage toolbox
  • NY_Data is Leukemia data from the Waller-Gotway book, as points
    (it would be nice to supplement with the actual tract areas as polygons)
    but typically these data sets have very few variables, so not that useful.

Generate YAML dataset documentation from a single CSV

It would be nice to be able to regenerate all the YAML documentation pages based off of one input file (datasets.csv or something, possibly even an interactive Google Sheet), so that any changes to the dataset documentation template, or info about the datasets could be automated and not have to be calibrated manually.

Will look into how to do this, I think it could mainly be a matter of templating out the shell script that is used to generate the new dataset documentation (initdata.sh) to work with an input CSV file. I know how to do this with R but will look into how to do it with the shell.

Update to next.js

an update of the data-and-lab website to a more modern implementation utilizing next.js.
Still utilizing a similar blog structure

Sample data template

If you can use this template for the sample data (we'll use the same thing for the labs afterwards):
https://geodacenter.github.io/data-and-lab/

The webpage is in the branch "gh-pages":
https://github.com/GeoDaCenter/data-and-lab/tree/gh-pages

The idea is to add all of the sample data on our current page: https://spatial.uchicago.edu/sample-data
with the new data you put together (https://uchicago.app.box.com/folder/15959242415) and make it easier to access through the new template.

To query it, we need to categorize the data, e.g. by:

  • map type (polygons or points, later also networks)
  • time dimension present (cross-section or multi-year)
  • year(s)
  • rates (event and base)
  • size (<100, <5,000, <50,000, 50k+)
  • country
  • city
  • topic (e.g. crime, housing, health, agriculture, etc.)
  • spatial resolution (e.g. address, block, tract, county, country, ...)
  • GeoDa analysis type (ESDA, regression, averages chart, etc.: I can add this in the end)
  • type of example (toy like columbus or real - I can add this in the end, too)

Once we have the tags, we can filter the datasets flexibly. If you can take a look at the repo and the data page (e.g.: https://geodacenter.github.io/data-and-lab/how-to-use/) and readme file to see what questions you have for implementing this, then we can meet and resolve them and decide which details to highlight in the summary (e.g. thumbnail images of the map and table). I also requested accounts for us on our new RCC server so we can store the data there.

Sample data fixes

  • when you click on any of the red banners, you get a 404 page not found error
  • right now, the blue banner is too large relative to the map (see attached): Could you make that smaller (same extent as text) and the map bigger?
  • at https://geodacenter.github.io/data-and-lab/tags/, could you remove the upload date since we only need the name?
  • let's disable comments at the bottom ("Comments: We were unable to load Disqus. If you are a moderator please see our troubleshooting guide."
  • right now, there's no way to actually download the data :) We could add a "Download" link in the blue banner below the dataset name?

screen shot 2017-08-01 at 1 16 12 pm

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.