Coder Social home page Coder Social logo

gdsbook / book Goto Github PK

View Code? Open in Web Editor NEW
325.0 18.0 101.0 671.22 MB

This book serves as an introduction to a whole new way of thinking systematically about geographic data, using geographical analysis and computation to unlock new insights hidden within data.

Home Page: https://geographicdata.science

License: Other

Jupyter Notebook 99.96% Dockerfile 0.01% Makefile 0.01% TeX 0.03% CSS 0.01% Python 0.01%
data-science data-analysis-python geographical-information-system geographic-data spatial-analysis spatial-statistics statistics spatial-data-analysis

book's Introduction

book's People

Contributors

actions-user avatar darribas avatar jeffcsauer avatar josiahparry avatar k20shores avatar ljwolf avatar sjsrey avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

book's Issues

Decide license

This will need to be decided before we go public, just making a collective mental note.

convention to refer to earlier chapters' methods/techniques

In local_autocorrelation.ipynb, there is an inline question:

Should we adopt some scheme so as to refer to earlier chapters, as in the case of maps and weights here to improve the flow of the book and reduce any repetition of basic tasks?

I say yes, we should always refer to a previous chapter if we talked about a topic before.

Renaming to do

Opening a quick ticket to keep track of bits that need to be renamed across the book:

  • File name for "Ch.2 - Geographic Thinking" from 02_spatial_data to 02_geographic_thinking
  • File name for "Ch.3 - Spatial Data" from 03_spatial_data_processing to 03_spatial_data
  • Sub-sub-heading on Ch.10 from "Spatial Feature Engineering" to "Spatial Heterogeneity" (now we have a full chapter, I think this fits best as "heterogeneity")

If @ljwolf and @sjsrey agree on this, I'd suggest to do it when there are no other PRs waiting, and then merge right away to avoid conflicts.

edits for clustering and regionalization chapters

Question 6, @darribas writes:

We could recast this question to make it more practical, along the lines @ljwolf does in other chapters. For example, on this one, it could be:

  • Re-run the analysis in the chapter w/ a different set of weights.
  • Compare the resulting clusters visually
  • What are the key differences between the two W's?
  • How do you think such differences affect the final result?

Question 7, @darribas writes

This one I think it's pretty hard for an introductory text.

Question 8, @darribas writes

I'm not sure I understand this one

Typos

Hey guys,

Thank you for the wonderful book! Great start. I went through Part 1 and found a few typos, do you welcome pull requests or would you like me to mention these in the Issues?

Thank you!

Cache data used from `osmnx`

Currently we're reading data off OSM on-the-fly via osmnx. It'd be good to have a cache in the repo that allows us to read them locally if connectivity is not an option, and add a note with code in the chapter that it can be read locally.

Ch. 1 - need to specify tag in docker `pull`

In the Run the book's container section, I would like to suggest that you change the following docker command:

docker pull gdsbook/stack

to:

docker pull gdsbook/stack:3.0

When I run the command that is currently in the book in Windows PowerShell, I receive the following error:

Error response from daemon: manifest for gdsbook/stack:latest not found: manifest unknown: manifest unknown

Per this thread on the Docker forums, I understand that running the command without the 3.0 tag defaults to pulling the image with tag:latest, but the "latest" tag doesn't exist, so I receive that error.

Thanks very much for all your work on this wonderful book!

Docker container error

On a new ubuntu machine I'm trying to setup things and hitting a docker error:

Step 13/22 : RUN cd /home/$NB_USER/testbook  && gem install bundler -v 1.17.2  && bundle install
 ---> Running in 412f6a892798
Successfully installed bundler-1.17.2
Parsing documentation for bundler-1.17.2
Installing ri documentation for bundler-1.17.2
Done installing documentation for bundler after 4 seconds
1 gem installed
/usr/lib/ruby/2.5.0/rubygems.rb:289:in `find_spec_for_exe': can't find gem bundler (>= 0.a) with executable bundle (Gem::GemNotFoundException)
        from /usr/lib/ruby/2.5.0/rubygems.rb:308:in `activate_bin_path'
        from /home/jovyan/gems/bin/bundle:23:in `<main>'
The command '/bin/sh -c cd /home/$NB_USER/testbook  && gem install bundler -v 1.17.2  && bundle install' returned a non-zero code: 1
make: *** [Makefile:2: container] Error 1

Not sure why this is now happening - did the container change upstream maybe?

Higher resolution Matplotlib inline images

By default Matplotlib outputs %matplotlib inline images at a resolution of 100 dpi. This results in some blurry images, e.g. Spatial Weights

This could be adjusted globally by adjusting the settings in the matplotlibrc file. The relevant parameters are:

figure.dpi       : 100

or in a script:

import matplotlib as mpl

mpl.rcParams['figure.dpi'] = 100

Alternatively this magic could be used to output an svg instead: %config InlineBackend.figure_format = 'svg'.

Our relationship to the internet

We need to flesh out our relationship to the internet, specifically when running chapters.

Ideally, we'd like people to be able to execute the chapters without the internet. A few issues arise though:

  1. We occasionally have reads of remote data, such as in the Local Autocorrelation chapter (where we read in brexit returns) and the income inequality chapter (where we read in the county rectified polygons). We need a solution to reduce the size of these datasets and ship them locally. Further, we also use osmnx occasionally, and need to save the network outputs and offer a local option when reading them in. For the large polygonal datasets, consider using the polygon simplification from topojson?
  2. We use remote basemaps a ton with contextily. We need to either add an option to contextily that allows for a "failsaife" mode that returns a white basemap if the provider fails to be reached, or ship a cache of basemaps along with the book.

delabel, don't de-axis

the axis can be useful for multi-facet visualizations... like, if you want to label every row or every column, removing the axis from each facet means you can't use set_xlabel or set_ylabel.

I use this function to delabel the axis, meaning that you remove ticklabels & ticks, but keep the actual bounding box.

def delabel(ax):
    if isinstance(ax, numpy.ndarray):
        orig_shape = ax.shape
        result = numpy.asarray([delabel(ax_) for ax_ in ax.flatten()])
        return result.reshape(orig_shape)
    ax.set_xticks([])
    ax.set_xticklabels([])
    ax.set_yticks([])
    ax.set_yticklabels([])
    return ax

Add packages to the docker

In order to use the tools in pointpats for the Points chapter (#28), we need to add the pointpats package to the docker.

loss of bookdata.py

in the reorganization, we lost the bookdata.py file that was used in weights to read the datasets. We will need to restore this from a previous version.

Permissions error with Docker

This is a hack I use to remap user (and group) IDs on Docker:

docker run -ti --user root -e NB_UID=1001 -e NB_GID=100 -p 8888:8888 -v /home/dani:/home/jovyan/work darribas/gds start.sh

I "think" this remaps the host UID into 1001 and host group ID into 100.

CHAPTER: Points

Issue set up to track "Chapter 9: Point Pattern Analysis"

Fix belab on book site

The belab button now correctly builds the computational backend (through the Docker set up for Binder I think) but it's not properly setup so code runs as expected. Mainly, the belab kernel drops you in the home directory of the repository, not on the content/notebook, as each notebook (and Binder now) expects.

unable to pull "the image"

I tried "docker pull gdsbook/stack", but the resulted as follows: "Using default tag: latest
Error response from daemon: manifest for gdsbook/stack:latest not found: manifest unknown: manifest unknown"

Thank you for your assistance. db

Add new dataset: GHSL

It'd be really cool to add an additional raster with non-traditional raster info. I'm thinking the GHSL for population. This would fit well in:

  • Ch.3 (#23): manipulating objects as surfaces
  • Ch.4 (#26) : weights from rasters
  • Ch.7 (#25): local statistics from a raster (extension since chapter is short)

I'm happy to add it myself. As for regions, I was thinking of using somewhere in Latin America, perhaps Sao Paulo (Brazil)?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.