Coder Social home page Coder Social logo

jbaileyh / geogrid Goto Github PK

View Code? Open in Web Editor NEW
386.0 15.0 30.0 12.75 MB

Turning geospatial polygons into regular or hexagonal grids. For other similar functionality see the tilemaps package https://github.com/kaerosen/tilemaps

License: Other

R 41.01% C++ 58.99%

geogrid's Introduction

Algorithmic tesselation with geogrid

Joseph Bailey 2018-12-07

Travis-CI Build Status CRAN_Status_Badge Coverage Status

geogrid

Turn geospatial polygons like states, counties or local authorities into regular or hexagonal grids automatically.

Intro

Using geospatial polygons to portray geographical information can be a challenge when polygons are of different sizes. For example, it can be difficult to ensure that larger polygons do not skew how readers retain or absorb information. As a result, many opt to generate maps that use consistent shapes (i.e. regular grids) to ensure that no specific geography is emphasised unfairly. Generally there are four reasons that one might transform geospatial polygons to a different grid (or geospatial representation):

  1. We may use cartograms to represent the number of people (or any value) within a particular geography. For more information and examples see here and here. This cartogram approach changes the size of a particular geography in-line with the values that one seeks to visualise.
  2. We may use grids to bin data and typically visualise the spatial density of a particular variable. For an example see here.
  3. We may use grids to segment a geographical region. For example tesselation can be used in biological sampling or even in generating game environments.
  4. We may use grids to ‘fairly’ represent existing geographical entities (such as US states, UK local authorities, or even countries in Europe). For an example of representing US states as both regular and hexagonal grids see here.

The link in bullet 4 provides an excellent introduction to the notion of tesselation and its challenges. Interestingly, the eventual generation of hexagonal and regular grids demonstrated in the article was done manually. I believe that this can be very time consuming, and though it may stimulate some fun discussion - wouldn’t it be great to do it automatically?

Recent functionality for representing US states, European countries and World countries in a grid has been made available for ggplot2 here and there are many other great examples of hand-specified or bespoke grids. The challenge with this is that if you have a less commonly used geography then it might be hard to find a hand-specified or bespoke grid for your area of interest.

What I wanted to do with geogrid is make it easier to generate these grids in ways that might be visually appealing and then assign the original geographies to their gridded counterparts in a way that made sense. Using an input of geospatial polgyons geogrid will generate either a regular or hexagonal grid, and then assign each of the polygons in your original file to that new grid.

Idea

There are two steps to using geogrid:

  1. Generate a regular or hexagonal grid of your choice. There are lots of different arrangements of these grids so choosing one with the calculate_grid function and varying the seed is a good place to start.
  2. Use the hungarian algorithm to efficiently calculate the assignments from the original geography to the new geography. This involves identifying the solution where the total distance between the centroid of every original geography and its new centroid on the grid is minimised. For this I have included a previous implementation of the Hungarian algorithm kindly made available here. Huge thanks to Lars Simon Zehnder for this implementation.

Example

This is a basic example which shows how the assignment of London boroughs could work.

library(geogrid)
library(sf)
library(tmap)

input_file <- system.file("extdata", "london_LA.json", package = "geogrid")
original_shapes <- st_read(input_file) %>% st_set_crs(27700)
original_shapes$SNAME <- substr(original_shapes$NAME, 1, 4)

For reference, lets see how London’s local authorities are actually bounded in real space. In this example, I have coloured each polygon based on it’s area. Brighter polygons are larger.

rawplot <- tm_shape(original_shapes) + 
  tm_polygons("HECTARES", palette = "viridis") +
  tm_text("SNAME")
rawplot

So, let’s turn this into a grid to stop places like Bromley, Hillingdon and Havering from stealing our attention. First of all, we can generate a number of different grids using seed. Since there are many ways to dissect the outer boundary of the polygons you might want to choose an output that appeals to you. I’d recommend looking at different seed values and choosing the one that best matches the outline that you approve of.

The calculate_grid function takes in a SpatialPolygonsDataframe or sf object, a learning rate (suggestion = 0.03 to begin), a grid type hexagonal or regular and a seed value. Let’s have a look at some hexagonal grid options for the London local authorities:

par(mfrow = c(2, 3), mar = c(0, 0, 2, 0))
for (i in 1:6) {
  new_cells <- calculate_grid(shape = original_shapes, grid_type = "hexagonal", seed = i)
  plot(new_cells, main = paste("Seed", i, sep = " "))
}

Let’s also look at things with a regular grid:

par(mfrow = c(2, 3), mar = c(0, 0, 2, 0))
for (i in 1:6) {
  new_cells <- calculate_grid(shape = original_shapes, grid_type = "regular", seed = i)
  plot(new_cells, main = paste("Seed", i, sep = " "))
}

As we can see there are lots of options. Now, lets choose a grid and assign our existing places to it. I happen to like the both grids that have a seed of 3. So I’m going to assign the polygons to those grids. Let’s do that and see what they look like compared to the original.

new_cells_hex <- calculate_grid(shape = original_shapes, grid_type = "hexagonal", seed = 3)
resulthex <- assign_polygons(original_shapes, new_cells_hex)

new_cells_reg <- calculate_grid(shape = original_shapes, grid_type = "regular", seed = 3)
resultreg <- assign_polygons(original_shapes, new_cells_reg)

Now we have an example transfer from real space to grid space - we can visualise it.

hexplot <- tm_shape(resulthex) + 
  tm_polygons("HECTARES", palette = "viridis") +
  tm_text("SNAME")

regplot <- tm_shape(resultreg) + 
  tm_polygons("HECTARES", palette = "viridis") +
  tm_text("SNAME")

tmap_arrange(rawplot, hexplot, regplot, nrow = 3)

Details

The package has two major functions:

  1. calculate_grid() given your input polygons this will generate the grid as specified by your arguments:
    • shape: the original polygons
    • learning_rate: the rate at which the gradient descent finds the optimum cellsize to ensure that your gridded points fit within the outer boundary of the input polygons.
    • grid_type: either regular for a square grid or hexagonal for a hexagonal grid.
    • seed: the seed to ensure you get the same grid output.
  2. assign_polygons(): this will assign the original polygons to their new locations on the grid generated in calculate_grid(). It will find the solution that minimises the sum of the total distance between the original polygon centroids and eventual gridded centroids. Arguments:
    • shape: the original polygons
    • new_polygons: the output (a list) from calculate_grid().

TODO

  • Assignment may not always work - check the assign_polygons() why does it only work sometimes?
  • Make it work (done I think), make it right (not yet), make it fast (not yet).
  • Improve the cellsize calculation methodology.
  • Get someone to answer this stack overflow question.

Notes

This is my first attempt at a package. If it doesn’t work I’d like suggestions for improvements and thanks in advance for providing them!

I welcome critique and feedback. Blog post to follow.

Thanks

I read a lot of the work by Hadley Wickham, Jenny Bryan, Thomas Lin Pedersen, Mara Averick and Bob Rudis to name a few. But also love the R community and learn a huge amount from R Bloggers.

Extra thanks go to Ryan Hafen for making this package publishable.

Other examples

From others:

Simon Hailstone has looked at male life expectancy in the South East region of England using the package. Thanks Simon for using!

From me:

This time using the contiguous USA. Again, I used set seed and chose some that I liked but I’d recommend you’d do the same.

input_file2 <- system.file("extdata", "states.json", package = "geogrid")
original_shapes2 <- st_read(input_file2) %>% st_transform(2163)
original_shapes2$SNAME <- substr(original_shapes2$NAME, 1, 4)
  
rawplot2 <- tm_shape(original_shapes2) +
  tm_polygons("CENSUSAREA", palette = "viridis") +
  tm_text("SNAME")

Let’s check the seeds again.

par(mfrow = c(2, 3), mar = c(0, 0, 2, 0))
for (i in 1:6) {
  new_cells <- calculate_grid(shape = original_shapes2, grid_type = "hexagonal", seed = i)
  plot(new_cells, main = paste("Seed", i, sep = " "))
}

par(mfrow = c(2, 3), mar = c(0, 0, 2, 0))
for (i in 1:6) {
  new_cells <- calculate_grid(shape = original_shapes2, grid_type = "regular", seed = i)
  plot(new_cells, main = paste("Seed", i, sep = " "))
}

Now we’ve seen some seed demo’s lets assign them…

new_cells_hex2 <- calculate_grid(shape = original_shapes2, grid_type = "hexagonal", seed = 6)
resulthex2 <- assign_polygons(original_shapes2, new_cells_hex2)

new_cells_reg2 <- calculate_grid(shape = original_shapes2, grid_type = "regular", seed = 4)
resultreg2 <- assign_polygons(original_shapes2, new_cells_reg2)

hexplot2 <- tm_shape(resulthex2) + 
  tm_polygons("CENSUSAREA", palette = "viridis") +
  tm_text("SNAME")

regplot2 <- tm_shape(resultreg2) + 
  tm_polygons("CENSUSAREA", palette = "viridis") +
  tm_text("SNAME")

tmap_arrange(rawplot2, hexplot2, regplot2, nrow = 3)

Likewise, you can try the bay area…

input_file3 <- system.file("extdata", "bay_counties.geojson", package = "geogrid")
original_shapes3 <- st_read(input_file3) %>% st_transform(3310)
original_shapes3$SNAME <- substr(original_shapes3$county, 1, 4)
  
rawplot3 <- tm_shape(original_shapes3) +
  tm_polygons(col = "gray25") +
  tm_text("SNAME")

new_cells_hex3 <- calculate_grid(shape = original_shapes3, grid_type = "hexagonal", seed = 6)
resulthex3 <- assign_polygons(original_shapes3, new_cells_hex3)

new_cells_reg3 <- calculate_grid(shape = original_shapes3, grid_type = "regular", seed = 1)
resultreg3 <- assign_polygons(original_shapes3, new_cells_reg3)

hexplot3 <- tm_shape(resulthex3) + 
  tm_polygons(col = "gray25") +
  tm_text("SNAME")

regplot3 <- tm_shape(resultreg3) + 
  tm_polygons(col = "gray25") +
  tm_text("SNAME")

tmap_arrange(rawplot3, hexplot3, regplot3, nrow = 3)

geogrid's People

Contributors

apoorv74 avatar edavidaja avatar hafen avatar jbaileyh avatar nowosad avatar sassalley avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

geogrid's Issues

IDs are not assigned in the correct location

I am using geogrid to convert polygons to hexagons. The resulting map looks slightly off but reasonable for most regions. My main issue is that the IDs of the newly created hexagons is completely off. I mean, some polygons from the north west end in the south east, or central regions and so on. Any ideas? I can provide the shapefile if needed

Get path to test data using system.file

Not sure what the rep is doing here:

#In the working directory of the package.
input_file <- rep("inst/extdata/london_LA.json")

but:

input_file = system.file("inst/extdata/london_LA.json",package="hexmapr")

will remove the "In the working directory of the package" requirement and work anywhere.

Porting to python

I am in charge of implementing a tool to convert polygons from shapefiles (or other geo data formats) to equal area cartogram representations. I stumbled over your R implementation and it looks exactly like what i want to achieve.

Would it be ok for you if i port parts of your code to a python package? I am asking because the Licence file is mostly empty :) Of course i would reference this repo here.

Thanks in advance

Error

What is wrong?

RcppExports.cpp:4:27: fatal error: RcppArmadillo.h: Arquivo ou diretório não encontrado
compilation terminated.
/usr/lib/R/etc/Makeconf:168: recipe for target 'RcppExports.o' failed
make: *** [RcppExports.o] Error 1
ERROR: compilation failed for package ‘geogrid’

  • removing ‘/home/gabriel/R/x86_64-pc-linux-gnu-library/3.4/geogrid’
    Warning in install.packages :
    installation of package ‘geogrid’ had non-zero exit status

The downloaded source packages are in
‘/tmp/Rtmpae5c16/downloaded_packages’

Session Info:

R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
[1] LC_CTYPE=pt_BR.UTF-8 LC_NUMERIC=C LC_TIME=pt_BR.UTF-8 LC_COLLATE=pt_BR.UTF-8
[5] LC_MONETARY=pt_BR.UTF-8 LC_MESSAGES=pt_BR.UTF-8 LC_PAPER=pt_BR.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=pt_BR.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] dplyr_0.7.6 ggplot2_3.0.0

loaded via a namespace (and not attached):
[1] Rcpp_0.12.18 rstudioapi_0.7 bindr_0.1.1 magrittr_1.5 devtools_1.13.5 tidyselect_0.2.4
[7] munsell_0.5.0 colorspace_1.3-2 R6_2.2.2 rlang_0.2.2 plyr_1.8.4 tools_3.4.4
[13] grid_3.4.4 gtable_0.2.0 withr_2.1.2 yaml_2.1.19 lazyeval_0.2.1 digest_0.6.17
[19] assertthat_0.2.0 tibble_1.4.2 crayon_1.3.4 bindrcpp_0.2.2 purrr_0.2.5 memoise_1.1.0
[25] glue_1.3.0 compiler_3.4.4 pillar_1.3.0 scales_1.0.0 pkgconfig_2.0.2

Function clean() not found

Hi Joseph,

I'm planning on using geogrid for a new project but I couldn't manage to plot the final results. I'm adapting your sample code and while calculate_grid() and assign_polygons() both work perfectly fine, clean() results in an error message (Error in clean(resulthex) : could not find function "clean").

Do you have any ideas on how to fix/bypass this? Thanks in advance! :-)

Use a projected coordinate system in the states.json example file

Hey @jbaileyh, I would suggest to use a projected coordinate system in the states.json example file. A comparison between a geographic and projected coordinate system is attached below. You can read more about projections at https://source.opennews.org/articles/choosing-right-map-projection/.

library(geogrid)
library(sf)
#> Linking to GEOS 3.6.1, GDAL 2.1.4, proj.4 4.9.3
input_file = system.file("extdata", "states.json", package = "geogrid")
us = st_read(input_file)
#> Reading layer `OGRGeoJSON' from data source `/home/jn/R/x86_64-redhat-linux-gnu-library/3.4/geogrid/extdata/states.json' using driver `GeoJSON'
#> Simple feature collection with 49 features and 5 fields
#> geometry type:  MULTIPOLYGON
#> dimension:      XY
#> bbox:           xmin: -124.7332 ymin: 24.5447 xmax: -66.9499 ymax: 49.38436
#> epsg (SRID):    4326
#> proj4string:    +proj=longlat +datum=WGS84 +no_defs
plot(st_geometry(us))

us_2163 = st_transform(us, 2163)
plot(st_geometry(us_2163))

Created on 2018-05-01 by the reprex package (v0.2.0).

Making groups on cells when assign polygons

Hi, I'm trying to visualize an election map of South Korea.
There are provinces and cities and each province or city has number of electoral districts (represented as a cell).

plot

I tried full map of South Korea and colored by province or city for convinience. As you see, some brown cells are apart from main body of the city. I hope cells belonging to the same city be adjacent to each other.

Can I make groups on cells in same city and make them be adjacent? Or any ideas that I can implement?

Allow user to use squared distance for optimisation

Hey,

Great package, I think it would be useful if the user could specify squared (or other functions) distance to use for assignment optimisation. I am using the package on a ~400 polygon object and I'm getting good overall spatial accuracy at the cost of a few very large errors.

Many thanks!

Broken package for geographical coordinates

An SO user contacted me with this issue:

library(raster)
g <- getData(name = "GADM", country = "GBR", level = 2)
library(hexmapr)
h <- calculate_cell_size(shape = g, seed = 1,
                         shape_details = get_shape_details(g), 
                         learning_rate = 0.03, grid_type = 'hexagonal')
i <- assign_polygons(shape = g, new_polygons = h)
plot(i)
library(maptools)
j <- unionSpatialPolygons(SpP = i, IDs = rep(1, length(i)))
plot(j)

I replied that your assign_polygons() function forces spDists to use planar coordinates (lots of warnings), and that polygon boundaries do not seem to be correctly assigned. It also takes much longer than:

g1 <- rgdal::spTransform(g, CRS("+init=epsg:27700"))
i2 <- HexPoints2SpatialPolygons(spsample(g1, type="hexagonal",
  cellsize=38309)) # cellsize from calculate_cell_size
j2 <- unionSpatialPolygons(SpP = i2, IDs = rep(1, length(i2)))

which does not have the unconnected polygons.

how can I replace get_shape_details for shape file?

I have a shape file and not a json file. So, when I do st_read, it doesn't read get_shape_details (since it is associated with read_polygons but read_polygons doesn't read shp files). What command should I use to get the details ?

Thank you in advance!

Output format

This is a great project and something I've been looking for for a while. Thanks for sharing it.

At the Open Data Institute Leeds node (ODILeeds) we defined a JSON format for sharing hex-based maps. It is quite basic but lets people add extra data to their hexes too. We've already created some hex maps (by hand) including UK constituencies and ward maps for some councils. People have also made tools for D3.js and R to read maps saved in the format. It'd be great if hexmapr could save generated hex maps in the HexJSON format.

remove binaries from repository

The src directory has some binaries (.so and .o) which you should probably delete. I suspect they may be messing up my install:

 unable to load shared object `/library/hexmapr/libs/hexmapr.so':
  /library/hexmapr/libs/hexmapr.so: invalid ELF header

Add them to your .gitignore

polygons too few

Hello,

how do I change the command if I get that I have too few (small) polygons? thanks, Martin

for (i in 1:6){

  • new_cells <- calculate_cell_size(original_shapes, original_details,0.05, 'hexagonal', i)
  • plot(new_cells[[2]], main = paste("Seed",i, sep=" "))
  • }
    [1] 2
    [1] 197.2832
    [1] "too few polygons"
    [1] 1
    [1] 197.0859
    [1] "too few polygons"
    Show Traceback

Rerun with Debug
Error in .local(x, n, type, ...) :
iteration did not converge; try enlarging argument iter

Get CRAN-ready

There are several things to address with R CMD check to get this package ready for release on CRAN. I'm happy to do all of these things and issue a PR.

  • Reduce the number of Depends (ideally to zero) and change them to Imports and handle appropriately when used
  • Fully document functions
  • Use roxygen2 to handle function documentation, imports, etc.
  • Some minimal tests

Regardless of CRAN release, it would still be good to have these updates in place so that other packages can depend on this package without issue. If you are okay with me working on these, let me know and I'll get started.

Update Geogrid to work on R 3.3.3

Hello,

I just tried installing geogrid and got the following error:

Warning in install.packages :
  package ‘geogrid’ is not available (for R version 3.3.3)

Could you update the package on CRAN to allow for downloads in newer R versions please?

Error in get_shape_details_internal(shape) :

I'm getting the following error when trying to follow along with the example documented here.

Error in get_shape_details_internal(shape) : 
  trying to get slot "bbox" from an object (class "sf") that is not an S4 object

I'm currently working with a fresh install of R studio and only have the libraries either referenced in the example or the dependents of those libraries installed.

create maps from sf object

Is it possible - or could you enhance package so - that the hex and regular maps can be created from the starting point of a data.frame with a simple feature geometry list-column

Examples: clean() error: no method for coercing S4 class to vector

Hi there,

Thank you for making this package available. Trying to run through the example and encounter the below:

> result_df_raw <- clean(raw)
Error in as.vector(x) : no method for coercing this S4 class to a vector

Session info:

R version 3.4.2 (2017-09-28)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] hexmapr_0.0.0.1   packrat_0.4.8-1   gridExtra_2.3     units_0.4-6       udunits2_0.13     sf_0.5-5          rgeos_0.3-26     
 [8] rgdal_1.2-15      sp_1.2-5          Rcpp_0.12.13      magrittr_1.5      DBI_0.7           dplyr_0.7.4       plyr_1.8.4       
[15] viridis_0.4.0     viridisLite_0.2.0 ggplot2_2.2.1     devtools_1.13.3  

loaded via a namespace (and not attached):
 [1] compiler_3.4.2   bindr_0.1        class_7.3-14     tools_3.4.2      digest_0.6.12    memoise_1.1.0    tibble_1.3.4    
 [8] gtable_0.2.0     lattice_0.20-35  pkgconfig_2.0.1  rlang_0.1.2      yaml_2.1.14      bindrcpp_0.2     e1071_1.6-8     
[15] withr_2.0.0      classInt_0.1-24  grid_3.4.2       glue_1.1.1       R6_2.2.2         scales_0.5.0     assertthat_0.2.0
[22] colorspace_1.3-2 lazyeval_0.2.0   munsell_0.4.3

Slow processing with assign_polygons() (hungarian_costmin()) for 3 MB shapefile

Here's a reprex using a larger shapefile with US congressional districts. Based on a single polygon taking 10 seconds to pass through hungarian_costmin(), this should take over 80 minutes to render the full cartogram. I've reduced the shapefile size from 65 MB to 3 MB already, but is there a way the hungarian algorithm can be optimized to run faster during grid assignment?

library(geogrid)
library(sf)
library(tidyverse)
library(tmap)

# Download and read example data
tmp_shps <- tempfile()
tmp_dir <- tempdir()
download.file(
  "https://www2.census.gov/geo/tiger/TIGER2016/CD/tl_2016_us_cd115.zip",
  tmp_shps
)
unzip(tmp_shps, exdir = tmp_dir)
us_shp <- st_read(paste(tmp_dir, "tl_2016_us_cd115.shp", sep = "/"))
  # > 64394144 bytes
shp <- us_shp %>% 
  filter(!STATEFP %in% c('15', '02', '60', '72', '69', '78', '66')) %>% # just mainland
  as('Spatial') %>%
  rmapshaper::ms_simplify() %>% # simplification using rmapshaper. reduces size from 65 MB to 3 MB
  st_as_sf()

# shp %>% object.size()
# > 3594432 bytes

# commented out: initial rawplot as in README example #
# substrRight <- function(x, n){
# substr(x, nchar(x)-n+1, nchar(x))
# }
# shp$SNAME <- paste(shp$STATEFP %>% as.character(), 
#                   substrRight(shp$NAMELSAD %>% as.character(), 2), 
#                               sep = '_') # get congr distr #
# rawplot <- tm_shape(shp) +
#   tm_polygons() +
#   tm_text("SNAME")
# rawplot

# commented out: exploring different grid layouts, settled on hexgrid seed 6 #
# par(mfrow = c(2, 3), mar = c(0, 0, 2, 0))
# for (i in 1:6) {
#   new_cells <- calculate_grid(shape = original_shapes, grid_type = "hexagonal", seed = i)
#   plot(new_cells, main = paste("Seed", i, sep = " "))
# }

# Calculate Grid
    new_cells_hex <- calculate_grid(shape = shp, grid_type = "hexagonal", seed = 6)

# Assign Polygons
    ## assign_polygons() doesn't complete here, sometimes crashes. takes extremely long time to run
    # resulthex3 <- assign_polygons(shp, new_cells_hex) 
    #####################

##
# Where the delay is: hungarian_costmin within assign_polygons()
##

# within assign_polygons.sf, which converts to spdf, and uses
  # assign_polygons.SpatialPolygonsDataFrame()
# running equivalent contents line by line, just 1st element / polygon within loop

  shape <- shp %>% as('Spatial')
  original_points <- rgeos::gCentroid(spf, byid = TRUE)
  shape@data$CENTROIX <- original_points$x
  shape@data$CENTROIY <- original_points$y
  shape@data$key_orig <- paste(original_points$x, original_points$y, 
                               sep = "_")
  
  new_polygons <- new_cells_hex
  new_points <- new_polygons[[1]]
  vector_length <- length(shape)
  new_polygons2 <- new_polygons[[2]]
  s_poly <- sp::SpatialPolygonsDataFrame(new_polygons2, as.data.frame(sp::coordinates(new_polygons2)))
  s_poly$key_new <- paste(s_poly@data$V1, s_poly@data$V2, 
                          sep = "_")
  closest_site_vec <- vector(mode = "numeric", length = vector_length)
  min_dist_vec <- vector(mode = "numeric", length = vector_length)
  taken_vec <- vector(mode = "numeric", length = vector_length)
  taken_vec_index <- integer(vector_length)

## within loop ##
  i <- 1 # just first iteration
  dist_vec <- sp::spDistsN1(original_points, new_points[i], 
                            longlat = FALSE)
  min_dist_vec[i] <- min(dist_vec)
    closest_site_vec[i] <- which.min(dist_vec)

  taken_vec[i] <- which.min(dist_vec)
  taken_vec_index <- taken_vec[taken_vec > 0]
  costmatrix <- sp::spDists(original_points, new_points, 
                            longlat = FALSE)
  colnames(costmatrix) <- paste(s_poly@data$V1, s_poly@data$V2, 
                                sep = "_")
  rownames(costmatrix) <- paste(original_points@coords[, 1], original_points@coords[, 2], sep = "_")
  #################################################
  ### everything up to here evaluates instantly ###
  #################################################

  system.time(hungarian_costmin <- hungarian_cc(costmatrix)) # Just one element takes 10.8 s, MBP 2.2 GHz i7 16 GB RAM
  # user  system elapsed 
  # 10.796   0.033  10.876 
  
  # end loop
  # 11 * 436 obs ~ 4796 seconds / 80 minutes

Is there a realistic limit for number of polygons

I am trying to set some hexagonal maps for areas of Canada to visualize there recent census results

I have no problem with the code with a small number of areas i.e in the GTA version below. Although there are warnings with assign_polygons() it does work almost instantaneously
However, the computer hangs at this stage when I am doing the same process with a significantly 20X set of inputs TOR
Just wondering whether there is a practical limit

NB I have just done calculate_cell_size on one option for this issue

library(cancensus)
library(sf)
library(tidyverse)
library(broom) 
library(hexmapr)


options(cancensus.api_key = "CensusMapper_5c16da37f89e276603dd820db030d03a")

clean <- function(shape){
  shape@data$id = rownames(shape@data) 
  shape.points = tidy(shape, region="id") 
  shape.df = inner_join(shape.points, shape@data, by="id")
}

# Greater Toronto ares census sub-divisions

GTA <- get_census(dataset='CA16', regions=list(CMA="35535"),  level='CSD', geo_format = "sf") #24 obs

GTA_sp <- as(GTA, "Spatial")

GTA_shp_details <- get_shape_details(GTA_sp)


GTA_cells <-  calculate_cell_size(GTA_sp, GTA_shp_details,0.03, 'hexagonal', 1)


GTAhex <- assign_polygons(GTA_sp,GTA_cells)
#There were 25 warnings (use warnings() to see them)
warnings()
#25: In spDists(originalPoints, new_points, longlat = FALSE) :
#spDists: argument longlat conflicts with CRS(x); using the value FALSE

GTAhex<- clean(GTAhex) #168 obs

ggplot(GTAhex, aes(long,
                      lat,
                      fill=Population,
                      group=group)) +
  geom_polygon(col="white") +
  geom_text(aes(V1, V2, label = substr(name,1,4)), size=5,color = "white") +
  scale_fill_viridis(option="plasma") +
  coord_equal() + theme_void()

# Toronto census tracts

TOR <- get_census(dataset='CA16', regions=list(CMA="3520"), vectors=c("v_CA16_2447"), level='CT', geo_format = "sf") #572

TOR_sp <- as(TOR, "Spatial")

TOR_shp_details <- get_shape_details(TOR_sp)


TOR_cells <-  calculate_cell_size(TOR_sp, TOR_shp_details,0.03, 'hexagonal', 1)

## hangs here
TORhex <- assign_polygons(TOR_sp,TOR_cells)
TORhex<- clean(TORhex)

The stackoverflow?

I think, but am not certain, that you would like to take a look at RTriangle, based on Jonathan Shewcut's C implementation Triangle. Further, as you're a reader, pretty much anything Shewcut or his protoges wrote Shewcut - Berkeley including, I would add, his C code for the comments. I'm not a C programmer, only R to a point, but if I was trying to learn elegant C, I'd read his. And, of course, I could be completely wrong as to your needs.

Positions of some hexagons

Hi, Joseph. First I'd like to thank you for your package; it is helping me a lot with my project.

I have a concern/question in creating a hexagonal grid for South Korea. Whenever I create hexagonal grids I see 3 to 6 different hexagon units generated in a district (island) which should technically have two geographic units. First I thought that maybe for each trial the positions of hexagons change, therefore, some of the hexagons move from the mainland to this island. Nevertheless... could you please explain what is possibly causing this or the mechanism? First I will show you my script and a basic plot of the shape file to show you what I mean.

#install.packages("raster")
library(raster)
kor2 <- getData(name = "GADM", country = "KOR", level = 2)

str(kor2, max.level = 2)
Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots
..@ data :'data.frame': 229 obs. of 15 variables:
..@ polygons :List of 229
..@ plotOrder : int [1:229] 62 64 108 66 117 56 65 121 67 123 ...
..@ bbox : num [1:2, 1:2] 125.1 33.1 130.9 38.6
.. ..- attr(*, "dimnames")=List of 2
..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
#plot the shape file
plot(kor2)

image

As you can see, there are only two geographic units in Jeju island in the image above, the big island at the bottom, However, whenever I generate a set of hexagonal grids, different numbers of hexagons come out for that island. See below for my script and six hexagonal grids generated as a result.

#basic setup for hexgird
library(devtools)
library(hexmapr)
original_shapes <- getData(name = "GADM", country = "KOR", level = 2)
original_details <- get_shape_details(original_shapes)
#generating 6 different hexagonal plots
par(mfrow=c(2,3), mar = c(0,0,2,0))
for (i in 1:6){
new_cells <- calculate_cell_size(original_shapes, original_details,0.03, 'hexagonal', i)
plot(new_cells[[2]], main = paste("Seed",i, sep=" "))
}

image

Any ideas on why this is the case? Your feedback will be much appreciated!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.