Coder Social home page Coder Social logo

datadesk / census-map-downloader Goto Github PK

View Code? Open in Web Editor NEW
31.0 4.0 8.0 447 KB

Easily download U.S. census maps

License: MIT License

Python 63.35% Jupyter Notebook 36.65%
census maps journalism data-journalism news mapping-la-pipeline python

census-map-downloader's Introduction

census-map-downloader

Easily download U.S. census maps

Installation

pipenv install census-map-downloader

Command line usage

Usage: censusmapdownloader [OPTIONS] COMMAND [ARGS]...

  Easily download U.S. census maps

Options:
  --data-dir TEXT  The folder where you want to download the data
  --year INTEGER   The vintage of data to download. By default it gets the
                   latest year. Not all data are available for every year.

  --help           Show this message and exit.

Commands:
  blocks                   Download blocks
  congress-carto           Download cartographic congressional districts
  counties                 Download counties
  counties-carto           Download cartographic counties
  countysubdivision        Download cartographic county subdivisions
  legislative-lower-carto  Download cartographic state legislative...
  legislative-upper-carto  Download cartographic state legislative...
  places                   Download places
  states-carto             Download cartographic states
  tracts                   Download tracts
  zctas                    Download ZCTAs

Examples

Here's an example of downloading all counties

censusmapdownloader counties

You can specify the download directory with --data-dir

censusmapdownloader --data-dir ./my-special-folder/ counties

Contributing

Install dependencies for development

pipenv install --dev

Run tests

pipenv run python test.py

Adding additional years to a dataset

Downloader classes for different geography types are defined in modules of {code}census_map_downloader.geotypes. For example, the downloader for counties is {code}census_map_downloader.geotypes.counties.CountiesDownloader.

If the URL and fields in a shapefile are the same as those for years that are already supported, you can just add the year to the {code}YEAR_LIST attribute.

If the fields are the same, but the URL changes between groups of years, add logic to the {code}url property method of the downloader classes to alter the URL based on {code}self.year.

If the fields and URL change from year to year, consider creating classes for each year and delegating to {code}census_map_downloader.geotypes.tracts.TractsDownloader is an example of a class that uses this approach.

Developing the CLI

The command-line interface is implemented using Click and setuptools. To install it locally for development inside your virtual environment, run the following installation command, as prescribed by the Click documentation.

pipenv run pip install --editable .

Links

census-map-downloader's People

Contributors

dependabot[bot] avatar ghing avatar irisslee avatar palewire avatar sandhya-k avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

census-map-downloader's Issues

Error when trying to download nonexistent lower chamber state legislative districts for Nebrasksa

When running:

censusmapdownloader --data-dir data legislative-lower-carto

I get this error:

Traceback (most recent call last):
  File "/home/codespace/.local/bin/censusmapdownloader", line 11, in <module>
    load_entry_point('census-map-downloader', 'console_scripts', 'censusmapdownloader')()
  File "/home/codespace/.local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/codespace/.local/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/codespace/.local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/codespace/.local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/codespace/.local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/codespace/.local/lib/python3.8/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/workspaces/census-map-downloader/census_map_downloader/cli.py", line 97, in legislative_lower_carto
    obj.run()
  File "/workspaces/census-map-downloader/census_map_downloader/base.py", line 167, in run
    self.download()
  File "/workspaces/census-map-downloader/census_map_downloader/base.py", line 183, in download
    runner.run()
  File "/workspaces/census-map-downloader/census_map_downloader/base.py", line 51, in run
    self.download()
  File "/workspaces/census-map-downloader/census_map_downloader/base.py", line 108, in download
    urlretrieve(self.url, self.zip_path)
  File "/usr/lib/python3.8/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/usr/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.8/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.8/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

when trying to download Nebraska's file, https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_31_sldl_500k.zip to data/raw/cb_2018_31_sldl_500k.zip, which makes sense because Nebraksa has a unicameral legislature.

Places data download stops after one state / Shell is not a LinearRing error

Hello,

I've been trying to download all "places" through the cli, but download stops after getting Alabama (FIPS 01). I'm able to download all counties by doing
censusmapdownloader counties and thought censusmapdownloader places would work the same. But in the default output raw directory, I only end up with these files:

tl_2018_01_place.cpg
tl_2018_01_place.dbf
tl_2018_01_place.prj
tl_2018_01_place.shp
tl_2018_01_place.shp.ea.iso.xml
tl_2018_01_place.shp.iso.xml
tl_2018_01_place.shx
tl_2018_01_place.zip

I get dozens of lines of these errors in the console (they also pop up when downloading counties, even though download ends up being successful)

Shell is not a LinearRing
Shell is not a LinearRing
Shell is not a LinearRing
Shell is not a LinearRing
Shell is not a LinearRing
Shell is not a LinearRing
Hole is not a LinearRing
Shell is not a LinearRing
Shell is not a LinearRing
Shell is not a LinearRing
Shell is not a LinearRing
Shell is not a LinearRing
Shell is not a LinearRing
Shell is not a LinearRing
Shell is not a LinearRing
Shell is not a LinearRing
Shell is not a LinearRing
Shell is not a LinearRing
Hole is not a LinearRing
Shell is not a LinearRing
IllegalArgumentException: geometries must not contain null elements

and then at the end

/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/geopandas/geodataframe.py:432: FutureWarning: Assigning CRS to a GeoDataFrame without a geometry column is now deprecated and will not be supported in the future.
  return GeoDataFrame(rows, columns=columns, crs=crs)
Traceback (most recent call last):
  File "/Users/mpatino14/.pyenv/versions/3.7.1/bin/censusmapdownloader", line 8, in <module>
    sys.exit(cmd())
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/census_map_downloader/cli.py", line 37, in places
    obj.run()
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/census_map_downloader/base.py", line 167, in run
    self.download()
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/census_map_downloader/base.py", line 194, in download
    runner.run()
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/census_map_downloader/base.py", line 53, in run
    self.process()
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/census_map_downloader/base.py", line 145, in process
    cleaned.to_file(self.geojson_path, driver="GeoJSON")
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/geopandas/geodataframe.py", line 746, in to_file
    _to_file(self, filename, driver, schema, index, **kwargs)
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/geopandas/io/file.py", line 239, in _to_file
    schema = infer_schema(df)
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/geopandas/io/file.py", line 299, in infer_schema
    geom_types = _geometry_types(df)
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/geopandas/io/file.py", line 316, in _geometry_types
    geom_types_2D = df[~df.geometry.has_z].geometry.geom_type.unique()
  File "/Users/mpatino14/.pyenv/versions/3.7.1/lib/python3.7/site-packages/pandas/core/generic.py", line 5274, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'Series' object has no attribute 'has_z'

Which I suppose points to an error with geopandas, but I'm just not sure what's going on there. Is there a certain version that I should have? Is there a reason why I'm not getting more data back?

Here are some geopandas related libs of my current env:

Cartopy==0.18.0
categorical-distance==1.9
census==0.8.17
census-map-downloader==0.1.0
click==7.1.2
click-plugins==1.1.1
descartes==1.1.0
Fiona==1.8.18
gast==0.2.2
GDAL==3.1.2
geographiclib==1.50
geojson==2.5.0
geopandas==0.8.1
geoplot==0.4.1
geopy==2.0.0
numpy==1.19.5
pandas==1.0.5
pyepsg==0.4.0
pyesg==0.1.4
pygeos==0.7.1
pyproj==2.6.1.post1
rasterio==1.1.8

Thanks so much for making this! And looking forward to hopefully using it :)

`FIELD_CROSSWALK` incorrect for cartographic congressional districts

Running censusmapdownloader --data-dir data congress-cart produces this error:

Traceback (most recent call last):
  File "/home/codespace/.local/bin/censusmapdownloader", line 11, in <module>
    load_entry_point('census-map-downloader', 'console_scripts', 'censusmapdownloader')()
  File "/home/codespace/.local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/codespace/.local/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/codespace/.local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/codespace/.local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/codespace/.local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/codespace/.local/lib/python3.8/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/workspaces/census-map-downloader/census_map_downloader/cli.py", line 77, in congress_carto
    obj.run()
  File "/workspaces/census-map-downloader/census_map_downloader/base.py", line 53, in run
    self.process()
  File "/workspaces/census-map-downloader/census_map_downloader/base.py", line 138, in process
    trimmed = gdf[list(self.FIELD_CROSSWALK.keys())]
  File "/home/codespace/.local/lib/python3.8/site-packages/geopandas/geodataframe.py", line 1299, in __getitem__
    result = super(GeoDataFrame, self).__getitem__(key)
  File "/home/codespace/.local/lib/python3.8/site-packages/pandas/core/frame.py", line 3030, in __getitem__
    indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
  File "/home/codespace/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1266, in _get_listlike_indexer
    self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
  File "/home/codespace/.local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1316, in _validate_read_indexer
    raise KeyError(f"{not_found} not in index")
KeyError: "['NAMELSAD'] not in index"

The fields in FIELD_CROSSWALK don't seem to match the fields listed in the documentation PDF in the comments: https://www2.census.gov/geo/tiger/GENZ2018/2018_file_name_def.pdf.

I'll fix this once I get through a bit more testing on adding support for different vintages as part of #8, but wanted to document this somewhere.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.