Coder Social home page Coder Social logo

palewire / cpi Goto Github PK

View Code? Open in Web Editor NEW
127.0 127.0 23.0 252.97 MB

Quickly adjust U.S. dollars for inflation using the Consumer Price Index (CPI)

Home Page: https://palewi.re/docs/cpi/

License: MIT License

Makefile 0.01% Python 4.00% Jupyter Notebook 95.98% Shell 0.01%
bls consumer-price-index cpi data-journalism dataset dollar economics inflation journalism money news python sqlite

cpi's Introduction

Hello. My name is Ben Welsh.

I’m an Iowan living in New York City. I work as a journalist, albeit an unconventional one. I specialize in what some people call data journalism, some call computational journalism and some others call computer-assisted reporting.

This README includes a directory of my open-source computer code on GitHub and other platforms. It does not include the dozens of apps, stories and graphics I've published as part of my journalism career. For that, visit palewi.re to find my résumé, a database of my news clips and an archive of my public-speaking engagements.

Table of contents

Products

Websites

repo description
amsat-satellite-index A searchable, sortable table listing all the ham satellites in space
californiacivicdata.org The online home of the California Civic Data Coalition
cummings.ee A collection of the work of Edward Estlin Cummings, as it enters the public domain
news-homepages An open-source archive that gathers, archives and shares news homepages
palewi.re My blog
savemy.news A personal, permanent clipping service
studs-terkel-podcast Selections from WFMT's Studs Terkel Radio Archive delivered to your podcatcher

Lesson plans

repo description
first-automated-chart Learn how you can use Python and the Datawrapper API to create a limitless number of charts and maps
first-django-admin A step-by-step guide to creating a simple web application that empowers you to enlist reporters in data entry and refinement
first-github-scraper An introduction to free, automated web scraping with GitHub Actions
first-pull-request How to propose changes to open-source software using GitHub pull requests
first-python-notebook A step-by-step guide to analyzing data with Python and the Jupyter Notebook
first-visual-story A step-by-step guide to publishing a standalone story from a dataset

Bots

repo description
old-la-photos A bot that posts photographs from the Los Angeles Public Library’s digital collection
metar-weather-bot A bot that posts the latest METAR weather report for LAX airport
muckrockbot A bot that posts the latest public records requests filed and completed at muckrock.com
nyc-open-data-monitor Automated monitoring of new and updated datasets posted to New York City's data portal
reuters-jobs A bot that posts the latest open jobs at Reuters
random-pigeon-gpt A bot that posts AI-generated images of New York City pigeons generated using random adjectives
sanborn-maps-bot A bot that posts random images from the Library of Congress collection of Sanborn Fire Insurance Company maps

Data

Computational notebooks

repo description
baseball-notebooks Python notebooks exploring Major League Baseball data
california-crop-production-wages-analysis Crop worker pay in California
california-electricity-capacity-analysis California's costly power glut
california-fire-zone-analysis California buildings within fire hazard zones
california-h2a-visas-analysis Temporary visas granted to foreign agricultural workers
census-hard-to-map-analysis A census undercount could cost California billions — and L.A. is famously hard to track
cfb-gap-analysis College football's most imbalanced teams
chicago-regions-map Creates a regional map of Chicago based on the city's official designations
chicago-trees-analysis How many trees has Chicago planted? And where?
construction-jobs-analysis Demographics and pay of construction workers
cubs-opening-day-analysis Analysis of the Opening Day starters for the Chicago Cubs baseball team
deadspin-scraper Scrape posts from Deadspin
deleon-district-election-results-analysis How former state Sen. Kevin de León fared in his own district
drudge-domain-analysis A simple example of using storytracker and the PastPages API to conduct a link analysis
faa-drone-license-analysis Who can fly commercially?
ferc-enforcement-analysis Civil penalties issued by FERC
helicopter-accident-analysis A Los Angeles Times analysis of helicopter accident rates
hollister-ranch-analysis Agricultural property tax breaks in Hollister Ranch
houston-flood-zone-analysis Geospatial analysis of Houston homes after Hurricane Harvey
hsr-document-analysis How California’s faltering high-speed rail project was ‘captured’ by costly consultants
judge-home-run-analysis How the Yankee slugger's 2022 pace compares to the past
la-settlements-analysis Legal payouts by L.A. city
la-vacant-building-complaints-analysis Vacant building complaints filed with L.A. city
la-weedmaps-analysis Black market cannabis shops thrive in L.A. even as city cracks down
literary-notebooks Python notebooks exploring Project Gutenberg texts
native-american-census-analysis The 2020 census is coming. Will Native Americans be counted?
promenade-west-sales-report An analysis of downtown Los Angeles housing prices
street-racing-analysis Street racing fatalities in L.A. County
swana-census-analysis Are Arabs and Iranians white? Census says yes, but many disagree
washingtonpost-newswhip-analysis How many pieces does the Washington Post publish?

Git scrapers

repo description
amateur-satellite-database The amateur satellites in space. A machine-readable mirror of JE9PEL's website and the SatNOGS database.
aphis-inspection-reports Scrapes inspection data and PDFs from the USDA's Animal and Plant Health Inspection Service
california-coronavirus-data The Los Angeles Times' open-source archive of California coronavirus data
california-coronavirus-scrapers The open-source web scrapers that feed the Los Angeles Times California coronavirus tracker
fed-dot-plot-scraper Extracting the "dot plot" economic projections posted online by the Federal Open Market Committee
noaa-hurricane-gis-scraper Automated downloads of geographic information system data posted by the National Oceanic and Atmospheric Administration's National Hurricane Center and Central Pacific Hurricane Center

Public records requests

repo description
california-business-entities Corporations and limited-liability companies registered with the California Secretary of State
california-house-members A simple machine-readable list of the 53 men and women California sends to Congress
california-topojson-atlas Simple maps of California's 58 counties
cedar-rapids-buildings-unsafe-after-derecho-2020 Buildings marked as unsafe to occupy by the Cedar Rapids city government following the 2020 derecho storms
la-county-2016-primary-precinct-maps Maps of the consolidated precincts used in Los Angeles County's 2016 primary election
la-county-election-precincts-2018 Final election precincts used by the Los Angeles County Registrar-Recorder/County Clerk in the 2018 elections
la-county-election-precincts-2020 Final election precincts used by the Los Angeles County Registrar-Recorder/County Clerk in the 2020 general elections
la-county-trail-maps Geospatial data of trails managed or planned by the Los Angeles County Department of Parks and Recreation
la-magnets-2016-test-scores A database of test scores for roughly 200 L.A. Unified magnet schools obtained by the Los Angeles Times
la-metro-maps Geospatial data from L.A. Metro's public transportation system
lausd-school-campus-polygons The areas of school campuses at the Los Angeles Unified School District
los-angeles-county-tsunami-hazard-areas California Geological Survey maps of flooding tsunamis could produce in Los Angeles County
noaa-hurricane-hunters-logo An official logo of NOAA's Hurricane Hunters released via FOIA
nrol-39-logo A vector PDF of the official mission logo of NROL-39 released via FOIA
nyc-parks-logo The official logos of NYC Parks released via FOIL
regional-connector-art Public art created for light rail stations on the Los Angeles Metro's Regional Connector line
san-francisco-campaign-contributions Itemized monetary campaign contributions compiled by San Francisco's Ethics Commission
space-force-emblems The official logos of 83 US Space Force units
union-station-site-map The glossy map on display in the Los Angeles transit hub
us-ca-butte_county-addresses_parcels_roads-shp SHP files of addresses, parcels and roads received in a public record request from Butte County, California
us-ca-el_dorado_county-currprcl-shp SHP file of parcels with situs address attached provided via public records request by local government in El Dorado County California
us-ca-lake_county-situs_parcels-shp SHP file of parcels with situs address attached provided via public records request by local government in Lake County California
us-ca-lassen_county-situs_parcels-shp SHP file of parcels with situs address attached provided via public records request by local government in Lassen County California
us-ca-madera_county-situs-shp SHP file of address points provided via public records request by local government in Madera County California
us-ca-orange_county-situs_parcels-shp A SHP file of parcel polygons downloaded from Orange County California's public website
us-ca-san_joaquin_county-situs_parcels-shp SHP file of parcels with situs address attached provided via public records request by local government in San Joaquin County California
us-ca-santa_clara_county-gis-shp SHP files downloaded from Santa Clara County California's password-protected GIS repository
us-ca-santa_cruz_county-PointAddress_SC-shp SHP file of address points provided via public records request by local government inSanta Cruz County California
us-ca-shasta_county-ShastaCountySitusPoints-shp SHP file of address points provided via public records request by Shasta County government
us-ca-sonoma-county-sc_base_adr_addresses-shp SHP file of parcels with situs address attached provided via public records request by local government in Sonoma County California
us-ca-yuba_county-AddressPoints-shp SHP file of address points provided via public records request by local government in Yuba County California
usa-style-guides U.S. government style guides acquired via the Freedom of Information Act
usgs-anss-logo The logo for the U.S. Geological Survey's Advanced National Seismic System
usgs-hawaii-volcano-drone-survey-october-2022 Photography and a digital elevation model from an October 2022 USGS drone mission over the Kilauea volcano's Halema‘uma‘u pit crater.

Python

Templates

repo description
django-heroku-template A template for Django projects hosted by Heroku
python-open-source-template A template for open-source Python software repositories

Packages

repo description
air-quality-index Download air quality index data from AirNow
altair Declarative statistical visualization library for Python
atcf-data-parser Parser for the a-deck data posted online by the Automated Tropical Cyclone Forecasting System
archiveis A simple Python wrapper for the archive.is capturing service
calfire-wildfires Download wildfires data from CalFire
census-data-aggregator Combine U.S. census data responsibly
census-data-downloader Download U.S. census data and reformat it for humans
census-error-analyzer Analyze the margin of error in U.S. census data
census-map-consolidator Combine Census blocks into new shapes
census-map-downloader Easily download U.S. census maps
cpi Quickly adjust U.S. dollars for inflation using the Consumer Price Index (CPI)
django-anss-archive Archive real-time earthquake notifications from the USGS's Advanced National Seismic System
django-bakery A set of helpers for baking your Django site out as flat files
django-calaccess-downloads-website An open-source archive of campaign finance and lobbying disclosure data from the California Secretary of State’s CAL-ACCESS database
django-calaccess-raw-data A Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State's CAL-ACCESS database
django-calaccess-processed-data A Django app to transform and refine campaign-finance data from the California Secretary of State’s CAL-ACCESS database
django-calaccess-scraped-data A Django app to scrape campaign-finance data from the California Secretary of State’s CAL-ACCESS website
django-calaccess-technical-documentation Technical documentation for our pipeline of Django apps that download, extract, load and process the CAL-ACCESS database
django-greeking Django template tools for printing filler, a technique from the days of hot type known as greeking
django-internetarchive-storage A custom Django storage system for Internet Archive collections
django-postgres-copy Quickly import and export delimited data with Django support for PostgreSQL's COPY command
django-yamlfield A Django database field for storing YAML data
inciweb-wildfires Download wildfire data from inciweb
install-python-pipenv-pipfile Easily install Python, pipenv and Pipfile packages in your GitHub Action
ipsos-credibility-interval Calculate Bayesian credibility intervals for online polling using the Ipsos method
mlbcolors Easy access to the official colors of every team in Major League Baseball
nasa-wildfires Download wildfire data from NASA satellites
nifc-wildfires Download wildfires data from NIFC
noaa-wildfires Download wildfires data from NOAA satellites
nws-aurora Download forecast data for Aurora Borealis and Aurora Australis from the National Weather Service
nws-wwa Download watch, warning and advisory data from the National Weather Service
python-censusbatchgeocoder A simple Python wrapper for U.S. Census Geocoding Services API batch service
python-googlegeocoder A simple Python wrapper for version three of Google's geocoder API
python-muckrock A simple Python wrapper for the MuckRock API
reuters-style A Python library format dates, numbers and text to conform with the Reuters Style Guide, the standards that guide the world's largest independent newsroom
savepagenow A simple Python wrapper for archive.org's "Save Page Now" capturing service
sphinx-palewire-theme A Sphinx theme for sites hosted at palewi.re
storysniffer Inspect a URL and estimate if it links to news story

Examples

repo description
altair-column-sort-example An example of how to sort the columns in a bar chart created by the Altair data visualization library
altair-election-maps-example An experiment in creating precinct-level election results maps using Python's Altair library
altair-interactive-scatterplot-example An example of how to add a tooltip to a scatterplot in Python's Altair charting library
dorling-cartogram-example How to calculate a dorling cartogram with Python
geopandas-intersection-area-example How to use geopandas' overlay method to find the area of intersections between two datasets
geopandas-spatial-join-example An example of how to join point to polygon data with geopandas and Python
git-scraper-example A example of a git scraper that download, lints, commits and archives a data set
jupyter-notebook-execution-examples Examples of how to remotely execute Jupyter Notebooks from other contexts
pandas-combine-workbooks-example How to use Python's pandas library to combine tabs from multiple Microsoft Excel workbooks into a single CSV
pandas-squarify-example How to use the squarify extension to matplotlib to visualize a pandas DataFrame as a treemap
random-tract A Python hack to respond to a Twitter challenge to "select a random geographic point in the US, with the probability weighted by population."

JavaScript

Examples

repo description
10 PRINT CHR$(205.5+RND(1)); : GOTO 10 RUN A popular one-line script for the Commodore 64
2018-year-in-review A streamgraph of the Los Angeles Times' master branch
analyzing-color Experiments in manipulating color data
baseball-visualizations Abstracting America's pastime
california-fire-zones Maps developed as part of a Los Angeles times geospatial analysis of fire risk zones
california-in-mercator A base map of the state
california-poppy-generator A randomized generator of California poppies
covid-19-prototypes Experiments developed as part of the Los Angeles Times’ coronavirus tracking effort
delaunay-headline-hero This modification of Mike Bostock's Delaunay Dual diagram was drafted for consideration as lead art on a gallery of data analysis pieces
earthquake-intensity-map A "shakemap" of the 7.1 magnitude earthquake that struck Searles Valley, Calif., on July 5, 2019
election-2013 Resultados de las Oct. 2013 Buenos Aires elecciones
election-results-by-education-treemap A treemap of Iowa's 2016 presidential election results by education
election-results-challenge Got a better idea? Here's you chance to prove it
first-observable-notebook A course taught at the 2020 conference of the National Institute for Computer Assisted Reporting
hexbin-headline-hero The first draft of the diagram that serves as the lead art on a gallery of data analysis pieces
how-iowa-voted Mapping election results from the state of Iowa
inglewood-inventory Lunches in the "City of Champions" with the Los Angeles Times Data Desk
iowa-dorling An example of a Dorling cartogram mapping the population of Iowa's 99 counties
load-d3-data-incrementally-using-sorted-value Gradually loads all the cities of California from north to south, and then removes them from south to north.
numbers-in-the-newsroom Calculators for common newsroom needs, including those featured in Sarah Cohen’s book “Numbers in the Newsroom: Using Math and Statistics in News”
observable-helpers Utilities to help do things
spike-chart This variation on a standard bar chart substitutes in a proportionally sized spike for the traditional rectangle
the-ichiro-bet The Ichiro Bet
the-many-voices-of-the-other-americans Visualizing the rotating narrators of Laila Lalami's novel "The Other Americans"
tinting-a-canvas-image How to overlay a color filter on top of a canvas image
trump-tweets Techniques for working President Donald Trump's posts on twitter.com
us-census-data A variety of methods for visualizing data from the United States Census Bureau
vega-visualizations Examples of using the Vega data visualization toolkit
voronoi-husband-and-wife A photograph of my wife and I stippled using a Voronoi diagram
web-dubois Digital recreations of data visualizations made for W.E.B. Du Bois’ presentation at the “Paris Exposition Universelle” in 1900

Other stuff

repo description
dotfiles My configuration files
ebook-exports Export the e.e. cummings free poetry archive to a variety of ebook formats
internet-archive-upload Upload files to an archive.org item
is-5 Page scans of E.E. Cummings’ 1926 book of verse
tulips-and-chimneys Page scans of E.E. Cummings’ first published book of verse

Inactive projects

Websites

repo description
boundaries.latimes.com An API that serves up local GIS data
documentstacker Use DocumentCloud to publish PDFs for humans
nicar18-datadesk-family-reunion Los Angeles Times Data Desk Reunion @ NICAR 2018
nicar19-datadesk-family-reunion Los Angeles Times Data Desk Reunion @ NICAR 2019
orchestral-motion.github.io L.A. Phil hackday website
pastpages.org The news homepage archive
tablestacker Publish spreadsheets as interactive tables. And do it on deadline.

Lesson plans

repo description
first-news-app A step-by-step guide to publishing a simple news application
first-web-scraper A step-by-step guide to writing a web scraper with Python

Bots

repo description
checkbook-la-watchdog A periodically updated archive of financial data published by the city of Los Angeles' Checkbook LA data portal
everytractcount Statistics about every U.S. Census tract mapped by @everytract
mistadobalina A script that posts raps by Del Tha Funkee Homosapien to @MISTADOBALINA on Twitter
mlb-postseason-bot Twitter bot that posts daily updates on a team’s chance to make the Major League Baseball postseason
questionheds A feed of headlines with question marks in them
trump-tweets All @RealDonaldTrump tweets stored at trumptwitterarchive.com in a single JSON

Templates

repo description
appengine-template Bootstrap a Google App Engine project with Django and other goodies
django-project-template A custom template for initializing a new Django project the Data Desk way
django-calaccess-project-template A custom template for initializing a new Django project with the California Civic Data Coalition's applications for analyzing the California Secretary of State’s CAL-ACCESS database

Packages

repo description
altair-latimes A Los Angeles Times theme for Python's Altair statistical visualization library
calculate Some simple math we use to do journalism
django-a-matter An app for authoring background biographical matter on newsworthy people
django-autoarchive Django helpers for automatically archiving URLs
django-calaccess-campaign-browser A Django app to refine, review and republish campaign finance data drawn from the California Secretary of State’s CAL-ACCESS database
django-calaccess-cookbook A Chef cookbook and Fabfile for deploying the California Civic Data Coalition's applications for analyzing the California Secretary of State’s CAL-ACCESS database on Amazon Web Services
django-calaccess-docker A standalone Docker stack serving the California Civic Data Coalition's applications for analyzing the California Secretary of State’s CAL-ACCESS database
django-calaccess-lobbying-browser A simple Django app browse California lobbying activity data from CAL-ACCESS
django-correx A set of models and template tags for pulling in lists of content changes across applications
django-memento-framework A set for helpers for Django web sites to enable the Memento framework for time-based access
django-orchestral-motion-db A Django channels app for receiving live motion data from an accelerometer
django-rapture An archive of the Rapture Index at raptureready.com
django-swineflu A quick and dirty data dump of the H1N1 flu vaccine locations that LA County public health currently buries in a PDF
django-urlarchivefield A custom Django model field that automatically archives a URL
lametro-api A simple Python wrapper for the L.A. Metro’s API for bus stops, routes and vehicles
mappingla A Python wrapper for accessing the Mapping L.A. Boundaries API
pastpages2gif Create an animated GIF from the PastPages news homepage archive
pluggablemaps A pluggable GeoDjango app with the boundaries of all states in the United States of America. Geography, loosely coupled
pluggablemaps-hackshackers A GeoDjango app that maps unemployment, meant to demonstrate concepts from my talk
pluggablemaps-lametrorail A pluggable GeoDjango app mapping the Los Angeles Metro Rail system
pluggablemaps-uscounties A pluggable GeoDjango app with the boundaries of United States counties
pyplacefinder A very simple wrapper for Yahoo PlaceFinder
python-elections A Python wrapper for the Associated Press' U.S. election data service
qiklog A simplified wrapper for Python's logging module
scrapy-calaccess-crawler A Scrapy app to scrape campaign-finance data from the California Secretary of State’s CAL-ACCESS website
statestyle A Python library that standardizes the names of U.S. states
storytracker Tools for tracking stories on news homepages
webcitation A simple Python wrapper for the webcitation.org capturing service
wordpress-memento-plugin A plugin for Wordpress web sites to enable the Memento framework for time-based access

Other stuff

repo description
california-k12-notebooks Scripts to download and process California K12 schools data
chirp-ham-radio-channels Channels formatted for the CHIRP amateur radio programming system
first-python-notebook-binder A template for deploying "First Python Notebook" with Binder
ire2010 Class materials for the Django bootcamp at the IRE 2010 conference in Las Vegas
nicar2010 Materials from the Django bootcamp at NICAR 2010
nicar2011 An example Django app for a class at the NICAR 2011 conference
python-calaccess-notebooks Python notebooks analyzing campaign finance and lobbying activity data from California Secretary of State’s CAL-ACCESS database
osm-quiet-la A street-centric base layer for overlaying point data about Southern California
osm-silent-la A template for a black base layer about Southern California
sopr-activity A quick and dirty script for pulling down lobby disclosure docs filed with the Senate Office of Public Records
sopr-contribs Scripts for processing and analyzing federal lobbyist disclosure data reporting contributions to political campaigns
the-mondesi-bet The Mondesi Bet

cpi's People

Contributors

dependabot[bot] avatar palewire avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

cpi's Issues

Error when importing CPI in ipython3 Jupyter notebook via Anaconda: "no such table: cu area"

Hi palewire,

I am trying to use the cpi library in an ipython3 Jupyter notebook running via Anaconda Navigator on a Windows 10 OS. I can successfully run pip install cpi, but when I import cpi I get the same "no such table: cu area" error message that is mentioned here: #62

For context, I can install and import other libraries through the Jupyter notebook. Also, I was able to both install and import the cpi library in January 2023. Is it possible that the solution you implemented for issues #62 is not translating into my coding environment?
image

Thank you!

latest year in time series

Hi,
I've ran cpi.update() before applying the CPI function to a quarterly time series and it appeared to work. But oddly there is a gap between my nominal numbers and those adjusted for the CPI change. How can I see what month and/or year is being used by default to adjust figures? If by default I'm using the 2018 CPI index, shouldn't my 2018 nominal and adjusted numbers be the same?
Thanks,
Jason
This is the code I used:
cpi.update()
qcew['cpi_total_qtrly_wages'] = qcew.apply(lambda x: cpi.inflate(x.total_qtrly_wages, x.int_year) , axis=1)

Unable to download data

Currently BLS website is unavailable like this:
Screenshot from 2020-08-08 17-47-17

The library just parsing the error HTML page and there is no validation check whether the correct thing gets parsed from the website. Therefore, the importing of library is not working at all. Please add validation of parsed HTML page and a workaround for these kinds of problems (or give meaningful error messages on import).

Autoupdate BLS data

It could work something like this:

  1. Store somewhere in the library the datestamp of the last time the data was downloaded
  2. Also store the last time you checked for it
  3. Each time you import the library, or perhaps when you run inflate, check the datetime of the latest value in the CPI data
  4. Calculate the time difference between that latest value and the download times
  5. If those differences are greater a threshold (One month?), rerun the download routine.

Could that work?

Very Slow to Load/Import

Python 3.6

Every time I import the library, it is extremely slow (~50 seconds).
I don't have time to review the codebase right now for a solution, so I am going to look for other libraries, but this really is a huge problem when you just want to rattle off some code quickly in the shell.

I tried with Python 3.7 from Anaconda distribution and normal 3.6, but both were exceedingly slow. I also tried calling cpi.update() and then closing and re-opening the shell and importing again, but it was still extremely slow.

Error in cpi.update()

Following is the traceback for the issue I faced during cpi.update():

Version of cpi used: 1.0.17


AssertionError Traceback (most recent call last)
Cell In [40], line 1
----> 1 cpi.update()

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\cpi_init_.py:163, in update()
157 def update():
158 """
159 Updates the Consumer Price Index dataset at the core of this library.
160
161 Requires an Internet connection.
162 """
--> 163 Downloader().update()

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\cpi\download.py:75, in Downloader.update(self)
73 # Download the TSVs
74 logger.debug(f"Downloading {len(self.FILE_LIST)} files from the BLS")
---> 75 [self.get_tsv(file) for file in self.FILE_LIST]
77 # Insert the TSVs
78 logger.debug("Loading data into SQLite database")

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\cpi\download.py:75, in (.0)
73 # Download the TSVs
74 logger.debug(f"Downloading {len(self.FILE_LIST)} files from the BLS")
---> 75 [self.get_tsv(file) for file in self.FILE_LIST]
77 # Insert the TSVs
78 logger.debug("Loading data into SQLite database")

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\cpi\download.py:109, in Downloader.get_tsv(self, file)
107 tsv_path = self.get_data_dir() / f"{file}.tsv"
108 response = requests.get(url)
--> 109 assert response.ok
110 with open(tsv_path, "w") as fp:
111 fp.write(response.text)

AssertionError:

Autoupdate

Every time I have starting to execute the script, I need to wait for the update process. There should be an auto-update mechanism or a method to check whether the files are up to date.

Operational error no such table: cu.area

When I try importing the CPI library after installation it gives me the error:
OperationalError Traceback (most recent call last)
/var/folders/cx/c0rnck312cxbw_4wbq7ylfw80000gn/T/ipykernel_38424/1241436912.py in
----> 1 import cpi

~/opt/anaconda3/lib/python3.9/site-packages/cpi/init.py in
19 # Parse data for use
20 logger.info("Parsing data files from the BLS")
---> 21 areas = parsers.ParseArea().parse()
22 items = parsers.ParseItem().parse()
23 periods = parsers.ParsePeriod().parse()

~/opt/anaconda3/lib/python3.9/site-packages/cpi/parsers.py in parse(self)
58 logger.debug("Parsing area file")
59 object_list = MappingList()
---> 60 for row in self.get_file("cu.area"):
61 obj = Area(row["area_code"], row["area_name"])
62 object_list.append(obj)

~/opt/anaconda3/lib/python3.9/site-packages/cpi/parsers.py in get_file(self, file)
35
36 # Query this file
---> 37 query = cursor.execute(f'SELECT * FROM "{file}"')
38 columns = [d[0] for d in query.description]
39 result_list = [dict(zip(columns, r)) for r in query.fetchall()]

OperationalError: no such table: cu.area

Could someone please help me fix this error

sqlite3.OperationalError: near ")": syntax error

When I try to run cpi.update(), I get an error: sqlite3.OperationalError: near ")": syntax error. I'm unsure how to update the underlying cpi tables. Even after I uninstalled and reinstalled, I only have data up until 2023-03-31.

OperationalError: no such table: cu.data.3.AsizeNorthEast

I run this in Python

import cpi
cpi.update()

I get this error. It was working till 2 days back.


OperationalError Traceback (most recent call last)
/var/folders/8b/fjq89b5n05ldt2ytmw4pn5j80000gn/T/ipykernel_94105/2037719277.py in
----> 1 from economics import Inflation

~/opt/anaconda3/lib/python3.9/site-packages/economics/init.py in
----> 1 from cpi import CPI
2 from inflation import Inflation

~/opt/anaconda3/lib/python3.9/site-packages/cpi/init.py in
23 periods = parsers.ParsePeriod().parse()
24 periodicities = parsers.ParsePeriodicity().parse()
---> 25 series = parsers.ParseSeries(
26 periods=periods, periodicities=periodicities, areas=areas, items=items
27 ).parse()

~/opt/anaconda3/lib/python3.9/site-packages/cpi/parsers.py in parse(self)
165 def parse(self):
166 self.series_list = self.parse_series()
--> 167 self.parse_indexes()
168 return self.series_list
169

~/opt/anaconda3/lib/python3.9/site-packages/cpi/parsers.py in parse_indexes(self)
195 for file in self.FILE_LIST:
196 # ... and for each file ...
--> 197 for row in self.get_file(file):
198 # Get the series
199 series = self.series_list.get_by_id(row["series_id"])

~/opt/anaconda3/lib/python3.9/site-packages/cpi/parsers.py in get_file(self, file)
35
36 # Query this file
---> 37 query = cursor.execute(f'SELECT * FROM "{file}"')
38 columns = [d[0] for d in query.description]
39 result_list = [dict(zip(columns, r)) for r in query.fetchall()]

OperationalError: no such table: cu.data.3.AsizeNorthEast

Project seems dead, alternatives?

This project hasn't been updated in a long time, the data is incomplete and the inclusion of other currencies would be awesome. Does anybody know an alternative?

Error when importing CPI: sqlite3.OperationalError: no such table: cu.area

Version 1.0.5 works, but the newest verison 1.0.9 does not. It gives me this no such table error as seen below. It looks like it may be missing a DB file or some issues with the DB file its self:

Python 3.7.6 (default, Jan  8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import cpi
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\ProgramData\Anaconda3\lib\site-packages\cpi\__init__.py", line 21, in <module>
    areas = parsers.ParseArea().parse()
  File "C:\ProgramData\Anaconda3\lib\site-packages\cpi\parsers.py", line 60, in parse
    for row in self.get_file("cu.area"):
  File "C:\ProgramData\Anaconda3\lib\site-packages\cpi\parsers.py", line 37, in get_file
    query = cursor.execute(f'SELECT * FROM "{file}"')
sqlite3.OperationalError: no such table: cu.area
>>>

Store data globally, or somewhere configurable

If I understand what's happening, data lives and is updated here: https://github.com/datadesk/cpi/blob/master/cpi/data.csv

Two things make my queasy about this (with the caveat that I haven't used this in a project yet):

  • it's changing the codebase in flight, which is scary
  • if I have multiple installs, they could get out of sync, or I just end up with lots of copies of the same data

One way you could avoid that is having a global, or configurable, data cache. It might be $HOME/.python-cpi/data.csv by default, with the option to configure if you needed an isolated copy somewhere. The library could pre-populate the cache, or fall back on what's included in the codebase, or warn if the data is stale.

sqlite3.OperationalError: near ")": syntax error

When I try to run cpi.update(), I get an error: 'sqlite3.OperationalError: near ")": syntax error'. I'm unsure how to update the underlying cpi tables. Even after I uninstalled and reinstalled, I only have data up until 2023-03-31.

adding support for datetime numpy arrays

While pd.apply() works for small datasets like the example from the docs

df['ADJUSTED'] = df.apply(lambda x: cpi.inflate(x.MEDIAN_HOUSEHOLD_INCOME, x.YEAR), axis=1)

it quickly falls apart if one tries to inflate long series because it inflates each value one at a time instead of taking advantage of numpy and pandas vectorization.

CPI already can handle numpy arrays and has both pandas and numpy as dependencies.
cpi_incomes
(100,000,000 rows in less than 2 seconds, pretty cool.)

The problem:

CPI takes year_or_month as either int or a date object and retrieves the corresponding source_index from cpi.db. This, as far as I understand, would need to be done for every item in the array therefore it would still be very time-consuming for very large datasets.

The solution:

I still don't have any solid solutions.

One way to approach this could be:

  1. receive a numpy array of dates for year_or_month

    • clean it so they all have 01 as day of month
      cpi_dates
  2. grab the unique values in this array of dates

    • even if you have 100,000,000 rows, you definitely don't have 100,000,000 different year-month combinations.
      • BLS' data goes back to 1913 (2017-1913=104 years, 104 * 12 = 1248 months + 10 months of 2018 as of now = 1258 unique values at most)
    • create a numpy array of those values matching their date (or a dict() to later use .map() on the dates array.)
  3. map the source_index values to the array of dates

    • look up the CPI value for each of those unique dates and map it back to the original numpy array of dates
  4. cpi.inflate() already just multiplies (value * target_index) / float(source_index)

    • numpy will take care of the rest

Even though most likely one would be inflating values to one specific year or month, this method could be applied to both year_or_month and to to inflate a series of values from a series of dates to a different series of dates.

The use:

The particular use I came up with was normalizing different types of incomes from public use microdata. For example, if I go to ipums and grab ACS data from 2000-2016 for incomes (earned wages, household income, farm income, social security, etc).
There are only 16 distinct years but if I use pd.apply() it would go row by row and it would simply never end:
acs2000-16


I don't have a experience with sqlite so I couldn't put together a proof of concept but I hope this explanation is helpful.

April Labor Statistics Not Present

Hello, I've been using the package for monthly food inflation tracking -

cpi_df = cpi.series.get_by_id('CUUR0000SAF1').to_dataframe()
cpi_df_x = cpi_df.filter(col('period_type') == 'monthly')

It seems that the latest information present is March; however, the bureau released April's numbers on May 11th.

Can you help me understand the discrepancy?

Yes, I am also running the update function.
cpi.update()

inflate() year_or_month type error not catching years as string format

I'm using a dataset that stores dates as strings "YYYY-MM-DD". I sliced the first four characters, then inflate complained without providing useful error information.

Converting the year_or_month parameter to an integer and performing the check would solve this issue. Also, one could check if the type is a string and provide a more useful TypeError() msg.

    if type(year_or_month) != type(to):
        raise TypeError("Years can only be converted to other years. Months only to other months.")

https://github.com/datadesk/cpi/blob/36d049b45a3f318df97dbced56fbc12dd55415e2/cpi/__init__.py#L138

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.