Comments (15)
It seems this is a problem that's been active since at least the 2010 data came out:
https://groups.google.com/d/msg/geocommons-geocode/PH6g20m7kaU/Z_W065lbyjkJ
It looks like the data files in the 2011 distribution are kept in the same directories, instead of spread out over different states and counties. The principal directories are:
EDGES
ADDR
FEATNAMES
These just contain the zip files directly. The script tiger2009_import seems to do this:
# Foreach ZIP in FEATNAMES ADDR EDGES
unzip -q $ZIP -d $TMP
# We're building SQL here so create the tables
cat ${SQL}/setup.sql > output.sql
# Foreach file in EDGES do
shp2sqlite -aS ${TMP}/*_${file}.shp tiger_${file} >> output.sql
# Foreach file in FEATNAMES and ADDR
shp2sqlite -an ${TMP}/*_${file}.dbf tiger_${file} >> output.sql
# Now do transformations using temporary tables
cat ${SQL}/convert.sql >> output.sql
# Now run the SQL
cat ouput.sql | sqlite3 $DATABASE
Would anyone be offended if I rewrote this using Ruby for Tiger2011?
from geocoder.
I get why it's written as a single long pipe command, it's an elegant solution to the problem of the size of the data. I have the following script written in Ruby.
https://gist.github.com/1631758
This took roughly two hours on my quad core Mac to create the loading.sql file. It was roughly 99Gb. Unfortunately it seems to get stuck on the "cat loading.sql | sqlite3 #{database}" part. I gave it 16 hours, after which it was stuck using 1% of the CPU. Very strange. Probably need to rewrite it to use a single long pipe.
from geocoder.
I could be wrong but the state/county organization from TIGER/Line 2009 might be used in further steps after the import step.
from geocoder.
I just double checked and I'm not seeing any place where it's used. It looks like it simply imports the shp and dbf files into the database without regard to the folder names / placement. Of course, this shell script is pretty dense stuff for me. Here's my attempt to rewrite the above script while maintaining the whole pipe mechanism.
https://gist.github.com/1694885
I haven't ran it since I just decided to use a commercial product for geocoding. But I hope we can get to the bottom of this and update geocoder to the new database. I'm going to work on it this week-end.
from geocoder.
Good call. Just out of curiosity, what commercial geocoding software (or service) are you using?
I am working on porting TIGER/Line2011 onto HDFS instead of a database. Will post update once there are progress.
from geocoder.
Well, the data I was working on was 90% just city/state/zip. So I used for those:
http://www.zipcodedownload.com/
Then i used the geocoder gem with Bing maps for the last 10%:
https://github.com/alexreisner/geocoder
This is not ideal but I think I ended up with pretty high quality results. I'm hoping to get this geocoder database fixed, it doesn't have any usage limits and it's not locked down by any corporation or government.
from geocoder.
I can't believe it didn't occur to me but all you need to do is use the tiger_import script. Import for 2011 goes like this:
First follow the Prerequisites section of the Geocoder man page (https://github.com/geocommons/geocoder) but skip "Additionally, you will need a custom build of the ‘sqlite3-ruby’ gem". It's not needed anymore. Next build the geocoder gem:
git clone git://github.com/geocommons/geocoder.git
cd geocoder
make install
gem install Geocoder-US-2.0.2pre.gem
On Mac OS X it will fail at "make install" with "ld: symbol(s) not found for architecture x86_64". Here's the fix:
cd src/shp2sqlite
make -f Makefile.macosx
cd ../..
make install
gem install Geocoder-US-2.0.2pre.gem
ruby -rgeocoder/us -e ''
# This last command will fail with a nasty error like:
# /Users/jjeffus/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/site_ruby/1.9.1/geocoder/us/database.rb:96:in `load_extension': dlopen(/Users/jjeffus/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/site_ruby/1.9.1/geocoder/us/sqlite3.so, 10): image not found (RuntimeError)
# To get a working geocoder/us you need to take the filename after dlopen( and copy the correct file there. In this case
# the file is: /Users/jjeffus/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/site_ruby/1.9.1/geocoder/us/sqlite3.so
# so: cp lib/geocoder/us/sqlite3.so /Users/jjeffus/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/site_ruby/1.9.1/geocoder/us/sqlite3.so
After you have successfully built geocoder::us please do the next from geocoder root.
mkdir data
mkdir database
cd data
wget -nd -r -A.zip ftp://ftp2.census.gov/geo/tiger/TIGER2011/ADDR/
wget -nd -r -A.zip ftp://ftp2.census.gov/geo/tiger/TIGER2011/FEATNAMES/
wget -nd -r -A.zip ftp://ftp2.census.gov/geo/tiger/TIGER2011/EDGES/
cd ..
Now open "build/tiger_import" in the text editor of your choice and change:
SHP2SQLITE=../src/shp2sqlite/shp2sqlite
# to
SHP2SQLITE="$BASE/shp2sqlite"
Now we can finally do the import:
build/tiger_import database/geocoder.sqlite3 data
chmod +x build/build_indexes
build/build_indexes database/geocoder.sqlite3
sudo gem install text --no-rdoc --no-ri
bin/rebuild_metaphones database/geocoder.sqlite3
It took my Amazon EC2 extra-large instance about 8 hours to do the import. I'm going to put up a torrent of the finished sqlite database, as well as upload it on rapidshare or something. I'll post the links here.
Also, I'm going to fork the codebase and update the docs. This is one of the coolest libraries out there. I hope we can come together as a community and keep this thing working.
from geocoder.
I've uploaded a torrent of the full data here:
http://assuredwebdevelopment.com/geocoder_us_tigerline_2011.7z.torrent
Backup here:
http://www.mybtfiles.com/torrents/65950942/
from geocoder.
Can someone just upload their sqlite db file with 2011 loaded so that we can just use that? Are there problems with this approach?
from geocoder.
hekaldama: I did, it's in my last post. I uploaded it as a Torrent file. Let me know how that works out.
from geocoder.
Trying to download now. I am not sure if my firewall is blocking me or not, but it currently isn't downloading...
from geocoder.
I used this method on the TIGER2012 data. I was able to import and pass the tests. However, there are several lines like this in the log:
/tmp/tiger-import.9161/*_addr.dbf: dbf file (.dbf) can not be opened.
is that something I should be worried about?
from geocoder.
Here you go guys: https://www.dropbox.com/s/7so3ivq2npxcndy/geocoder_us_tigerline_2011.7z
from geocoder.
Anyone uploaded a 2012 sqlite built database? This 2011 7z file is throwing an error trying to decompress :/
from geocoder.
Here is the 2014 raw sqlite db.
http://downloads.codefi.re/shelnutt2/geocoder_tiger_2014.db
from geocoder.
Related Issues (20)
- make fails HOT 1
- import_tiger gives "undefined symbol: ceil..." & missing functions HOT 4
- Metaphone on import doesn't match metaphone when geocoding
- database location in bin/build_metaphones is hardcoded HOT 1
- [deleted] HOT 1
- Installation Produces Error HOT 1
- bin/tiger_import mentioned in README doesn't exist
- README should have beter explanation as to what TIGER/Line files are needed HOT 1
- Step-by-step instructions on installing geocoder HOT 1
- OS X - build/tiger_import error HOT 7
- Compilation fails with yacc error in wktparse.y
- Segment Geocoding Issue
- deploying geocoder on HDFS?
- Is this functional on windows? HOT 1
- Saint -> St Abbrevation
- Update/Fork for Improved Street and Zip Code Matching
- tiger_import errors on Centos 5.6 HOT 1
- C#, Sql Server, Geocoder Project Translation
- Docker translation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from geocoder.