Coder Social home page Coder Social logo

zipcode_district's Introduction

**邮编省份、县市区映射关系

一、缘起

项目数据处理过程中,需要对**各省、各市、各区等信息进行汇总统计。而对应的数据又没有完成的结构化区域信息,只有一长串的非常不标准的拼音或英文,唯一能够较为准确判断区域的只有邮编。
于是想当然的认为网上能够下载到完成的表格,想当然的认为邮编和行政区划可以一一对应。
实际数据处理过程中,会发现如下问题。
1、网上直接下下来的表格不全,有错误;
2、邮编和行政区划并非一对一的关系。有些区是共用一个邮编的,此外,很多人只会写到市级别的数字,还有些可能根本就是写了个错的(这部分计划忽略不计)
3、手工清洗数据耗时非常长,结果错误不少

二、目标及目标数据结构

市一级的邮编能够快速定位出省、市
区级邮编也能定位出省、市
一个邮编可能归属不同的区,不能含糊
表字段至少得又4个:省、市县、区、邮编

三、步骤

1、抓2345网页邮政编码查询,一个页面就够了
2、数据解析
3、建表、入库
4、更新

四、使用注意事项

zipcode_district's People

Contributors

monkeyshare avatar wei345 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.