happyte / buyhouse Goto Github PK
View Code? Open in Web Editor NEW🏠基于python的scrapy爬虫,爬取链家网成都地区新房源,并用高德api在地图上可视化显示
🏠基于python的scrapy爬虫,爬取链家网成都地区新房源,并用高德api在地图上可视化显示
项目名和spider名字都为fangjia, 运行时遇到下面异常。通过修改项目名buyhouse/fangjia -> buyhouse/fangjiaCD解决(同时需要修改fangjiaCD/settings.py和buyhouse/scrapy.cfg
$ scrapy crawl fangjia -o rent.csv -t csv
Traceback (most recent call last):
File "/usr/local/bin/scrapy", line 11, in
sys.exit(execute())
File "/usr/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 148, in execute
cmd.crawler_process = CrawlerProcess(settings)
File "/usr/local/lib/python2.7/site-packages/scrapy/crawler.py", line 243, in init
super(CrawlerProcess, self).init(settings)
File "/usr/local/lib/python2.7/site-packages/scrapy/crawler.py", line 134, in init
self.spider_loader = _get_spider_loader(settings)
File "/usr/local/lib/python2.7/site-packages/scrapy/crawler.py", line 330, in _get_spider_loader
return loader_cls.from_settings(settings.frozencopy())
File "/usr/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 61, in from_settings
return cls(settings)
File "/usr/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 25, in init
self._load_all_spiders()
File "/usr/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 47, in _load_all_spiders
for module in walk_modules(name):
File "/usr/local/lib/python2.7/site-packages/scrapy/utils/misc.py", line 71, in walk_modules
submod = import_module(fullpath)
File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/init.py", line 37, in import_module
import(name)
File "/Users/shidonghua/git-project/buyhouse/fangjia/spiders/fangjia.py", line 3, in
from fangjia.items import FangjiaItem
ImportError: No module named items
解析address异常
address = response.xpath('//p[@Class="where"]/span/@title').extract()[0]
IndexError: list index out of range
修改fangjia.py
address = response.xpath('//p[@class="where manager" or @class="where "]/span/@title').extract()[0]
from fangjia.items import FangjiaItem
ImportError: No module named items
没有items这个module
在buyhouse目录下执行:
scrapy crawl fangjia -o rent1.csv -t csv
结果报错:
File "/usr/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 51, in load
raise KeyError("Spider not found: {}".format(spider_name))
KeyError: 'Spider not found: fangjia'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.