bowenpay / wechat-spider Goto Github PK
View Code? Open in Web Editor NEW微信公众号爬虫
Home Page: http://wechatspider.0fenbei.com/
微信公众号爬虫
Home Page: http://wechatspider.0fenbei.com/
有的文章过阵子就被删了 要是能有一个保存图片到本地服务器/云的功能就好了
本项目想在的框架下面,似乎没有手段能方便的取到一个公帐号的历史文章数据?只能限制在最近 10 条?
启动爬虫步骤中,“extractor”和“processor”输入后终端没有反应,浏览器“等待下载“数量有增加,但”最新文章“无更新,请问是什么原因?
# 未被限制,可以下载
DEBUG 2016-09-29 23:23:39,015 remote_connection 85800 140735118328576 POST http://127.0.0.1:60960/hub/session {"requiredCapabilities": {}, "desiredCapabilities": {"binary": "/Applications/Firefox.app/Contents/MacOS/firefox-bin", "firefox_profile": "UEsDBBQAAAAIAPK6PUlBOpIx6gMAAKMOAAAHAAAAdXNlci5qc51XwW7jNhC99ysWOXWBmkg23Ut7SrMpUGCxKdYI9khQ1MhiTJFccmjFf98hZSF2LFFOb5LMIWfevPc4jgE8dx6aX69qgcKDsx6V2TBntZJ7lj6uY9WpEJQ1/+aPd1KCQ6ivfvvQCB3g45+/xKNtbMc68cJl620HPEivHHIfDUfVAcXcXp8G9FDVXu3oHYyoNHAjkF457MBgmD4ErdVbhcwA9tZvO2HEBjyrVUg7UAz6+CYkgIxe4Z71whu+U9BTmbxTL1Cz0NqeWyNh+rRnsRNDHcw6JBxCjvjH3NOjnTkPXhBMyIujIxiBDeXNwHaaHlUOPuXXg9hOB1Te9vTGHp6+3rFbJl6bcp6LcO6iJEZYETR0gH7PPDyDnNt2TCGA8LI9nFDONogG8nPiWCc0VbuQ06HFrEV0iVgraY2hnBKyKwd+RUuJPBR8c12CVIPYJUTzV1hq+akYWhAa2+GdpfOUBNYoH/B7NNMbvLJahBBJCNEQgIGg5KSlmBMuIEqnY3TMEa1p4TsKu4AslrLyqp6m7bgSRRXy/o/m0cFMka8EyPYQ0BKyHnK9DcmfSy9COx17rg9jUTX7p5CxKUlEBnfgTDmtZEWk8tr2rI6dO+JZUa6yBbm9t50jH6qUTicatWlR7xdQODStJd87NO5KVDbiH5UWZns1HaWpdE/2WFbB6KrLdvo2C7IQLZMTR6/LGdWwSwYQGOVjvRy8rQhawVPLUNmm0crMtK9UARN1rZL2xUIxJ7bhWhXa/Sr97EiOKw1mg4mXnz5/nrX6UuGOyO0wJJXwztZCL3Rv3k203WzSxwPcD6VtJk20ePCLU6RKoTULakP3KhlEIIH+jMovHtIChWUtfIFGRI1/Db+86/4qmdL5bfPOy2noQl7/ac4BeqOtqNk4ISTb/9GCWSeOUYLzajvMEtyS+/E65mrSVmXUiNVblkOSofiBpjPJGeiJP4nh/5s941102WjRKA30REbllURuvSLq8WHSK5flQdQJTsA8Dfydrr47ykcuDz+VtnKrVcDLrvmhUauDQayIvBhnhsACLtGlVhXFdJTiqMGS4o9GKBHRziBNIrOG0Oqgq8Cv8+tM9oWBb2k4OScbUWm46M7ZdlRnSvzLQOu1pLgwMTSdkpMtXhpF3S9OWaSzoWOcPIo/B1JNGmPVgNp5F179rG8VGUGiVdnOjuawPCEfzWESPE4fMzEE3Wsb5u6rEwhC+reEnPIsWN8F/7WK5B3lQayktmpNKFd7Xg8+Xb6qRxN4q/7budFONyG6QVUDGWYGGfpUP5ofmYbh5popAto+ff962RhEduu02EP9cHO9/kaTYObM77T4P1BLAQIUAxQAAAAIAPK6PUlBOpIx6gMAAKMOAAAHAAAAAAAAAAAAAACkgQAAAAB1c2VyLmpzUEsFBgAAAAABAAEANQAAAA8EAAAAAA==", "args": [], "marionette": false}}
DEBUG 2016-09-29 23:23:39,126 remote_connection 85800 140735118328576 Finished Request
DEBUG 2016-09-29 23:23:39,127 remote_connection 85800 140735118328576 POST http://127.0.0.1:60960/hub/session/f8517508-33ac-314b-8671-6428b0545704/url {"url": "http://weixin.sogou.com/", "sessionId": "f8517508-33ac-314b-8671-6428b0545704"}
ERROR 2016-09-29 23:23:40,084 downloaders 85800 140735118328576 ''
Traceback (most recent call last):
File "/Users/onestar/wechat_spider/wechat-spider/wechat/downloaders.py", line 101, in download_wechat
self.visit_wechat_index(wechatid)
File "/Users/onestar/wechat_spider/wechat-spider/wechat/downloaders.py", line 163, in visit_wechat_index
browser.get("http://weixin.sogou.com/")
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 248, in get
self.execute(Command.GET, {'url': url})
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 234, in execute
response = self.command_executor.execute(driver_command, params)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 401, in execute
return self._request(command_info[0], url, body=data)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 433, in _request
resp = self._conn.getresponse()
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1136, in getresponse
response.begin()
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 453, in begin
version, status, reason = self._read_status()
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 417, in _read_status
raise BadStatusLine(line)
BadStatusLine: ''
DEBUG 2016-09-29 23:23:40,092 remote_connection 85800 140735118328576 DELETE http://127.0.0.1:60960/hub/session/f8517508-33ac-314b-8671-6428b0545704/cookie {"sessionId": "f8517508-33ac-314b-8671-6428b0545704"}
ERROR 2016-09-29 23:23:40,093 downloaders 85800 140735118328576 [Errno 61] Connection refused
Traceback (most recent call last):
File "/Users/onestar/wechat_spider/wechat-spider/wechat/downloaders.py", line 49, in __exit__
self.browser.delete_all_cookies()
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 655, in delete_all_cookies
self.execute(Command.DELETE_ALL_COOKIES)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 234, in execute
response = self.command_executor.execute(driver_command, params)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 401, in execute
return self._request(command_info[0], url, body=data)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 432, in _request
self._conn.request(method, parsed_url.path, body, headers)
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1057, in request
self._send_request(method, url, body, headers)
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1097, in _send_request
self.endheaders(body)
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1053, in endheaders
self._send_output(message_body)
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 897, in _send_output
self.send(msg)
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 859, in send
self.connect()
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 836, in connect
self.timeout, self.source_address)
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 575, in create_connection
raise err
error: [Errno 61] Connection refused
os : mac osx
看不太懂 debug 信息啊
数据库里看到是加密的 求解?
Applying wechat.0001_initial...Traceback (most recent call last):
File "manage.py", line 10, in
execute_from_command_line(sys.argv)
File "/usr/local/lib/python2.7/site-packages/django/core/management/init.py", line 338, in execute_from_command_line
utility.execute()
File "/usr/local/lib/python2.7/site-packages/django/core/management/init.py", line 330, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/local/lib/python2.7/site-packages/django/core/management/base.py", line 390, in run_from_argv
self.execute(*args, **cmd_options)
File "/usr/local/lib/python2.7/site-packages/django/core/management/base.py", line 441, in execute
output = self.handle(*args, **options)
File "/usr/local/lib/python2.7/site-packages/django/core/management/commands/migrate.py", line 221, in handle
executor.migrate(targets, plan, fake=fake, fake_initial=fake_initial)
File "/usr/local/lib/python2.7/site-packages/django/db/migrations/executor.py", line 110, in migrate
self.apply_migration(states[migration], migration, fake=fake, fake_initial=fake_initial)
File "/usr/local/lib/python2.7/site-packages/django/db/migrations/executor.py", line 147, in apply_migration
state = migration.apply(state, schema_editor)
File "/usr/local/lib/python2.7/site-packages/django/db/backends/base/schema.py", line 91, in exit
self.execute(sql)
File "/usr/local/lib/python2.7/site-packages/django/db/backends/base/schema.py", line 111, in execute
cursor.execute(sql, params)
File "/usr/local/lib/python2.7/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
File "/usr/local/lib/python2.7/site-packages/django/db/utils.py", line 97, in exit
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/usr/local/lib/python2.7/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
File "/usr/local/lib/python2.7/site-packages/django/db/backends/mysql/base.py", line 124, in execute
return self.cursor.execute(query, args)
File "/usr/local/lib/python2.7/site-packages/MySQLdb/cursors.py", line 205, in execute
self.errorhandler(self, exc, value)
File "/usr/local/lib/python2.7/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
django.db.utils.OperationalError: (1071, 'Specified key was too long; max key length is 767 bytes')
能不能提供图片和视频存储到本地?
同样一篇文章,有timestamp=1479828320开头的临时链接和能永久访问的链接,请问如何能抓到__biz=MzI0MjA1Mjg2Ng开头的永久访问的链接呢?
后台显示代理异常,是需要在哪里设置代理吗
不使用代理可以吗?
日志文件中:其他日志文件都是空的,Downloader报错如下:
DEBUG 2017-03-10 16:39:37,900 remote_connection 17038 140391779444544 Finished Request
ERROR 2017-03-10 16:39:37,901 downloaders 17038 140391779444544 Message: Unable to locate element: {"method":"xpath","selector":"//div[@Class='txt-box']/p[@Class='info']/label"}
Stacktrace:
at FirefoxDriver.prototype.findElementInternal_ (file:///tmp/tmprZaDO2/extensions/[email protected]/components/driver-component.js:10723)
at FirefoxDriver.prototype.findElement (file:///tmp/tmprZaDO2/extensions/[email protected]/components/driver-component.js:10732)
at DelayedCommand.prototype.executeInternal_/h (file:///tmp/tmprZaDO2/extensions/[email protected]/components/command-processor.js:12614)
at DelayedCommand.prototype.executeInternal_ (file:///tmp/tmprZaDO2/extensions/[email protected]/components/command-processor.js:12619)
at DelayedCommand.prototype.execute/< (file:///tmp/tmprZaDO2/extensions/[email protected]/components/command-processor.js:12561)
Traceback (most recent call last):
File "/root/wechat-spider/wechat/downloaders.py", line 102, in download_wechat
if self.visit_wechat_topic_list(wechatid):
File "/root/wechat-spider/wechat/downloaders.py", line 188, in visit_wechat_topic_list
element_wechat = browser.find_element_by_xpath("//div[@Class='txt-box']/p[@Class='info']/label")
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 258, in find_element_by_xpath
return self.find_element(by=By.XPATH, value=xpath)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 712, in find_element
{'using': by, 'value': value})['value']
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 201, in execute
self.error_handler.check_response(response)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
NoSuchElementException: Message: Unable to locate element: {"method":"xpath","selector":"//div[@Class='txt-box']/p[@Class='info']/label"}
Stacktrace:
at FirefoxDriver.prototype.findElementInternal_ (file:///tmp/tmprZaDO2/extensions/[email protected]/components/driver-component.js:10723)
at FirefoxDriver.prototype.findElement (file:///tmp/tmprZaDO2/extensions/[email protected]/components/driver-component.js:10732)
at DelayedCommand.prototype.executeInternal_/h (file:///tmp/tmprZaDO2/extensions/[email protected]/components/command-processor.js:12614)
at DelayedCommand.prototype.executeInternal_ (file:///tmp/tmprZaDO2/extensions/[email protected]/components/command-processor.js:12619)
at DelayedCommand.prototype.execute/< (file:///tmp/tmprZaDO2/extensions/[email protected]/components/command-processor.js:12561)
DEBUG 2017-03-10 16:39:37,903 remote_connection 17038 140391779444544 DELETE http://127.0.0.1:46313/hub/session/30f3cc8c-1257-4f04-b4a3-8175dc26eb00/cookie {"sessionId": "30f3cc8c-1257-4f04-b4a3-8175dc26eb00"}
帮忙看下这个错误吧!~
title
origin_title
source
content
words
abstract
avatar
Traceback (most recent call last):
File "extractor.py", line 409, in
my_extractor.run()
File "extractor.py", line 404, in run
self.get_detail(body, data)
File "extractor.py", line 381, in get_detail
col_value = self.extract(content, col_rules, {'data': result})
File "extractor.py", line 335, in extract
extractor = XPathExtractor(res, rule["data"])
File "/home/xiaocui/cui/xiaocui/wechat-spider/wechat/extractors.py", line 130, in init
self.tree = etree.parse(StringIO(content), htmlparser)
File "src/lxml/lxml.etree.pyx", line 3427, in lxml.etree.parse (src/lxml/lxml.etree.c:81117)
File "src/lxml/parser.pxi", line 1828, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:118072)
File "src/lxml/parser.pxi", line 1848, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:118341)
File "src/lxml/parser.pxi", line 1729, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:116899)
File "src/lxml/parser.pxi", line 1063, in lxml.etree._BaseParser._parseUnicodeDoc (src/lxml/lxml.etree.c:110886)
File "src/lxml/parser.pxi", line 595, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:105109)
File "src/lxml/parser.pxi", line 706, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:106817)
File "src/lxml/parser.pxi", line 646, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:105963)
File "", line 0
lxml.etree.XMLSyntaxError
content和source字段太大,导致查询太慢,建议分表处理
DEBUG 2017-03-26 13:41:36,828 init 16456 140440915134208 version=0.2.3
DEBUG 2017-03-26 13:41:36,829 init 16456 140440915134208 version=0.2.1
^CTraceback (most recent call last):
File "bin/processor.py", line 57, in
processor.run()
File "bin/processor.py", line 45, in run
rsp = r.brpop(settings.CRAWLER_CONFIG["processor"])
File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 1183, in brpop
return self.execute_command('BRPOP', *keys)
File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 573, in execute_command
return self.parse_response(connection, command_name, **options)
File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 585, in parse_response
response = connection.read_response()
File "/usr/local/lib/python2.7/dist-packages/redis/connection.py", line 577, in read_response
response = self._parser.read_response()
File "/usr/local/lib/python2.7/dist-packages/redis/connection.py", line 238, in read_response
response = self._buffer.readline()
File "/usr/local/lib/python2.7/dist-packages/redis/connection.py", line 168, in readline
self._read_from_socket()
File "/usr/local/lib/python2.7/dist-packages/redis/connection.py", line 126, in _read_from_socket
data = self._sock.recv(socket_read_size)
KeyboardInterrupt
禁止访问 (403)
CSRF验证失败. 相应中断.
Help
Reason given for failure:
CSRF token missing or incorrect.
In general, this can occur when there is a genuine Cross Site Request Forgery, or when Django's CSRF mechanism has not been used correctly. For POST forms, you need to ensure:
Your browser is accepting cookies.
The view function passes a request to the template's render method.
In the template, there is a {% csrf_token %} template tag inside each POST form that targets an internal URL.
If you are not using CsrfViewMiddleware, then you must use csrf_protect on any views that use the csrf_token template tag, as well as those that accept the POST data.
The form has a valid CSRF token. After logging in in another browser tab or hitting the back button after a login, you may need to reload the page with the form, because the token is rotated after a login.
You're seeing the help section of this page because you have DEBUG = True in your Django settings file. Change that to False, and only the initial error message will be displayed.
You can customize this page using the CSRF_FAILURE_VIEW setting.
不应该报csrf 问题啊
程序启动后,downloader.py 报如下错误,实在找不到什么原因了,希望协助解答,谢谢。
说明:本机已安装 firfox36.0 本机系统为:macOS 10.12.3
提示报错信息如下:
[Errno 8] Exec format error
Traceback (most recent call last):
File "bin/downloader.py", line 96, in
downloader.run()
File "bin/downloader.py", line 77, in run
with SeleniumDownloaderBackend(proxy=proxy) as browser:
File "/Users/gaoxiaofei/pywww/wechat-spider/wechat/downloaders.py", line 42, in enter
self.browser = self.get_browser(self.proxy)
File "/Users/gaoxiaofei/pywww/wechat-spider/wechat/downloaders.py", line 90, in get_browser
browser = webdriver.Firefox(firefox_profile=firefox_profile)
File "/Library/Python/2.7/site-packages/selenium/webdriver/firefox/webdriver.py", line 145, in init
self.service.start()
File "/Library/Python/2.7/site-packages/selenium/webdriver/common/service.py", line 74, in start
stdout=self.log_file, stderr=self.log_file)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 710, in init
errread, errwrite)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1335, in _execute_child
raise child_exception
OSError: [Errno 8] Exec format error
如果你对本项目感兴趣,愿意参与进来,请在这里留言告诉我.
下面是接下来要实现的功能(打钩的表示已认领,未打钩表示未认领)
文档:
重要功能:
碎片功能:
如果你有感兴趣的项,或者特别想要实现的功能,可以在这里留言告诉我,我会回复你。
具体合作开发流程:
1) 选择你感兴趣的项,留言告诉我。我会将该模块分配给你。
2)你fork我的项目,修改后给我提交pull request
3)我review和merge你的代码,并在贡献者名单中加上你
我看,每次新加的公众号都是最多只能搜索到10篇的文章,搜狗下是最近10次的群发文章都有
然后我查看 wechat\dowloaders.py 里面 这句
def download_wechat_topics(self, wechat_id, process_topic):
""" 在微信号的文章列表页面,逐一点击打开每一篇文章,并爬取 """
browser = self.browser
js = """ return document.documentElement.innerHTML; """
body = browser.execute_script(js)
htmlparser = etree.HTMLParser()
tree = etree.parse(StringIO(body), htmlparser)
elems = [item.strip() for item in tree.xpath("//h4[@class='weui_media_title']/text()") if item.strip()]
hrefs = ['http://mp.weixin.qq.com%s' % item for item in tree.xpath("//h4[@class='weui_media_title']/@hrefs")]
elems_avatars = tree.xpath("//div[@class='weui_media_box appmsg']/span/@style")
avatars = [item[21:-1] for item in elems_avatars]
elems_abstracts = tree.xpath("//p[@class='weui_media_desc']")
abstracts = [item.text.strip() if item.text else '' for item in elems_abstracts]
links = []
for idx, item in enumerate(elems[:10]):
title = item
print title
这个for里面是不是特地限制了10篇?为什么不搜狗里面的全部都抓取?
系统是Mac OS 10.10.5。
1.用virtualenv,安装了python2.7。
2.Mysql-python和lxml可以安装,但依赖包怎么都安装不上。一直提示:
折腾半天搞不定,去pip网站 https://pypi.python.org/pypi 搜了一下也没有发现这些依赖库,就先跳过,然后成功安装火狐36.0版本。
clone代码,安装依赖python库,完成,最后requirements.txt的库都安装完了。
3.创建mysql数据库 done
4.安装和运行Redis done
5.更新配置文件local_settings done
6.初始化表,出问题了,报一大堆问题:
请教一下,可能是哪里出问题了?谢谢。
另外python manage.py runserver 0.0.0.0:8001这个命令在windows下不能用吧?不知道您这个程序能否在windows7下运行???
搜索公众号好像还是不行,我看日志好像是跳到输入验证码的页面,还有采集的文章好像是不是加入了验证,现在好像采集到的很多是:操作过于频繁,请稍后再试
关键词,浏览量过滤,只保留精品
Message: Can't load the profile. Profile Dir: /tmp/tmpluVvmZ If you specified a log_file in the FirefoxBinary constructor, check it for details.
wechatspider_downloader.stderr.log 中提示此信息 现在无法下载
这个是什么问题呢?
DEBUG 2017-03-17 15:31:09,533 remote_connection 1417 140735238418432 POST http://127.0.0.1:50353/session {"requiredCapabilities": {}, "desiredCapabilities": {"javascriptEnabled": true, "platform": "ANY", "browserName": "firefox", "version": "", "moz:firefoxOptions": {"profile": "UEsDBBQAAAAIAOR7cUqBD/yDnAMAAMAMAAAHAAAAdXNlci5qc51WTW/cNhC991cUPrVATPijubQn13GAAEFcZGHkSFDUaEUvRTIccuX99xmKq+zaK1Gb3vTBGT7OvPeGEcFz56H546IWQXhw1gdl1sxZreSOpY+rWHUKUVnz3/DxTkpwAeqLd783QiP8+c9v8SiN7VgnXrhsve2Ao/TKBe6j4UF1QDG3V68Deqhqr7b0DkZUGrgRgV45bMEEpIDg45s9grV6owIzEHrrN50wYg2e1QpTgsmQytue3phtGq0MTGN/FluRATPrAh0YGba2/2Tu6dHOZIaXAAaHxdFRvYDlc8zUZwTy8PT5jt0ycSjmaWrh3Fk5x3oE0NBB8Dvm4RnkXNoRAoLwst3vUEaLooHhOXGjE7onqpQx7Qt9SWdAJrS2Pa92vIZGRB0mYb0mYAtChza/E1K/VRJYozyGr9FMb3lgkkCMRL5oaBekMnDibwRfrgbtHqJjLdHWEZ+4pUxe1cA6DHagzIVaG+vhopwgBdPiq0Lfx9RFQEFUyKjQ5tE8Opg586GXg0IJKTXGw3D8hhTIpRfYTseeMtfYoJrdEw6lmohAkNGrsGMS3b79ZVjJDXplatuzOnbuiDJFIckW5Obedo6soFI67WjUug16t1CFNz1MPROVjeHvSguzmWmcpqN7cqgyoUdjW3a0Eyb1oGUyw+h1GVEN26RlZITHepld59yiabteJ+WUlpegMVHXKrmeWEC5t13WhkBsbxW2u8v025HsLjWYdUiEu3n/ftZdSxAdsdYFTPTnna2FXmjLvGuMBdnX8eEcW35ldMWNX5wiuZGzMSRbECGS6kh536Pyi5u0QGEDyT9kQ/w3/5ln3n62cUtOwGsSoVlzbcXZA+EX50duwrD+Zk7ZvUkA2Dh906T81oJZJYoRvAVjUOi02EH9cH21+kK2I5OS/ppebKAnNiS+/m8ujBOkGP/T3hqlgZ7IT7ySgVuviEg834nK5/Ig6nQ8CMP8/ZgG1h3hkcu3h0pbudEKQxnkqL5c98txzhIVQ8Rfrkt0qYlFaRzdQkQMdqZ0aTQaOn4HXQV+NbzOwPnpu8ps2EBoanCeEqd8OypQ2v1D1sFKUlxKfz0zZjNl2LLjkrZyKThJmT8jh5d0I1MZ/WnDDrLvW0WCSf0qq/7oWjJc9o6uJRL8zBV34hJwry3OjNsz7u5nud+hI6nsPk+CmwWRvFXH7XRu+lQ/mm9Dm/H6iimqgn36+vm8GY2gG4wuczY3Na38AVBLAQIUAxQAAAAIAOR7cUqBD/yDnAMAAMAMAAAHAAAAAAAAAAAAAACkgQAAAAB1c2VyLmpzUEsFBgAAAAABAAEANQAAAMEDAAAAAA=="}, "marionette": true}}
DEBUG 2017-03-17 15:31:10,640 remote_connection 1417 140735238418432 Finished Request
'status'
Traceback (most recent call last):
File "downloader.py", line 96, in
downloader.run()
File "downloader.py", line 77, in run
with SeleniumDownloaderBackend(proxy=proxy) as browser:
File "/Users/macbook/Downloads/wechat-spider-master/wechat/downloaders.py", line 42, in enter
self.browser = self.get_browser(self.proxy)
File "/Users/macbook/Downloads/wechat-spider-master/wechat/downloaders.py", line 90, in get_browser
browser = webdriver.Firefox(firefox_profile=firefox_profile)
File "/Library/Python/2.7/site-packages/selenium/webdriver/firefox/webdriver.py", line 150, in init
keep_alive=True)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 92, in init
self.start_session(desired_capabilities, browser_profile)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 179, in start_session
response = self.execute(Command.NEW_SESSION, capabilities)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 236, in execute
self.error_handler.check_response(response)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 104, in check_response
status = value["status"]
KeyError: 'status'
貌似是根据sogou平台来的,平台只能看到10个,爬虫就爬不到历史纪录了
成功过一次,但之后一直报错
DEBUG 2017-02-15 18:34:07,670 init 4560 140567195948800 stderr=_XSERVTransSocketOpenCOTSServer: Unable to open socket for inet6
_XSERVTransOpen: transport open failed for inet6/iZbp19znczn4d9menw4e22Z:1011
_XSERVTransMakeAllCOTSServerListeners: failed to open listener for inet6
^CTraceback (most recent call last):
File "bin/downloader.py", line 96, in
downloader.run()
File "bin/downloader.py", line 49, in run
resp_data = r.brpop(settings.CRAWLER_CONFIG["downloader"])
File "/usr/local/lib/python2.7/site-packages/redis/client.py", line 1166, in brpop
return self.execute_command('BRPOP', *keys)
File "/usr/local/lib/python2.7/site-packages/redis/client.py", line 565, in execute_command
return self.parse_response(connection, command_name, **options)
File "/usr/local/lib/python2.7/site-packages/redis/client.py", line 577, in parse_response
response = connection.read_response()
File "/usr/local/lib/python2.7/site-packages/redis/connection.py", line 569, in read_response
response = self._parser.read_response()
File "/usr/local/lib/python2.7/site-packages/redis/connection.py", line 330, in read_response
bufflen = self._sock.recv_into(self._buffer)
KeyboardInterrupt
前面七步都配置好了,最后运行四个脚本的时候没有反应。显示代理状态:异常。请问可能是什么原因?
修改了views.py文件下search_wechat方法,已经用更新的SOGOU网站测试过,可否贡献代码?
File "/usr/local/lib/python2.7/site-packages/django/db/backends/mysql/base.py", line 27, in
raise ImproperlyConfigured("Error loading MySQLdb module: %s" % e)
django.core.exceptions.ImproperlyConfigured: Error loading MySQLdb module: libmysqlclient_r.so.16: cannot open shared object file: No such file or directory
类似这样的 是什么原因呢
def get_proxies(self):
# 快代理
#url = 'http://dev.kuaidaili.com/api/getproxy/?orderid=955742122799513&num=100&area=%E5%A4%A7%E9%99%86&b_pcchrome=1&b_pcie=1&b_pcff=1&protocol=1&method=2&an_ha=1&sp1=1&sep=1'
# 代理666
#url = 'http://qsdrk.daili666api.com/ip/?tid=559017461234554&num=100&delay=3&category=2&sortby=time&foreign=none&filter=on'
url = 'http://qsdrk.daili666api.com/ip/?tid=555451817416492&num=100&delay=3&category=2&sortby=time&foreign=none&filter=on'
r = requests.get(url)
lines = r.text.split()
for line in lines:
logger.debug(line)
try:
host, port = line.split(':')
Proxy.objects.get_or_create(host=host, port=int(port))
except Exception as e:
print e
更新了代理的接口,但是发现没有存储使用
ServerError at /wechat/add/
{'status': 400, 'details': {}}
Request Method: POST
Request URL: http://xxxx/wechat/add/
Django Version: 1.8.1
Exception Type: ServerError
Exception Value:
{'status': 400, 'details': {}}
Exception Location: /usr/local/lib/python2.7/dist-packages/oss2/api.py in _do, line 149
Python Executable: /usr/bin/python
Python Version: 2.7.9
Python Path:
['/root/wechat-spider',
'/usr/lib/python2.7',
'/usr/lib/python2.7/plat-x86_64-linux-gnu',
'/usr/lib/python2.7/lib-tk',
'/usr/lib/python2.7/lib-old',
'/usr/lib/python2.7/lib-dynload',
'/usr/local/lib/python2.7/dist-packages',
'/usr/lib/python2.7/dist-packages']
Server time: 星期三, 21 九月 2016 21:51:40 +0800
启动程序后,添加微信公众号,有些公众号没有爬取到,我看了下日志,失败的情况都一致,其中有如下片段:
INFO 2016-10-27 16:40:50,715 processor 32482 140638797100800 {"publish_time": "2016-10-27 16:40:50.702217", "title": "分享一个特大写的尴尬!", "url": "http://mp.weixin.qq.com/s?timestamp=1477557622&src=3&ver=1&signature=qpCRLCGlM315db1DuiJml0rC3iNHPFuoBkzU9ixxfUyDLxOrV-2-yFbiczQUgLHDtqGPPyaTj2nKF*e3xvxepzroE2qcuzf6M3GS8Ue7W92aEGPnv6B1YzFV2AP52FuyeMVvoQUGT0tmtSURc7poZMc87sJOowfGJHX5cHJDXMI=", "origin_title": "分享一个特大写的尴尬![省略号]............................", "words": 14, "wechat_id": 2, "read_num": ["100000+"]}
Traceback (most recent call last):
File "/var/wdcp1/wechat-spider/bin/processor.py", line 57, in
processor.run()
File "/var/wdcp1/wechat-spider/bin/processor.py", line 52, in run
self.process(data)
File "/var/wdcp1/wechat-spider/bin/processor.py", line 37, in process
backend.process(data)
File "/var/wdcp1/wechat-spider/wechat/processors.py", line 48, in process
C.objects.update_or_create(uniqueid=params['uniqueid'], defaults=params)
File "/usr/local/lib/python2.7/site-packages/django/db/models/manager.py", line 127, in manager_method
return getattr(self.get_queryset(), name)(_args, _kwargs)
File "/usr/local/lib/python2.7/site-packages/django/db/models/query.py", line 422, in update_or_create
obj, created = self._create_object_from_params(lookup, params)
File "/usr/local/lib/python2.7/site-packages/django/db/models/query.py", line 439, in _create_object_from_params
obj = self.create(_params)
File "/usr/local/lib/python2.7/site-packages/django/db/models/query.py", line 348, in create
obj.save(force_insert=True, using=self.db)
File "/usr/local/lib/python2.7/site-packages/django/db/models/base.py", line 710, in save
force_update=force_update, update_fields=update_fields)
File "/usr/local/lib/python2.7/site-packages/django/db/models/base.py", line 738, in save_base
updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
File "/usr/local/lib/python2.7/site-packages/django/db/models/base.py", line 822, in _save_table
result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
File "/usr/local/lib/python2.7/site-packages/django/db/models/base.py", line 861, in _do_insert
using=using, raw=raw)
File "/usr/local/lib/python2.7/site-packages/django/db/models/manager.py", line 127, in manager_method
return getattr(self.get_queryset(), name)(_args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/django/db/models/query.py", line 920, in _insert
return query.get_compiler(using=using).execute_sql(return_id)
File "/usr/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 970, in execute_sql
for sql, params in self.as_sql():
File "/usr/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 928, in as_sql
for obj in self.query.objs
File "/usr/local/lib/python2.7/site-packages/django/db/models/fields/init.py", line 710, in get_db_prep_save
prepared=False)
File "/usr/local/lib/python2.7/site-packages/django/db/models/fields/init.py", line 702, in get_db_prep_value
value = self.get_prep_value(value)
File "/usr/local/lib/python2.7/site-packages/django/db/models/fields/init.py", line 1868, in get_prep_value
return int(value)
TypeError: int() argument must be a string or a number, not 'list'
求解答!
您好:感谢您的优秀项目!我在爬取过程中,processor.py报错,导致爬到的数据无法储存,请问这是什么原因呢?代码如下,再次感谢!
python bin/processor.py
INFO 2016-10-07 22:57:29,553 processor 23501 140735208751104 {"publish_time": "2016-10-07 22:57:29.531462", "title": "原创 | hi我的毛坯恋人", "url": "http://mp.weixin.qq.com/s?timestamp=1475852221&src=3&ver=1&signature=QIEsiUwOWGsva7wX9feSeQqzK96oUU1Pbm2RyUpak5tecWE88*DIrrTA5u11zush7*J0UhokJMI3UkF2ChRy9GCmrKpPrJtyOIMCOJKCsRHr*I4-RaI42vKWWTDHLy8cVshYA6v3IyGMng1r2AZz0EAih5BGnJgLW7B5wH4715w=", "origin_title": "原创 | hi我的毛坯恋人", "abstract": "这个世界不可能给你准备好一个人,全方位匹配。", "content": "<div class="rich_media_content " id="js_content">\n
<span style="font-size: 12px; color: rgb(136, 136, 136);">
<span style="font-size: 15px;">
<span style="font-size: 15px;">有个朋友跟我说,拎包入住的房子也有,但谁都可以住,不是为你一个人准备的。
<span style="font-size: 15px;">还有一个朋友跟我说,你以为装修好了,房产证上就会写你的名字吗?呃~
<span style="font-size: 15px;">
<span style="font-size: 12px; color: rgb(136, 136, 136);">
<span style="font-size: 15px;">
<span style="font-size: 15px;">有个朋友跟我说,拎包入住的房子也有,但谁都可以住,不是为你一个人准备的。
<span style="font-size: 15px;">还有一个朋友跟我说,你以为装修好了,房产证上就会写你的名字吗?呃
<span style="font-size: 15px;">
\n<a class="reward_access" href="javascript:;" id="js_reward_link">赞赏\n
\n<div class="reward_area_inner" id="js_reward_inner" style="display: none; width: 816px;">\n<p class="tips_global reward_user_tips"><a href="javascript:;" id="js_reward_total">人赞赏\n<div class="reward_user_list" id="js_reward_list">\n\n\n<div class="rich_media_tool" id="js_toobar3" style="display: none;">\n<div class="media_tool_meta tips_global meta_primary" id="js_read_area3" style="display:none;">阅读 <span id="readNum3">\n<span class="media_tool_meta meta_primary tips_global meta_praise" id="like3" style="display:none;">\n<i class="icon_praise_gray"><span class="praise_num" id="likeNum3">\n\n<a class="media_tool_meta tips_global meta_extra" href="javascript:void(0);" id="js_report_article3" style="display:none;">投诉\n\n<div class="rich_media_tool" id="js_sg_bar">\n<div class="media_tool_meta tips_global meta_primary">阅读 <span id="sg_readNum3">31102\n<span class="media_tool_meta meta_primary tips_global meta_praise">\n<i class="icon_praise_gray"><span class="praise_num"><span class="praise_num" id="sg_likeNum3">474\n\n\n\n<div class="rich_media_area_primary sougou" id="sg_tj" style="display:none">\n\n<div class="rich_media_area_extra">\n<div class="mpda_bottom_container" id="js_bottom_ad_area">\n\n<div id="js_iframetest" style="display:none;">\n<div class="rich_media_extra" id="sg_cmt_area">\n<div class="discuss_container" id="sg_cmt_main" style="display: block;">\n<div class="rich_tips with_line title_tips discuss_title_line">\n<span class="tips">精选留言\n\n<ul class="discuss_list" id="sg_cmt_list"><li class="discuss_item" id="cid1125792425286041644"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="1125792425286041644" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">58 <div class="user_info"> <strong class="nickname">✨瑶瑶✨ <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/hqDXUD6csUib4nILP7HJ4IZoMDWiafSyytxxFYj8lX3vbSOMq5HSPRwdqh7QadGLibPwGcMPRRu18s/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 敲黑板,划重点,倒数第二段,精华!!! <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid1019308417807810564"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="1019308417807810564" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">47 <div class="user_info"> <strong class="nickname">雍 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/hqDXUD6csUicSs5VTSNSiaLxCjknQJdicJ9J0PP27vYkwxPZicdicT0ria1TFjAq9iaW8bIoZibknF58ibuM/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 我喜欢装修,装修完一个就找下一个毛坯。 <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid4856806416259743837"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="4856806416259743837" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">41 <div class="user_info"> <strong class="nickname">mr <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGaTtoq2hp47LOyDewhBo314zG55xiccAb6objticXshGzLcO2wcrAbJ1tHQdnklx3Ero/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 这个文末彩蛋,我给你101分,真的是单身久了,不易遇到那个怦然心动的人。 <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid1733980517616844814"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="1733980517616844814" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">31 <div class="user_info"> <strong class="nickname">iEdson <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/hqDXUD6csUicViaWLum5gptM0MLET2PcbnBZsMfAJ7SN8zBWb6ibibicfcdMNsqYfDSTfG3b3TBlfc1s/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 曾经毛坯,精装完了,被高价抛售了 <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid8719248674233778209"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="8719248674233778209" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">29 <div class="user_info"> <strong class="nickname">sany <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGax0cyDKR6CCGELyiaPyxICu3W4SKPnicVBRAWCN1KnXicE6827SrrmLCHhliapEamIqfA/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 恋爱需要机会,但是婚姻需要智慧 <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid3236516500466565555"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="3236516500466565555" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">26 <div class="user_info"> <strong class="nickname">梅子 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/YCHXCc6ZUnReZ4rxRPxPVEl9r04q2aOlTnmkeI0ibIJYFSnlcH63aQUs4iaPu3aseRtlltkGzqwXE/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 毛坯或精装,我都不介意,有就行<i class="icon_emotion_single icon14"><i class="icon_emotion_single icon14"> <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid9449362620285976996"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="9449362620285976996" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">23 <div class="user_info"> <strong class="nickname">Amy🌸🔮 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGY1f7NZuU1RI0ZmyqwPYdAwXVu67X2OjZvm01WpbTibkqA3TLxia4KnuWpzt6xSCIPGs/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 我喜欢那行小灰字<i class="icon_emotion_single icon45"> <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid5384725216255541761"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="5384725216255541761" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">21 <div class="user_info"> <strong class="nickname">杨宏玉 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/sMhnyHWADKgpDTGBsmDKE7Y5G6icSictUicoxuicicm4eiatzr4yMKXXiazVialu0n83QMRYpQxd19YVQicQ/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 哦,先生文章说得再透彻不过了,年征不大,悟之深也,不得不说你非常智慧,想必是极爱看书,爱思考,好奇心很重的人,当然做你夫人的人是很幸运的,并且也是有头脑人,爱琢磨的人,肯定有它独到之处,有他做人的底线和原则,我很配服你,国庆将至,祝夫妇俩快乐,虽不认识你,但很欣赏你的文釆。 <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid2621462237193175122"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="2621462237193175122" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">20 <div class="user_info"> <strong class="nickname">清 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/hqDXUD6csU9kvtbCZtHFskz3OROAMYcrAmyuHWzttGlOpNEcP14sF21A1CTs4qx7jtLdTNDBZYs/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 单身久了,妥妥的全中枪,我现在就是完全不想跟一个陌生人去磨合迁就,我真的习惯这种生活状态了,到了什么时间该做什么事情,感觉自己很难改变了。。。 <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid2648657454715371522"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="2648657454715371522" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">19 <div class="user_info"> <strong class="nickname">吉三🌹 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/hqDXUD6csUibicgvyZrEgbTbJhiaS2iaGYVBuBMBabBQ55bbcSqT4MIAoA6LIfa2ic30zDsuG1D19x9U/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 虽然说毛坯,其实也是经过筛选的,大体符合也挺不容易,房子可以看内部格局,人却隔着肚皮,恋爱的时候可以迎合你,婚姻之后就不知道了 <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid28030253653819464"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="28030253653819464" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">17 <div class="user_info"> <strong class="nickname">姜美名Mina🌸 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/ajNVdqHZLLAJOqPPsyzxz3wb26ic2MRCmlWpggrj6nYX9Uib0ueawmWhDSNfiaqmRHIPTibJ3oxabIo/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 有一行灰字太小,眼睛疼(˵¯͒〰¯͒˵) <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid3124392440719475448"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="3124392440719475448" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">17 <div class="user_info"> <strong class="nickname">大雁闻天 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/hqDXUD6csUibRekuAcWwPponp1DKL7rmSFt4fYVZCAqQ2TteSDP26L7xthE7wAxAO7LuBlbOKTYE/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 我知道人都是不完美的,可是心理洁癖在作怪<i class="icon_emotion_single icon6">我要慢慢改变~ <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid2551332808593244177"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="2551332808593244177" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">16 <div class="user_info"> <strong class="nickname">Unterwegs Jll <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/hqDXUD6csU8kHAJw1AXasnUOhPSF3J3l9Fc9sfe8RMnKAU7dMNu97V9LpTZoknWCr1qDwibu2v3M/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 虽然说单身久了就不想另外一个人来改变自己的节奏,但是还是期待能遇到那么一个人。他善良,有责任感,正义且真诚。不希望他多有钱,但是一定要有上进心。包容我,理解我,支持我的梦想,逗我开心,一起努力让生活更美好。先生您是楷模,不过我相信还会遇到另外一个您<i class="icon_emotion_single icon21"> <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid9751113111103864841"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="9751113111103864841" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">16 <div class="user_info"> <strong class="nickname">花儿 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/L0vhtibBtYcOLNx6FK16N1hic7QyBnbm9ibibbZEuI2SZvBgD1iaxIlKuQZkb8OibuxEiabxIpjic79lysY/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 与其花时间去塑造完美恋人,不如花时间修炼更好的自己。自己变好了,变成熟了,自然可以容纳那些不完美。 <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid4250356943870230602"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="4250356943870230602" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">15 <div class="user_info"> <strong class="nickname">不锈钢制品吴玲 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/hqDXUD6csU83UcicqTVgSwIwgqJPqjJHHd4ia3VJ3UR4puKEOKaMvmxhbtMgmp01PE79qoM4dFySU/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> “不是每个人都跟我一样”镶嵌得很漂亮<i class="icon_emotion_single icon21"> <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid10188080081616764942"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="10188080081616764942" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">14 <div class="user_info"> <strong class="nickname">ΖÐ~ <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGbUSIhickIGteI4282icGn9o4r7C6KViaWuicjpy9MvHiaiauePO4be2j63icDqzFtYO73iaF4/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 我是一名粉刷匠,粉刷本领强<i class="icon_emotion_single icon21"><i class="icon_emotion_single icon21"><i class="icon_emotion_single icon21"> <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid7843848109250576450"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="7843848109250576450" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">10 <div class="user_info"> <strong class="nickname">半世癫狂 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGawibXm0z4fg1B7H0Fial8888X9ctPhPQTDL3If6Qd0icBWlYiaW1WjYklzQe9ZMcD9bnE/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 因为在乎,所以改变。因为在乎,所以认真。但不要因为苛求完美,而失去爱的勇气?爱都不敢,还能做什么? <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid7741007767342678055"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="7741007767342678055" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">10 <div class="user_info"> <strong class="nickname">苹果 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/neLhv78Z89TicazDmPMJxmlEUD0KmAia5VY47UkYVnrfLUZSr6UyAutsSXWjqzbTKr2oj7LxDzbbo/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 先生已被夫人装修好了吧<i class="icon_emotion_single icon14"> <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid6184375035017297922"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="6184375035017297922" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">9 <div class="user_info"> <strong class="nickname">徐小马 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGZgET0VSYZXRBS4naDOlq5BOx3ZzYdLXxeRXe2eYL4Nxd1Y4Ao7uiaIvJ17RCmM4F18/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 我认为这段话更像是说给自己听的。所有的一切外因都是催化剂而已,想要幸福,就得让自己变得更好更可爱。 <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid5031181298203361490"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="5031181298203361490" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">9 <div class="user_info"> <strong class="nickname">沈小怡 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGZNheKYtZq7CB0LkCYRz6Yxibf6MIAUibOgHleSAasCGAmCe3YMQntZZ6wfmQa9eqn8s/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 我爱哥哥,我爱原创。 <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid4212629349846745178"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="4212629349846745178" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">8 <div class="user_info"> <strong class="nickname">简单就幸福 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/8WTu10ST9LzxClXwVLpNiaM7NsQPLDNRWatibJfFC15E7l9WzULTcrCuKM0lIxDmX6jIo7sxO680Y/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 终于等来全星球最帅的先生的原创,么么哒! <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid9436345772707676282"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="9436345772707676282" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">7 <div class="user_info"> <strong class="nickname">小辣椒💯《赏心内衣》 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGZOJ4yPiaicDaS3aUqxaeibSfHBNS3RdJjZMzqrrHUfCY32jSybp0wrXDWmg1QoDdtibiag/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 爱与不爱很简单,就是你是否让我心甘情愿为你改变!我画这句<i class="icon_emotion_single icon14"> <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid5996982266113294349"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="5996982266113294349" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">7 <div class="user_info"> <strong class="nickname">Jaidly. <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/r2mbb1YuyeHrKGSPnK0LLVtf40SibcqibDoOUB7s5zPB8K35oNYF8qLwrialKMdfRsF0l43nOyibdfI/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 都说陪伴是最长情的告白,但是人越长大是不是越不容易愿意花费时间去了解一个完全陌生的人? 陪伴,也是有前提的啊。 <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid6099272044753453074"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="6099272044753453074" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">7 <div class="user_info"> <strong class="nickname">wall.e <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/YBibK5ib9TUvcSVxYSMJCWJreczWcdlxGs3ZjermTnplFpDSwpMdDWcq8f7YUHL6zqCzoIxdaw9Jo/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 判断自己爱不爱一个人,就看自己愿不愿意为ta改变,不放弃对美的追求,费尽心力改造生活 <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid4956185504153337860"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="4956185504153337860" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">6 <div class="user_info"> <strong class="nickname">李咏蔚--小梦 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGbNia14qB3RHvu01sMr0zYIWpuEAubaz0J8BnMOo857OVFhFdibKlPp4OOZRk1NAXVhs/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 先生说的好--两个人在一起就是彼此雕刻。最最重要是有深厚的感情基础,才能愿意为对方去改变,才能在发生矛盾时,最大限度去接纳和包容对方。婚姻也是一场修行!且行且珍惜…… <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid6706532690436292640"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="6706532690436292640" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">6 <div class="user_info"> <strong class="nickname">A陈永钦 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGZQ3qPtyhAmG7Ds90XicYDrb7A7c549zCbQUhKkFtyHWMeLlYHhfeTTPAfnlb9rg3W0/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 最近铺天盖地的新闻都是谈房之事,压抑得很。也只有先生谈的此房最轻松、与众不同了。中间居然有句引起雷劈的浅灰色文字。要是换我,就不会打雷了。<i class="icon_emotion_single icon14"><i class="icon_emotion_single icon14"> <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid3526279750446219515"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="3526279750446219515" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">6 <div class="user_info"> <strong class="nickname">孙昱淑 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/hqDXUD6csU9SRSdCWj0mZqCbxZF7qLibGkbJUibvKAegib8xNHfWrOVSUWxvwnj7Cj4IzRtRs0p3bg/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> “不是每个人都跟我一样。”这几个字好小哦,根本看不清。<i class="icon_emotion_single icon14"> <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid10680527549873782795"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="10680527549873782795" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">6 <div class="user_info"> <strong class="nickname">A 温柔是罪 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGaa70rhCiaPXAENumpZK1L0FyRA2mVddbtskDc3keicpIthJpJfqoGPSbwicAaBtjfetQ/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 用心装修过一间房子。却因为一个小毛病倒塌了。装修这间房子时候就花光了你所有的精力,只差一个结尾却要重新来过。 <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid12224846336083623948"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="12224846336083623948" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">6 <div class="user_info"> <strong class="nickname">Yun _ Summer <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/uNZFBTXiapqwG52qplR5pSDFEqCMAg07tJhBViaBSCRE3Q7Fd9KxIU8dHJ4MAblKaNTJVBf5AT7ZY/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> Hi, 我们的琢磨先生~ <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid9956451112364015622"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="9956451112364015622" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">5 <div class="user_info"> <strong class="nickname">于济 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGa5l0YgZYQAZu6UV2CFBaJZiatGa5BiapW2XAhfGicB2RtAGBv3tMogDibcwX7bgVVDT04/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 第一次留言!很开心看到我的朋友因为我的推荐把先生的作品发到了朋友圈!!!希望留言能让她也看到! <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid4362290062256242973"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="4362290062256242973" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">5 <div class="user_info"> <strong class="nickname">宸西 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGYJMeXWLnJVGhusJ2Nc9ku9Zp7Pp5WwK9CGFLmXL6SygO2ibwFB0UNNX0k1K0DB2t9w/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 每个人都爱你呀,先生<i class="icon_emotion_single icon22"> <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid6557005603049308195"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="6557005603049308195" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">5 <div class="user_info"> <strong class="nickname">yanconglei <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/K4Ydsulou5e1d1wp6gcLFfr9icrROaUM2GLymhIJrvyy76rFMpazJ4g1GHHauzhK3DrIBibSic3bia8/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 好的爱情是完善自己的过程中彼此欣赏。 <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid4519504071656210458"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="4519504071656210458" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">5 <div class="user_info"> <strong class="nickname">Lindsay <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGZy2KXoZg2DGyUboOSJUubhpeicvfzQrj8gTst7iagQib44yRxz9S9dtpeBN6vXTjwkJ4/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 爱他是我的心甘情愿,可他的心甘情愿未必是爱我。 <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid8678670514710905030"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="8678670514710905030" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">5 <div class="user_info"> <strong class="nickname">青春之帆由我掌握w <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGam9O37J6nBiagPF74SvHORA23RmYuaiahRVjx4NStrw4T0YaicoB9yqSfvQRnFKnU0e0/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 因为追求完美,我宁愿高傲的活着。因为没有遇到对的人,我宁愿单着,看到你写的,有些话是说到我心里,很赞! <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid5425243847531692042"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="5425243847531692042" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">5 <div class="user_info"> <strong class="nickname">赏心悦目 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGZItbfZa53CA6jTkpiaLWzKzVKej0fQ9bNGicx9zDBCjicyU0eVK01dD6Cjhjugz8zuIw/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> 先生总能捕捉到我的心思,妙哉妙哉!<i class="icon_emotion_single icon14"> <p class="discuss_extra_info"> 9月28日 <li class="discuss_item" id="cid6590549211731722289"> <div class="discuss_opr"> <span class="media_tool_meta tips_global meta_praise js_comment_praise " data-content-id="6590549211731722289" data-status="0"> <i class="icon_praise_gray"> <span class="praise_num">5 <div class="user_info"> <strong class="nickname">孙元冬 <img class="avatar" src="http://wx.qlogo.cn/mmopen/vi_24/gia9TticbVQGYu6MaF5YfEJ3bI9xAAnfkZQlOxFoibNh36G5ESJNzTCjNtia4QBMFrt3X1iajbibD9CPI/96\"/> <div class="discuss_message"> <span class="discuss_status"> <div class="discuss_message_content"> hi微信扫一扫
关注该公众号
核心原理是走的什么接口呢
您好,我因为也是刚刚 接触 Python,运行您的代码后登陆直接报错了
[04/Nov/2016 17:44:04]"POST /admin/login/?next=/wechat/ HTTP/1.1" 302 0
ERROR 2016-11-04 17:44:04,030 base 12058 140062940415744 Internal Server Error: /wechat/
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 132, in get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/wechat-spider/wechatspider/util.py", line 48, in _wrapper
return f(request, *args, **kwargs)
File "/wechat-spider/wechat/views.py", line 62, in index
"downloader": r.llen(CRAWLER_CONFIG['downloader']) or 0,
File "/usr/local/lib/python2.7/site-packages/redis/client.py", line 1202, in llen
return self.execute_command('LLEN', name)
File "/usr/local/lib/python2.7/site-packages/redis/client.py", line 570, in execute_command
connection.send_command(*args)
File "/usr/local/lib/python2.7/site-packages/redis/connection.py", line 556, in send_command
self.send_packed_command(self.pack_command(*args))
File "/usr/local/lib/python2.7/site-packages/redis/connection.py", line 532, in send_packed_command
self.connect()
File "/usr/local/lib/python2.7/site-packages/redis/connection.py", line 436, in connect
raise ConnectionError(self._error_message(e))
ConnectionError: Error 111 connecting to localhost:6379. Connection refused.
ERROR 2016-11-04 17:44:04,030 base 12058 140062940415744 Internal Server Error: /wechat/
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 132, in get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/wechat-spider/wechatspider/util.py", line 48, in _wrapper
return f(request, *args, **kwargs)
File "/wechat-spider/wechat/views.py", line 62, in index
"downloader": r.llen(CRAWLER_CONFIG['downloader']) or 0,
File "/usr/local/lib/python2.7/site-packages/redis/client.py", line 1202, in llen
return self.execute_command('LLEN', name)
File "/usr/local/lib/python2.7/site-packages/redis/client.py", line 570, in execute_command
connection.send_command(*args)
File "/usr/local/lib/python2.7/site-packages/redis/connection.py", line 556, in send_command
self.send_packed_command(self.pack_command(*args))
File "/usr/local/lib/python2.7/site-packages/redis/connection.py", line 532, in send_packed_command
self.connect()
File "/usr/local/lib/python2.7/site-packages/redis/connection.py", line 436, in connect
raise ConnectionError(self._error_message(e))
ConnectionError: Error 111 connecting to localhost:6379. Connection refused.
[04/Nov/2016 17:44:04]"GET /wechat/ HTTP/1.1" 500 27
页面显示是500 的错误,请问我大体是哪个地方出了问题
EBUG 2016-09-30 01:30:19,049 init 16368 139946394367808 version=0.2.2
DEBUG 2016-09-30 01:30:19,050 init 16368 139946394367808 version=0.1.5
DEBUG 2016-09-30 01:32:27,583 downloader 16368 139946394367808 {u'kind': 3, u'word': u'\u6c7d\u8f66'}
DEBUG 2016-09-30 01:32:27,648 init 16368 139946394367808 param: "['Xvfb', '-help']"
DEBUG 2016-09-30 01:32:27,648 init 16368 139946394367808 command: ['Xvfb', '-help']
DEBUG 2016-09-30 01:32:27,648 init 16368 139946394367808 joined command: Xvfb -help
DEBUG 2016-09-30 01:32:27,652 init 16368 139946394367808 process was started (pid=16511)
DEBUG 2016-09-30 01:32:27,658 init 16368 139946394367808 process has ended
DEBUG 2016-09-30 01:32:27,658 init 16368 139946394367808 return code=0
DEBUG 2016-09-30 01:32:27,658 init 16368 139946394367808 stdout=
DEBUG 2016-09-30 01:32:27,658 init 16368 139946394367808 stderr=use: X [:] [option]
-a # default pointer acceleration (factor)
-ac disable access control restrictions
-audit int set audit trail level
-auth file select authorization file
-br create root window with black background
+bs enable any backing store support
-bs disable any backing store support
-c turns off key-click
c # key-click volume (0-100)
-cc int default color visual class
-nocursor disable the cursor
-core generate core dump on fatal error
-displayfd fd file descriptor to write display number to when ready to connect
-dpi int screen resolution in dots per inch
-dpms disables VESA DPMS monitor control
-deferglyphs [none|all|16] defer loading of [no|all|16-bit] glyphs
-f # bell base (0-100)
-fc string cursor font
-fn string default font name
-fp string default font path
-help prints message with these options
+iglx Allow creating indirect GLX contexts
-iglx Prohibit creating indirect GLX contexts (default)
-I ignore all remaining arguments
-ld int limit data space to N Kb
-lf int limit number of open files to N
-ls int limit stack space to N Kb
-nolock disable the locking mechanism
-nolisten string don't listen on protocol
-listen string listen on protocol
-noreset don't reset after last client exists
-background [none] create root window with no background
-reset reset after last client exists
-p # screen-saver pattern duration (minutes)
-pn accept failure to listen on all ports
-nopn reject failure to listen on all ports
-r turns off auto-repeat
r turns on auto-repeat
-render [default|mono|gray|color] set render color alloc policy
-retro start with classic stipple and cursor
-s # screen-saver timeout (minutes)
...skipping...
[+-]accessx [ timeout [ timeout_mask [ feedback [ options_mask] ] ] ]
enable/disable accessx key sequences
-ardelay set XKB autorepeat delay
-arinterval set XKB autorepeat interval
-screen scrn WxHxD set screen's width, height, depth
-pixdepths list-of-int support given pixmap depths
+/-render turn on/off RENDER extension support(default on)
-linebias n adjust thin line pixelization
-blackpixel n pixel value for black
-whitepixel n pixel value for white
-fbdir directory put framebuffers in mmap'ed files in directory
-shmem put framebuffers in shared memory
DEBUG 2016-09-30 01:48:26,340 init 16368 139946394367808 param: "['Xvfb', '-br', '-screen', '0', '1024x768x24', ':1039']"
DEBUG 2016-09-30 01:48:26,340 init 16368 139946394367808 command: ['Xvfb', '-br', '-screen', '0', '1024x768x24', ':1039']
DEBUG 2016-09-30 01:48:26,341 init 16368 139946394367808 joined command: Xvfb -br -screen 0 1024x768x24 :1039
DEBUG 2016-09-30 01:48:26,341 init 16368 139946394367808 param: "['Xvfb', '-br', '-screen', '0', '1024x768x24', ':1024']"
DEBUG 2016-09-30 01:48:26,341 init 16368 139946394367808 command: ['Xvfb', '-br', '-screen', '0', '1024x768x24', ':1024']
DEBUG 2016-09-30 01:48:26,341 init 16368 139946394367808 joined command: Xvfb -br -screen 0 1024x768x24 :1024
DEBUG 2016-09-30 01:48:26,345 init 16368 139946394367808 process was started (pid=19254)
DEBUG 2016-09-30 01:48:26,345 abstractdisplay 16368 139946394367808 DISPLAY=:1024
a/04/link?appid=100520031&url=http://mmbiz.qpic.cn/mmbiz_jpg/8jicXEqfbKCUAjCpxfs4b5hhias7tnxiboQLktLJw4IWo2v4GmqO5dGrFvAcPpc3hHBaibDp1xyDEBqibSqkjB62mCA/0?wx_fmt=jpeg', 'http://img01.sogoucdn.com/net/a/04/link?appid=100520031&url=http://mmbiz.qpic.cn/mmbiz_png/RiaJj5csONAHqPfr3HAk8Y8DooYYbUjBrpwx6bdgAtOI22DriaRLGicM8B9GicSMo39d25dsCZHmnoUqGLpKWrja0w/0?wx_fmt=png', 'http://img01.sogoucdn.com/net/a/04/link?appid=100520031&url=http://mmbiz.qpic.cn/mmbiz_jpg/b2YlTLuGbKAvMUwYLakJaFCh537IS3Wia9qSia3OsRVb7uqNpF4WXzMmpGpzb3uRUxhVPLBhyMKSYyv4xShlHjmQ/0?wx_fmt=jpeg', 'http://img01.sogoucdn.com/net/a/04/link?appid=100520031&url=http://mmbiz.qpic.cn/mmbiz_jpg/VabuFTIoksIgyojNsMA02n5u2j41Jq7qC763z79KQzalWNgzFwV8s8MsPdS0pAPiaf1QuJ6gXVeFECxy7kIhSqg/0?wx_fmt=jpeg', 'http://img01.sogoucdn.com/net/a/04/link?appid=100520031&url=http://mmbiz.qpic.cn/mmbiz_jpg/sQCrvCH4KbFlREVmyNumxuYiaqSXOsc4yqCuBNTPn0ibQl1dwOCpuuTWAlibOGocYyeQTSF3Bs5QZ3vJgLTVOp9pw/0?wx_fmt=jpeg', 'http://img01.sogoucdn.com/net/a/04/link?appid=100520031&url=http://mmbiz.qpic.cn/mmbiz_png/1vNs1G9cnWZpTJc2Jnp0bibo0zicjHWquSibCDgvOalANmuOYiaPE97aZa1Nnd7CyWv5ibB9kQuIS4lRpegLr0N2tgQ/0?wx_fmt=png', 'http://img01.sogoucdn.com/net/a/04/link?appid=100520031&url=http://mmbiz.qpic.cn/mmbiz_jpg/WSj0n9GPvlPo7bvqfzV1cTVoxfrPJUAwSVvre7hlOhia7VhCNcxgFB3jE8fjiaOiaRKggjdDBTAI6xicf0k9PPLv0w/0?wx_fmt=jpeg'] [u'\u60ac\u6302\u7cfb\u7edf\u60ac\u6302\u7684\u4f5c\u7528', u'\u8f66\u4e3b\u5982\u9047\u5230', u'\u4f3d\u5f6c\u5bfc\u8bed \u7206\u80ce\u662f\u6307\u8f6e\u80ce\u5728\u6781\u77ed\u7684\u65f6\u95f4(\u4e00\u822c\u5c11\u4e8e0.1\u79d2)\u56e0\u7834\u88c2\u7a81\u7136\u5931\u53bb\u7a7a\u6c14\u800c\u762a\u6389.\u7206\u80ce\u662f', u'\u201c', u'\u8f6c\u8f7d\u81ea', u'1886\u5e741\u670829\u65e5,\u5fb7\u56fd\u4eba\u5361\u5c14\u2022\u672c\u8328\u4e3a\u5176\u57281885\u5e74\u7814\u5236\u6210\u529f\u7684\u4e09\u8f6e', u'\u968f\u7740\u56fd\u5e86\u9ec4\u91d1\u5468\u4e0d\u65ad\u4e34\u8fd1,\u5f88\u591a\u8ba1\u5212\u5229\u7528\u957f\u5047\u81ea\u9a7e\u6e38\u7684\u6d88\u8d39\u8005\u6700\u8fd1\u90fd\u5728\u96c6\u4e2d\u91c7\u8d2d', '', u'\u968f\u7740\u73b0\u4ee3', u'\u751a\u81f3\u662f\u6709\u4e9b\u8f66\u4e3b\u4e70\u4e86\u8f66\u4e4b\u540e\u5c31\u628a\u8f66\u6446\u5bb6\u95e8\u53e3\u4e0d\u52a8\u4e86,\u5916\u51fa\u4f9d\u65e7\u6324\u516c\u4ea4\u3001\u5730\u94c1,\u7f8e\u5176\u540d\u66f0\u201c\u73af\u4fdd\u51fa\u884c\u201d,\u800c\u5b9e\u9645\u4e0a\u5374\u662f\u56e0\u6b64\u6cb9\u8017\u592a\u9ad8\u3001\u6cb9\u4ef7\u592a\u8d35,\u90a3\u53eb\u4e00\u4e2a\u60b2\u50ac\u554a~\u8981\u8bf4\u5173\u4e4e\u517b...']
未被限制,可以下载
Message: The browser appears to have exited before we could connect. If you specified a log_file in the FirefoxBinary constructor, check it for details.
Traceback (most recent call last):
File "bin/downloader.py", line 96, in
downloader.run()
File "bin/downloader.py", line 77, in run
with SeleniumDownloaderBackend(proxy=proxy) as browser:
File "/home/zzg/wechat-spider/wechat/downloaders.py", line 42, in enter
self.browser = self.get_browser(self.proxy)
File "/home/zzg/wechat-spider/wechat/downloaders.py", line 90, in get_browser
browser = webdriver.Firefox(firefox_profile=firefox_profile)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/firefox/webdriver.py", line 78, in init
self.binary, timeout)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/firefox/extension_connection.py", line 51, in init
self.binary.launch_browser(self.profile, timeout=timeout)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/firefox/firefox_binary.py", line 68, in launch_browser
self._wait_until_connectable(timeout=timeout)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/firefox/firefox_binary.py", line 98, in _wait_until_connectable
raise WebDriverException("The browser appears to have exited "
selenium.common.exceptions.WebDriverException: Message: The browser appears to have exited before we could connect. If you specified a log_file in the FirefoxBinary constructor, check it for details.
同过公众号添加后看不到爬虫爬取啊,所有脚本程序启动都正常。
后台也显示代理状态异常
提示 这些样式都没找到,是不是 django 什么初始化没完成?
进程控制模块源码缺失?只看到一个跳转链接
你好,在后台添加微信公众号的时候,提示 禁止访问 (403),CSRF验证失败. 相应中断.
请问是什么原因呢?
感谢
在mac下成功安装,中间遇到一些小问题,总结一下,帮助后面的小伙伴,我的mac osx 版本10.10.5。
python新人,欢迎大家指导,O(∩_∩)O谢谢~~
1.用virtualenv,安装python2.7
可以用这个命令指定版本:virtualenv -p python2.7 venv
没用过virtualenv可以参考廖老师的教程
2.1安装Mysql-python依赖
yum install python-devel mysql-devel gcc
pip里没有python-devel mysql-devel这2个插件,所以装不上,后面运行也不影响,可以跳过。
gcc装xcode就有了。网上也有不装xcode的方法,但我没试过。
2.2安装lxml依赖
yum install libxslt-devel libxml2-devel
pip里一样没有,可以通过homebrew来安装
我开始也是跳过了,但后面运行程序时报错,在stackoverflow找到这个答案,解决了,运行下面四个命令就行,需要先按照homebrew。
brew install libxml2
brew install libxslt
brew link libxml2 --force
brew link libxslt --force
2.3安装浏览器环境 selenium依赖.(如果是mac环境,仅需安装firefox, 但确保版本是 firefox 36.0)
Mac firefox 36.0 下载地址:https://ftp.mozilla.org/pub/firefox/releases/36.0.1/mac/
2.4 clone代码
$ git clone https://github.com/bowenpay/wechat-spider.git $ cd wechat-spider $ pip install -r requirements.txt
这一步简单,没有什么问题,pip如果你不是用国内镜像,下载会非常慢,参考这里改成国内镜像。
- 创建mysql数据库
- 安装和运行Redis
- 更新配置文件local_settings
上面三步都简单,照做就行。第六步我出错,报了这个bug:
错误代码 1045 Access denied for user 'root'@'localhost' (using password:YES)
最后原因是我的Mysql设置密码了,在local_settings.py文件的'PASSWORD': ''
,字段填上密码就行。
后面就一路顺利完成。进去网站发现作者真是太棒了,功能非常完善,里面还有非常好用的搜索功能:
填完公众号后,最后一步启动四个文件,开始有点懵逼:到底是四个文件依次执行完,还是开四个窗口每个窗口执行一次呢?最后发现是四个窗口都运行才可以。
不过现在很大概率会出验证码,需要部署到服务器上自动换ip,等搞定服务器部署我再接着分享。
感谢作者@yijingping 开源这么好的软件。
公众号的文章似乎有永久链接了,google搜索引擎上,开发者头条上的微信公众号的地址都是这种类型,有人知道是如何得到的吗
你的演示网站:http://wechatspider.0fenbei.com/ 登录的用户名/密码是什么?能提供进去看看不?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.