itstyren / cnki-download Goto Github PK
View Code? Open in Web Editor NEW:frog: 知网(CNKI)文献下载及文献速览爬虫
License: MIT License
:frog: 知网(CNKI)文献下载及文献速览爬虫
License: MIT License
RT
main.py 218 行
refence_file = requests.get(self.download_url, headers=HEADER)
改为:
refence_file = self.session.get(self.download_url) ?
请输入【主题】:python
请输入【篇名】:网络
请输入【篇名】条件类型:(a)并且 (b)或者 (c)不含 c
--------------------------
是否需要规定文献来源(y/n)?y
输入文献来源期刊名称:电子技术与软件工程
正在检索中.....
--------------------------
检索到85条结果,全部下载大约需要00小时07分钟05秒。
是否要全部下载(y/n)?n
请输入需要下载的数量:1
开始下载前1页所有文件,预计用时00小时01分钟40秒
--------------------------
正在下载: Python在商品销售数据分析中的使用.cajTraceback (most recent call last):
File "D:\idea\Data_mining\venv\lib\site-packages\urllib3\connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "D:\idea\Data_mining\venv\lib\site-packages\urllib3\util\connection.py", line 72, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "C:\Users\11815\AppData\Local\Programs\Python\Python310\lib\socket.py", line 955, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11001] getaddrinfo failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\idea\Data_mining\venv\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "D:\idea\Data_mining\venv\lib\site-packages\urllib3\connectionpool.py", line 398, in _make_request
conn.request(method, url, **httplib_request_kw)
File "D:\idea\Data_mining\venv\lib\site-packages\urllib3\connection.py", line 239, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "C:\Users\11815\AppData\Local\Programs\Python\Python310\lib\http\client.py", line 1282, in request
self._send_request(method, url, body, headers, encode_chunked)
File "C:\Users\11815\AppData\Local\Programs\Python\Python310\lib\http\client.py", line 1328, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "C:\Users\11815\AppData\Local\Programs\Python\Python310\lib\http\client.py", line 1277, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "C:\Users\11815\AppData\Local\Programs\Python\Python310\lib\http\client.py", line 1037, in _send_output
self.send(msg)
File "C:\Users\11815\AppData\Local\Programs\Python\Python310\lib\http\client.py", line 975, in send
self.connect()
File "D:\idea\Data_mining\venv\lib\site-packages\urllib3\connection.py", line 205, in connect
conn = self._new_conn()
File "D:\idea\Data_mining\venv\lib\site-packages\urllib3\connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x000002B698451D20>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\idea\Data_mining\venv\lib\site-packages\requests\adapters.py", line 489, in send
resp = conn.urlopen(
File "D:\idea\Data_mining\venv\lib\site-packages\urllib3\connectionpool.py", line 787, in urlopen
retries = retries.increment(
File "D:\idea\Data_mining\venv\lib\site-packages\urllib3\util\retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='i.shufang.cnki.net', port=80): Max retries exceeded with url: /KRS/KRSWriteHandler.ashx?curUrl=detail.aspx%3FdbCode%3DCJFQ%26fileName%3DDZRU202210049&referUrl=https%3A%2F%2Fkns.cnki.net%2Fkns%2Fbrief%2Fbrief.aspx%3Fpagename%3DASP.brief_default_result_aspx%26isinEn%3D1%26dbPrefix%3DSCDB%26dbCatalog%3D%25e4%25b8%25ad%25e5%259b%25bd%25e5%25ad%25a6%25e6%259c%25af%25e6%259c%259f%25e5%2588%258a%25e7%25bd%2591%25e7%25bb%259c%25e5%2587%25ba%25e7%2589%2588%25e6%2580%25bb%25e5%25ba%2593%26ConfigFile%3DCJFQ.xml%26research%3Doff%26t%3D1544249384932%26keyValue%3Dpython%26S%3D1%26sorttype%3D%23J_ORDER%26&cnkiUserKey=726a6f53-1896-b19c-b08a-c7edde6fcf0&action=file&userName=&td=1544605318654 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000002B698451D20>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\idea\Data_mining\数据挖掘\zhiwang\main.py", line 249, in
main()
File "D:\idea\Data_mining\数据挖掘\zhiwang\main.py", line 243, in main
search.search_reference(get_uesr_inpt())
File "D:\idea\Data_mining\数据挖掘\zhiwang\main.py", line 88, in search_reference
self.parse_page(
File "D:\idea\Data_mining\数据挖掘\zhiwang\main.py", line 176, in parse_page
page_detail.get_detail_page(self.session, self.get_result_url,
File "D:\idea\Data_mining\数据挖掘\zhiwang\GetPageDetail.py", line 70, in get_detail_page
self.session.get(
File "D:\idea\Data_mining\venv\lib\site-packages\requests\sessions.py", line 600, in get
return self.request("GET", url, **kwargs)
File "D:\idea\Data_mining\venv\lib\site-packages\requests\sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "D:\idea\Data_mining\venv\lib\site-packages\requests\sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "D:\idea\Data_mining\venv\lib\site-packages\requests\adapters.py", line 565, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='i.shufang.cnki.net', port=80): Max retries exceeded with url: /KRS/KRSWriteHandler.ashx?curUrl=detail.aspx%3FdbCode%3DCJFQ%26fileName%3DDZRU202210049&referUrl=https%3A%2F%2Fkns.cnki.net%2Fkns%2Fbrief%2Fbrief.aspx%3Fpagename%3DASP.brief_default_result_aspx%26isinEn%3D1%26dbPrefix%3DSCDB%26dbCatalog%3D%25e4%25b8%25ad%25e5%259b%25bd%25e5%25ad%25a6%25e6%259c%25af%25e6%259c%259f%25e5%2588%258a%25e7%25bd%2591%25e7%25bb%259c%25e5%2587%25ba%25e7%2589%2588%25e6%2580%25bb%25e5%25ba%2593%26ConfigFile%3DCJFQ.xml%26research%3Doff%26t%3D1544249384932%26keyValue%3Dpython%26S%3D1%26sorttype%3D%23J_ORDER%26&cnkiUserKey=726a6f53-1896-b19c-b08a-c7edde6fcf0&action=file&userName=&td=1544605318654 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000002B698451D20>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
--------------------------
| |
| 请选择检索条件:(可多选) |
|(a)主题 (b)关键词 (c)篇名 |
|(d)摘要 (e)全文 (f)被引文献 |
|(g)中图分类号 |
| |
--------------------------
请选择(以空格分割,如a c):c
--------------------------
您选择的是:
篇名 |
--------------------------
请输入【篇名】:贫化铀
--------------------------
是否需要规定文献来源(y/n)?n
正在检索中.....
--------------------------
Traceback (most recent call last):
File "main.py", line 259, in
main()
File "main.py", line 253, in main
search.search_reference(get_uesr_inpt())
File "main.py", line 99, in search_reference
self.pre_parse_page(second_get_res.text), second_get_res.text)
File "main.py", line 107, in pre_parse_page
page_source).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
Traceback (most recent call last):
File "C:\Users\Administrator\Desktop\CNKI-download\main.py", line 259, in
main()
File "C:\Users\Administrator\Desktop\CNKI-download\main.py", line 253, in main
search.search_reference(get_uesr_inpt())
File "C:\Users\Administrator\Desktop\CNKI-download\main.py", line 98, in search_reference
self.parse_page(
File "C:\Users\Administrator\Desktop\CNKI-download\main.py", line 186, in parse_page
page_detail.get_detail_page(self.session, self.get_result_url,
File "C:\Users\Administrator\Desktop\CNKI-download\GetPageDetail.py", line 80, in get_detail_page
self.pars_page(get_res.text)
File "C:\Users\Administrator\Desktop\CNKI-download\GetPageDetail.py", line 89, in pars_page
orgn_list = soup.find(name='div', class_='orgn').find_all('a')
AttributeError: 'NoneType' object has no attribute 'find_all'
知网改了网页源代码,将搜索后包含内容的
修改的地方就是http改成了https,其他没有动过,但是下载下来都是2kb,打开显示已损坏
在具体使用过程中,我发现程序在指定文献期刊来源后,只能检索到期刊内容,无法检索到非期刊文献。例如将文献期刊来源设定为"xx大学",检索结果的来源为“xx大学学报”,数据库为“期刊”。有没有什么方法能检索到硕博士论文?
读过代码后发现,该检索条件传入的参数为“'magazine_value1”,我想要修改此处的参数,尝试了几种方法但是没找到具体该传入什么参数。个人对爬虫和网络相关知识的相当浅薄,想知道此处应该如何修改?感谢大佬
[crawl]
; 爬取及下载开关 0为关闭 1为开启
isDownloadFile = 0
isCrackCode=1
isDetailPage=1
isDownLoadLink=0
stepWaitTime=3
正在下载: 高中政治教学中渗透科学精神核心素养路径初探.caj
正在下载: 试论初中语文教学中学生表达能力的培养策略.caj
ERROR:root:出现验证码
Traceback (most recent call last):
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 144, in parse_page
tr_table.tr.extract()
AttributeError: 'NoneType' object has no attribute 'tr'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 259, in <module>
if __name__ == '__main__':
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 253, in main
search = SearchTools()
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 98, in search_reference
self.parse_page(
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 195, in parse_page
self.get_another_page(download_page_left)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 209, in get_another_page
self.parse_page(download_page_left, get_res.text)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 195, in parse_page
self.get_another_page(download_page_left)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 209, in get_another_page
self.parse_page(download_page_left, get_res.text)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 195, in parse_page
self.get_another_page(download_page_left)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 209, in get_another_page
self.parse_page(download_page_left, get_res.text)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 195, in parse_page
self.get_another_page(download_page_left)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 209, in get_another_page
self.parse_page(download_page_left, get_res.text)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 195, in parse_page
self.get_another_page(download_page_left)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 209, in get_another_page
self.parse_page(download_page_left, get_res.text)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 195, in parse_page
self.get_another_page(download_page_left)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 209, in get_another_page
self.parse_page(download_page_left, get_res.text)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 195, in parse_page
self.get_another_page(download_page_left)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 209, in get_another_page
self.parse_page(download_page_left, get_res.text)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 195, in parse_page
self.get_another_page(download_page_left)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 209, in get_another_page
self.parse_page(download_page_left, get_res.text)
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/main.py", line 149, in parse_page
crack.get_image(self.get_result_url, self.session,
File "/Users/caizhicheng/Desktop/CNKI_download/CNKI-download/CrackVerifyCode.py", line 34, in get_image
self.current_url = re.search(r'(.*?)#', current_url).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
最后一行
class CrackCode(object):
def get_image(self, current_url, session, page_source):
'''
获取验证码图片
'''
self.header = HEADER
self.session = session
# 获得验证码图片地址
imgurl_pattern_compile = re.compile(r'.*?<img src="(.*?)".*?')
img_url = re.search(imgurl_pattern_compile, page_source).group(1)
self.current_url = re.search(r'(.*?)#', current_url).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
我跑到200左右出现验证码了
是否需要规定文献来源(y/n)?n
正在检索中.....
--------------------------
Traceback (most recent call last):
File "/Users/Desktop/CNKI-download-master/main.py", line 259, in
main()
File "/Users/Desktop/CNKI-download-master/main.py", line 253, in main
search.search_reference(get_uesr_inpt())
File "/Users/Desktop/CNKI-download-master/main.py", line 99, in search_reference
self.pre_parse_page(second_get_res.text), second_get_res.text)
File "/Users/Desktop/CNKI-download-master/main.py", line 106, in pre_parse_page
reference_num = re.search(reference_num_pattern_compile,
AttributeError: 'NoneType' object has no attribute 'group'
求助大佬
应该时没搜到 主题词 关键词等单一搜索都是报这个错
Traceback (most recent call last):
File "C:/Users/Lenovo/Downloads/CNKI-download-master/CNKI-download-master/main.py", line 246, in
main()
File "C:/Users/Lenovo/Downloads/CNKI-download-master/CNKI-download-master/main.py", line 240, in main
search.search_reference(get_uesr_inpt())
File "C:/Users/Lenovo/Downloads/CNKI-download-master/CNKI-download-master/main.py", line 87, in search_reference
second_get_res.text).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
采了一百多篇,验证码出问题,提示:
ERROR:root:出现验证码
Traceback (most recent call last):
File "main.py", line 144, in parse_page
tr_table.tr.extract()
AttributeError: 'NoneType' object has no attribute 'tr'
用不用识别码都报错,请帮忙看一下是什么原因?谢谢
直接fork到的代码不是直接能用的
然后修改了一下
def depoint(self, img):
"""传入二值化后的图片进行降噪"""
pixdata = img.load()
w, h = img.size
for y in range(1, h - 1):
for x in range(1, w - 1):
count = 0
if pixdata[x, y - 1] > 245: # 上
count = count + 1
if pixdata[x, y + 1] > 245: # 下
count = count + 1
if pixdata[x - 1, y] > 245: # 左
count = count + 1
if pixdata[x + 1, y] > 245: # 右
count = count + 1
if pixdata[x - 1, y - 1] > 245: # 左上
count = count + 1
if pixdata[x - 1, y + 1] > 245: # 左下
count = count + 1
if pixdata[x + 1, y - 1] > 245: # 右上
count = count + 1
if pixdata[x + 1, y + 1] > 245: # 右下
count = count + 1
if count > 4:
pixdata[x, y] = 255
return img
def imge2string(self,image,threshold):
"""
图片转字符串
按照threshold进行降噪
"""
image = image.convert('L')
# 二值化
image = image.point(lambda x: 255 if x > threshold else 0)
#
# 继续降噪
image = self.depoint(image)
# 识别//这里识别还有问题 tesserocr识别内容为空
result = tesserocr.image_to_text(image)
print(str(threshold)+"识别到验证码:" + str(result))
return result
def crack_code(self):
'''
自动识别验证码
'''
image = Image.open('./data/crack_code.jpeg')
# 转为灰度图像
# 设定二值化阈值
threshold = 127
s1 = self.imge2string(image, threshold)
s2 = self.imge2string(image, threshold+20)
s3 = self.imge2string(image, threshold-20)
if s1 == s2 == s3 or s1 == s2 or s1 == s3:
return self.send_code(str(s1))
elif s2 == s3:
return self.send_code(str(s2))
在result = tesserocr.image_to_text(image)
这里出现了问题
无论如何识别,或者处理图像,tesserocr返回结果均为空
即使把所有的链接改为 https,仍会爆出下面的错误,如何解决呢?
D:\chromedownloads\CNKI-download-master\CNKI-download-master>python main.py
--------------------------
| |
| 请选择检索条件:(可多选) |
|(a)主题 (b)关键词 (c)篇名 |
|(d)摘要 (e)全文 (f)被引文献 |
|(g)中图分类号 |
| |
--------------------------
请选择(以空格分割,如a c):a
--------------------------
您选择的是:
主题 |
--------------------------
请输入【主题】:asdf
--------------------------
是否需要规定文献来源(y/n)?y
输入文献来源期刊名称:
正在检索中.....
--------------------------
检索到4条结果,全部下载大约需要00小时00分钟20秒。
是否要全部下载(y/n)?y
正在下载: 中信:决战澳矿.caj
Traceback (most recent call last):
File "D:\Users\18301\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 600, in urlopen
chunked=chunked)
File "D:\Users\18301\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 343, in _make_request
self._validate_conn(conn)
File "D:\Users\18301\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 839, in _validate_conn
conn.connect()
File "D:\Users\18301\anaconda3\lib\site-packages\urllib3\connection.py", line 364, in connect
_match_hostname(cert, self.assert_hostname or server_hostname)
File "D:\Users\18301\anaconda3\lib\site-packages\urllib3\connection.py", line 374, in _match_hostname
match_hostname(cert, asserted_hostname)
File "D:\Users\18301\anaconda3\lib\ssl.py", line 334, in match_hostname
% (hostname, ', '.join(map(repr, dnsnames))))
ssl.SSLCertVerificationError: ("hostname 'i.shufang.cnki.net' doesn't match either of '.cnki.net', 'www.cnki.net', '.global.cnki.net', '*.oversea.cnki.net', 'big5.book.oversea.cnki.net', 'caj.d.cnki.net', 'caj.oversea.d.cnki.net', 'en.cend.cnki.net', 'eng.tcm.cnki.net', 'gb.book.oversea.cnki.net', 'gb.cend.cnki.net', 'gb.cnbar.cnki.net', 'gb.obaor.cnki.net', 'gb.sczlmz.cnki.net', 'gb.sczlzj.cnki.net', 'gb.tcm.cnki.net', 'kb.tcm.cnki.net', 'oversea.d.cnki.net', 'pdf.d.cnki.net', 'pdf.oversea.d.cnki.net', 'tra.tcm.cnki.net', 'cnki.net'",)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\Users\18301\anaconda3\lib\site-packages\requests\adapters.py", line 449, in send
timeout=timeout
File "D:\Users\18301\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 638, in urlopen
_stacktrace=sys.exc_info()[2])
File "D:\Users\18301\anaconda3\lib\site-packages\urllib3\util\retry.py", line 398, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='i.shufang.cnki.net', port=443): Max retries exceeded with url: /KRS/KRSWriteHandler.ashx?curUrl=detail.aspx%3FdbCode%3DCJFQ%26fileName%3DJJDK202009024&referUrl=https%3A%2F%2Fkns.cnki.net%2Fkns%2Fbrief%2Fbrief.aspx%3Fpagename%3DASP.brief_default_result_aspx%26isinEn%3D1%26dbPrefix%3DSCDB%26dbCatalog%3D%25e4%25b8%25ad%25e5%259b%25bd%25e5%25ad%25a6%25e6%259c%25af%25e6%259c%259f%25e5%2588%258a%25e7%25bd%2591%25e7%25bb%259c%25e5%2587%25ba%25e7%2589%2588%25e6%2580%25bb%25e5%25ba%2593%26ConfigFile%3DCJFQ.xml%26research%3Doff%26t%3D1544249384932%26keyValue%3D%25E6%259B%25BE%25E6%2599%25A8%26S%3D1%26sorttype%3D%23J_ORDER%26&cnkiUserKey=199bceef-d913-a550-9ff0-b5614a82b64&action=file&userName=&td=1544605318654 (Caused by SSLError(SSLCertVerificationError("hostname 'i.shufang.cnki.net' doesn't match either of '.cnki.net', 'www.cnki.net', '.global.cnki.net', '*.oversea.cnki.net', 'big5.book.oversea.cnki.net', 'caj.d.cnki.net', 'caj.oversea.d.cnki.net', 'en.cend.cnki.net', 'eng.tcm.cnki.net', 'gb.book.oversea.cnki.net', 'gb.cend.cnki.net', 'gb.cnbar.cnki.net', 'gb.obaor.cnki.net', 'gb.sczlmz.cnki.net', 'gb.sczlzj.cnki.net', 'gb.tcm.cnki.net', 'kb.tcm.cnki.net', 'oversea.d.cnki.net', 'pdf.d.cnki.net', 'pdf.oversea.d.cnki.net', 'tra.tcm.cnki.net', 'cnki.net'")))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 259, in
main()
File "main.py", line 253, in main
search.search_reference(get_uesr_inpt())
File "main.py", line 99, in search_reference
self.pre_parse_page(second_get_res.text), second_get_res.text)
File "main.py", line 188, in parse_page
self.download_url)
File "D:\chromedownloads\CNKI-download-master\CNKI-download-master\GetPageDetail.py", line 73, in get_detail_page
params=params)
File "D:\Users\18301\anaconda3\lib\site-packages\requests\sessions.py", line 546, in get
return self.request('GET', url, **kwargs)
File "D:\Users\18301\anaconda3\lib\site-packages\requests\sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "D:\Users\18301\anaconda3\lib\site-packages\requests\sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "D:\Users\18301\anaconda3\lib\site-packages\requests\adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='i.shufang.cnki.net', port=443): Max retries exceeded with url: /KRS/KRSWriteHandler.ashx?curUrl=detail.aspx%3FdbCode%3DCJFQ%26fileName%3DJJDK202009024&referUrl=https%3A%2F%2Fkns.cnki.net%2Fkns%2Fbrief%2Fbrief.aspx%3Fpagename%3DASP.brief_default_result_aspx%26isinEn%3D1%26dbPrefix%3DSCDB%26dbCatalog%3D%25e4%25b8%25ad%25e5%259b%25bd%25e5%25ad%25a6%25e6%259c%25af%25e6%259c%259f%25e5%2588%258a%25e7%25bd%2591%25e7%25bb%259c%25e5%2587%25ba%25e7%2589%2588%25e6%2580%25bb%25e5%25ba%2593%26ConfigFile%3DCJFQ.xml%26research%3Doff%26t%3D1544249384932%26keyValue%3D%25E6%259B%25BE%25E6%2599%25A8%26S%3D1%26sorttype%3D%23J_ORDER%26&cnkiUserKey=199bceef-d913-a550-9ff0-b5614a82b64&action=file&userName=&td=1544605318654 (Caused by SSLError(SSLCertVerificationError("hostname 'i.shufang.cnki.net' doesn't match either of '.cnki.net', 'www.cnki.net', '.global.cnki.net', '*.oversea.cnki.net', 'big5.book.oversea.cnki.net', 'caj.d.cnki.net', 'caj.oversea.d.cnki.net', 'en.cend.cnki.net', 'eng.tcm.cnki.net', 'gb.book.oversea.cnki.net', 'gb.cend.cnki.net', 'gb.cnbar.cnki.net', 'gb.obaor.cnki.net', 'gb.sczlmz.cnki.net', 'gb.sczlzj.cnki.net', 'gb.tcm.cnki.net', 'kb.tcm.cnki.net', 'oversea.d.cnki.net', 'pdf.d.cnki.net', 'pdf.oversea.d.cnki.net', 'tra.tcm.cnki.net', 'cnki.net'")))
D:\chromedownloads\CNKI-download-master\CNKI-download-master>
正在检索中.....
--------------------------
Traceback (most recent call last):
File "/Users/irimsky/Downloads/CNKI-download-master/main.py", line 254, in <module>
main()
File "/Users/irimsky/Downloads/CNKI-download-master/main.py", line 248, in main
search.search_reference(get_uesr_inpt())
File "/Users/irimsky/Downloads/CNKI-download-master/main.py", line 92, in search_reference
second_get_res.text).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<title>概览页</title> <script type="text/javascript" src="//piccache.cnki.net/kdn/nvsmkns/script/jquery-1.4.2.min.js"></script> <script type="text/javascript" src="//piccache.cnki.net/kdn/nvsmkns/script/min/gb.BriefPage.min.js?v=D59787997F3B8FCE" ></script> <script type="text/javascript" src="//piccache.cnki.net/kdn/nvsmkns/script/WideScreen.js"></script><input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="/wEdAAIdIRdFWpKA4BYUdQL6KgtZbgvJz/EQj5rQumcYDL2xlhbi/alS8mbfX8EEY1efJIm0syhzU+O3mZ5ahqVI454K" />
$(document).ready(function () {
qkInfoCall();
setAuShow();
// GetHeat();
window.parent.HideWaitDiv();
SetFrameHeight();
isHasAddFav();
try{
parent.window.adsContainer.loadAds(parent.document.getElementById("txt_1_value1").value);
}
catch(e){}
});
</script>
<table border=0 ><tr><td>记录集失效</td></tr></table>
<input name="tpagemode" type="hidden" id="tpagemode" value="L" />
</form>
<script type="text/javascript">
try{}catch(e){}
ChangeDownloadImg();
RevertUserSelect();
briefTableListJSEvent();
BindOnlick_ShowWait();
BindTurnPage_TitleTip();
parent.$("#zyzklist").hide();
//外文推荐
// recWWJDAddToTable();20170921,增加中英文混合检索,不需要再加载外文推荐
var analysisURL = "/KVisual/ArticleAnalysis/index";
modifySql();
function modifySql() {
var param = "";
if (param == null || param == "") {
var obj = parent.document.getElementById("sql");
if (obj) {
obj.value = "2827E4B6502D8710744CC7825A00F3F82BAB6FF9F49C28A8C06DBD3C5D73A36E7A6B95F18DA4019E021F3F1691F6B0A03B99C056E48A0254F8D0AFE1AAB57A9BFDCBBFFEEAAFA080E188818637CF6AADB3910F9CB0D5384C288BBBD10EE5B756BFAE86E762F5587544067EFCB6335F1551B1752FED7007F848A2F65F6361E4969CA97A467AE7DFF1D65FA2333691AE914B807EA865F98F2B4DA7F0E5B53CEDD31A34E99814BE79036EC7A23B28568767B543605EEB42085FF85A2AC02FD02AE188F025E7ADBB5D5456124701C643F785C0E8F466CEE182F0A51495CB44F3F039D6E5D62B005E08337F47C8371201A0DFFCD7B64073A1CD0D600811A47AC221B26485DE690B81866288498CD8DECB643D5A64546FA6FF6D41267ACBE6078EC4D35DF08B166A076AEAA5A7E0C875747A661813A88146D8A0137BBB953F17A54818672367305E80A265A56051CB57C24AD39C2D00E0684CDFA37DD96554F37EE38FD19E0CD5CE82D88DA5FAE4A2031AE3E919BB498FF0449A5F52A7D842DC60B2BD843E9B9509F4BC42505450294895655B83E5650C9144C860DA8E88EE4C6B08E27624BDE654E1FC7AF299653113BC029D0992FBF45C30DBB551D112D5C03A389CD1052A01C8786C738A9F5DF0C441D49E11AFA9584FF3A277196FFB1CA6BDC1A25E6772206DF8EFC2D5447DFAB86DBAA1613C34E184FC2B7B55377B7884B29AECB4936D0C467D89B9E4E9369F64918AEBD8384D5D249B77A9B49004D8D15D3A7ED0C89DFFE7113205E7BB1299D4FC6B0DA8ACB80F7FAC8108D4A4E64B60670662A952D1BE0AE397082DA211E56C8C828AA8E92C268A3FBC1B6198341E104077130909FA61E6683103C1254083F67147DEBD755F6092E3F90395E0CA27CAB3B84317BB47FA03DB85EEBC5B615F588F9DD26A526A277A46AD88D604D532A35D63E94F900E98D9D0C37B0A7BEC09EDDB1D89099BCBF1F2A3A8E1653D4EDD15965D90A79F1A31D6B9BF54835DF333410FFD5BA72C9A8D7B57E62F44302072FFE974835BDE3FE5299B779AF41A80BD39D540926EDC484B56409B2C66FFC44338DD0F61DF4706323FF89C933DADC03DC5BE11F75426D85B473DCFAE42917F52A585ADD81ED18A1EA75F13D4F70F5E8EA50D223A9342048E7986AECE95607D7476F386A9%";
}
}
}
//绑定分析
$("#analysisBox").hoverDelay({
hoverDuring: 200,
outDuring: 0,
hoverEvent: function () {
var $this = $(this);
//显示数字
var fileNameS = new FileNameS();
var pcnt = fileNameS.Count();
var rcnt = 643822;
var ptext = pcnt > -10 ? "<span>(" + pcnt + ")</span>" : "";//始终为真
var rtext = rcnt > -10 ? "<span>(" + rcnt + ")</span>" : "";
$this.find("a").eq(0).html("已选文献分析" + ptext);
//$this.find("a").eq(1).html("检索结果");
$this.find(".imiSelDp").show();
},
outEvent: function () {
$(this).find(".imiSelDp").hide();
}
});
//排序方式缓存 add by LH 2017-7-26
$("#J_ORDER .Btn5 a").click(function() {
SetSortTypeCookie(this);
});
function SetSortTypeCookie(elm) {
var sorttype = GetQueryStringByName($(elm).attr("href"), "sorttype");
var Days = 7;
var exp = new Date();
exp.setTime(exp.getTime() + Days * 24 * 60 * 60 * 1000);
var dbcode = GetQueryStringByName(window.location.href, "dbPrefix");
document.cookie = "KNS_SortType" + "=" + escape(dbcode+"!"+sorttype) + ";expires=" + exp.toGMTString() + ";path=/";
}
window.document.onclick = parent.OnclickForHideMoredo;
</script>
<style type="text/css">
.fly
ingAdd
{ left: -100px;
top: 0px;
position: absolute;
width: 50px;
text-align: center;
height: 50px;
font-size: 50px;
color: #999;
z-index: 50000;
}
/*等待*/
.loading {
position: absolute;
width: 232px;
height: 32px;
z-index: 300;
background: url(../images/gb/loading.gif) no-repeat scroll center center transparent;
}
</style>
<div style="left: -1000px; top: -100px; opacity: 1; font-size: 50px;" class="flyingAdd">
<img src="../images/gb/checkboxbook.png" alt="" />
</div>
<script type="text/javascript">
LoadScript('/KRS/scripts/Recommend.js');
LoadScript('//piccache.cnki.net/kdn/nvsmkns/script/piwikCommon70.js');
</script>
我从github上下载项目配置完成后 得到.caj文件打开是网页源代码 而不是caj文档
Traceback (most recent call last):
File "F:\Python\CNKI-爬虫download\main.py", line 27, in
from GetPageDetail import page_detail
File "F:\Python\CNKI-爬虫download\GetPageDetail.py", line 203, in
page_detail = PageDetail()
File "F:\Python\CNKI-爬虫download\GetPageDetail.py", line 39, in init
if config.crawl_isDownLoadLink == '1':
File "F:\Python\CNKI-爬虫download\GetConfig.py", line 30, in get
value = self.func(instance)
File "F:\Python\CNKI-爬虫download\GetConfig.py", line 75, in crawl_isDownLoadLink
return int(self.conf.get('crawl', 'isDownLoadLink'))
File "F:\Anaconda\envs\tensorflow-gpu\lib\configparser.py", line 781, in get
d = self._unify_values(section, vars)
File "F:\Anaconda\envs\tensorflow-gpu\lib\configparser.py", line 1141, in _unify_values
raise NoSectionError(section)
configparser.NoSectionError: No section: 'crawl'
如下
Traceback (most recent call last):
File "D:\Python-3.83\lib\site-packages\urllib3\connectionpool.py", line 667, in urlopen
self._prepare_proxy(conn)
File "D:\Python-3.83\lib\site-packages\urllib3\connectionpool.py", line 930, in _prepare_proxy
conn.connect()
File "D:\Python-3.83\lib\site-packages\urllib3\connection.py", line 396, in connect
_match_hostname(cert, self.assert_hostname or server_hostname)
File "D:\Python-3.83\lib\site-packages\urllib3\connection.py", line 406, in _match_hostname
match_hostname(cert, asserted_hostname)
File "D:\Python-3.83\lib\ssl.py", line 416, in match_hostname
raise CertificateError("hostname %r "
ssl.SSLCertVerificationError: ("hostname 'i.shufang.cnki.net' doesn't match either of '.cnki.net', 'www.cnki.net', '.global.cnki.net', '*.oversea.cnki.net', 'big5.book.oversea.cnki.net', 'caj.d.cnki.net', 'caj.oversea.d.cnki.net', 'en.cend.cnki.net', 'eng.tcm.cnki.net', 'gb.book.oversea.cnki.net', 'gb.cend.cnki.net', 'gb.cnbar.cnki.net', 'gb.obaor.cnki.net', 'gb.sczlmz.cnki.net', 'gb.sczlzj.cnki.net', 'gb.tcm.cnki.net', 'kb.tcm.cnki.net', 'oversea.d.cnki.net', 'pdf.d.cnki.net', 'pdf.oversea.d.cnki.net', 'tra.tcm.cnki.net', 'cnki.net'",)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\Python-3.83\lib\site-packages\requests\adapters.py", line 439, in send
resp = conn.urlopen(
File "D:\Python-3.83\lib\site-packages\urllib3\connectionpool.py", line 724, in urlopen
retries = retries.increment(
File "D:\Python-3.83\lib\site-packages\urllib3\util\retry.py", line 439, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='i.shufang.cnki.net', port=443): Max retries exceeded with url: /KRS/KRSWriteHandler.ashx?curUrl=detail.aspx%3FdbCode%3DCJFQ%26fileName%3DTYGY20210331002&referUrl=https%3A%2F%2Fkns.cnki.net%2Fkns%2Fbrief%2Fbrief.aspx%3Fpagename%3DASP.brief_default_result_aspx%26isinEn%3D1%26dbPrefix%3DSCDB%26dbCatalog%3D%25e4%25b8%25ad%25e5%259b%25bd%25e5%25ad%25a6%25e6%259c%25af%25e6%259c%259f%25e5%2588%258a%25e7%25bd%2591%25e7%25bb%259c%25e5%2587%25ba%25e7%2589%2588%25e6%2580%25bb%25e5%25ba%2593%26ConfigFile%3DCJFQ.xml%26research%3Doff%26t%3D1544249384932%26keyValue%3D%25E8%25AF%25AD%25E9%259F%25B3%25E8%25AF%2586%25E5%2588%25AB%26S%3D1%26sorttype%3D%23J_ORDER%26&cnkiUserKey=d522e520-357b-a254-9bd3-9e95fdce484&action=file&userName=&td=1544605318654 (Caused by SSLError(SSLCertVerificationError("hostname 'i.shufang.cnki.net' doesn't match either of '.cnki.net', 'www.cnki.net', '.global.cnki.net', '*.oversea.cnki.net', 'big5.book.oversea.cnki.net', 'caj.d.cnki.net', 'caj.oversea.d.cnki.net', 'en.cend.cnki.net', 'eng.tcm.cnki.net', 'gb.book.oversea.cnki.net', 'gb.cend.cnki.net', 'gb.cnbar.cnki.net', 'gb.obaor.cnki.net', 'gb.sczlmz.cnki.net', 'gb.sczlzj.cnki.net', 'gb.tcm.cnki.net', 'kb.tcm.cnki.net', 'oversea.d.cnki.net', 'pdf.d.cnki.net', 'pdf.oversea.d.cnki.net', 'tra.tcm.cnki.net', 'cnki.net'")))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/ASUS/PycharmProjects/1111111/知网/CNKI-download-master/main.py", line 259, in
main()
File "C:/Users/ASUS/PycharmProjects/1111111/知网/CNKI-download-master/main.py", line 253, in main
search.search_reference(get_uesr_inpt())
File "C:/Users/ASUS/PycharmProjects/1111111/知网/CNKI-download-master/main.py", line 98, in search_reference
self.parse_page(
File "C:/Users/ASUS/PycharmProjects/1111111/知网/CNKI-download-master/main.py", line 186, in parse_page
page_detail.get_detail_page(self.session, self.get_result_url,
File "C:\Users\ASUS\PycharmProjects\1111111\知网\CNKI-download-master\GetPageDetail.py", line 70, in get_detail_page
self.session.get(
File "D:\Python-3.83\lib\site-packages\requests\sessions.py", line 555, in get
return self.request('GET', url, **kwargs)
File "D:\Python-3.83\lib\site-packages\requests\sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "D:\Python-3.83\lib\site-packages\requests\sessions.py", line 677, in send
history = [resp for resp in gen]
File "D:\Python-3.83\lib\site-packages\requests\sessions.py", line 677, in
history = [resp for resp in gen]
File "D:\Python-3.83\lib\site-packages\requests\sessions.py", line 237, in resolve_redirects
resp = self.send(
File "D:\Python-3.83\lib\site-packages\requests\sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "D:\Python-3.83\lib\site-packages\requests\adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='i.shufang.cnki.net', port=443): Max retries exceeded with url: /KRS/KRSWriteHandler.ashx?curUrl=detail.aspx%3FdbCode%3DCJFQ%26fileName%3DTYGY20210331002&referUrl=https%3A%2F%2Fkns.cnki.net%2Fkns%2Fbrief%2Fbrief.aspx%3Fpagename%3DASP.brief_default_result_aspx%26isinEn%3D1%26dbPrefix%3DSCDB%26dbCatalog%3D%25e4%25b8%25ad%25e5%259b%25bd%25e5%25ad%25a6%25e6%259c%25af%25e6%259c%259f%25e5%2588%258a%25e7%25bd%2591%25e7%25bb%259c%25e5%2587%25ba%25e7%2589%2588%25e6%2580%25bb%25e5%25ba%2593%26ConfigFile%3DCJFQ.xml%26research%3Doff%26t%3D1544249384932%26keyValue%3D%25E8%25AF%25AD%25E9%259F%25B3%25E8%25AF%2586%25E5%2588%25AB%26S%3D1%26sorttype%3D%23J_ORDER%26&cnkiUserKey=d522e520-357b-a254-9bd3-9e95fdce484&action=file&userName=&td=1544605318654 (Caused by SSLError(SSLCertVerificationError("hostname 'i.shufang.cnki.net' doesn't match either of '.cnki.net', 'www.cnki.net', '.global.cnki.net', '*.oversea.cnki.net', 'big5.book.oversea.cnki.net', 'caj.d.cnki.net', 'caj.oversea.d.cnki.net', 'en.cend.cnki.net', 'eng.tcm.cnki.net', 'gb.book.oversea.cnki.net', 'gb.cend.cnki.net', 'gb.cnbar.cnki.net', 'gb.obaor.cnki.net', 'gb.sczlmz.cnki.net', 'gb.sczlzj.cnki.net', 'gb.tcm.cnki.net', 'kb.tcm.cnki.net', 'oversea.d.cnki.net', 'pdf.d.cnki.net', 'pdf.oversea.d.cnki.net', 'tra.tcm.cnki.net', 'cnki.net'")))
请求您的解答
检索到69条结果,全部下载大约需要00小时05分钟45秒。
是否要全部下载(y/n)?y
正在下载: 基于文字识别技术的作业自动批改系统.caj
Traceback (most recent call last):
File "main.py", line 259, in
main()
File "main.py", line 253, in main
search.search_reference(get_uesr_inpt())
File "main.py", line 99, in search_reference
self.pre_parse_page(second_get_res.text), second_get_res.text)
File "main.py", line 188, in parse_page
self.download_url)
File "D:\paper_search_program\CNKI-download-master\GetPageDetail.py", line 80, in get_detail_page
self.pars_page(get_res.text)
File "D:\paper_search_program\CNKI-download-master\GetPageDetail.py", line 89, in pars_page
orgn_list = soup.find(name='div', class_='orgn').find_all('a')
AttributeError: 'NoneType' object has no attribute 'find_all'
这个该怎么解决啊? 博主,希望可以解答一波!!!!!!!!谢谢!!!
python main.py
--------------------------
| |
| 请选择检索条件:(可多选) |
|(a)主题 (b)关键词 (c)篇名 |
|(d)摘要 (e)全文 (f)被引文献 |
|(g)中图分类号 |
| |
--------------------------
请选择(以空格分割,如a c):c
--------------------------
您选择的是:
篇名 |
--------------------------
请输入【篇名】:汉服
--------------------------
是否需要规定文献来源(y/n)?n
正在检索中.....
--------------------------
Traceback (most recent call last):
File "main.py", line 259, in
main()
File "main.py", line 253, in main
search.search_reference(get_uesr_inpt())
File "main.py", line 99, in search_reference
self.pre_parse_page(second_get_res.text), second_get_res.text)
File "main.py", line 107, in pre_parse_page
page_source).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
在校外,没有公网的网址,需要用用户名和账号登陆才能下载,这个咋搞啊
Traceback (most recent call last):
File "C:\Users\Yuchl\Downloads\CNKI-download-master\main.py", line 259, in
main()
File "C:\Users\Yuchl\Downloads\CNKI-download-master\main.py", line 253, in main
search.search_reference(get_uesr_inpt())
File "C:\Users\Yuchl\Downloads\CNKI-download-master\main.py", line 99, in search_reference
self.pre_parse_page(second_get_res.text), second_get_res.text)
File "C:\Users\Yuchl\Downloads\CNKI-download-master\main.py", line 106, in pre_parse_page
reference_num = re.search(reference_num_pattern_compile,
AttributeError: 'NoneType' object has no attribute 'group'****
报错信息:
在检索文件信息时,会出现”NoneType...find_all(‘a’)”的报错
解决办法:
我加了一个if判断如果find不到需要的信息(作者单位)就跳过,发现生成的excel里面都没有摘要和关键字了
问题猜测:
我打印了爬取到的soup,发现爬取到的html里面都没有摘要(在网页上查找同样的文章是存在摘要的),想问下作者是不是知网的接口又变了,因为对爬虫的了解很肤浅,真诚希望作者大大百忙之中解答一下,谢谢!
实现了学校ip的知网登录但下载文献需要验证码(每一篇都要),真实的浏览器(selenium驱动浏览器也每篇都要验证码)请求可以直接下载到文献,是少量什么参数还是什么?
看了下CNKI-download的文献下载部分只是简单的get请求加了headers是一个404
import requests
headers = {
'Connection': 'keep-alive',
'Cache-Control': 'max-age=0',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Language': 'zh-CN,zh;q=0.9,en-GB;q=0.8,en;q=0.7',
# 'Cookie': 'SID=020197; Ecp_LoginStuts={"IsAutoLogin":false,"UserName":"DX0434","ShowName":"%e6%b5%99%e6%b1%9f%e7%90%86%e5%b7%a5%e5%a4%a7%e5%ad%a6","UserType":"bk","BUserName":"","BShowName":"","BUserType":"","r":"0rHTHE"}; c_m_LinID=LinID=WEEvREcwSlJHSldRa1FhcEFLUmVicE1SUFRzQTZEZW5Va0VWYitsa2NPMD0=$9A4hF_YAuvQ5obgVAqNKPCYcEjKensW4IQMovwHtwkF4VYPoHbKxJw!!&ot=06/19/2020 13:54:08; LID=WEEvREcwSlJHSldRa1FhcEFLUmVicE1SUFRzQTZEZW5Va0VWYitsa2NPMD0=$9A4hF_YAuvQ5obgVAqNKPCYcEjKensW4IQMovwHtwkF4VYPoHbKxJw!!; c_m_expire=2020-06-19 13:54:08; Ecp_session=1; ASP.NET_SessionId=vughxubnlqvnxrf0vtd0brwz; Ecp_ClientId=5200619133401915832'
}
session = requests.Session()
session.headers.update(headers)
# ip 登录
r = session.get(
'https://login.cnki.net/TopLogin/api/loginapi/IpLoginFlush')
r.encoding = r.apparent_encoding
# print(r.text)
res = session.get('https://kns.cnki.net/kns/download.aspx?filename=WRGMhx2KSxkQxNUQD50cSZXZUlHTv8ma3I2RKlnbwpFMrJXcEpHc5dzUPF3Z1BneZFHNGhEdCdFUnJkRzh3ayU1dE9WSiZ2KQxUbGdETQl1KSp1dw40b1JWcpV3cxAzYqFGaydmNQlmSDlXNsRkcQZEZrZVTul2N&tablename=CJFDLAST2018')
res.encoding = res.apparent_encoding
# print(res.headers)
print(res.text)
output
</head>
<body>
<div class="c_verify-box">
<form method="post" onsubmit="return validate();">
<h3 class="title">安全验证</h3>
<p class="c_verify-desc">您当前的IP为:183.134.192.27,您的操作过于频繁,为保障帐
户的正常使用,请输入验证码:</p>
<dl class="c_verify-code">
<dt><img id="vImg" src="/kdoc/request/ValidateCode.ashx?t=1577242936454" alt="验证码" title="点击切换验证码"></dt>
<dd>
<p class="tips" id="tips"></p>
<input type="password" id="vcode" name="vcode" maxlength="4"><button class="c_btn" type="submit">提交</button>
</dd>
</dl>
</form>
</div>
</body>
</html>
我记得知网有个api,选定了是否是pdf文件还是caj文件,caj比较恶心,而且类型还不是所有的软件都支持打开
/kns/download?filename=5UjSyB3SXd0N18mWrImTGNVYTxETNF0QZhXMWl3R2RVTHRnYIVjRuBzT6dmarVEa5gHVGJEeCplQHJETrZ2Q40UQMVmeTNTZTFEM4cnerglV0hDOoVGVI5WRR5mWod2VilUZ2V2QFN1dqJ2ZKtSMZR0LrFWW1t0U&tablename=CAPJLAST&dflag=pdfdown
dflag=pdfdown 这个是pdf的下载链接
dflag=cajdown 这个是caj的下载链接
除此之外其余的参数就没什么区别了
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.