fake-useragent / fake-useragent Goto Github PK
View Code? Open in Web Editor NEWUp-to-date simple useragent faker with real world database
Home Page: https://pypi.python.org/pypi/fake-useragent
License: Apache License 2.0
Up-to-date simple useragent faker with real world database
Home Page: https://pypi.python.org/pypi/fake-useragent
License: Apache License 2.0
from fake_useragent import UserAgent
ua = UserAgent()
Traceback (most recent call last):
File "", line 1, in
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/fake.py", line 17, in init
self.load()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/fake.py", line 21, in load
self.data = load_cached()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 138, in load_cached
update()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 133, in update
write(load())
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 99, in load
browsers_dict[browser_key] = get_browser_versions(browser)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 64, in get_browser_versions
html = get(settings.BROWSER_BASE_PAGE.format(browser=quote_plus(browser)))
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 29, in get
return urlopen(request, timeout=settings.HTTP_TIMEOUT).read()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1227, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1197, in do_open
raise URLError(err)
urllib2.URLError:
Not everyone uses pip and in order to fetch the project besides pulling from master, it would be ideal to fetch from tags from a package manager perspective that is. Thanks in advanced
I can not access "www.w3schools.com" in China,so I can't call get_browsers() in utils.py.
For me, certain sites will not be accessible with out of date random user agents using this library.
This might just be a limitation of this library, but I thought I would add my 2c in saying that adding an option for most popular user strings to prevent unable to access might be worthwhile.
E.g Taken From here-
https://developers.whatismybrowser.com/useragents/explore/software_name/chrome/
Might be of use. I use Chrome a lot so hence this example.
Hi,buddy,I found a bug like this:
DEBUG: Error occurred during fetching https://www.w3schools.com/browsers/default.asp
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/fake_useragent/utils.py", line 67, in get
context=context,
File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib64/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/usr/lib64/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib64/python2.7/urllib2.py", line 1258, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/usr/lib64/python2.7/urllib2.py", line 1214, in do_open
raise URLError(err)
Hope to be solved,thank you
I am getting this when trying to init a new UserAgent()
Traceback (most recent call last):
File "", line 1, in
File "/Users/davidyu/.virtualenvs/webscraper/lib/python2.7/site-packages/fake_useragent/fake.py", line 10, in init
self.data = load_cached()
File "/Users/davidyu/.virtualenvs/webscraper/lib/python2.7/site-packages/fake_useragent/utils.py", line 140, in load_cached
update()
File "/Users/davidyu/.virtualenvs/webscraper/lib/python2.7/site-packages/fake_useragent/utils.py", line 135, in update
write(load())
File "/Users/davidyu/.virtualenvs/webscraper/lib/python2.7/site-packages/fake_useragent/utils.py", line 92, in load
browsers_dict[browser_key] = get_browser_versions(browser)
File "/Users/davidyu/.virtualenvs/webscraper/lib/python2.7/site-packages/fake_useragent/utils.py", line 55, in get_browser_versions
html = html.split('<div id='liste'>')[1]
IndexError: list index out of range
I believe it is possibly because internetexplorer's BROWSER_BASE_PAGE is not returning any usable values.
Each time when fake-useragent
downloaded and combined data, data needs to be verified
Urllib usage does not account for proxies
If I have time, I will write a pull request, but I cannot promise. Not sure if this is something the author is aware of, so issue created.
I have been having a lot of problems today regarding fake_useragent. In a real world example of using selenium, but i cant even get it to pull the string at all either.
Python 3.6.1 (default, Jun 8 2017, 06:36:16)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from fake_useragent import UserAgent as UA
>>> ua = UA()
>>> ua.random
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 136, in __getattr__
return random.choice(self.data_browsers[browser])
File "/usr/local/lib/python3.6/random.py", line 257, in choice
raise IndexError('Cannot choose from an empty sequence') from None
IndexError: Cannot choose from an empty sequence
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 139, in __getattr__
raise FakeUserAgentError('Error occurred during getting browser') # noqa
fake_useragent.errors.FakeUserAgentError: Error occurred during getting browser
>>> ua.VERSION
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 136, in __getattr__
return random.choice(self.data_browsers[browser])
KeyError: 'version'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 139, in __getattr__
raise FakeUserAgentError('Error occurred during getting browser') # noqa
fake_useragent.errors.FakeUserAgentError: Error occurred during getting browser
>>> ua.chrome
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 136, in __getattr__
return random.choice(self.data_browsers[browser])
File "/usr/local/lib/python3.6/random.py", line 257, in choice
raise IndexError('Cannot choose from an empty sequence') from None
IndexError: Cannot choose from an empty sequence
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 139, in __getattr__
raise FakeUserAgentError('Error occurred during getting browser') # noqa
fake_useragent.errors.FakeUserAgentError: Error occurred during getting browser
>>> ua['google chrome']
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 136, in __getattr__
return random.choice(self.data_browsers[browser])
File "/usr/local/lib/python3.6/random.py", line 257, in choice
raise IndexError('Cannot choose from an empty sequence') from None
IndexError: Cannot choose from an empty sequence
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 119, in __getitem__
return self.__getattr__(attr)
File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 139, in __getattr__
raise FakeUserAgentError('Error occurred during getting browser') # noqa
fake_useragent.errors.FakeUserAgentError: Error occurred during getting browser
>>>
another traceback with selenium at
#41
I'm getting this error with version 0.1.7 running on Mac OS X. It seems that the common suggestion to this is to update the version, but I think I have a version where this error should not come anymore? Any ideas?
Traceback (most recent call last): File "/Users/mikko/dev/norway/lib/python2.7/site-packages/twisted/internet/defer.py", line 1386, in _inlineCallbacks result = g.send(result) File "/Users/mikko/dev/norway/lib/python2.7/site-packages/scrapy/core/downloader/middleware.py", line 37, in process_request response = yield method(request=request, spider=spider) File "/Users/mikko/dev/norway/lib/python2.7/site-packages/scrapy_fake_useragent/middleware.py", line 28, in process_request self.proxy2ua[proxy] = get_ua() File "/Users/mikko/dev/norway/lib/python2.7/site-packages/scrapy_fake_useragent/middleware.py", line 23, in get_ua return getattr(self.ua, self.ua_type) File "/Users/mikko/dev/norway/lib/python2.7/site-packages/fake_useragent/fake.py", line 139, in __getattr__ raise FakeUserAgentError('Error occurred during getting browser') # noqa FakeUserAgentError: Error occurred during getting browser
I've seen u have already fixed this problem but it still appears to me :(
root@vmi52271:~/# pip install --upgrade fake-useragent
Collecting fake-useragent
Downloading fake-useragent-0.1.2.tar.gz
Building wheels for collected packages: fake-useragent
Running setup.py bdist_wheel for fake-useragent ... done
Stored in directory: /root/.cache/pip/wheels/be/63/15/f6e26846756da814630681d9fd98d53310426a8464289d7455
Successfully built fake-useragent
Installing collected packages: fake-useragent
Successfully installed fake-useragent-0.1.2
root@vmi52271:~/# pip install -U fake-useragent
Requirement already up-to-date: fake-useragent in /usr/local/lib/pypy2.7/dist-packages
root@vmi52271:~/# python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from fake_useragent import UserAgent
>>> ua = UserAgent()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/fake.py", line 10, in __init__
self.data = load_cached()
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 140, in load_cached
update()
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 135, in update
write(load())
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 92, in load
browsers_dict[browser_key] = get_browser_versions(browser)
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 55, in get_browser_versions
html = html.split('<div id=\'liste\'>')[1]
IndexError: list index out of range
thanks for the great work
Hi there,
For whatever reason, the self.data['browsers']/self.data_browsers[browser] has only keys, no values,
{u'chrome': [], u'opera': [], u'firefox': [], u'internetexplorer': [], u'safari': []}
Therefore our production cron is broken due to this error
raise FakeUserAgentError('Error occurred during getting browser') # noqa
So I am asking you to return a default browser if choice is empty.
i.e in line https://github.com/hellysmile/fake-useragent/blob/master/fake_useragent/fake.py#L136
if not self.data_browsers[browser]:
return random.choice(['firefox', 'chrome', 'opera', 'safari']) # i.e a fallback list
return random.choice(self.data_browsers[browser])
Hope you can give a better solution.
fake-useragent.UserAgent().chrome
return
'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36'
but the newest chrome is
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36'
Chrome/27.0 is too old, and when I request 'https://www.zhihu.com/sigup', I'm redirected to 'https://www.zhihu.com/compatibility/index.html'
Seems that some parameters have changed and it is causing a bug:
>>> import fake_useragent
>>> fake_useragent.UserAgent()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/site-packages/fake_useragent/fake.py", line 10, in __init__
self.data = load_cached()
File "/usr/local/lib/python2.7/site-packages/fake_useragent/utils.py", line 140, in load_cached
update()
File "/usr/local/lib/python2.7/site-packages/fake_useragent/utils.py", line 135, in update
write(load())
File "/usr/local/lib/python2.7/site-packages/fake_useragent/utils.py", line 92, in load
browsers_dict[browser_key] = get_browser_versions(browser)
File "/usr/local/lib/python2.7/site-packages/fake_useragent/utils.py", line 55, in get_browser_versions
html = html.split('<div id=\'liste\'>')[1]
IndexError: list index out of range
Hello!
Thanks for cool library.
I have one issue though: on my hardware I see the exception "Error occurred during formatting data. Trying to use fallback server" too often. Hardcoding settings.HTTP_TIMEOUT
to large value like 10 makes it go away though. Could you please increase it a bit (say, to 5) or maybe add a possibility to define particular timeout when creating UserAgent
instance?
It seems the cache server is unreliably available, as my programs are failing. They are trying to load "https://fake-useragent.herokuapp.com/browsers/0.1.8". Sometimes when I visit this page in a browser it is available (loading "browsers": {"chrome":.....), most of the time I am getting a connection closed error:
Error occurred during loading data. Trying to use cache server https://fake-useragent.herokuapp.com/browsers/0.1.8
Traceback (most recent call last):
File "C:\Python36\lib\site-packages\fake_useragent\utils.py", line 67, in get
context=context,
File "C:\Python36\lib\urllib\request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "C:\Python36\lib\urllib\request.py", line 526, in open
response = self._open(req, data)
File "C:\Python36\lib\urllib\request.py", line 544, in _open
'_open', req)
File "C:\Python36\lib\urllib\request.py", line 504, in _call_chain
result = func(*args)
File "C:\Python36\lib\urllib\request.py", line 1346, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "C:\Python36\lib\urllib\request.py", line 1321, in do_open
r = h.getresponse()
File "C:\Python36\lib\http\client.py", line 1331, in getresponse
response.begin()
File "C:\Python36\lib\http\client.py", line 297, in begin
version, status, reason = self._read_status()
File "C:\Python36\lib\http\client.py", line 266, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
I'd like to be able to change settings like BROWSERS_COUNT_LIMIT.
Maybe just as a parameter to the update method.
The service this library relies on to get user agent data is woefully out of date.
This library pulls from this page to get firefox User Agent strings, but it looks like it only goes up to version 40.1, which was released in 2015.
Similarly, chrome versions are pulled from here and only go up to version 41.
I'm getting this error with version 0.1.7 running on Mac OS X with python3.6.1 when I try the code below:
from fake_useragent import UserAgent
ua = UserAgent()
which raises error:
Error occurred during loading data. Trying to use cache server https://fake-useragent.herokuapp.com/browsers/0.1.7
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1318, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1026, in _send_output
self.send(msg)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 964, in send
self.connect()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1392, in connect
super().connect()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 936, in connect
(self.host,self.port), self.timeout, self.source_address)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 722, in create_connection
raise err
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 713, in create_connection
sock.connect(sa)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 67, in get
context=context,
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 526, in open
response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 544, in _open
'_open', req)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1361, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1320, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error timed out>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 150, in load
for item in get_browsers(verify_ssl=verify_ssl):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 97, in get_browsers
html = get(settings.BROWSERS_STATS_PAGE, verify_ssl=verify_ssl)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 84, in get
raise FakeUserAgentError('Maximum amount of retries reached')
fake_useragent.errors.FakeUserAgentError: Maximum amount of retries reached
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1318, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1026, in _send_output
self.send(msg)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 964, in send
self.connect()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1392, in connect
super().connect()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 936, in connect
(self.host,self.port), self.timeout, self.source_address)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 722, in create_connection
raise err
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 713, in create_connection
sock.connect(sa)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 67, in get
context=context,
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 526, in open
response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 544, in _open
'_open', req)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1361, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1320, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error timed out>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 150, in load
for item in get_browsers(verify_ssl=verify_ssl):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 97, in get_browsers
html = get(settings.BROWSERS_STATS_PAGE, verify_ssl=verify_ssl)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 84, in get
raise FakeUserAgentError('Maximum amount of retries reached')
fake_useragent.errors.FakeUserAgentError: Maximum amount of retries reached
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1318, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1026, in _send_output
self.send(msg)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 964, in send
self.connect()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1400, in connect
server_hostname=server_hostname)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 401, in wrap_socket
_context=self, _session=session)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 808, in __init__
self.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 1061, in do_handshake
self._sslobj.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 683, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 67, in get
context=context,
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 526, in open
response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 544, in _open
'_open', req)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1361, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1320, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/fake.py", line 69, in __init__
self.load()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/fake.py", line 78, in load
verify_ssl=self.verify_ssl,
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 246, in load_cached
update(path, use_cache_server=use_cache_server, verify_ssl=verify_ssl)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 241, in update
write(path, load(use_cache_server=use_cache_server, verify_ssl=verify_ssl))
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 185, in load
verify_ssl=verify_ssl,
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 84, in get
raise FakeUserAgentError('Maximum amount of retries reached')
fake_useragent.errors.FakeUserAgentError: Maximum amount of retries reached
But the same code works fine with python2.7. Any ideas?
Hello,
Thank you for the nice module. A nice addition to this library would be the ability to filter the user agent based on their release date (so that you dont get too old browser). Sadly, it seems http://useragentstring.com/ doesn't provide this kind of information. Another thing is that it seems most of the UA seems out of date on the website (exemple, for chrome, last version on useragentstring.com is 41 when in reality it is 52..). I tried to look around to find an up-to-date database of all user-agent but i kind of failed :( Does anyone know where we could gather/scrape this data ?
The best i could found is: https://techblog.willshouse.com/2012/01/03/most-common-user-agents/
Would be great to add android as an option, for example
ua.android
Thanks
Returning 404
Below is the traceback.
The page http://www.w3schools.com/browsers/browsers_stats.asp doesn't have a table with class="reference no translate".
I guess the layout has changed in the last 24 hours.
from fake_useragent import UserAgent
ua = UserAgent(cache=False)
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/fake.py", line 12, in init
self.data = load()
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 82, in load
for item in get_browsers():
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 29, in get_browsers
html = html.split('
http://www.w3schools.com/browsers/browsers_stats.asp
I have a package that depends on fake-useragent, and recently, it's automated unittests have been failing because fake-useragent has been timing out when it tries to retrieve agents.
File "/home/travis/build/chrisspen/howdou/.tox/py27/lib/python2.7/site-packages/fake_useragent/fake.py", line 98, in __getattr__
raise FakeUserAgentError('Error occurred during getting browser') # noqa
What's causing this? Am I hitting some web resource too much, or is it a bug in fake-useragent? What can I do to minimize this or cache the results locally?
A lot of modules have a version method that reports the version of the module itself. Could you add something that?
Apparently the site isn't responding to requests to 'http://useragentstring.com/pages/%s/'
anymore, causing an initialization of UserAgent to raise a 404 error.
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 2000, in call
return self.wsgi_app(environ, start_response)
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1991, in wsgi_app
response = self.make_response(self.handle_exception(e))
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1567, in handle_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1988, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1641, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1544, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1639, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1625, in dispatch_request
return self.view_functionsrule.endpoint
File "/root/appsflyer_pushs/main_with_stream.py", line 127, in listen
if valid_request(ip=request.remote_addr):
File "/root/appsflyer_pushs/main_with_stream.py", line 90, in valid_request
ua = UserAgent(cache=False)
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/fake.py", line 12, in init
self.data = load()
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 92, in load
browsers_dict[browser_key] = get_browser_versions(browser)
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 55, in get_browser_versions
html = html.split('<div id='liste'>')[1]
IndexError: list index out of range
Traceback (most recent call last):
File "/usr/local/lib64/python2.7/site-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks
result = g.send(result)
File "/usr/local/lib/python2.7/site-packages/scrapy/core/downloader/middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "/usr/local/lib/python2.7/site-packages/scrapy_fake_useragent/middleware.py", line 27, in process_request
request.headers.setdefault('User-Agent', self.ua.random)
File "/usr/local/lib/python2.7/site-packages/fake_useragent/fake.py", line 98, in __getattr__
raise FakeUserAgentError('Error occurred during getting browser') # noqa
FakeUserAgentError: Error occurred during getting browser
I keep getting this error on the Linux server when I run multiple spiders concurrently. What should I do to avoid that? Do I have to raise the RAM or something?
When I try use ua = UserAgent()
on a EC2 ubuntu instance, I get this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ubuntu/cncases/local/lib/python3.4/site-packages/fake_useragent/fake.py", line 13, in __init__
self.load()
File "/home/ubuntu/cncases/local/lib/python3.4/site-packages/fake_useragent/fake.py", line 17, in load
self.data = load_cached()
File "/home/ubuntu/cncases/local/lib/python3.4/site-packages/fake_useragent/utils.py", line 135, in load_cached
update()
File "/home/ubuntu/cncases/local/lib/python3.4/site-packages/fake_useragent/utils.py", line 130, in update
write(load())
File "/home/ubuntu/cncases/local/lib/python3.4/site-packages/fake_useragent/utils.py", line 96, in load
browsers_dict[browser_key] = get_browser_versions(browser)
File "/home/ubuntu/cncases/local/lib/python3.4/site-packages/fake_useragent/utils.py", line 61, in get_browser_versions
html = get(settings.BROWSER_BASE_PAGE.format(browser=quote_plus(browser)))
File "/home/ubuntu/cncases/local/lib/python3.4/site-packages/fake_useragent/utils.py", line 27, in get
return urlopen(request, timeout=settings.HTTP_TIMEOUT).read()
File "/usr/lib/python3.4/urllib/request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.4/urllib/request.py", line 463, in open
response = self._open(req, data)
File "/usr/lib/python3.4/urllib/request.py", line 481, in _open
'_open', req)
File "/usr/lib/python3.4/urllib/request.py", line 441, in _call_chain
result = func(*args)
File "/usr/lib/python3.4/urllib/request.py", line 1210, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/usr/lib/python3.4/urllib/request.py", line 1185, in do_open
r = h.getresponse()
File "/usr/lib/python3.4/http/client.py", line 1171, in getresponse
response.begin()
File "/usr/lib/python3.4/http/client.py", line 351, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.4/http/client.py", line 313, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.4/socket.py", line 374, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out
The machine definitely have internet connection since I can ping google. Also, this only happens to a few instances I established this week. The ones from last month are still running properly. I have update fake-useragent to its most recent version 0.1.2.
I'm using fake-useragent
in combination with scrapy-fake-useragent and due to useragentstring.com downtimes this library's get()
method yields a socket.timeout: timed out
.
It would be nice to have a distinct exception type, e.g. FakeUserAgentError
, which says something like Url {0} unreachable
instead.
Добрый день. Хотелось бы видеть в новой версии возможность получения одного и того же useragent. К примеру, ua = UserAgent(last=True), после чего ua.chrome ВСЕГДА выдаёт самый свежий (для данной базы) useragent для этого браузера.
Для чего это надо - я ежедневно захожу на сайт, который при авторизации запрашивает и userkey, который высчитывается в том числе и на основе строки useragent. Если я буду ежедневно (по несколько раз) заходить с разным userkey, то это неизбежно вызовет подозрения.
ps: "ВСЕГДА" - в смысле не только для одной сессии, а при любом запуске скрипта.
It would be useful to be able to use the very nice "random" behavior but ensure to always or never get a useragent corresponding to a mobile device
Hi when I type in the commands:
from fake_useragent import UserAgent
ua = UserAgent()
It shows the following error:
Traceback (most recent call last):
File "scraper.py", line 287, in
ua = UserAgent()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/fake.py", line 10, in init
self.data = load_cached()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 140, in load_cached
update()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 135, in update
write(load())
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 92, in load
browsers_dict[browser_key] = get_browser_versions(browser)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 55, in get_browser_versions
html = html.split('<div id='liste'>')[1]
IndexError: list index out of range
I tried uninstalling and reinstalling using pip install fake-useragent
, but the error still shows. Could you have a look and see what is the problem?
Thanks!
Hi Guys,
I'm new to python so I apologize if this is basic. The first two lines of my script read:
from fake_useragent import UserAgent
ua = UserAgent()
I'm getting the following error:
File "Clicks.py", line 2, in
ua = UserAgent()
File "/Library/Python/2.7/site-packages/fake_useragent/fake.py", line 10, in init
self.data = load_cached()
File "/Library/Python/2.7/site-packages/fake_useragent/utils.py", line 140, in load_cached
update()
File "/Library/Python/2.7/site-packages/fake_useragent/utils.py", line 135, in update
write(load())
File "/Library/Python/2.7/site-packages/fake_useragent/utils.py", line 92, in load
browsers_dict[browser_key] = get_browser_versions(browser)
File "/Library/Python/2.7/site-packages/fake_useragent/utils.py", line 55, in get_browser_versions
html = html.split('<div id='liste'>')[1]
IndexError: list index out of range
I've used this script many times successfully in the past and nothing in my code has changed. This is the first time I've run it in about a week though. Thanks again for any help.
-Tim
from fake_useragent import UserAgent
on calling ua = UserAgent() or ua = UserAgent(cache=False)
Traceback (most recent call last):
File "<pyshell#17>", line 1, in
ua = UserAgent()
File "/usr/local/lib/python2.7/dist-packages/fake_useragent-0.0.6-py2.7.egg/fake_useragent/fake.py", line 12, in init
self.data = load_cached()
File "/usr/local/lib/python2.7/dist-packages/fake_useragent-0.0.6-py2.7.egg/fake_useragent/utils.py", line 134, in load_cached
update()
File "/usr/local/lib/python2.7/dist-packages/fake_useragent-0.0.6-py2.7.egg/fake_useragent/utils.py", line 129, in update
write(load())
File "/usr/local/lib/python2.7/dist-packages/fake_useragent-0.0.6-py2.7.egg/fake_useragent/utils.py", line 88, in load
for counter in range(int(float(percent))):
ValueError: could not convert string to float: <a
Here is the whole traceback:
Traceback (most recent call last):
File "...\fake_useragent\fake.py", line 11, in init
self.data = load_cached()
File "...\fake_useragent\utils.py", line 133, in load_cached
update()
File "...\fake_useragent\utils.py", line 128, in update
write(load())
File "...\fake_useragent\utils.py", line 87, in load
for counter in range(int(float(percent))):
ValueError: could not convert string to float: <a
I tried to upgrade fake-useragent. It did not help.
I also tried pip uninstall fake-useragent and pip install fake-useragent again. It returns the same error message.
By the way, my OS is Windows 8.1.
I'll appreciate it if you help me with this issue.
list of alternatives
http://user-agent-string.info/list-of-ua
In [1]: from fake_useragent import UserAgent
In [2]: ua = UserAgent()
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-2-89a0eee92536> in <module>()
----> 1 ua = UserAgent()
/Users/josefson/virtualenvs/CNPQ/lib/python2.7/site-packages/fake_useragent/fake.pyc in __init__(self, cache)
8 def __init__(self, cache=True):
9 if cache:
---> 10 self.data = load_cached()
11 else:
12 self.data = load()
/Users/josefson/virtualenvs/CNPQ/lib/python2.7/site-packages/fake_useragent/utils.pyc in load_cached()
138 def load_cached():
139 if not exist():
--> 140 update()
141
142 return read()
/Users/josefson/virtualenvs/CNPQ/lib/python2.7/site-packages/fake_useragent/utils.pyc in update()
133 rm()
134
--> 135 write(load())
136
137
/Users/josefson/virtualenvs/CNPQ/lib/python2.7/site-packages/fake_useragent/utils.pyc in load()
80 randomize_dict = {}
81
---> 82 for item in get_browsers():
83 browser, percent = item
84
/Users/josefson/virtualenvs/CNPQ/lib/python2.7/site-packages/fake_useragent/utils.pyc in get_browsers()
27 html = get(settings.BROWSERS_STATS_PAGE)
28 html = html.decode('windows-1252')
---> 29 html = html.split('<table class="reference notranslate">')[1]
30 html = html.split('</table>')[0]
31
IndexError: list index out of range
Hi @hellysmile!
Thank you for this nice library, it is a really nice addition to the community! Today I tried to use the library but got stuck when it tried to connect to http://useragentstring.com/pages/Chrome/.
It seems as if that page throws a 404. I have checked through the Wayback Machine and it seems as if that page existed in the past. My guess is that useragentstring.com changed their website architecture.
If you believe the site was down temporarily and everything works as expected then please mark this issue as solved.
Thank you so much!
How about a SIMPLE how to install guide for those of us who need it?
I don't mind copying & pasting Terminal commands, but I don't script.
I get this out of the errors.log
WARNING:root:Error occurred during loading data. Trying to use cache server https://fake-useragent.herokuapp.com/browsers/0.1.8
Traceback (most recent call last):
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 154, in load
for item in get_browsers(verify_ssl=verify_ssl):
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 97, in get_browsers
html = get(settings.BROWSERS_STATS_PAGE, verify_ssl=verify_ssl)
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 84, in get
raise FakeUserAgentError('Maximum amount of retries reached')
FakeUserAgentError: Maximum amount of retries reached
WARNING:root:Error occurred during loading data. Trying to use cache server https://fake-useragent.herokuapp.com/browsers/0.1.8
Traceback (most recent call last):
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 154, in load
for item in get_browsers(verify_ssl=verify_ssl):
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 97, in get_browsers
html = get(settings.BROWSERS_STATS_PAGE, verify_ssl=verify_ssl)
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 84, in get
raise FakeUserAgentError('Maximum amount of retries reached')
FakeUserAgentError: Maximum amount of retries reached
And this out of executing the script
/data/user/0/org.qpython.qpy/files/bin/qpython-android5-root.sh
"/storage/emulated/0/qpython/instabot.py-master/main.py" && exit
/instabot.py-master/main.py" && exit <
Traceback (most recent call last):
File "/storage/emulated/0/qpython/instabot.py-master/main.py", line 77, in
'yuki_nishitani8','contemporary.paintings','danteslens','ramontrotman','voidcrack',])
File "/storage/emulated/0/qpython/instabot.py-master/src/instabot.py", line 162, in init fake_ua = UserAgent()
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/fake.py", line 69, in init self.load()
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/fake.py", line 78, in load
verify_ssl=self.verify_ssl,
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 250, in load_cached
update(path, use_cache_server=use_cache_server, verify_ssl=verify_ssl)
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 245, in update write(path, load(use_cache_server=use_cache_server, verify_ssl=verify_ssl))
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 189, in load
verify_ssl=verify_ssl,
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 84, in get
raise FakeUserAgentError('Maximum amount of retries reached')
fake_useragent.errors.FakeUserAgentError: Maximum amount of retries reached
1|bullhead:/ $
Can someone tell me what is wrong? Have no issus on Windows or Mac
when I use fake-useragent in my scrapy spider.
When I type "scrapy crawl myspider", it give me an error "[fake_useragent] DEBUG: Error occurred during fetching https://www.w3schools.com/browsers/default.asp"
ConnectionRefusedError: [Errno 111] Connection refused.
Then I use command "scrapy fetch website", it give me the same error.
Finally, I get "https://www.w3schools.com/browsers/default.asp" with chrome browser. still failed.
So, I guess w3school going wrong.
replace urllib.request.urlopen
by twisted.web.client.Agent
I writes a spider to crawl the web from PC front with ua.random
, but sometimes, it would redirect to mobile front , that's not what I expected.
I found it caused by ua.random
would return a mobile ua as below:
import requests
res = requests.get("https://fake-useragent.herokuapp.com/browsers/0.1.5")
print res.json()['browsers']['safari'][41]
# Mozilla/5.0 (Android 2.2; Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4
From #10 (comment), there will be eventually added ua.desktop
and ua.mobile
, I'm really looking forward to it.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.