Coder Social home page Coder Social logo

fake-useragent / fake-useragent Goto Github PK

View Code? Open in Web Editor NEW
3.5K 61.0 512.0 492 KB

Up-to-date simple useragent faker with real world database

Home Page: https://pypi.python.org/pypi/fake-useragent

License: Apache License 2.0

Python 96.71% Shell 3.29%
python python3 user agent fake faker scraping user-agent user-agent-spoofer useragent useragent-scraper

fake-useragent's Issues

urllib2.URLError: <urlopen error timed out>

from fake_useragent import UserAgent
ua = UserAgent()

Traceback (most recent call last):
File "", line 1, in
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/fake.py", line 17, in init
self.load()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/fake.py", line 21, in load
self.data = load_cached()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 138, in load_cached
update()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 133, in update
write(load())
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 99, in load
browsers_dict[browser_key] = get_browser_versions(browser)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 64, in get_browser_versions
html = get(settings.BROWSER_BASE_PAGE.format(browser=quote_plus(browser)))
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 29, in get
return urlopen(request, timeout=settings.HTTP_TIMEOUT).read()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1227, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1197, in do_open
raise URLError(err)
urllib2.URLError:

Create some tag releases

Not everyone uses pip and in order to fetch the project besides pulling from master, it would be ideal to fetch from tags from a package manager perspective that is. Thanks in advanced

Certain sites will not be accessible with out of date random user agents.

For me, certain sites will not be accessible with out of date random user agents using this library.

This might just be a limitation of this library, but I thought I would add my 2c in saying that adding an option for most popular user strings to prevent unable to access might be worthwhile.

E.g Taken From here-
https://developers.whatismybrowser.com/useragents/explore/software_name/chrome/

Might be of use. I use Chrome a lot so hence this example.

Error occurred during fetching https://www.w3schools.com/browsers/default.asp

Hi,buddy,I found a bug like this:
DEBUG: Error occurred during fetching https://www.w3schools.com/browsers/default.asp
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/fake_useragent/utils.py", line 67, in get
context=context,
File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib64/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/usr/lib64/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib64/python2.7/urllib2.py", line 1258, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/usr/lib64/python2.7/urllib2.py", line 1214, in do_open
raise URLError(err)
Hope to be solved,thank you

index out of range error on initialization

I am getting this when trying to init a new UserAgent()
Traceback (most recent call last):
File "", line 1, in
File "/Users/davidyu/.virtualenvs/webscraper/lib/python2.7/site-packages/fake_useragent/fake.py", line 10, in init
self.data = load_cached()
File "/Users/davidyu/.virtualenvs/webscraper/lib/python2.7/site-packages/fake_useragent/utils.py", line 140, in load_cached
update()
File "/Users/davidyu/.virtualenvs/webscraper/lib/python2.7/site-packages/fake_useragent/utils.py", line 135, in update
write(load())
File "/Users/davidyu/.virtualenvs/webscraper/lib/python2.7/site-packages/fake_useragent/utils.py", line 92, in load
browsers_dict[browser_key] = get_browser_versions(browser)
File "/Users/davidyu/.virtualenvs/webscraper/lib/python2.7/site-packages/fake_useragent/utils.py", line 55, in get_browser_versions
html = html.split('<div id='liste'>')[1]
IndexError: list index out of range

I believe it is possibly because internetexplorer's BROWSER_BASE_PAGE is not returning any usable values.

Self health check

Each time when fake-useragent downloaded and combined data, data needs to be verified

Add proxy support

Urllib usage does not account for proxies

If I have time, I will write a pull request, but I cannot promise. Not sure if this is something the author is aware of, so issue created.

Cannot choose from an empty sequence

I have been having a lot of problems today regarding fake_useragent. In a real world example of using selenium, but i cant even get it to pull the string at all either.

Python 3.6.1 (default, Jun  8 2017, 06:36:16) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from fake_useragent import UserAgent as UA
>>> ua = UA()
>>> ua.random
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 136, in __getattr__
    return random.choice(self.data_browsers[browser])
  File "/usr/local/lib/python3.6/random.py", line 257, in choice
    raise IndexError('Cannot choose from an empty sequence') from None
IndexError: Cannot choose from an empty sequence

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 139, in __getattr__
    raise FakeUserAgentError('Error occurred during getting browser')  # noqa
fake_useragent.errors.FakeUserAgentError: Error occurred during getting browser
>>> ua.VERSION
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 136, in __getattr__
    return random.choice(self.data_browsers[browser])
KeyError: 'version'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 139, in __getattr__
    raise FakeUserAgentError('Error occurred during getting browser')  # noqa
fake_useragent.errors.FakeUserAgentError: Error occurred during getting browser
>>> ua.chrome
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 136, in __getattr__
    return random.choice(self.data_browsers[browser])
  File "/usr/local/lib/python3.6/random.py", line 257, in choice
    raise IndexError('Cannot choose from an empty sequence') from None
IndexError: Cannot choose from an empty sequence

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 139, in __getattr__
    raise FakeUserAgentError('Error occurred during getting browser')  # noqa
fake_useragent.errors.FakeUserAgentError: Error occurred during getting browser
>>> ua['google chrome']
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 136, in __getattr__
    return random.choice(self.data_browsers[browser])
  File "/usr/local/lib/python3.6/random.py", line 257, in choice
    raise IndexError('Cannot choose from an empty sequence') from None
IndexError: Cannot choose from an empty sequence

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 119, in __getitem__
    return self.__getattr__(attr)
  File "/usr/local/lib/python3.6/site-packages/fake_useragent/fake.py", line 139, in __getattr__
    raise FakeUserAgentError('Error occurred during getting browser')  # noqa
fake_useragent.errors.FakeUserAgentError: Error occurred during getting browser
>>> 

another traceback with selenium at
#41

FakeUserAgentError('Error occurred during getting browser')

I'm getting this error with version 0.1.7 running on Mac OS X. It seems that the common suggestion to this is to update the version, but I think I have a version where this error should not come anymore? Any ideas?

Traceback (most recent call last): File "/Users/mikko/dev/norway/lib/python2.7/site-packages/twisted/internet/defer.py", line 1386, in _inlineCallbacks result = g.send(result) File "/Users/mikko/dev/norway/lib/python2.7/site-packages/scrapy/core/downloader/middleware.py", line 37, in process_request response = yield method(request=request, spider=spider) File "/Users/mikko/dev/norway/lib/python2.7/site-packages/scrapy_fake_useragent/middleware.py", line 28, in process_request self.proxy2ua[proxy] = get_ua() File "/Users/mikko/dev/norway/lib/python2.7/site-packages/scrapy_fake_useragent/middleware.py", line 23, in get_ua return getattr(self.ua, self.ua_type) File "/Users/mikko/dev/norway/lib/python2.7/site-packages/fake_useragent/fake.py", line 139, in __getattr__ raise FakeUserAgentError('Error occurred during getting browser') # noqa FakeUserAgentError: Error occurred during getting browser

Still IndexError: list index out of range

I've seen u have already fixed this problem but it still appears to me :(

root@vmi52271:~/# pip install --upgrade fake-useragent
Collecting fake-useragent
  Downloading fake-useragent-0.1.2.tar.gz
Building wheels for collected packages: fake-useragent
  Running setup.py bdist_wheel for fake-useragent ... done
  Stored in directory: /root/.cache/pip/wheels/be/63/15/f6e26846756da814630681d9fd98d53310426a8464289d7455
Successfully built fake-useragent
Installing collected packages: fake-useragent
Successfully installed fake-useragent-0.1.2
root@vmi52271:~/# pip install -U fake-useragent
Requirement already up-to-date: fake-useragent in /usr/local/lib/pypy2.7/dist-packages
root@vmi52271:~/# python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from fake_useragent import UserAgent
>>> ua = UserAgent()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/fake_useragent/fake.py", line 10, in __init__
    self.data = load_cached()
  File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 140, in load_cached
    update()
  File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 135, in update
    write(load())
  File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 92, in load
    browsers_dict[browser_key] = get_browser_versions(browser)
  File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 55, in get_browser_versions
    html = html.split('<div id=\'liste\'>')[1]
IndexError: list index out of range

thanks for the great work

Browser list has no values, raising FakeUserAgentError in fake.py", line 140,

Hi there,

For whatever reason, the self.data['browsers']/self.data_browsers[browser] has only keys, no values,

{u'chrome': [], u'opera': [], u'firefox': [], u'internetexplorer': [], u'safari': []}

Therefore our production cron is broken due to this error

raise FakeUserAgentError('Error occurred during getting browser')  # noqa

So I am asking you to return a default browser if choice is empty.
i.e in line https://github.com/hellysmile/fake-useragent/blob/master/fake_useragent/fake.py#L136

if not self.data_browsers[browser]:
    return random.choice(['firefox', 'chrome', 'opera', 'safari']) # i.e a fallback list
return random.choice(self.data_browsers[browser])

Hope you can give a better solution.

Index out of range

Seems that some parameters have changed and it is causing a bug:

>>> import fake_useragent
>>> fake_useragent.UserAgent()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/site-packages/fake_useragent/fake.py", line 10, in __init__
    self.data = load_cached()
  File "/usr/local/lib/python2.7/site-packages/fake_useragent/utils.py", line 140, in load_cached
    update()
  File "/usr/local/lib/python2.7/site-packages/fake_useragent/utils.py", line 135, in update
    write(load())
  File "/usr/local/lib/python2.7/site-packages/fake_useragent/utils.py", line 92, in load
    browsers_dict[browser_key] = get_browser_versions(browser)
  File "/usr/local/lib/python2.7/site-packages/fake_useragent/utils.py", line 55, in get_browser_versions
    html = html.split('<div id=\'liste\'>')[1]
IndexError: list index out of range

Urllib timeout

Hello!
Thanks for cool library.
I have one issue though: on my hardware I see the exception "Error occurred during formatting data. Trying to use fallback server" too often. Hardcoding settings.HTTP_TIMEOUT to large value like 10 makes it go away though. Could you please increase it a bit (say, to 5) or maybe add a possibility to define particular timeout when creating UserAgent instance?

fake-useragent.herokuapp.com unreliable uptime

It seems the cache server is unreliably available, as my programs are failing. They are trying to load "https://fake-useragent.herokuapp.com/browsers/0.1.8". Sometimes when I visit this page in a browser it is available (loading "browsers": {"chrome":.....), most of the time I am getting a connection closed error:
image

Error occurred during loading data. Trying to use cache server https://fake-useragent.herokuapp.com/browsers/0.1.8
Traceback (most recent call last):
  File "C:\Python36\lib\site-packages\fake_useragent\utils.py", line 67, in get
    context=context,
  File "C:\Python36\lib\urllib\request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Python36\lib\urllib\request.py", line 526, in open
    response = self._open(req, data)
  File "C:\Python36\lib\urllib\request.py", line 544, in _open
    '_open', req)
  File "C:\Python36\lib\urllib\request.py", line 504, in _call_chain
    result = func(*args)
  File "C:\Python36\lib\urllib\request.py", line 1346, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "C:\Python36\lib\urllib\request.py", line 1321, in do_open
    r = h.getresponse()
  File "C:\Python36\lib\http\client.py", line 1331, in getresponse
    response.begin()
  File "C:\Python36\lib\http\client.py", line 297, in begin
    version, status, reason = self._read_status()
  File "C:\Python36\lib\http\client.py", line 266, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

Configurable settings

I'd like to be able to change settings like BROWSERS_COUNT_LIMIT.
Maybe just as a parameter to the update method.

Error with Python3

I'm getting this error with version 0.1.7 running on Mac OS X with python3.6.1 when I try the code below:

from fake_useragent import UserAgent
ua = UserAgent()

which raises error:

Error occurred during loading data. Trying to use cache server https://fake-useragent.herokuapp.com/browsers/0.1.7
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1318, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 964, in send
    self.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1392, in connect
    super().connect()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 936, in connect
    (self.host,self.port), self.timeout, self.source_address)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 722, in create_connection
    raise err
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 713, in create_connection
    sock.connect(sa)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 67, in get
    context=context,
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 526, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 544, in _open
    '_open', req)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1361, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1320, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error timed out>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 150, in load
    for item in get_browsers(verify_ssl=verify_ssl):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 97, in get_browsers
    html = get(settings.BROWSERS_STATS_PAGE, verify_ssl=verify_ssl)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 84, in get
    raise FakeUserAgentError('Maximum amount of retries reached')
fake_useragent.errors.FakeUserAgentError: Maximum amount of retries reached
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1318, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 964, in send
    self.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1392, in connect
    super().connect()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 936, in connect
    (self.host,self.port), self.timeout, self.source_address)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 722, in create_connection
    raise err
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 713, in create_connection
    sock.connect(sa)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 67, in get
    context=context,
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 526, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 544, in _open
    '_open', req)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1361, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1320, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error timed out>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 150, in load
    for item in get_browsers(verify_ssl=verify_ssl):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 97, in get_browsers
    html = get(settings.BROWSERS_STATS_PAGE, verify_ssl=verify_ssl)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 84, in get
    raise FakeUserAgentError('Maximum amount of retries reached')
fake_useragent.errors.FakeUserAgentError: Maximum amount of retries reached

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1318, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 964, in send
    self.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1400, in connect
    server_hostname=server_hostname)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 401, in wrap_socket
    _context=self, _session=session)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 808, in __init__
    self.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 1061, in do_handshake
    self._sslobj.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py", line 683, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 67, in get
    context=context,
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 526, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 544, in _open
    '_open', req)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1361, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 1320, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/fake.py", line 69, in __init__
    self.load()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/fake.py", line 78, in load
    verify_ssl=self.verify_ssl,
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 246, in load_cached
    update(path, use_cache_server=use_cache_server, verify_ssl=verify_ssl)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 241, in update
    write(path, load(use_cache_server=use_cache_server, verify_ssl=verify_ssl))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 185, in load
    verify_ssl=verify_ssl,
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fake_useragent/utils.py", line 84, in get
    raise FakeUserAgentError('Maximum amount of retries reached')
fake_useragent.errors.FakeUserAgentError: Maximum amount of retries reached

But the same code works fine with python2.7. Any ideas?

Filter on date

Hello,

Thank you for the nice module. A nice addition to this library would be the ability to filter the user agent based on their release date (so that you dont get too old browser). Sadly, it seems http://useragentstring.com/ doesn't provide this kind of information. Another thing is that it seems most of the UA seems out of date on the website (exemple, for chrome, last version on useragentstring.com is 41 when in reality it is 52..). I tried to look around to find an up-to-date database of all user-agent but i kind of failed :( Does anyone know where we could gather/scrape this data ?

The best i could found is: https://techblog.willshouse.com/2012/01/03/most-common-user-agents/

Page http://www.w3schools.com/browsers/browsers_stats.asp has changed.

Below is the traceback.

The page http://www.w3schools.com/browsers/browsers_stats.asp doesn't have a table with class="reference no translate".

I guess the layout has changed in the last 24 hours.

from fake_useragent import UserAgent
ua = UserAgent(cache=False)

Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/fake.py", line 12, in init
self.data = load()
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 82, in load
for item in get_browsers():
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 29, in get_browsers
html = html.split('

')[1]
IndexError: list index out of range

http://www.w3schools.com/browsers/browsers_stats.asp

Error occurred during getting browser

I have a package that depends on fake-useragent, and recently, it's automated unittests have been failing because fake-useragent has been timing out when it tries to retrieve agents.

 File "/home/travis/build/chrisspen/howdou/.tox/py27/lib/python2.7/site-packages/fake_useragent/fake.py", line 98, in __getattr__

    raise FakeUserAgentError('Error occurred during getting browser')  # noqa

What's causing this? Am I hitting some web resource too much, or is it a bug in fake-useragent? What can I do to minimize this or cache the results locally?

Useragentstring.com change causing 404

Apparently the site isn't responding to requests to 'http://useragentstring.com/pages/%s/' anymore, causing an initialization of UserAgent to raise a 404 error.

Since the last update. All my programs crushsss due to: html = html.split('<div id=\'liste\'>')[1] IndexError: list index out of range

File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 2000, in call
return self.wsgi_app(environ, start_response)
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1991, in wsgi_app
response = self.make_response(self.handle_exception(e))
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1567, in handle_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1988, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1641, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1544, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1639, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1625, in dispatch_request
return self.view_functionsrule.endpoint
File "/root/appsflyer_pushs/main_with_stream.py", line 127, in listen
if valid_request(ip=request.remote_addr):
File "/root/appsflyer_pushs/main_with_stream.py", line 90, in valid_request
ua = UserAgent(cache=False)
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/fake.py", line 12, in init
self.data = load()
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 92, in load
browsers_dict[browser_key] = get_browser_versions(browser)
File "/usr/local/lib/python2.7/dist-packages/fake_useragent/utils.py", line 55, in get_browser_versions
html = html.split('<div id='liste'>')[1]
IndexError: list index out of range

FakeUserAgentError on concurrent requests on Linux Server

Traceback (most recent call last):
  File "/usr/local/lib64/python2.7/site-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks
    result = g.send(result)
  File "/usr/local/lib/python2.7/site-packages/scrapy/core/downloader/middleware.py", line 37, in process_request
    response = yield method(request=request, spider=spider)
  File "/usr/local/lib/python2.7/site-packages/scrapy_fake_useragent/middleware.py", line 27, in process_request
    request.headers.setdefault('User-Agent', self.ua.random)
  File "/usr/local/lib/python2.7/site-packages/fake_useragent/fake.py", line 98, in __getattr__
    raise FakeUserAgentError('Error occurred during getting browser')  # noqa
FakeUserAgentError: Error occurred during getting browser

I keep getting this error on the Linux server when I run multiple spiders concurrently. What should I do to avoid that? Do I have to raise the RAM or something?

socket.timeout: timed out

When I try use ua = UserAgent() on a EC2 ubuntu instance, I get this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/cncases/local/lib/python3.4/site-packages/fake_useragent/fake.py", line 13, in __init__
    self.load()
  File "/home/ubuntu/cncases/local/lib/python3.4/site-packages/fake_useragent/fake.py", line 17, in load
    self.data = load_cached()
  File "/home/ubuntu/cncases/local/lib/python3.4/site-packages/fake_useragent/utils.py", line 135, in load_cached
    update()
  File "/home/ubuntu/cncases/local/lib/python3.4/site-packages/fake_useragent/utils.py", line 130, in update
    write(load())
  File "/home/ubuntu/cncases/local/lib/python3.4/site-packages/fake_useragent/utils.py", line 96, in load
    browsers_dict[browser_key] = get_browser_versions(browser)
  File "/home/ubuntu/cncases/local/lib/python3.4/site-packages/fake_useragent/utils.py", line 61, in get_browser_versions
    html = get(settings.BROWSER_BASE_PAGE.format(browser=quote_plus(browser)))
  File "/home/ubuntu/cncases/local/lib/python3.4/site-packages/fake_useragent/utils.py", line 27, in get
    return urlopen(request, timeout=settings.HTTP_TIMEOUT).read()
  File "/usr/lib/python3.4/urllib/request.py", line 161, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.4/urllib/request.py", line 463, in open
    response = self._open(req, data)
  File "/usr/lib/python3.4/urllib/request.py", line 481, in _open
    '_open', req)
  File "/usr/lib/python3.4/urllib/request.py", line 441, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.4/urllib/request.py", line 1210, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "/usr/lib/python3.4/urllib/request.py", line 1185, in do_open
    r = h.getresponse()
  File "/usr/lib/python3.4/http/client.py", line 1171, in getresponse
    response.begin()
  File "/usr/lib/python3.4/http/client.py", line 351, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.4/http/client.py", line 313, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib/python3.4/socket.py", line 374, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out

The machine definitely have internet connection since I can ping google. Also, this only happens to a few instances I established this week. The ones from last month are still running properly. I have update fake-useragent to its most recent version 0.1.2.

Один и тот же useragent

Добрый день. Хотелось бы видеть в новой версии возможность получения одного и того же useragent. К примеру, ua = UserAgent(last=True), после чего ua.chrome ВСЕГДА выдаёт самый свежий (для данной базы) useragent для этого браузера.

Для чего это надо - я ежедневно захожу на сайт, который при авторизации запрашивает и userkey, который высчитывается в том числе и на основе строки useragent. Если я буду ежедневно (по несколько раз) заходить с разным userkey, то это неизбежно вызовет подозрения.

ps: "ВСЕГДА" - в смысле не только для одной сессии, а при любом запуске скрипта.

Add random_browser & random_mobile

It would be useful to be able to use the very nice "random" behavior but ensure to always or never get a useragent corresponding to a mobile device

Showing "list index out of range" again and cannot be fixed by reinstalling

Hi when I type in the commands:

from fake_useragent import UserAgent
ua = UserAgent()

It shows the following error:

Traceback (most recent call last):
File "scraper.py", line 287, in
ua = UserAgent()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/fake.py", line 10, in init
self.data = load_cached()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 140, in load_cached
update()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 135, in update
write(load())
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 92, in load
browsers_dict[browser_key] = get_browser_versions(browser)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fake_useragent/utils.py", line 55, in get_browser_versions
html = html.split('<div id='liste'>')[1]
IndexError: list index out of range

I tried uninstalling and reinstalling using pip install fake-useragent, but the error still shows. Could you have a look and see what is the problem?
Thanks!

ua = UserAgent() error

Hi Guys,

I'm new to python so I apologize if this is basic. The first two lines of my script read:

from fake_useragent import UserAgent
ua = UserAgent()

I'm getting the following error:
File "Clicks.py", line 2, in
ua = UserAgent()
File "/Library/Python/2.7/site-packages/fake_useragent/fake.py", line 10, in init
self.data = load_cached()
File "/Library/Python/2.7/site-packages/fake_useragent/utils.py", line 140, in load_cached
update()
File "/Library/Python/2.7/site-packages/fake_useragent/utils.py", line 135, in update
write(load())
File "/Library/Python/2.7/site-packages/fake_useragent/utils.py", line 92, in load
browsers_dict[browser_key] = get_browser_versions(browser)
File "/Library/Python/2.7/site-packages/fake_useragent/utils.py", line 55, in get_browser_versions
html = html.split('<div id='liste'>')[1]
IndexError: list index out of range

I've used this script many times successfully in the past and nothing in my code has changed. This is the first time I've run it in about a week though. Thanks again for any help.

-Tim

user agent throws "ValueError: could not convert string to float" error - on both cached /uncached calls

from fake_useragent import UserAgent
on calling ua = UserAgent() or ua = UserAgent(cache=False)

Traceback (most recent call last):
File "<pyshell#17>", line 1, in
ua = UserAgent()
File "/usr/local/lib/python2.7/dist-packages/fake_useragent-0.0.6-py2.7.egg/fake_useragent/fake.py", line 12, in init
self.data = load_cached()
File "/usr/local/lib/python2.7/dist-packages/fake_useragent-0.0.6-py2.7.egg/fake_useragent/utils.py", line 134, in load_cached
update()
File "/usr/local/lib/python2.7/dist-packages/fake_useragent-0.0.6-py2.7.egg/fake_useragent/utils.py", line 129, in update
write(load())
File "/usr/local/lib/python2.7/dist-packages/fake_useragent-0.0.6-py2.7.egg/fake_useragent/utils.py", line 88, in load
for counter in range(int(float(percent))):
ValueError: could not convert string to float: <a

ValueError: could not convert string to float: <a

Here is the whole traceback:

Traceback (most recent call last):
File "...\fake_useragent\fake.py", line 11, in init
self.data = load_cached()
File "...\fake_useragent\utils.py", line 133, in load_cached
update()
File "...\fake_useragent\utils.py", line 128, in update
write(load())
File "...\fake_useragent\utils.py", line 87, in load
for counter in range(int(float(percent))):
ValueError: could not convert string to float: <a

I tried to upgrade fake-useragent. It did not help.
I also tried pip uninstall fake-useragent and pip install fake-useragent again. It returns the same error message.

By the way, my OS is Windows 8.1.

I'll appreciate it if you help me with this issue.

IndexError: list index out of range

In [1]: from fake_useragent import UserAgent

In [2]: ua = UserAgent()
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-2-89a0eee92536> in <module>()
----> 1 ua = UserAgent()

/Users/josefson/virtualenvs/CNPQ/lib/python2.7/site-packages/fake_useragent/fake.pyc in __init__(self, cache)
      8     def __init__(self, cache=True):
      9         if cache:
---> 10             self.data = load_cached()
     11         else:
     12             self.data = load()

/Users/josefson/virtualenvs/CNPQ/lib/python2.7/site-packages/fake_useragent/utils.pyc in load_cached()
    138 def load_cached():
    139     if not exist():
--> 140         update()
    141
    142     return read()

/Users/josefson/virtualenvs/CNPQ/lib/python2.7/site-packages/fake_useragent/utils.pyc in update()
    133         rm()
    134
--> 135     write(load())
    136
    137

/Users/josefson/virtualenvs/CNPQ/lib/python2.7/site-packages/fake_useragent/utils.pyc in load()
     80     randomize_dict = {}
     81
---> 82     for item in get_browsers():
     83         browser, percent = item
     84

/Users/josefson/virtualenvs/CNPQ/lib/python2.7/site-packages/fake_useragent/utils.pyc in get_browsers()
     27     html = get(settings.BROWSERS_STATS_PAGE)
     28     html = html.decode('windows-1252')
---> 29     html = html.split('<table class="reference notranslate">')[1]
     30     html = html.split('</table>')[0]
     31

IndexError: list index out of range

Useragentstring.com throws 404

Hi @hellysmile!

Thank you for this nice library, it is a really nice addition to the community! Today I tried to use the library but got stuck when it tried to connect to http://useragentstring.com/pages/Chrome/.

It seems as if that page throws a 404. I have checked through the Wayback Machine and it seems as if that page existed in the past. My guess is that useragentstring.com changed their website architecture.

If you believe the site was down temporarily and everything works as expected then please mark this issue as solved.

Thank you so much!

Installation guide?

How about a SIMPLE how to install guide for those of us who need it?

I don't mind copying & pasting Terminal commands, but I don't script.

Cant run it on QPython on Android

I get this out of the errors.log

WARNING:root:Error occurred during loading data. Trying to use cache server https://fake-useragent.herokuapp.com/browsers/0.1.8
Traceback (most recent call last):
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 154, in load
for item in get_browsers(verify_ssl=verify_ssl):
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 97, in get_browsers
html = get(settings.BROWSERS_STATS_PAGE, verify_ssl=verify_ssl)
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 84, in get
raise FakeUserAgentError('Maximum amount of retries reached')
FakeUserAgentError: Maximum amount of retries reached
WARNING:root:Error occurred during loading data. Trying to use cache server https://fake-useragent.herokuapp.com/browsers/0.1.8
Traceback (most recent call last):
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 154, in load
for item in get_browsers(verify_ssl=verify_ssl):
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 97, in get_browsers
html = get(settings.BROWSERS_STATS_PAGE, verify_ssl=verify_ssl)
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 84, in get
raise FakeUserAgentError('Maximum amount of retries reached')
FakeUserAgentError: Maximum amount of retries reached

And this out of executing the script

/data/user/0/org.qpython.qpy/files/bin/qpython-android5-root.sh
"/storage/emulated/0/qpython/instabot.py-master/main.py" && exit
/instabot.py-master/main.py" && exit <
Traceback (most recent call last):
File "/storage/emulated/0/qpython/instabot.py-master/main.py", line 77, in
'yuki_nishitani8','contemporary.paintings','danteslens','ramontrotman','voidcrack',])
File "/storage/emulated/0/qpython/instabot.py-master/src/instabot.py", line 162, in init fake_ua = UserAgent()
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/fake.py", line 69, in init self.load()
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/fake.py", line 78, in load
verify_ssl=self.verify_ssl,
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 250, in load_cached
update(path, use_cache_server=use_cache_server, verify_ssl=verify_ssl)
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 245, in update write(path, load(use_cache_server=use_cache_server, verify_ssl=verify_ssl))
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 189, in load
verify_ssl=verify_ssl,
File "/data/user/0/org.qpython.qpy/files/lib/python2.7/site-packages/fake_useragent-0.1.8-py2.7.egg/fake_useragent/utils.py", line 84, in get
raise FakeUserAgentError('Maximum amount of retries reached')
fake_useragent.errors.FakeUserAgentError: Maximum amount of retries reached
1|bullhead:/ $

Can someone tell me what is wrong? Have no issus on Windows or Mac

Distinguish between desktop and mobile (Android vs iOS)

I writes a spider to crawl the web from PC front with ua.random, but sometimes, it would redirect to mobile front , that's not what I expected.
I found it caused by ua.random would return a mobile ua as below:

import requests
res = requests.get("https://fake-useragent.herokuapp.com/browsers/0.1.5")
print res.json()['browsers']['safari'][41]
# Mozilla/5.0 (Android 2.2; Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4

From #10 (comment), there will be eventually added ua.desktop and ua.mobile, I'm really looking forward to it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.