ua-parser / uap-python Goto Github PK
View Code? Open in Web Editor NEWPython implementation of ua-parser
License: Apache License 2.0
Python implementation of ua-parser
License: Apache License 2.0
How can I parse if the device is desktop, smartphone, tv?
string: Outlook-Express/7.0 (MSIE 7.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; TmstmpExt)
actual: {'family': 'IE', 'major': '7', 'minor': '0', 'patch': None}
string: Outlook-Express/7.0 (MSIE 7.0; Windows NT 5.1; Trident/4.0; AskTB5.6; TmstmpExt)
actual: {'family': 'IE', 'major': '7', 'minor': '0', 'patch': None}
string: Outlook-Express/7.0 (MSIE 7.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; Media Center PC 6.0; OfficeLiveConnector.1.4; OfficeLivePatch.1.3; InfoPath.3; FDM; TmstmpExt)
actual: {'family': 'IE', 'major': '7', 'minor': '0', 'patch': None}
string: Outlook-Express/7.0 (MSIE 6.0; Windows NT 5.1; SV1; GTB6.3; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; InfoPath.2; .NET CLR 3.0.04506.648; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; OfficeLiveConnector.1.3; OfficeLivePatch.0.0; TmstmpExt)
actual: {'family': 'IE', 'major': '6', 'minor': '0', 'patch': None}
string: Outlook-Express/7.0 (MSIE 8; Windows NT 5.1; Trident/4.0; GTB7.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; TmstmpExt)
actual: {'family': 'Other', 'major': None, 'minor': None, 'patch': None}
string: 'Outlook-Express/7.0 (MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.1; TmstmpExt)',
actual: {'family': 'IE', 'major': '8', 'minor': '0', 'patch': None}
string: Outlook-Express/7.0 (MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; HPDTDF; .NET4.0C; BRI/2; AskTbLOL/5.12.5.17640; TmstmpExt)
actual: {'family': 'IE', 'major': '9', 'minor': '0', 'patch': None}
expected: {'family': 'Windows Live Mail', 'major': None, 'minor': None, 'patch': None}
(for all user-agents above, not sure about a version)
string: Microsoft Office/12.0 (Windows NT 6.1; Microsoft Office Outlook 12.0.6739; Pro)
actual: {'family': 'Other', 'major': None, 'minor': None, 'patch': None}
expected: {'family': 'Outlook', 'major': '2007', 'minor': None, 'patch': None}
string: Microsoft Office/14.0 (Windows NT 6.1; Microsoft Outlook 14.0.5128; Pro)
actual: {'family': 'Other', 'major': None, 'minor': None, 'patch': None}
expected: {'family': 'Outlook', 'major': '2010', 'minor': None, 'patch': None}
string: Microsoft Office/16.0 (Microsoft Outlook Mail 16.0.6525; Pro)
actual: {'family': 'Other', 'major': None, 'minor': None, 'patch': None}
expected: {'family': 'Outlook', 'major': '2016', 'minor': None, 'patch': None}
string: Microsoft Office/16.0 (Windows NT 10.0; Microsoft Outlook 16.0.6326; Pro)
actual: {'family': 'Other', 'major': None, 'minor': None, 'patch': None}
expected: {'family': 'Outlook', 'major': '2016', 'minor': None, 'patch': None}
string: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 10.0; WOW64; Trident/8.0; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729; Microsoft Outlook 16.0.6366; ms-office; MSOffice 16)
actual: {'family': 'IE', 'major': '7', 'minor': '0', 'patch': None}
expected: {'family': 'Outlook', 'major': '2016', 'minor': None, 'patch': None}
string: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.2pre) Gecko/2009031117 Spicebird/0.7.1
actual: {'family': 'Other', 'major': None, 'minor': None, 'patch': None}
expected: {'family': 'Spicebird', 'major': '0', 'minor': '7', 'patch': '1'}
string: Mozilla/5.0 (Windows NT 10.0; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 Lightning/4.0.5.1
actual: {'family': 'Lightning', 'major': '4', 'minor': '0', 'patch': '5'}
expected: {'family': 'Thunderbird', 'major': '38', 'minor': '5', 'patch': '1'}
string: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9pre) Gecko/20100209 Shredder/3.0.2pre
actual: {'family': 'Other', 'major': None, 'minor': None, 'patch': None}
expected: {'family': 'Shredder', 'major': '3', 'minor': '0', 'patch': '2pre'}
string: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9pre) Gecko/20100308 Lightning/1.0b1 Shredder/3.0.4pre
actual: {'family': 'Lightning', 'major': '1', 'minor': '0', 'patch': 'b1'}
expected: {'family': 'Shredder', 'major': '3', 'minor': '0', 'patch': '4pre'}
Snyk is flagging uap-python as insecure because it refers to an outdated version of uap-core with a known ReDoS vulnerability. The submodule needs to be updated. https://snyk.io/advisor/python/ua-parser#security
Hi Guys,
In November 2015 AWS Redshift added Python support for User Define Functions. I am new to Python but would like to add this library to Redshift but could use some guidance as to how. This could become an alternative setup script for others. According to the following link, AWS requires the source in a zip file and added to an S3 bucket.
http://docs.aws.amazon.com/redshift/latest/dg/udf-python-language-support.html
Once prepared the library can be created then functions can be created based on the library. As best I can tell, UAP has a core that requires pre-processing. Any suggestions on how to write an alternative setup script for the Python project to include all of that in a zip?
For context, see: getsentry/sentry#1987
I notice this is being translated from yaml to json inside the sdist. Why not go one step further and generate the .py version instead?
tl;dr,
It takes 1.2 seconds to import the module on my computer through parsing the yaml file and converting it at module import time, but 219 microseconds after it's compiled to a py file.
Before:
$ python -m timeit -s 'from sentry.utils.ua_parser import parser' 'reload(parser)'
10 loops, best of 3: 1.2 sec per loop
After:
$ python -m timeit -s 'from sentry.utils.ua_parser import parser' 'reload(parser)'
1000 loops, best of 3: 219 usec per loop
I can submit pull request for this if you're on board.
I need to use the changes from #28, but can't just pip install the tarball from github due to those not including the submodule. I'd really appreciate a PyPI release.
z = user_agent_parser.ParseDevice('Mozilla/5.0 (iPhone; CPU iPhone OS 12_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) CriOS/70.0.3538.75 Mobile/15E148 Safari/605.1')
print(z)
{'family': 'iPhone', 'brand': 'Apple', 'model': 'iPhone'}
Even though the information is there, the exact model is not picked up, unfortunately.
Thank you for the work you've put into this project btw :)
A user-agent of Edge is
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36 Edge/15.15063
The parsed result of browser version is {'major': 15, 'minor':15063}
, and I parsed the same ua in the following website whatismybrowser, it returnd Edge 40
๏ผ and I found that 15 is the Layout Engine Version and 40 is Software Version, maybe return Software Version is better.
Forgive my poor English and thanks to the contribution
now is 0.6.1, but in init.py is still 0.5.1
I was confused by this user agent so tried the library...
>>> a='Mozilla/5.0 (Linux; Android 6.1; iPhone 7PLUS Build/KOT49H) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/30.0.0.0 Mobile Safari/537.36'
>>> a
'Mozilla/5.0 (Linux; Android 6.1; iPhone 7PLUS Build/KOT49H) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/30.0.0.0 Mobile Safari/537.36'
>>> from ua_parser import user_agent_parser
>>> import pprint
>>> pp = pprint.PrettyPrinter(indent=4)
>>> parsed_string = user_agent_parser.Parse(a)
>>> pp.pprint(parsed_string)
{ 'device': {'brand': 'Apple', 'family': 'iPhone', 'model': 'iPhone'},
'os': { 'family': 'Android',
'major': '6',
'minor': '1',
'patch': None,
'patch_minor': None},
'string': 'Mozilla/5.0 (Linux; Android 6.1; iPhone 7PLUS Build/KOT49H) '
'AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 '
'Chrome/30.0.0.0 Mobile Safari/537.36',
'user_agent': { 'family': 'Chrome Mobile',
'major': '30',
'minor': '0',
'patch': '0'}}
>>>
Apparently this isn iPhone running Android!
Feeding the same string into https://developers.whatismybrowser.com/useragents/parse/#parse-useragent tells me this is Chrome 30 on iOS.
Apologies for the non-issue issue.
We use ua-parser heavily in Sentry and I'm curious about the status of 0.4.0.
Right now we're at a point where we either will be forking the old ua-parser (and possibly vendoring it), or helping push this through.
Mostly the reason for this is we need to get updated support for things like Microsoft Edge.
Attempting to install ua-parser-0.5 on Python 3.5.0 on Mac OS X 10.11.1 and getting the following error:
(MyNewLeaf)MyNewLeaf|store-adminโก โ pip install --upgrade ua-parser
Collecting ua-parser
Using cached ua-parser-0.5.0.tar.gz
Requirement already up-to-date: pyyaml in ./lib/python3.5/site-packages (from ua-parser)
Building wheels for collected packages: ua-parser
Running setup.py bdist_wheel for ua-parser
Complete output from command /Users/coltonprovias/Development/MyNewLeaf/bin/python3.5 -c "import setuptools;__file__='/private/var/folders/zl/0vgdj8l96ss0_w20j9k0t7bm0000gn/T/pip-build-x3p3uu52/ua-parser/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" bdist_wheel -d /var/folders/zl/0vgdj8l96ss0_w20j9k0t7bm0000gn/T/tmpe4yexrdlpip-wheel-:
running bdist_wheel
running build
running build_py
creating build
creating build/lib
creating build/lib/ua_parser
copying ./ua_parser/__init__.py -> build/lib/ua_parser
copying ./ua_parser/user_agent_parser.py -> build/lib/ua_parser
copying ./ua_parser/user_agent_parser_test.py -> build/lib/ua_parser
running egg_info
writing dependency_links to ua_parser.egg-info/dependency_links.txt
writing top-level names to ua_parser.egg-info/top_level.txt
writing requirements to ua_parser.egg-info/requires.txt
writing ua_parser.egg-info/PKG-INFO
warning: manifest_maker: standard file '-c' not found
reading manifest file 'ua_parser.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'ua_parser.egg-info/SOURCES.txt'
copying ./ua_parser/regexes.yaml -> build/lib/ua_parser
copying ./ua_parser/regexes.json -> build/lib/ua_parser
installing to build/bdist.macosx-10.11-x86_64/wheel
running install
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/zl/0vgdj8l96ss0_w20j9k0t7bm0000gn/T/pip-build-x3p3uu52/ua-parser/setup.py", line 83, in <module>
'Programming Language :: Python :: Implementation :: PyPy',
File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/distutils/core.py", line 148, in setup
dist.run_commands()
File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/distutils/dist.py", line 955, in run_commands
self.run_command(cmd)
File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/Users/coltonprovias/Development/MyNewLeaf/lib/python3.5/site-packages/wheel/bdist_wheel.py", line 211, in run
self.run_command('install')
File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/private/var/folders/zl/0vgdj8l96ss0_w20j9k0t7bm0000gn/T/pip-build-x3p3uu52/ua-parser/setup.py", line 43, in run
install_regexes()
File "/private/var/folders/zl/0vgdj8l96ss0_w20j9k0t7bm0000gn/T/pip-build-x3p3uu52/ua-parser/setup.py", line 16, in install_regexes
'Unable to find regexes.yaml, should be at %r' % yaml_src)
RuntimeError: Unable to find regexes.yaml, should be at '/private/var/folders/zl/0vgdj8l96ss0_w20j9k0t7bm0000gn/T/pip-build-x3p3uu52/ua-parser/uap-core/regexes.yaml'
Copying regexes.yaml to package directory...
----------------------------------------
Failed building wheel for ua-parser
Failed to build ua-parser
Installing collected packages: ua-parser
Found existing installation: ua-parser 0.4.1
Uninstalling ua-parser-0.4.1:
Successfully uninstalled ua-parser-0.4.1
Running setup.py install for ua-parser
Complete output from command /Users/coltonprovias/Development/MyNewLeaf/bin/python3.5 -c "import setuptools, tokenize;__file__='/private/var/folders/zl/0vgdj8l96ss0_w20j9k0t7bm0000gn/T/pip-build-x3p3uu52/ua-parser/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /var/folders/zl/0vgdj8l96ss0_w20j9k0t7bm0000gn/T/pip-m56ggvi8-record/install-record.txt --single-version-externally-managed --compile --install-headers /Users/coltonprovias/Development/MyNewLeaf/bin/../include/site/python3.5/ua-parser:
running install
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/zl/0vgdj8l96ss0_w20j9k0t7bm0000gn/T/pip-build-x3p3uu52/ua-parser/setup.py", line 83, in <module>
'Programming Language :: Python :: Implementation :: PyPy',
File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/distutils/core.py", line 148, in setup
dist.run_commands()
File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/distutils/dist.py", line 955, in run_commands
self.run_command(cmd)
File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/private/var/folders/zl/0vgdj8l96ss0_w20j9k0t7bm0000gn/T/pip-build-x3p3uu52/ua-parser/setup.py", line 43, in run
install_regexes()
File "/private/var/folders/zl/0vgdj8l96ss0_w20j9k0t7bm0000gn/T/pip-build-x3p3uu52/ua-parser/setup.py", line 16, in install_regexes
'Unable to find regexes.yaml, should be at %r' % yaml_src)
RuntimeError: Unable to find regexes.yaml, should be at '/private/var/folders/zl/0vgdj8l96ss0_w20j9k0t7bm0000gn/T/pip-build-x3p3uu52/ua-parser/uap-core/regexes.yaml'
Copying regexes.yaml to package directory...
----------------------------------------
Rolling back uninstall of ua-parser
Command "/Users/coltonprovias/Development/MyNewLeaf/bin/python3.5 -c "import setuptools, tokenize;__file__='/private/var/folders/zl/0vgdj8l96ss0_w20j9k0t7bm0000gn/T/pip-build-x3p3uu52/ua-parser/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /var/folders/zl/0vgdj8l96ss0_w20j9k0t7bm0000gn/T/pip-m56ggvi8-record/install-record.txt --single-version-externally-managed --compile --install-headers /Users/coltonprovias/Development/MyNewLeaf/bin/../include/site/python3.5/ua-parser" failed with error code 1 in /private/var/folders/zl/0vgdj8l96ss0_w20j9k0t7bm0000gn/T/pip-build-x3p3uu52/ua-parser
This is being done on a Windows machine in a PowerShell session. See repro and error below:
$env:UA_PARSER_YAML = "C:\\Temp\\regexes.yaml"
>>> from ua_parser import user_agent_parser
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\mark.depalma\Desktop\Projects\Intune\Scripts\DynamicRedirector\uatestenv\lib\site-packages\ua_parser\user_agent_parser.py", line 481, in <module>
regexes = yaml.load(fp, Loader=SafeLoader)
File "C:\Users\mark.depalma\Desktop\Projects\Intune\Scripts\DynamicRedirector\uatestenv\lib\site-packages\yaml\__init__.py", line 81, in load
return loader.get_single_data()
File "C:\Users\mark.depalma\Desktop\Projects\Intune\Scripts\DynamicRedirector\uatestenv\lib\site-packages\yaml\constructor.py", line 49, in get_single_data
node = self.get_single_node()
File "yaml\_yaml.pyx", line 673, in yaml._yaml.CParser.get_single_node
File "yaml\_yaml.pyx", line 687, in yaml._yaml.CParser._compose_document
File "yaml\_yaml.pyx", line 731, in yaml._yaml.CParser._compose_node
File "yaml\_yaml.pyx", line 845, in yaml._yaml.CParser._compose_mapping_node
File "yaml\_yaml.pyx", line 729, in yaml._yaml.CParser._compose_node
File "yaml\_yaml.pyx", line 806, in yaml._yaml.CParser._compose_sequence_node
File "yaml\_yaml.pyx", line 731, in yaml._yaml.CParser._compose_node
File "yaml\_yaml.pyx", line 845, in yaml._yaml.CParser._compose_mapping_node
File "yaml\_yaml.pyx", line 694, in yaml._yaml.CParser._compose_node
File "yaml\_yaml.pyx", line 858, in yaml._yaml.CParser._parse_next_event
File "yaml\_yaml.pyx", line 867, in yaml._yaml.input_handler
File "C:\Program Files\Python39\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 461: character maps to <undefined>
When manually stepping through code I saw that the file was being opened using 'cp1252' encoding by default. Adjusting the code so that the open handle is created as binary resolves the issue.
Line 480 of user_agent_parser.py with fix:
with open(UA_PARSER_YAML, 'rb') as fp:
Looking at this project over here: https://github.com/tobie/ua-parser
Should I be looking at the code in this repo for the python version? Or should I be continuing to look at the code within the other repo: https://github.com/tobie/ua-parser/tree/master/py/ua_parser
Is the plan to split the existing tobie/ua-parser into the ua-parser org with a bunch of sub repos?
I only ask so I know which place I'm suppose to log and communicate about issues?
uap-python/ua_parser/user_agent_parser.py
Line 219 in 6247d69
Instead of using the built in cache for 20 values, that gets cleared out if it's too big, we are using a lru cache of 1000 values. This has reduced the time it takes for user_agent_parser to run (when tested over 100,000 real user agents) by about 3x, and appears to use less then a MB of ram.
Hello there.
We have a problem that new user agents are not parsed correctly in Python.
After looking investigating the regexes.yaml file, we see that Python library is pointing to a 10 month old commit of uap-core, but the most recent version of uap-core is 10 days old.
Is there a reason why old uap-core is used, or we can use the new one?
When parsing UA string of HTTPie/0.9.9
, user_agent_parser
reports it as a "Generic Feature Phone", even though it's obviously not a phone.
See also the file size on the pypi page for wheel and egg:
https://pypi.python.org/pypi/ua-parser
Here is the User Agent header in question:
Mozilla/5.0 (Linux; U; Android 4.4.3; zh-cn; ่ถๅ9999 Build/KOT49H) AppleWebKit/533.1 (KHTML, like Gecko)Version/4.0 MQQBrowser/5.4 TBS/025469 Mobile Safari/533.1 MicroMessenger/6.2.2.54_rec1912d.581 NetType/3gnet Language/zh_CN
And here is the traceback:
File "/usr/lib/python2.7/site-packages/user_agents-1.0.1-py2.7.egg/user_agents/parsers.py", line 232, in parse
return UserAgent(user_agent_string)
File "/usr/lib/python2.7/site-packages/user_agents-1.0.1-py2.7.egg/user_agents/parsers.py", line 126, in __init__
ua_dict = user_agent_parser.Parse(user_agent_string)
File "/usr/lib/python2.7/site-packages/ua_parser/user_agent_parser.py", line 232, in Parse
'device': ParseDevice(user_agent_string, **jsParseBits),
File "/usr/lib/python2.7/site-packages/ua_parser/user_agent_parser.py", line 315, in ParseDevice
device, brand, model = deviceParser.Parse(user_agent_string)
File "/usr/lib/python2.7/site-packages/ua_parser/user_agent_parser.py", line 203, in Parse
model = self.MultiReplace(self.model_replacement, match)
File "/usr/lib/python2.7/site-packages/ua_parser/user_agent_parser.py", line 184, in MultiReplace
_string = re.sub(r'\$(\d)', _repl, string)
File "/usr/lib64/python2.7/re.py", line 151, in sub
return _compile(pattern, flags).sub(repl, string, count)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe8 in position 0: ordinal not in range(128)
Mozilla/5.0 (Linux; Android 6.0.1; ATH-AL00 Build/HONORATH-AL00; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/53.0.2785.124 Mobile Safari/537.36 JsKit/1.0 (Android) SohuNews/5.7.3 BuildCode/113
Output:
{ 'device': { 'family': 'ATH-AL00 Build/HONORATH-AL00; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/53.0.2785.124 Mobile Safari/537.36 JsKit/1.0 (Android) SohuNews/5.7.3'},
'os': { 'family': 'Android',
'major': '6',
'minor': '0',
'patch': '1',
'patch_minor': None},
'string': 'Mozilla/5.0 (Linux; Android 6.0.1; ATH-AL00 Build/HONORATH-AL00; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/53.0.2785.124 Mobile Safari/537.36 JsKit/1.0 (Android) SohuNews/5.7.3 BuildCode/113',
'user_agent': { 'family': u'Chrome Mobile',
'major': '53',
'minor': '0',
'patch': '2785'}}```
Are there plans to add released tags to this repo? I'd like to add this repo as a subtree and/or submodule but there are no tags to point at, only commits on master.
Mozilla/5.0 (Linux; U; Android 2.3.5; zh-cn; HTC_IncredibleS_S710e Build/GRJ90) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1
Returns:
{'user_agent': {'family': 'Android', 'major': '2', 'minor': '3', 'patch': '5'}, 'device': {'family': 'HTC IncredibleS S710e', 'brand': 'HTC', 'model': 'IncredibleS S710e'}, 'os': {'family': 'Android', 'major': '2', 'patch_minor': None, 'minor': '3', 'patch': '5'}, 'string': 'Mozilla/5.0 (Linux; U; Android 2.3.5; zh-cn; HTC_IncredibleS_S710e Build/GRJ90) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1'}
http://www.useragentstring.com/index.php marks the browser as "4.0" but here it's 2.3.5, which is the android version, not the browser version
@mattrobenolt - unfortunately I missed an issue in uap-core
that left my fix for Pingdom bot detection incomplete. #53 includes that fix. I'd appreciate another patch release of uap-python
.
Thanks!
It seems ua-parser 0.5.0 does not correctly parse out the version number in later versions of Windows.
$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:38)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from ua_parser import user_agent_parser
>>> import pprint
>>> pp = pprint.PrettyPrinter(indent=4)
>>> ua_string = 'Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'
>>> parsed_string = user_agent_parser.ParseOS(ua_string)
>>> pp.pprint(parsed_string)
{ 'family': u'Windows 8.1',
'major': None,
'minor': None,
'patch': None,
'patch_minor': None}
{
'user_agent': {
'family': 'Firefox Mobile',
'major': '41',
'minor': '0',
'patch': None
},
'os': {
'family': 'Android',
'major': '4',
'minor': '4',
'patch': None,
'patch_minor': None
},
'device': {
'family': 'Generic Tablet',
'brand': 'Generic',
'model': 'Tablet'
},
'string': 'Mozilla/5.0 (Android 4.4; Tablet; rv:41.0) Gecko/41.0 Firefox/41.0'
}
{
'user_agent': {
'family': 'Firefox Mobile',
'major': '41',
'minor': '0',
'patch': None
},
'os': {
'family': 'Android',
'major': '4',
'minor': '4',
'patch': None,
'patch_minor': None
},
'device': {
'family': 'rv:41.0',
'brand': 'Generic_Android',
'model': 'rv:41.0'
},
'string': 'Mozilla/5.0 (Android 4.4; Tablet; rv:41.0) Gecko/41.0 Firefox/41.0'
}
I'll see if i can submit a pull request for this. :)
the device part of the dictionary is off.
Actually, this is a user agent for Instagram ios app:
Mozilla/5.0 (iPhone; CPU iPhone OS 12_0_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/16A404 Instagram 72.0.0.20.101 (iPad6,11; iOS 12_0_1; ru_RU; ru-RU; scale=2.00; gamut=normal; 750x1334; 131642248)
It has app name, but the result of user_agent_parser.Parse(ua_string)
for this:
{
'family': 'Mobile Safari UI/WKWebView',
'major': '12',
'minor': '0',
'patch': '1'
}
But 'family': 'Instagram'
is expected, because uap-core has regex for Instagram UA.
It seems that uap-core submodule is outdated.
Hello,
I try to install ua-parser using 'pip install ua-parser' and manual install(python setup.py install), however, both issued the same error: ModuleNotFoundError: No module named 'ua_parser._regexes'. The following is the detail error message after I install ua-parser use pip.
Python 3.6.1 |Anaconda 4.4.0 (64-bit)| (default, May 11 2017, 13:25:24) [MSC v.
900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
from ua_parser import user_agent_parser
Traceback (most recent call last):
File "", line 1, in
File "E:\ua\uap-python-master\uap-python-master\ua_parser\user_agent_parser.p
", line 552, in
from ._regexes import USER_AGENT_PARSERS, DEVICE_PARSERS, OS_PARSERS
ModuleNotFoundError: No module named 'ua_parser._regexes'
My os is win7 64bit. Is there something I missed?
Thanks for your attention.
Please enable Travis CI in project settings. Thanks.
Thanks for a useful library!
I noticed that a lot has happened in this repo since the last PyPI release (april 2018). Would it be possible to cut a new PyPI release?
user_agent_parser.Parse seems to hang forever on certain malformed UA strings.
Repro:
from ua_parser import user_agent_parser
user_agent_parser.Parse("Mozilla/5.0 (Linux; Android 9; SM-G975U Build/PPR1.180610.011; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/77.0.3865.92 Mobile Safari/537.36 (Mobile; afma-sdk-a-v14300000.14300000.0)")
(Note the large amount of whitespace.)
Take the following User Agent instance:
'Mozilla/5.0 (Linux; Android 5.0.2; SM-T535 Build/LRX22G; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/50.0.2661.86 Safari/537.36 [FB_IAB/FB4A;FBAV/77.0.0.20.66;]'
Place that in a variable ua
, then:
user_agent_parser.ParseUserAgent(ua)
says that the browser family is 'Facebook'.
I have never heard of such a browser family.
Is this behavior correct?
Hello!
It would be nice if the python package version changed when the linked uap-core submodule was updated. uap-core has been updated several times in master since 0.8.0 has been released. Is there an intention to 'release' a new version of uap-python?
I just noticed that version 0.5.0 is available on pypi, so I went here looking for a chagelog. I cannot find one, and the code in the repository seems like it's still version 0.4.1, which is the version I currently use in my project.
I looked in uap-core and found version 0.5.0 under releases, but I was confused that there was no version 0.4.1 (only 0.4.0), and I still can't really find a good description of what has changed. Of course I can read the commit messages, but that is not very easily read.
All in all, I am hessitant to upgrade, because I am unsure of what has happened.
Python version: Python 3.7.1
ua-parser version: ua-parser==0.8.0
REPL:
>>> from ua_parser import user_agent_parser as uap
>>> uap.Parse('Mozilla/5.0 (Linux; Android 9; SM-G960F Build/PPR1.180610.011; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/74.0.3729.136 Mobile Safari/537.36 PKeyAuth/1.0')
Output:
{'user_agent': {'family': 'Chrome Mobile WebView', 'major': '74', 'minor': '0', 'patch': '3729'}, 'os': {'family': 'Android', 'major': None, 'minor': None, 'patch': None, 'patch_minor': None}, 'device': {'family': 'Samsung SM-G960F', 'brand': 'Samsung', 'model': 'SM-G960F'}, 'string': 'Mozilla/5.0 (Linux; Android 9; SM-G960F Build/PPR1.180610.011; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/74.0.3729.136 Mobile Safari/537.36 PKeyAuth/1.0'}
Expected:
{'user_agent': {'family': 'Chrome Mobile WebView', 'major': '74', 'minor': '0', 'patch': '3729'}, 'os': {'family': 'Android', 'major': '9', 'minor': None, 'patch': None, 'patch_minor': None}, 'device': {'family': 'Samsung SM-G960F', 'brand': 'Samsung', 'model': 'SM-G960F'}, 'string': 'Mozilla/5.0 (Linux; Android 9; SM-G960F Build/PPR1.180610.011; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/74.0.3729.136 Mobile Safari/537.36 PKeyAuth/1.0'}
User Agent string generated by Android 9 devices doesn't follow the format {major}.{minor}.{patch}
e.g. Android 9.0
, instead just give Android 9
, which cause the parser unable to recognise its version.
I'm wondering what's the process for updating uap-python
with the latest rules from uap-core
?It looks like @mattrobenolt usually does this - do you want update requests as an issue like this, a PR to merge?
The git submodule for uap-core
currently points at a commit not associated with a tag in uap-core
which suggests that we don't need to wait for a "release" from uap-core
to update uap-python
. Seems like it would be preferable to point at a release but makes sense that we'd want to avoid that level of coordination.
I'm asking because I recently submitted ua-parser/uap-core#216 that was merged to master over the weekend and would like to pull it into uap-python
so I can use it.
Would be helpful to update the README with instructions on how to make these updates.
Thanks!
As per $SUBJ the pypi sdist tarball does not contain license file nor the repository.
Even if you forked you still need to ship a license file (in this case I guess Apache-2.0) within the repo and the archive.
Hey @elsigh!
So I got a report that python-user-agents is detecting MS Edge wrong and after some digging it looks like it's using uap-python. I believe MS Edge detection was in ua-parser but wanted to check. My python-fu is pretty weak so I'd need some assistance on that front.
Upgrading from 0.4.0 to 0.7.1 I was surprised to find everything coming back in bytes
rather than str
, I would have found this out more easily with a changelog to read ๐
Sample Useragent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/537.36 (KHTML, like Gecko) brave/0.7.11 Chrome/47.0.2526.110 Brave/0.36.5 Safari/537.36
website: https://brave.com
make Makefile
make: Nothing to be done for
Makefile'.`python setup.py install
>>> from ua_parser import user_agent_parser
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "ua_parser/user_agent_parser.py", line 460, in <module>
yamlFile = open(yamlPath)
IOError: [Errno 2] No such file or directory: 'ua_parser/regexes.yaml'
The sub-module hash for uap-core comes from November.
For those of you who are interested. If you have ua-parser and you want to get the latest regexs. You will need to run the 'makefile' or 'python setup.py sdist'.
The reason for this, you need the 'json' files as well as the 'yaml'.
git clone https://github.com/ua-parser/uap-python
cd uap-python
git submodule update --init
python setup.py sdist
Then check 'regexes.yaml' for an item in the latest copy. For me, that was 'Opera Coast'. A driving reason why I wanted to update my ua-parser.
Thank you for this great utility! I wanted to share some feedback and offer to take on the contribution to work on it if you're open to that.
I did a quick search of prior issues and didn't see anything related. Please feel free to link if it is a duplicate
I recently found that lower case user agent strings led to issues where data was not extracted/parsed properly. The below illustrates this:
>>> from ua_parser import user_agent_parser
>>> import pprint
>>> ua_string = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.104 Safari/537.36'
>>> normal_case_parsed = user_agent_parser.Parse(ua_string)
>>> pprint.pprint(normal_case_parsed)
{'device': {'brand': 'Apple', 'family': 'Mac', 'model': 'Mac'},
'os': {'family': 'Mac OS X',
'major': '10',
'minor': '9',
'patch': '4',
'patch_minor': None},
'string': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 '
'(KHTML, like Gecko) Chrome/41.0.2272.104 Safari/537.36',
'user_agent': {'family': 'Chrome',
'major': '41',
'minor': '0',
'patch': '2272'}}
>>> lower_case_parsed = user_agent_parser.Parse(ua_string.lower())
>>> pprint.pprint(lower_case_parsed)
{'device': {'brand': None, 'family': 'Other', 'model': None},
'os': {'family': 'Other',
'major': None,
'minor': None,
'patch': None,
'patch_minor': None},
'string': 'mozilla/5.0 (macintosh; intel mac os x 10_9_4) applewebkit/537.36 '
'(khtml, like gecko) chrome/41.0.2272.104 safari/537.36',
'user_agent': {'family': 'Other', 'major': None, 'minor': None, 'patch': None}}
Let me know of any other details to provide, otherwise I will take a look for where a patch could be applied and see if I can contribute one.
Some user agent strings are causing the user agent parser to be really slow.
from ua_parser import user_agent_parser
ua_str = "Mozilla/5.0 (iPhone; CPU iPhone OS 7_5 like Mac OS X) AppleWebKit/1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 (KHTML, like Gecko) Version/5.1 Mobile/9334 Safari/7548.320"
s = time.time()
print(user_agent_parser.ParseUserAgent(ua_str))
print('took ', time.time() - s)
AppleWebKit/<400 digits>
The notable change was updating uap-core
gitsubmodule from ce89c7637eeeb07b3464dbf40645bb3973723c0a
to fc570f378e41063bad3bdf0532967743efc75b4b
After digging around, I've found that the issue is due to the regex addition in uap-core
:
The latest regexes.xml altered the regex, and looks to have fixed the slowness issue.
PR for #70 looks to use the new regex which helps the issue.
this user agent string fails to parse:
user_agent_parser.Parse('Mozilla/5.0 (Linux; U; Android 4.1.1; en-us; Build/JRO03C) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Safari/534.30')
Traceback (most recent call last):
File "", line 1, in
File "C:\Python27\lib\site-packages\ua_parser-0.4.0-py2.7.egg\ua_parser\user_agent_parser.py", line 212, in Parse
'device': ParseDevice(user_agent_string, **jsParseBits),
File "C:\Python27\lib\site-packages\ua_parser-0.4.0-py2.7.egg\ua_parser\user_agent_parser.py", line 291, in ParseDevice
device, brand, model = deviceParser.Parse(user_agent_string)
File "C:\Python27\lib\site-packages\ua_parser-0.4.0-py2.7.egg\ua_parser\user_agent_parser.py", line 187, in Parse
device = match.group(1)
IndexError: no such group
Shouldn't the version number in ua_parser/__init__.py
be kept up to date?
Right now the whole API is using CamelCase which is extremely unpythonic. It would be nice if you could add snake_case alternatives, or even deprecated the old ones and eventuallt remove them.
A fix for detecting Edge (Chromium) as Chrome is in version 0.6.10. Would it be possible to update the submodule and cut a new release please?
Hi,
I was using this library for a project recently, and was really surprised that ParseUserAgent
was significantly slower than Parse
. After digging a bit into the code, I think this is because ParseUserAgent
is not using the internal cache, whereas Parse
is.
As both methods are documented in the README, I was expecting both to have similar performance, with ParseUserAgent
being a bit faster because it has to parse less. Currently, both methods seem to be recommended for regular use, but ParseUserAgent
seems significantly worse than Parse
.
In my opinion it might be helpful to document this behavior in the README or change it, because it is currently faster to use Parse(ua_string)['user_agent']
instead of ParseUserAgent
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.