dlint-py / dlint Goto Github PK
View Code? Open in Web Editor NEWDlint is a tool for encouraging best coding practices and helping ensure Python code is secure.
License: BSD 3-Clause "New" or "Revised" License
Dlint is a tool for encouraging best coding practices and helping ensure Python code is secure.
License: BSD 3-Clause "New" or "Revised" License
defusedxml is not capable of creating Element or SubElement
Whitelisting in bad_xml_use.py
@property
def whitelisted_modules(self):
return [
'xml.sax.saxutils',
'xml.etree.ElementTree.Element',
'xml.etree.ElementTree.SubElement',
]
would help solve this.
A work around could then be used as follows:
from xml.etree.ElementTree import Element, SubElement
import defusedxml.ElementTree as ET
ET.Element = _ElementType = Element
ET.SubElement = SubElement
Note that we still disallow
from xml.etree.ElementTree import parse
What is the relationship between dlint and bandit?
What I can see:
Could somebody maybe point out reasons to use one or the other? Do you maybe use both together? Is there an overlap between the communities?
Alternation detection currently compares different expression types (e.g. literals, not literals, negate literals, ranges, dots) to ranges. It doesn't check all combinations of aforementioned types. E.g.
$ python -m dlint.redos -p '(a|[a-c])+'
('(a|[a-c])+', True)
$ python -m dlint.redos -p '(a|aa)+'
('(a|aa)+', False)
Though a
and aa
clearly overlap. We should check all combinations of previously mentioned expression types against each other. This will likely require a refactor to make the comparisons easier.
Some thoughts on a refactor: my initial thoughts are to turn each expression type into a list of ranges it supports, e.g.
a
becomes range a
through a
[a-zA-Z]
becomes range a
through z
and range A
through Z
\w
becomes the range of printable charactersWe'll probably need to support some kind of negation operation on our ranges to correctly/easily handle not literals (e.g. [^a]
) and negate literals (e.g. [^abc]
).
After we've created a unified type that can be instantiated from each expression type, then performing comparisons will be much easier. We'll simply have to iterate over the 2-combination of all ranges and see if there's any overlap.
In order for catastrophic backtracking to occur there must be a character that forces backtracking to occur. E.g.
>>> re.match('(a+)+', 'a' * 64 + 'c')
<_sre.SRE_Match at 0x7f9106d32430>
>>> re.match('(a+)+b', 'a' * 64 + 'c')
...Spins...
The first expression has nested quantifiers, so Dlint will detect it, but since there's nothing after the nested quantifier to backtrack the match no catastrophic backtracking will occur. Similarly:
$ python -m dlint.redos -p '(a+)+'
('(a+)+', True)
$ python -m dlint.redos -p '(a+)+b'
('(a+)+b', True)
We should avoid detecting the first expression because catastrophic backtracking cannot occur.
Hi,
I have a small improvement suggestion for the contents of the dlint
python package distributed in PyPi. This is a minor quality of life suggestion from the end user perspective.
When I install dlint
on a fresh Python virtual environment, the installation also comes with the tests
package
Python 3.10.12
miikama$ python3 -m venv env
miikama$ source env/bin/activate
(env) miikama$ ls env/lib/python3.10/site-packages/
_distutils_hack pip pkg_resources setuptools-59.6.0.dist-info
distutils-precedence.pth pip-22.0.2.dist-info setuptools
(env) miikama$ python3 -m pip install dlint
Collecting dlint
Downloading dlint-0.14.1-py3-none-any.whl (77 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 77.7/77.7 KB 1.4 MB/s eta 0:00:00
Collecting flake8>=3.6.0
Downloading flake8-7.0.0-py2.py3-none-any.whl (57 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 57.6/57.6 KB 6.1 MB/s eta 0:00:00
Collecting pycodestyle<2.12.0,>=2.11.0
Downloading pycodestyle-2.11.1-py2.py3-none-any.whl (31 kB)
Collecting pyflakes<3.3.0,>=3.2.0
Downloading pyflakes-3.2.0-py2.py3-none-any.whl (62 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.7/62.7 KB 7.2 MB/s eta 0:00:00
Collecting mccabe<0.8.0,>=0.7.0
Downloading mccabe-0.7.0-py2.py3-none-any.whl (7.3 kB)
Installing collected packages: pyflakes, pycodestyle, mccabe, flake8, dlint
Successfully installed dlint-0.14.1 flake8-7.0.0 mccabe-0.7.0 pycodestyle-2.11.1 pyflakes-3.2.0
(env) miikama$ ls env/lib/python3.10/site-packages/
__pycache__ flake8 pip-22.0.2.dist-info pyflakes-3.2.0.dist-info
_distutils_hack flake8-7.0.0.dist-info pkg_resources setuptools
distutils-precedence.pth mccabe-0.7.0.dist-info pycodestyle-2.11.1.dist-info setuptools-59.6.0.dist-info
dlint mccabe.py pycodestyle.py **tests**
dlint-0.14.1.dist-info pip pyflakes
(env) miikama$ pip show dlint
Name: dlint
Version: 0.14.1
Summary: Dlint is a tool for encouraging best coding practices and helping ensure Python code is secure.
Home-page: https://github.com/dlint-py/dlint
Author:
Author-email:
License: BSD-3-Clause
Location: env/lib/python3.10/site-packages
Requires: flake8
Required-by:
It would be nice if dlint
did not introduce a package tests
into the global python distribution site-packages directory. Because, afterwards, if run from tests import ...
this package is sometimes picked up even by accident.
My suggestion: Do not package tests
folder with dlint
package to prevent the introduction of the directory env/lib/python3.10/site-packages/tests/
after installation of dlint
package
Have a great day! :)
The following pyyaml
calls should be safe:
yaml.load(..., Loader=yaml.SafeLoader)
yaml.load(..., Loader=yaml.CSafeLoader)
I believe this is equivalent to using safe_load
, but I've encountered a few false positives in the wild using this code.
Per Preventing Regular Expression Denial of Service (ReDoS), Dlint currently supports alternation and nested quantifiers. Redos can also occur via quantifiers in sequence. Let's add support for detecting this redos avenue to Dlint.
resolve_entities
defaults to True
, probably is what allows quadratic blowup and local entity expansion attacks (that lxml is not sujbect to billion laughs I expect means it doesn't do recursive entity expansion), it might be a bit brutal to require it being disabled however it's an option and lxml's FAQ provides a recipe for restricted entity expansionno_network
defaults to True
(no network lookup) and protects against external entity expansion and DTD retrieval, disabling it should probably be flaggedhuge_tree
protects against xml bombs by default, enabling it should probably be flaggedFinally xpath and xslt are a bit more complicated as they're "legit and safe" in the same sense e.g. database APIs are, running a "static" query should be safe (and lxml's xpath API supports parametrisation) but untrusted xpath and xslt injection / untrusted execution has similar issues to sql.
Bad:
input("Password: ")
Good:
getpass.getpass("Password: ")
See getpass.
We can probably just look for input
calls that contain the string literal 'password'
.
This is a built-in flag that affirms the code is not using md5 for security-related purposes. In my code's case, I'm manually calculating a checksum as required by the AWS S3 API.
Would it be possible to remove the upper bound on flake8 requirement?
https://iscinumpy.dev/post/bound-version-constraints/
If not, can we get that bumped up to flake8<7
please?
DUO108 is outdated in the following ways:
raw_input
which is also deprecated in py3Anything else?
Flake8 5 came out a couple of days ago.
Does anything here need to change other than requirements.txt?
The following expression doesn't ReDoS, but Dlint detects it:
re.search(r'(\n.*)+a', '\n' * 64 + 'b')
However, this expression does ReDoS:
re.search(r'(\n.*)+a', '\n' * 64 + 'b', re.DOTALL)
Fixing this requires a large amount of work for little gain in reducing false positives. The first example doesn't seem very common. We don't currently analyze the flags passed to re
functions, so adding this functionality would take considerable work.
Please upgrade dlint to support flake8>=4. There doesn't appear to be any breaking changes that will affect dlint.
https://flake8.pycqa.org/en/latest/release-notes/4.0.0.html#backwards-incompatible-changes
dlint detects redos issues only if the regex is hardcoded into the place where the re function is called. If the regex is stored in a constant, it doesn't catch it.
For example, this code would be flagged:
text = re.sub(r"""(?i)\b((?:https?:(?:/{1,3}|[a-z0-9%])|[a-z0-9.\-]+[.](?:com|net|org|edu|gov|mil|aero|asia|biz|cat|coop|info|int|jobs|mobi|museum|name|post|pro|tel|travel|xxx|ac|ad|ae|af|ag|ai|al|am|an|ao|aq|ar|as|at|au|aw|ax|az|ba|bb|bd|be|bf|bg|bh|bi|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn|co|cr|cs|cu|cv|cx|cy|cz|dd|de|dj|dk|dm|do|dz|ec|ee|eg|eh|er|es|et|eu|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf|gg|gh|gi|gl|gm|gn|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|im|in|io|iq|ir|is|it|je|jm|jo|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|me|mg|mh|mk|ml|mm|mn|mo|mp|mq|mr|ms|mt|mu|mv|mw|mx|my|mz|na|nc|ne|nf|ng|ni|nl|no|np|nr|nu|nz|om|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|ps|pt|pw|py|qa|re|ro|rs|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|Ja|sk|sl|sm|sn|so|sr|ss|st|su|sv|sx|sy|sz|tc|td|tf|tg|th|tj|tk|tl|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|us|uy|uz|va|vc|ve|vg|vi|vn|vu|wf|ws|ye|yt|yu|za|zm|zw)/)(?:[^\s()<>{}\[\]]+|\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\))+(?:\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\)|[^\s`!()\[\]{};:\'\'.,<>?«»“”‘’])|(?:(?<!@)[a-z0-9]+(?:[.\-][a-z0-9]+)*[.](?:com|net|org|edu|gov|mil|aero|asia|biz|cat|coop|info|int|jobs|mobi|museum|name|post|pro|tel|travel|xxx|ac|ad|ae|af|ag|ai|al|am|an|ao|aq|ar|as|at|au|aw|ax|az|ba|bb|bd|be|bf|bg|bh|bi|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn|co|cr|cs|cu|cv|cx|cy|cz|dd|de|dj|dk|dm|do|dz|ec|ee|eg|eh|er|es|et|eu|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf|gg|gh|gi|gl|gm|gn|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|im|in|io|iq|ir|is|it|je|jm|jo|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|me|mg|mh|mk|ml|mm|mn|mo|mp|mq|mr|ms|mt|mu|mv|mw|mx|my|mz|na|nc|ne|nf|ng|ni|nl|no|np|nr|nu|nz|om|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|ps|pt|pw|py|qa|re|ro|rs|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|Ja|sk|sl|sm|sn|so|sr|ss|st|su|sv|sx|sy|sz|tc|td|tf|tg|th|tj|tk|tl|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|us|uy|uz|va|vc|ve|vg|vi|vn|vu|wf|ws|ye|yt|yu|za|zm|zw)\b/?(?!@)))""", "", text)
But this would not:
URL_REGEX = r"""(?i)\b((?:https?:(?:/{1,3}|[a-z0-9%])|[a-z0-9.\-]+[.](?:com|net|org|edu|gov|mil|aero|asia|biz|cat|coop|info|int|jobs|mobi|museum|name|post|pro|tel|travel|xxx|ac|ad|ae|af|ag|ai|al|am|an|ao|aq|ar|as|at|au|aw|ax|az|ba|bb|bd|be|bf|bg|bh|bi|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn|co|cr|cs|cu|cv|cx|cy|cz|dd|de|dj|dk|dm|do|dz|ec|ee|eg|eh|er|es|et|eu|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf|gg|gh|gi|gl|gm|gn|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|im|in|io|iq|ir|is|it|je|jm|jo|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|me|mg|mh|mk|ml|mm|mn|mo|mp|mq|mr|ms|mt|mu|mv|mw|mx|my|mz|na|nc|ne|nf|ng|ni|nl|no|np|nr|nu|nz|om|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|ps|pt|pw|py|qa|re|ro|rs|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|Ja|sk|sl|sm|sn|so|sr|ss|st|su|sv|sx|sy|sz|tc|td|tf|tg|th|tj|tk|tl|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|us|uy|uz|va|vc|ve|vg|vi|vn|vu|wf|ws|ye|yt|yu|za|zm|zw)/)(?:[^\s()<>{}\[\]]+|\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\))+(?:\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\)|[^\s`!()\[\]{};:\'\'.,<>?«»“”‘’])|(?:(?<!@)[a-z0-9]+(?:[.\-][a-z0-9]+)*[.](?:com|net|org|edu|gov|mil|aero|asia|biz|cat|coop|info|int|jobs|mobi|museum|name|post|pro|tel|travel|xxx|ac|ad|ae|af|ag|ai|al|am|an|ao|aq|ar|as|at|au|aw|ax|az|ba|bb|bd|be|bf|bg|bh|bi|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn|co|cr|cs|cu|cv|cx|cy|cz|dd|de|dj|dk|dm|do|dz|ec|ee|eg|eh|er|es|et|eu|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf|gg|gh|gi|gl|gm|gn|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|im|in|io|iq|ir|is|it|je|jm|jo|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|me|mg|mh|mk|ml|mm|mn|mo|mp|mq|mr|ms|mt|mu|mv|mw|mx|my|mz|na|nc|ne|nf|ng|ni|nl|no|np|nr|nu|nz|om|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|ps|pt|pw|py|qa|re|ro|rs|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|Ja|sk|sl|sm|sn|so|sr|ss|st|su|sv|sx|sy|sz|tc|td|tf|tg|th|tj|tk|tl|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|us|uy|uz|va|vc|ve|vg|vi|vn|vu|wf|ws|ye|yt|yu|za|zm|zw)\b/?(?!@)))""" # noqa E501
text = re.sub(URL_REGEX, "", text)
In the original code where I discovered this, URL_REGEX
was defined in a different module, then imported. It would be great to handle this as well (i.e.. the regex is in a variable in the same or in a different module than the re.
call).
In the BadKwargUseLinter
helper we can observe the following behavior:
def func(a, b='foo'):
# ...
return a * b
func(1, b='bar') # Caught with BadKwargUseLinter
func(2, 'bar') # Uncaught with BadKwargUseLinter
That is, kwargs passed as positional arguments will not be caught by BadKwargUseLinter
. We could remedy this by supporting an additional piece of configuration in the BadKwargUseLinter.kwargs
property. If we took another, optional configuration option like position
we could detect this behavior.
This is somewhat of a fringe use-case. It may not be worth it to try and maintain the position of function kwargs since they could easily change out from underneath us.
Installing the package using pip with the --no-binary
flag fails, because no sdist is available on PyPi
$ python -m pip install dlint --no-binary :all:
ERROR: Could not find a version that satisfies the requirement dlint (from versions: none)
ERROR: No matching distribution found for dlint
TODO: find all examples of insecure usage
Python 3.6 was deprecated 8 months ago so it would be good to drop it. Locations to update:
sys.version_info
that needs to be updated accordinglyMalformed expressions currently cause Dlint to raise an exception:
$ pipenv run python -m dlint.redos -p '(foo'
Traceback (most recent call last):
...
sre_constants.error: missing ), unterminated subpattern at position 0
Since we'll be running Dlint across many files and don't want it to crash we should handle this error gracefully. We can simply ignore malformed expressions.
Under python 3.11 and flake8 6.0 with flint 0.14.0 pkgutil.iter_modules
is called via get_plugin_linter_classes
once per module. On my Mac it takes over 100ms
In [2]: %timeit list(pkgutil.iter_modules())
122 ms ± 4.51 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
If running flake8 over several hundred files this adds up to 10s of seconds of extra wait time.
Similar to these community actions: https://github.com/sdras/awesome-actions
We should build an action for Dlint. Bonus points for running Dlint against Dlint :)
Currently, when running the flake8 plugin, we run every linter against the AST that flake8 passes us and flake8 does rule filtering of results based on what's enabled. So we run all linters despite what's set via --select
or --ignore
. Instead we should only run linters if they're enabled. This will speed up Dlint's run time.
From a brief perusal of the codebase, it looks like there's still some code designed for python2 compatibility (such as unnecessary __future__
imports). It would be good to remove this, to avoid confusion for contributors.
The Python Authlib library provides various authentication functionality. There are some potential insecurities surrounding its crypto usage. These include:
JsonWebEncryption(algorithms=['RSA1_5'])
JsonWebSignature(algorithms=['RS256'])
JsonWebSignature(algorithms=['RS384'])
JsonWebSignature(algorithms=['RS512'])
These algorithms make use of RSA PKCSv1.5 which has some security considerations. Let's write a linter to check for this usage.
functools.lru_cache
was recently added to dlint.namespace
which provided a great speed up. We should output functools.lru_cache.cache_info
information when benchmarking.
We should be able to:
linter.namespace.illegal_module_imported.cache_info
linter.namespace.name_imported.cache_info
asname_to_name
as wellOutputting cache_info
will tell us if we're efficiently caching information, and can allow for greater profiling and speed ups.
E.g. Django tests/admin_views/tests.py
Dlint:
python -m flake8 --select=DUO --isolated django/tests/admin_views/tests.py
...
real 0m27.674s
user 0m27.638s
sys 0m0.037s
Regular Flake8:
python -m flake8 --isolated django/tests/admin_views/tests.py
...
real 0m1.715s
user 0m1.679s
sys 0m0.036s
Per the OWASP API Security Top 10, broken function level authorization is a big security concern. Adding a linter to detect this would be very useful. Most Python web application frameworks use decorators on function-level API routes (e.g. rest_framework.decorators.api_view
in Django REST framework, flask_login.login_required
in Flask-Login).
One way I can envision implementing this would be looking for decorator anomalies in Python files that look like they contain API routes. E.g.
@api.route("/users")
@login_required
def users(request):
...
@api.route("/groups")
@login_required
def groups(request):
...
@api.route("/settings")
def settings(request):
# Oops, did we forget @login_required?
...
@api.route("/jobs")
@login_required
def jobs(request):
...
If XX% of API routes in a file are missing what looks like an authentication decorator, we can flag the function missing the decorator. Another common one for authorization might look something like:
@app.route("/users", roles=[User.Admin])
def users(request):
...
@api.route("/groups", roles=[User.Admin])
def groups(request):
...
@api.route("/settings", roles=[User.Regular])
def settings(request):
# Oops, can all users access this sensitive endpoint?
...
This may seem trivial, but it gets more difficult as you have many different authentication methods, authorization schemes, and user roles.
This will probably involve some of the following:
login_required
).There's also some low-hanging fruit here, like just searching for existing "security-off" switches for web framework routes, like:
django.views.decorators.csrf.csrf_exempt
flask_wtf.csrf.exempt
rest_framework.permissions.AllowAny
or permission_classes = []
Hi!
Section Correct code
at https://github.com/dlint-py/dlint/blob/master/docs/linters/DUO121.md#correct-code seems overly complicated to me. Could you elaborate why tempfile.mkstemp
is being avoided?
Thanks and best, Sebastian
Redos alternation detection currently supports dot (e.g. (.|...)
), literals (e.g. (abc|...)
), not literals (e.g. ([^a]|...)
), negate literals (e.g. ([^abc]|...)
), and ranges (e.g. ([a-z]|...)
). Let's add support for category alternation detection (e.g. (\w|...)
).
FYI: os.EX_OK
isn't available on Windows and fails when running flake8 --print-dlint-linters
Not sure what the appropriate substitute is, but I'd be happy to test any changes
Line 46 in 828a156
Many of Dlint's checks looking for code execution bugs due to user input aren't insecure if its argument is a constant string. E.g. eval("2+2")
. There are a few linters where this is the case:
$ rg 'constant string'
docs/linters/DUO120.md
36:* Code may be safe if data passed to `marshal` is a constant string
docs/linters/DUO119.md
36:* Code may be safe if data passed to `shelve` is a constant string
docs/linters/DUO106.md
41:* Code may be safe if data passed to `os.system` is a constant string
docs/linters/DUO103.md
36:* Code may be safe if data passed to `pickle` is a constant string
docs/linters/DUO104.md
35:* Code may be safe if data passed to `eval` is a constant string
docs/linters/DUO105.md
42:* Code may be safe if data passed to `exec` is a constant string
docs/linters/DUO110.md
43:* Code may be safe if data passed to `compile` is a constant string and limits data size
We should consider adding logic to these linters to prevent false positives when the argument is a constant string.
DUO101 is Python 3.3 only, and these versions are no longer supported by dlint.
Other than removing the md file in docs and the linter itself plus its tests, what are the other deprecation procedures?
Most file IO should be using with
statements to automatically close files, but there still may be instances of manual open
and close
calls (or lack thereof). See Reading and Writing Files.
There are lots of ways a file can be opened:
open
os.open
io.open
(alias for builtin open
)tempfile.TemporaryFile|NamedTemporaryFile|SpooledTemporaryFile
tarfile.open
ZipFile.open
My first idea for an implementation is tracking variable instantiation of the above methods, then checking for a lack of a close
call in the same scope.
Also would be good to check for a lack of closing a connection after opening one (e.g. in SQLAlchemy). There's probably lots of opportunity for this in DB connection libraries.
It's happening... To continue supporting only official supported versions of Python we will be removing 2.7 support in January 2020. This will allow the project to remove 2-3 compatibility code (e.g. __future__
imports, etc).
Before dropping 2.7 support there will be one final release of Dlint that can be used with 2.7. This means future enhancements, bug fixes, etc will not be released as 2.7-compatible. Because we haven't released a major version of Dlint yet we can simply bump the minor version (if we'd already released a major version this would probably warrant a semver major version bump).
The xmltodict
library is a widely used XML parsing module. We should check for insecure use of this library. A couple checks come to mind:
disable_entities
kwarg.defusedexpat
is installed. The library checks for this like so:try:
from defusedexpat import pyexpat as expat
except ImportError:
from xml.parsers import expat
This means if it's not installed then the library is wide open to various XML attacks similar to those prevented in defusedxml
. Further, the defusedexpat
library itself appears to be unmaintained, so there may be some insecurities we could search for there as well.
Per the Python documentation:
Changed in version 3.7.1: The SAX parser no longer processes general external entities by default to increase security. Before, the parser created network connections to fetch remote files or loaded local files from the file system for DTD and entities. The feature can be enabled again with method setFeature() on the parser object and argument feature_external_ges.
We should look for explicit enabling of the following features:
Enabling these features allows for XML XXE including DTD retrieval. We should detect usage of these features.
"A group that contains a token with a quantifier must not have a quantifier of its own unless the quantified token inside the group can only be matched with something else that is mutually exclusive with it." (Nested Quantifiers)
Dlint does not currently eliminate safe regular expressions that have nested quantifiers but they're mutually exclusive. Consider the example from the above link:
$ python -m dlint.redos -p '(x\w{1,10})+y'
('(x\\w{1,10})+y', True)
Dlint finds the nested quantifier. But it flags the corrected code as well:
$ python -m dlint.redos -p '(x[a-wyz0-9_]{1,10})+y'
('(x[a-wyz0-9_]{1,10})+y', True)
This example is okay because there's no character overlap inside the nested quantifier. We should fix this false positive.
Per the Python 3.8 release notes:
ast classes Num, Str, Bytes, NameConstant and Ellipsis are considered deprecated and will be removed in future Python versions. Constant should be used instead. (Contributed by Serhiy Storchaka in bpo-32892.)
ast.NodeVisitor methods visit_Num(), visit_Str(), visit_Bytes(), visit_NameConstant() and visit_Ellipsis() are deprecated now and will not be called in future Python versions. Add the visit_Constant() method to handle all constant nodes. (Contributed by Serhiy Storchaka in bpo-36917.)
This isn't immediately necessary, but it's worth tracking. It's not clear when this change will happen, so we don't need a definite timeline. If we can wait until Python 2.7 support is dropped (#17) then this change will be a bit easier since many of these classes/methods are used in 2.7. For example, Dlint makes use of ast.Str
and ast.NameConstant
.
I.e. Context managers.
The BadNameAttributeUseLinter
helper currently supports normal variable instantiation of an object, e.g.
def func():
foo = Foo()
bar = foo.bad_function() # Caught by BadNameAttributeUseLinter
print(bar)
However, the helper does not support instantiation via the with
statement, e.g.
def func():
with Foo() as foo:
bar = foo.bad_function() # Not caught by BadNameAttributeUseLinter, yet
print(bar)
This pattern is not as common as normal variable instantiation, however, it is worth detecting. One of our initial reasons for adding this helper was to catch insecure behavior in tarfile
and zipfile
, and both of these libraries have a common pattern of instantiation via the with
statement.
Let's add context manager variable instantiation support to BadNameAttributeUseLinter
.
Flake8 allows for custom formatters: Developing a Formatting Plugin for Flake8.
I think there's value in having an output mode where things are very dense, and there's one finding per line. This is the current formatter for flake8. It'd also be great if there was a verbose, multi-line mode that included additional information. E.g. the physical line that fired the rule and/or some amount of surrounding lines, a link to the rule's documentation, suggestions for a fix, etc.
I'm envisioning that we develop a flake8 custom formatter that keys off a rule's code and "pretty prints" some or all of the above information to the output buffer.
TODO: find all insecure example usage
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.