asottile / classify-imports Goto Github PK
View Code? Open in Web Editor NEWUtilities for refactoring imports in python-like syntax.
License: MIT License
Utilities for refactoring imports in python-like syntax.
License: MIT License
There is a special case logic gate in classify.py:
# special case pypy3 bug(?)
elif not os.path.exists(spec.origin):
return True, '(builtin)', True
If you are using a zipped third-party package, then this will falsely identify it as part of the standard library.
Here's a code snippet showing the value of spec.origin
in such a case:
Python 3.7.5 (default, Oct 31 2019, 15:18:51) [MSC v.1916 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.9.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import importlib.util
In [2]: spec = importlib.util.find_spec('py4j')
In [3]: spec
Out[3]: ModuleSpec(name='py4j', loader=<zipimporter object "C:\DevApps\spark\spark-2.4.4-bin\python\lib\py4j-0.10.7-src.zip">, origin='C:\\DevApps\\spark\\spark-2.4.4-bin\\python\\lib\\py4j-0.10.7-src.zip\\py4j\\__init__.py', submodule_search_locations=['C:\\DevApps\\spark\\spark-2.4.4-bin\\python\\lib\\py4j-0.10.7-src.zip\\py4j'])
In [4]: spec.origin
Out[4]: 'C:\\DevApps\\spark\\spark-2.4.4-bin\\python\\lib\\py4j-0.10.7-src.zip\\py4j\\__init__.py'
That spec.origin
doesn't exist on the filesystem, so os.path.exists(spec.origin)
is False
.
I realize we're using %PythonPath%
which isn't really supported, but I do believe this would also be an issue in other scenarios. (Sourcing py4j
from a zip included with a Spark distribution is actually a fairly common example I believe.)
Would it be reasonable to get another logic gate in there to identify the zipped library scenario? I'd be happy to try and make the addition if you like.
Thanks for the great tool. We're fine with mixing our third-party and first-party imports due to using %PythonPath%
. We really appreciate the black
compatibility (and the focus on clean diffs).
I am using this, and I noticed the relative import sorted incorrect. Here is the code
import pprint
from aspy.refactor_imports.import_obj import FromImport
objs = sorted([
FromImport.from_str('from a import foo'),
FromImport.from_str('from a.b import baz'),
FromImport.from_str('from a import bar'),
FromImport.from_str('from a import bar as buz'),
FromImport.from_str('from a import bar as baz'),
FromImport.from_str('from .a import bar as baz'),
FromImport.from_str('from . import bar as baz'),
FromImport.from_str('from .aa.bb import bar as baz'),
])
pprint.pprint([i.to_text() for i in objs])
Excepted:
[u'from a import bar\n',
u'from a import bar as baz\n',
u'from a import bar as buz\n',
u'from a import foo\n',
u'from a.b import baz\n',
u'from .aa.bb import bar as baz\n',
u'from .a import bar as baz\n',
u'from . import bar as baz\n',
]
Got:
[u'from . import bar as baz\n',
u'from .a import bar as baz\n',
u'from .aa.bb import bar as baz\n',
u'from a import bar\n',
u'from a import bar as baz\n',
u'from a import bar as buz\n',
u'from a import foo\n',
u'from a.b import baz\n']
I'm using python3.8.
from aspy.refactor_imports.classify import classify_import
print(classify_import("distutils"))
Expected:
BUILTIN
Actual:
THIRD_PARTY
Upon a closer look, I see that you specifically force distutils
to be classified as THIRD_PARTY
though I don't understand the reasoning given it still seems to be part of the standard library or at least seems to be documented as such. Also, I don't see functionality such as distutils.dir_util.copy_tree
having moved into setuptools
.
# force distutils to be "third party" after being gobbled by setuptools
elif base == 'distutils':
return ImportType.THIRD_PARTY
something something case insensitive filesystem something
On Windows, under certain circumstances, an application can be misclassified as a builtin, due to case-sensitive path comparison on a case-insensitive filesystem.
The function _module_path_is_local_and_is_not_symlinked
in the module aspy.refactor_imports.classify
uses str.startswith
to check if the module path is located within an application directory. On a case-insensitive filesystem, this may produce a false negative if the representation of the paths differs in case.
The same function compares the absolute path of the module to its real path to check for the absence of symlinks, using string equality. This check will also produce a false negative if the paths differ in case.
As a consequence of this bug, reorder-python-imports
can sort imports differently on Windows than on other platforms, moving an application import to the top of the imports.
Reproduction
I create a repository to reproduce the issue here. You can see the outcomes for different operating systems in the GitHub Actions run here. The job marked windows-latest - requirements.txt
uses the latest release of aspy.refactor_imports, the job marked windows-latest - requirements-fix.txt
uses a patched version, see the associated PR #53 for this issue.
I was able to reproduce the behavior on the following platforms:
Here's how to reproduce the issue from scratch:
src
layout.python -m venv venv
inside the packagepip install -e .
reorder-python-imports
on the test module.Some further details:
The _get_module_info
helper uses importlib.util.find_spec
to determine the module path. The representation of this path is determined by the matching entry in sys.path
, which is lower-case in the repro example.
I get the following assertion error while running tox -e linting
on the pytest repo locally, which runs pre-commit, which runs reorder-python-imports:
Reorder python imports...................................................Failed
- hook id: reorder-python-imports
- exit code: 1
Traceback (most recent call last):
File "/home/ran/.cache/pre-commit/repoye77qfmi/py_env-python3.10/bin/reorder-python-imports", line 8, in <module>
sys.exit(main())
File "/home/ran/.cache/pre-commit/repoye77qfmi/py_env-python3.10/lib/python3.10/site-packages/reorder_python_imports.py", line 907, in main
retv |= _fix_file(filename, args)
File "/home/ran/.cache/pre-commit/repoye77qfmi/py_env-python3.10/lib/python3.10/site-packages/reorder_python_imports.py", line 475, in _fix_file
new_contents = fix_file_contents(
File "/home/ran/.cache/pre-commit/repoye77qfmi/py_env-python3.10/lib/python3.10/site-packages/reorder_python_imports.py", line 456, in fix_file_contents
partitioned = step(partitioned)
File "/home/ran/.cache/pre-commit/repoye77qfmi/py_env-python3.10/lib/python3.10/site-packages/reorder_python_imports.py", line 366, in apply_import_sorting
sorted_blocks = sort(import_obj_to_partition.keys(), **sort_kwargs)
File "/home/ran/.cache/pre-commit/repoye77qfmi/py_env-python3.10/lib/python3.10/site-packages/aspy/refactor_imports/sort.py", line 98, in sort
imports_partitioned[classify_func(import_obj)].append(import_obj)
File "/home/ran/.cache/pre-commit/repoye77qfmi/py_env-python3.10/lib/python3.10/site-packages/aspy/refactor_imports/sort.py", line 72, in classify_func
tp = classify_import(obj.import_statement.module, **kwargs)
File "/home/ran/.cache/pre-commit/repoye77qfmi/py_env-python3.10/lib/python3.10/site-packages/aspy/refactor_imports/classify.py", line 135, in classify_import
found, module_path, is_builtin = _get_module_info(
File "/home/ran/.cache/pre-commit/repoye77qfmi/py_env-python3.10/lib/python3.10/site-packages/aspy/refactor_imports/classify.py", line 103, in _get_module_info
assert spec.submodule_search_locations is not None
AssertionError
reorder_python_imports
is configured to run like this:
- repo: https://github.com/asottile/reorder_python_imports
rev: v2.6.0
hooks:
- id: reorder-python-imports
args: ['--application-directories=.:src', --py37-plus]
This is on Python 3.10. I will try to add more details later if needed.
unclassifiable_application_modules
Under Python 3.10+, the system builtins are checked for the import before anything else.
Under Python <3.10, unclassifiable_application_modules
is checked before anything else.
As a result, if a builtin module is specified in unclassifiable_application_modules
, it will be reported as an application module under Python <3.10 and a builtin under Python 3.10+.
I'm not sure what the "correct" behavior for these are, but it does seem that both should be consistent among Python versions.
application_directories
Furthermore, the difference in find_stdlib
under Python <3.10 is different from the base in sys.stdlib_module_names
check under 3.10+. If the former is configured with an empty set of application_directories
, it will identify an arbitrary module in the current directory as a builtin, while sys.stdlib_module_names
does not. Here the behavior seems more clearly correct under Python 3.10+.
Noticed this here: Yelp/beans@b4eabf0
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.