Coder Social home page Coder Social logo

fpgmaas / deptry Goto Github PK

View Code? Open in Web Editor NEW
770.0 7.0 12.0 2.15 MB

Find unused, missing and transitive dependencies in a Python project.

Home Page: https://deptry.com/

License: MIT License

Makefile 0.52% Jupyter Notebook 2.65% Python 89.85% Rust 6.99%
dependencies poetry python cicd pep621 rust

deptry's Introduction

Hi there ๐Ÿ‘‹

Hi, I'm Florian, a freelance data/ML engineer. I live in The Hague, the Netherlands.

Find me online

fpgmaas.com ย  Stack Overflow ย  LinkedIn ย 

StackOverflow Statistics

StackOverflow

Show my Github Stats

deptry's People

Contributors

akeeman avatar baggiponte avatar edgarrmondragon avatar fpgmaas avatar joao-vitor-souza avatar mbyrnepr2 avatar mkniewallner avatar renovate[bot] avatar rrauenza avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

deptry's Issues

Add option to only include certain directories, instead of excluding many.

Is your feature request related to a problem? Please describe.

Right now, by default every directory is scanned, and directories can be ignored with --ignore-directories. This can be cumbersome when there are many directories with py files.

Describe the solution you would like

It would be nice to give the user the option to instead use an argument to include some directories, and have all others ignored by default.

Simplify repetition of command line arguments

Is your feature request related to a problem? Please describe.

Currently, to ignore multiple dependencies in the same category, or to exclude multiple directories, the flag has to be reused, e.g.

deptry . --exclude foo -- exclude bar --exclude foobar

This can be shortened to

deptry . --e foo -- e bar --e foobar

At the cost of clarity.

Describe the solution you would like

Similar to black and flake8, allow the passing of a list of comma separated values:

deptry . --exclude foo,bar,foobar

This is both short and clear.

Add the option to check the current version

Is your feature request related to a problem? Please describe.

Because deptry has many released versions, users need to confirm the current version number frequently.

Describe the solution you would like

Option to check the current version deptry --version will be helpful.

Additional context

At this time, we need check the current version with poetry.lock etc.

Alternate file encodings throw UnicodeDecodeError

Describe the bug

Python files that declare an alternate encoding throw a UnicodeDecodeError:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position n: invalid continuation byte

To Reproduce

Steps to reproduce the behavior:

  1. For a file under consideration, add a declaration like # -*- coding: iso-8859-15 -*-
  2. add a line such as my_string = 'รฉ'
  3. Run deptry

Expected behavior

These files should be parsed correctly.

Have the --exclude (and potentially --include when it's added) arguments work with regex.

Is your feature request related to a problem? Please describe.

Currently, the --exclude argument only checks for directories and files that start with elements of the list that is provided.

Describe the solution you would like

It would be great to have this also work with pattern matching. e.g. the user can provide --exclude test* and all directories that match that pattern will be excluded.

Additional context

Similar for --include, if it were to be added (#49)

Should we add support for other dependency specifications than pyproject.toml, such as requirements.txt?

deptry detects issues with dependencies, such as obsolete, missing or transitive dependencies. Right now, it only supports dependencies and development dependencies specified in pyproject.toml.

It would be relatively straightforward to make deptry work with other file formats than pyproject.toml, such as requirements.txt. The only poetry-specific element right now is the parsing of dependencies from pyproject.toml. It should be relatively simple to also support reading from e.g. requirements.txt.

I already created a first working version with for example a method to identify which dependency management file-format is used, and a method to extract dependencies from requirements.txt. This proves that the changes required to support requirements.txt are minimal.


This raises the question; should a utility like this want to support other tools than poetry and pyproject.toml? The initial changes required might seem simple, but it might lead to a lot of unforeseen bugs and increased effort to support the other formats and tools. With the many methods there are to specify project dependencies (environment.yml, setup.py, requirements.txt, pyproject.toml) would it not be better to specialize in one area, rather than trying to generalize to all?

High priority: Add support for python 3.7

Currently, only python 3.8, 3.9, and 3.10 are supported. Supporting Python 3.7 in the current setup does not work, due to this error:

So, because deptry depends on both flake8 (^4.0.1) and mkdocs (^1.3.0), version solving failed.

Probably mkdocs can be moved to a separate dependency group. This will require changes in the github actions workflow.

Revise the unit testing framework

The current unit tests depend on a lot of external dependencies. For example, some unit tests depend on packagematplotlib having mpl_toolkits in top_level.txt.

This is bad practice, it would be a lot better to not have these external dependencies for our unit tests, e.g. by creating mock '.venv' directories.

development dependencies that should be regular dependencies marked as transitive dependencies

Describe the bug

Transitive dependencies now catch multiple issues:

  • Transitive dependencies, as it should. Package A depends on B, B is imported, but A is added as dependency.
  • Simple user misconfiguration; files that should be run with dev dependencies are scanned. Can happen e.g. when user overrrides the ignore_directories configuration, but does not add tests in the list.
  • A dependency that is wrongly marked as a development dependency (i.e. it is imported within the codebase).

Expected behavior

We need to either catch these separate issues, or improve logging to reflect the above, so the user knows what to do. It would be relatively simple to also read the dev requirements and warn the user if the above happens.

cachecontrol is incorrectly markes as obsolete

When running deptry check . on poetry itself, cachecontrol is marked as an obsolete dependency. However, the metadata for cachecontrol is found, and its name is CacheControl. When installing it according to the documentation, it's also spelled CacheControl. In pyproject.toml however, it's spelled as cachecontrol.

So the casing causes the dependency to be marked as obsolete (CacheControl is used to import from, however cachecontrol is the dependency).

This seems strange, since both

poetry add cachecontrol 
poetry add CacheControl 

Create a record in pyproject.toml that specifies CacheControl as the dependency. So this seems a flaw with poetry's pyproject.toml.

Potential solution; convert everything to lowercase?

Add support for .ipynb files

Currently, deptry only scans for imports in .py files. Since .ipynb files are used in projects quite often, it would be good to also scan those for imports. The best option would probably be to convert them to .py files in a temporary directory, and then simply add thos to the list of .py files to be scanned.

Google cloud libraries are always marked as obsolete

Whenever Google cloud libraries like google-cloud-bigquery are used, the following will happen:

Warning: Failed to find corresponding package name for import google
pyproject.toml contains obsolete dependencies: ['google-cloud-bigquery',]

This is because when these packages are installed, they create a file structure in site-packages like this:

google
google_cloud_bigquery-2.34.4-py3.10-nspkg.pth
google_cloud_bigquery-2.34.4.dist-info

and the import statement usually looks something like this:

from google.cloud import bigquery

The parser will see google as the module, and google has no associated metadata.

CI integration (--check option)

Could we have an option for CI?
Example: deptry --check .
If it has done already, where I can find an example of CI integration?

Downgrade click to (>=8.0.0,<9.0.0)

Hello,

Could you safely downgrade click to (>=8.0.0,<9.0.0)?

Your click version is too recent, it makes impossible to install it to pyproject which has old packages ( from 2021 ).

relative imports erronesouly marked as missing

Describe the bug

e.g. a __init__.py with lines as from .foo import bar with a file foo.py will return foo as a mssing import.

To Reproduce

Steps to reproduce the behavior:

  1. ...
  2. ...
  3. ...

Expected behavior

System [please complete the following information]:

  • OS: e.g. [Ubuntu 18.04]
  • Language Version: [e.g. Python 3.8]
  • Poetry version: [e.g. Poetry 1.1.13]

Additional context

deptry does not work when installed globally

Describe the bug

Whenever deptry is installed globally, it does not have access to the metadata of the packages in the virtual environment, even if that virtual environment is activated.

I will see if I can either

  • solve this, which I think will be difficult, or...
  • state clearly in the documentation that deptry should be installed within the virtual environment to be tested and optionally log this warning to the console whenever one or more dependencies are not found in the environment.

To Reproduce

install globally with pip install deptry outside of the virtualenv. Then activate a virtualenv and run deptry .

Optional dependencies are both obsolete and missing when not installed.

Describe the bug

See e.g. https://github.com/p1c2u/openapi-schema-validator/blob/master/pyproject.toml;
if dependencies are specified like

{version = "*", optional = true}

They will be both missing and obsolete:

-----------------------------------------------------

pyproject.toml contains obsolete dependencies:

        attrs
        isodate
        rfc3339-validator
        strict-rfc3339

Consider removing them from your projects dependencies. If a package is used for development purposes, you should add
it to your development dependencies instead.

-----------------------------------------------------

There are dependencies missing from pyproject.toml:

        isodate
        rfc3339_validator
        strict_rfc3339

Consider adding them to your project's dependencies. 

To Reproduce

Run deptry on https://github.com/p1c2u/openapi-schema-validator/blob/master/pyproject.toml

Expected behavior

They should not be obsolete. Should they be missing? Open question. We could use the same approach as mentioned in #68, with fuzzy string matching.

OR we could just specify that deptry needs to be run in an environment with all optional dependencies installed.

scikit-learn/sklearn erroneously marked as obsolete

sklearn can be installed in two ways:

poetry add sklearn
poetry add scikit-learn

both install the package scikit-learn, to be imported with import sklearn

See also here on StackOverflow.

pyproject.toml:

scikit-learn = ">=0.24,<1.1"

Output of deptry:

Corresponding package name for imported module `sklearn` is `sklearn`.
pyproject.toml contains obsolete dependencies: ['scikit-learn']

Add arguments to the docstrings

Currently docstrings only explains what the functions and classes do, but not what the expected arguments are and how they are used. Should be fixed.

Add more unit tests

The ticket that you will always see in new projects.. I have added some unit tests, but not every function is properly tested yet. Room for improvement.

Revise the way that configuration is handled.

Priority should be as follows:

  • default
  • pyproject.toml
  • cli arguments

The way this is currently handled with the creation of a cli_arguments dictionary in cli.py is not very nice, there is room for improvement there.

Issue when parsing `docs/.venv/lib/python3.9/site-packages/packaging/tags.py`

Describe the bug

Accidently ran deptry over .venv, which returned an error:

docs/.venv/lib/python3.9/site-packages/packaging/tags.py

I expect this ahs to do with the statement from . import ...

To Reproduce

Install the docs/.venv, and in pyproject.toml remove the configuration.

Expected behavior

Should parse succesfully.

Add argument 'directory' to CLI

Currently deptry is called as

deptry check

It would be nice to change this to:

deptry check .

So one could also specify other directories.

Should the CLI arguments be relative to the current working dir or to the [DIRECTORY] argument?

Question

See also; #91

The first argument to deptry is currently the project's root for which to run deptry. Since deptry is installed within the virtual environment, there should almost never be a reason to enter something else than .. The other arguments are all relative to this directory. This can lead to confusion.

Let's assume we have a directory structure:

my-proj/

โ”œโ”€ my_proj/
โ”‚  โ”œโ”€ main.py
โ”œโ”€ tests/
โ”‚  โ”œโ”€ test.py
โ”œโ”€ req/
โ”‚  โ”œโ”€ req-prod.txt

Right now it would be called as:

deptry . --exclude tests --requirements-txt req/req-prod.txt

We have multiple options:

1. Don't change anything

Keep the arguments as is. The argument seems a little bit useless, since the project's virtual environment needs to be active for deptry to scan a project, and thus deptry will almost always be called as deptry . The fact that this argument is there, mgith lead people to incorrectly believe they can also use it as 'code to be scanned' and try to run e.g. deptry src to scan only the code in the src directory, which will fail.

2 . Remove the first argument [DIRECTORY]

a) Remove the argument and always run deptry within the current working directory.

deptry --exclude tests --requirements-txt req/req-prod.txt

b) Alternatively, make it optional.

3. Change the meaning of the first argument [DIRECTORY]

Change the meaning from 'directory to run in' to 'code to be scanned'. This will lead to some CLI options being relative (--exclude, --extend-exclude), and some being absolute (--requirements-txt)

deptry my_proj --requirements-txt req/req-prod.txt

In this case, we don't need to specify the exclude option. However:

  • this becomes a bit complicated if we have multiple directories that we want to scan within my-proj. Also, I think it's better to explicitly exclude than to explicility include, since we are looking for obsolete dependencies.
  • If within my_proj there is a file we want to ignore we would run:
deptry my_proj --requirements-txt req/req-prod.txt --exclude the_file.py

Now one of the arguments is relative, and the other is absolute. This seems quite confusing.

Enable tox unit tests for Python 3.11

Describe the bug

Unit tests for Python 3.11 ran succesfully: https://github.com/fpgmaas/deptry/actions/runs/3020905009

However, when the branch was merged, the same pipeline failed: https://github.com/fpgmaas/deptry/actions/runs/3020926227

      value = _canonicalize_regex.sub("-", name).lower()
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  TypeError: expected string or bytes-like object, got 'NoneType'

Seems related to this issue; pypa/pip#11352 and this issue; python-poetry/poetry#6389

For now, disabled the tox unit tests for Python 3.11. These should be enabled again later.

Packages that share a top-level module name are never marked as obsolete if one of them is used.

If packages share a top-level module name, they are never marked as obsolete if one of them is used.

An example; google-cloud-bigquery and google-cloud-auth share a top-level name; google, which is found in their metadata's top_level.txt. They are imported as google.cloud.bigquery and google.cloud.auth.

deptry recognizes that there is an imported module google but it has no metadata. Therefore, for each of the google-cloud-* packages, it will see if any of the top-level module names are imported. Each has top-level module name google, so if anywhere in the code there is anything imported fromgoogle, all google-cloud-* packaged are marked as not obsolete.

See also #5 for the prior issue with google packages.

deptry does not handle conditional dependencies very well

Describe the bug

deptry .

on the current branch py311, returns:

-----------------------------------------------------

pyproject.toml contains obsolete dependencies:

        importlib-metadata

Consider removing them from your projects dependencies. If a package is used for development purposes, you should add
it to your development dependencies instead.

-----------------------------------------------------

There are dependencies missing from pyproject.toml:

        importlib_metadata

Consider adding them to your project's dependencies. 

-----------------------------------------------------

The dependency looks as follows:

importlib-metadata = { version = "*", python = "<=3.7" }

And we use Python 3.9 to develop locally. So the package is not installed.

The error is because the module name can not be found from the dependency (since it's not installed), so the dependency will be marked as obsolete.

SImilarly, no matching dependency that provides the module importlib_metadata can be found, since it's not installed in the environment.

Add functionality that recognises imports within if/else blocks

When deptry is run on poetry itself, it marks importlib-metadata as a obsolete dependency. However, it is used:

if sys.version_info < (3, 10):
    # compatibility for python <3.10
    import importlib_metadata as metadata
else:
    from importlib import metadata

This is probably an issue with our use of ast within out ImportParser.

requirements.txt must be in the same directory that is being scanned

Is your feature request related to a problem? Please describe.
requirements.txt must be in the same directory as specified by the positional "directory" argument

Describe the solution you would like
by defining a command line option to specify the requirements text file i would assume that the program would use that path for the file, but that doesn't appear to be the case when looking at the source code

Additional context
image

Missing development dependency raises an error if package.name is None

Describe the bug

MisplacedDevDependenciesFinder uses the Module.is_dev_dependency flag to mark a dependency as a misplaced development dependency and then it adds the module.Name flag to the list. However, it's possible that module.Name is None,
if the dependency's metadata was not found.

To Reproduce

Steps to reproduce the behavior:

Add a fake dependency to pyproject.toml's dev dependencies:

[tool.poetry.dev-dependencies]
foo = "foo"

Then import the fake dependency somewhere:

if False:
    import foo

Output:

  File "....../deptry/result_logger.py", line 61, in _log_misplaced_develop_dependencies
    f"There are imported modules from development dependencies detected:\n{sep}{sep.join(sorted(dependencies))}\n"
TypeError: sequence item 0: expected str instance, NoneType found

setuptools marked as a transitive dependency

Describe the bug

setuptools is marked as a transitive dependency if it is imported in the code. Is this correct, or should it always be ignored, since it is in a poetry/virtualenv environment by default?

Add the option to check for obsolete dev-dependencies

Currently, only regular dependencies are checked. dev-dependencies could also be checked, but they are often packages like jupyterlab, black, isort, flake8, which are not imported in the code. So the logic becomes a lot more fuzzy, should we e.g. check makefiles & ci/cd files for the mentioning of these dependencies?

Alternatively, those can be added to an argument dev-dependencies-to-ignore in pyproject.toml (or something similar), and regular python files can be scanned for imports. This can still help to find obsolete dependencies; e.g. a package might have been used in the development of unit tests, but it's no longer actually used.

Add mypy

Mypy turned off for initial development. We should add it again.

Improved handling of conditional dependencies

Is your feature request related to a problem? Please describe.

Follow-up of #59

Right now, if a dependency is conditional, e.g. importlib-metadata = { version = "*", python = "<=3.7" }, it is ignored during the checking of obsolete dependencies. This because there is no way for deptry to link the module to the dependency if it is not installed.

Describe the solution you would like

We could improve this and check the python string for the conditional dependency against the active python version. In this case, if Python>3.7, importlib_metadata is not installed and should not be considered for the obsolete check. If Python<=3.7, the package is installed and thus can be checked for being obsolete.

Conditional dependencies marked as missing

Describe the bug

If a conditional dependency like importlib-metadata = { version = "*", python = "<=3.7" } is not installed, it will be marked as missing:

-----------------------------------------------------

There are dependencies missing from pyproject.toml:

        importlib_metadata

Consider adding them to your project's dependencies. 

-----------------------------------------------------

This is because there is no way for deptry to link the module to the dependency if it is not installed.

There is a potential solution that should work in 95% of the cases; If a missing module is found AND there are conditional dependencies, do a similarity match between the dependency name and the module name. e.g. we could replace _ with - and check for equality, or use e.g. the Lehvenstein distance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.