Comments (5)
The most convenient way to find files under a directory, and exclude some based upon gitignore patterns is to treat the files as sets, and perform operations from there:
import pathspec
import pathspec.util
ignore_spec = pathspec.GitIgnoreSpec.from_lines([
"*.txt",
# ...
])
all_files = set(pathspec.util.iter_tree_files("/some/directory"))
ignore_files = set(ignore_spec.match_files(all_files))
keep_files = all_files - ignore_files
I think your 3rd suggestion of adding a negate parameter to PathSpec.match_tree_files()
would be the safest way to make this easier. Gitignore has peculiar edge cases that I don't think would behave properly if every match was simply negated. Implementing the logic at the tree level would allow using the file iterator intelligently without having to consume the whole thing in 2 or 3 sets.
from python-pathspec.
Some ideas on how it could be achieved:
-
Add a method to PathSpec to negate the whole spec and return a spec with everything flipped.
A bit similar topython-pathspec/pathspec/pathspec.py
Line 80 in 933dd7d
-
Add a method to PathSpec to negate the whole spec and return a wrapper object that has a negated
match_file
method. (note: until this functionality is in the library, making such a wrapper myself seems quite feasible in the meantime) -
Add a "negate" parameter to
match_tree_files
https://github.com/cpburnz/python-pathspec/blob/933dd7da982551300a584c98570993402a56bc27/pathspec/pathspec.py#L267-L268C19
pass it on tomatch_files
and negate the condition accordingly
python-pathspec/pathspec/pathspec.py
Line 213 in 933dd7d
from python-pathspec.
- Add a "negate" parameter to
from_lines
which flips theinclude
of each parsed line
from python-pathspec.
Yes, a negate
parameter seems to be the right approach, and indeed I was running into all kinds of edge cases otherwise.
The current approach with scanning the directory tree twice does not seem optimal to me :(
Meanwhile I also realized that there is a decently convenient way in the current state as well, without pulling too much of internals with it:
def match_files_negated(exclude: pathspec.PathSpec, root: str) -> Iterable[str]:
for path in pathspec.util.iter_tree_files(root):
if not exclude.match_file(path):
yield path
from python-pathspec.
The negate
parameter for PathSpec.match_files()
and related methods has been implemented and is in the new v0.11.2 release.
from python-pathspec.
Related Issues (20)
- `match_files()` is not a pure generator function, and it impacts `tree_*()` gravely HOT 1
- Symlink pathspec_meta.py breaks Windows HOT 1
- test_util.py uses os.symlink which can fail on Windows HOT 1
- Backslashes at start of pattern not handled correctly HOT 1
- `!` doesn't exclude files in directories if the pattern doesn't have a trailing slash HOT 1
- Dist failure for Fedora, CentOS, EPEL HOT 11
- Since version 0.10.0 pure wildcard does not work in some cases HOT 4
- The pattern_to_regex method does not seem to work correctly on windows. HOT 4
- IndexError with my .gitignore file when trying to build a Python package HOT 4
- Checking directories via match_file() does not work on Path objects HOT 5
- Package not marked as `py.typed` HOT 1
- Exports are considered private HOT 1
- `'Self'` string literal type is `Unknown` in pyright HOT 1
- Please consider switching the build-system to flit_core to ease setuptools bootstrap HOT 5
- Include directory should override exclude file HOT 3
- On bracket expression negation HOT 2
- `GitIgnoreSpec` behaviors differ from git HOT 2
- PathSpec.match_file() returns None since 0.12.0 HOT 3
- Exclusions not working HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from python-pathspec.