Coder Social home page Coder Social logo

devopshq / artifactory-cleanup Goto Github PK

View Code? Open in Web Editor NEW
104.0 10.0 55.0 210 KB

Extended cleanup tool for JFrog Artifactory

License: MIT License

Python 98.63% Dockerfile 0.63% Shell 0.46% Makefile 0.29%
artifactory artifactory-cleanup cleanup jfrog jfrog-artifactory artifactory-api hacktoberfest

artifactory-cleanup's Introduction

Artifactory cleanup

artifactory-cleanup is an extended and flexible cleanup tool for JFrog Artifactory.

The tool has simple YAML-defined cleanup configuration and can be extended with your own rules on Python. Everything must be as a code, even cleanup policies!

Tables of Contents

Installation

As simple as one command!

# docker
docker pull devopshq/artifactory-cleanup
docker run --rm devopshq/artifactory-cleanup artifactory-cleanup --help

# python (later we call it 'cli')
python3 -mpip install artifactory-cleanup
artifactory-cleanup --help

Usage

Suppose you want to remove all artifacts older than N days from reponame repository. You should take the following steps:

  1. Install artifactory-cleanup (see above)
  2. Create a configuration file artifactory-cleanup.yaml. variables.
# artifactory-cleanup.yaml
artifactory-cleanup:
  server: https://repo.example.com/artifactory
  # $VAR is auto populated from environment variables
  user: $ARTIFACTORY_USERNAME
  password: $ARTIFACTORY_PASSWORD

  policies:
    - name: Remove all files from repo-name-here older than 7 days
      rules:
        - rule: Repo
          name: "reponame"
        - rule: DeleteOlderThan
          days: 7
  1. Run the command TO SHOW (not remove) artifacts that will be deleted. By default artifactory-cleanup uses "dry mode".
# Set the credentials with delete permissions
export ARTIFACTORY_USERNAME=usernamehere
export ARTIFACTORY_PASSWORD=password

# docker
docker run --rm -v "$(pwd)":/app -e ARTIFACTORY_USERNAME -e ARTIFACTORY_PASSWORD devopshq/artifactory-cleanup artifactory-cleanup

# cli
artifactory-cleanup
  1. Verify that right artifacts will be removed and add --destroy flag TO REMOVE artifacts:
# docker
docker run --rm -v "$(pwd)":/app -e ARTIFACTORY_USERNAME -e ARTIFACTORY_PASSWORD devopshq/artifactory-cleanup artifactory-cleanup --destroy

# cli
artifactory-cleanup --destroy

Looking for more examples? Check examples folder!

Notes

# docker
docker pull devopshq/artifactory-cleanup:1.0.0
docker run --rm devopshq/artifactory-cleanup:1.0.0 artifactory-cleanup --version

# python (later we call it 'cli')
python3 -mpip install artifactory-cleanup==1.0.0
artifactory-cleanup --help
  • Use CI servers or cron-like utilities to run artifactory-cleanup every day (or every hour). TeamCity and GitHub have built-in support and show additional logs format
  • Do not save credentials in the configuration file, use environment variables.
  • Use --ignore-not-found flag to ignore errors when the repository is not found. It's useful when you have a configuration for multiple repositories and some of them are not found.
  • Use --worker-count=<WORKER_NUM> to increase the number of workers. By default, it's 1. It's useful when you have a lot of artifacts and you want to speed up the process.

Commands

# Debug - "dry run" mode by default
# debug run - only print artifacts. it does not delete any artifacts
artifactory-cleanup

# Debug run only for policytestname.
artifactory-cleanup --policy-name policytestname

# REMOVE
# For remove artifacts use --destroy
artifactory-cleanup --destroy

# For remove artifacts use environment variable
export ARTIFACTORY_CLEANUP_DESTROY=True
artifactory-cleanup

# Specify config filename
artifactory-cleanup --config artifactory-cleanup.yaml

# Specify config filename using environment variable
export ARTIFACTORY_CLEANUP_CONFIG_FILE=artifactory-cleanup.yaml
artifactory-cleanup --config artifactory-cleanup.yaml

# Look in the future - shows what the tool WILL remove after 10 days
artifactory-cleanup --days-in-future=10

# Not satisfied with built-in rules? Write your own rules in python and connect them!
artifactory-cleanup --load-rules=myrule.py
docker run -v "$(pwd)":/app devopshq/artifactory-cleanup artifactory-cleanup --load-rules=myrule.py

# Save the table summary in a file
artifactory-cleanup --output=myfile.txt

# Save the summary in a Json file
artifactory-cleanup --output=myfile.txt --output-format=json

Rules

Common

  • Repo - Apply the rule to one repository. If no name is specified, it is taken from the rule name (in CleanupPolicy definition)
- rule: Repo
  name: reponame
# OR - if you have a single policy for the repo - you can name the policy as reponame
# Both configurations are equal
policies:
  - name: reponame
    rules:
      - rule: Repo
  • RepoList - Apply the policy to list of repositories.
- rule: RepoList
  repos:
    - repo1
    - repo2
    - repo3
  • RepoByMask - Apply rule to repositories matching by mask
- rule: RepoByMask
  mask: "*.banned"
  • PropertyEq- Delete repository artifacts only with a specific property value (property_key is the name of the parameter, property_value is the value)
- rule: PropertyEq
  property_key: key-name
  property_value: 1
  • PropertyNeq- Delete repository artifacts only if the value != specified. If there is no value, delete it anyway. Allows you to specify the deletion flag do_not_delete = 1
- rule: PropertyNeq
  property_key: key-name
  property_value: 1

Delete

  • DeleteOlderThan - deletes artifacts that are older than N days
- rule: DeleteOlderThan
  days: 1
  • DeleteWithoutDownloads - deletes artifacts that have never been downloaded (DownloadCount=0). Better to use with DeleteOlderThan rule
- rule: DeleteWithoutDownloads
  • DeleteOlderThanNDaysWithoutDownloads - deletes artifacts that are older than N days and have not been downloaded
- rule: DeleteOlderThanNDaysWithoutDownloads
  days: 1
  • DeleteNotUsedSince - delete artifacts that were downloaded, but for a long time. N days passed. Or not downloaded at all from the moment of creation and it's been N days
- rule: DeleteNotUsedSince
  days: 1
  • DeleteEmptyFolders - Clean up empty folders in given repository list
- rule: DeleteEmptyFolders
  • DeleteByRegexpName - delete artifacts whose name matches the specified regexp
- rule: DeleteByRegexpName
  regex_pattern: "\d"

Keep

  • KeepLatestNFiles - Leaves the last (by creation time) files in the amount of N pieces. WITHOUT accounting subfolders
- rule: KeepLatestNFiles
  count: 1
  • KeepLatestNFilesInFolder - Leaves the last (by creation time) files in the number of N pieces in each folder
- rule: KeepLatestNFilesInFolder
  count: 1
  • KeepLatestVersionNFilesInFolder - Leaves the latest N (by version) files in each folder. The definition of the version is using regexp. By default it parses semver using the regex - ([\d]+\.[\d]+\.[\d]+)")
- rule: KeepLatestVersionNFilesInFolder
  count: 1
  custom_regexp: "[^\\d][\\._]((\\d+\\.)+\\d+)"
  • KeepLatestNupkgNVersions - Leaves N nupkg (adds *.nupkg filter) in release feature builds
- rule: KeepLatestNupkgNVersions
  count: 1

Docker

  • DeleteDockerImagesOlderThan - Delete docker images that are older than N days
- rule: DeleteDockerImagesOlderThan
  days: 1
  • DeleteDockerImagesOlderThanNDaysWithoutDownloads - Deletes docker images that are older than N days and have not been downloaded
- rule: DeleteDockerImagesOlderThanNDaysWithoutDownloads
  days: 1
  • DeleteDockerImagesNotUsed - Removes Docker image not downloaded since N days
- rule: DeleteDockerImagesNotUsed
  days: 1
  • IncludeDockerImages - Apply to docker images with the specified names and tags
- rule: IncludeDockerImages
  masks: "*singlemask*"
- rule: IncludeDockerImages
  masks:
    - "*production*"
    - "*release*"
  • ExcludeDockerImages - Exclude Docker images by name and tags.
- rule: ExcludeDockerImages
  masks:
    - "*production*"
    - "*release*"
  • KeepLatestNVersionImagesByProperty(count=N, custom_regexp='some-regexp', number_of_digits_in_version=X) - Leaves N Docker images with the same major. (^\d+\.\d+\.\d+$) is the default regexp how to determine version which matches semver 1.1.1. If you need to add minor then set number_of_digits_in_version to 2 or if patch then set to 3 (by default we match major, which 1). Semver tags prefixed with v are supported by updating the regexp to include (an optional) v in the expression (e.g., (^v?\d+\.\d+\.\d+$)).
- rule: KeepLatestNVersionImagesByProperty
  count: 1
  custom_regexp: "[^\\d][\\._]((\\d+\\.)+\\d+)"
  • DeleteDockerImageIfNotContainedInProperties(docker_repo='docker-local', properties_prefix='my-prop', image_prefix=None, full_docker_repo_name=None) - Remove Docker image, if it is not found in the properties of the artifact repository.

  • DeleteDockerImageIfNotContainedInPropertiesValue(docker_repo='docker-local', properties_prefix='my-prop', image_prefix=None, full_docker_repo_name=None) - Remove Docker image, if it is not found in the properties of the artifact repository.

Filters

  • IncludePath - Apply to artifacts by path / mask.
- rule: IncludePath
  masks: "*production*"
- rule: IncludePath
  masks:
   - "*production*"
   - "*develop*"
  • IncludeFilename - Apply to artifacts by name/mask
- rule: IncludeFilename
  masks:
   - "*production*"
   - "*develop*"
  • ExcludePath - Exclude artifacts by path/mask
- rule: ExcludePath
  masks:
   - "*production*"
   - "*develop*"
  • ExcludeFilename - Exclude artifacts by name/mask
- rule: ExcludeFilename
  masks:
    - "*.tag.gz"
    - "*.zip"

Create your own rule

If you want to create your own rule, you can do it!

The basic flow how the tool calls Rules:

  1. Rule.check(*args, **kwargs) - verify that the Rule configured right. Call other services to get more information.
  2. Rule.aql_add_filter(filters) - add Artifactory Query Language expressions
  3. Rule.aql_add_text(aql) - add text to the result aql query
  4. artifactory-cleanup calls Artifactory with AQL and pass the result to the next step
  5. Rule.filter(artifacts) - filter out artifacts. The method returns artifacts that will be removed!.
    • To keep artifacts use artifacts.keep(artifact) method

Create myrule.py file at the same folder as artifactory-cleanup.yaml:

# myrule.py
from typing import List

from artifactory_cleanup import register
from artifactory_cleanup.rules import Rule, ArtifactsList


class MySimpleRule(Rule):
    """
    This doc string is used as rule title

    For more methods look at Rule source code
    """

    def __init__(self, my_param: str, value: int):
        self.my_param = my_param
        self.value = value

    def aql_add_filter(self, filters: List) -> List:
        print(f"Today is {self.today}")
        print(self.my_param)
        print(self.value)
        return filters

    def filter(self, artifacts: ArtifactsList) -> ArtifactsList:
        """I'm here just to print the list"""
        print(self.my_param)
        print(self.value)
        # You can make requests to artifactory by using self.session:
        # url = f"/api/storage/{self.repo}"
        # r = self.session.get(url)
        # r.raise_for_status()
        return artifacts


# Register your rule in the system
register(MySimpleRule)

Use rule: MySimpleRule in configuration:

# artifactory-cleanup.yaml
- rule: MySimpleRule
  my_param: "Hello, world!"
  value: 42

Specify --load-rules to the command:

# docker
docker run -v "$(pwd)":/app devopshq/artifactory-cleanup artifactory-cleanup --load-rules=myrule.py

# cli
artifactory-cleanup --load-rules=myrule.py

How to

How to connect self-signed certificates for docker?

In case you have set up your Artifactory self-signed certificates, place all certificates of the chain of trust into the certificates folder and add additional argument to the command:

docker run -v "$(pwd)":/app -v "$(pwd)/certificates":/mnt/self-signed-certs/ devopshq/artifactory-cleanup artifactory-cleanup

How to clean up Conan repository?

We can handle conan's metadata by creating two policies:

  1. First one removes files but keep all metadata.
  2. Second one look at folders and if it contains only medata files - removes it (because there's no associated with metadata files)

The idea came from #47

# artifactory-cleanup.yaml
artifactory-cleanup:
  server: https://repo.example.com/artifactory
  user: $ARTIFACTORY_USERNAME
  password: $ARTIFACTORY_PASSWORD

  policies:
    - name: Conan - delete files older than 60 days
      rules:
        - rule: Repo
          name: "conan-testing"
        - rule: DeleteNotUsedSince
          days: 60
        - rule: ExcludeFilename
          masks:
            - ".timestamp"
            - "index.json"
    - name: Conan - delete empty folders (to fix the index)
      rules:
        - rule: Repo
          name: "conan-testing"
        - rule: DeleteEmptyFolders
        - rule: ExcludeFilename
          masks:
            - ".timestamp"
            - "index.json"

How to keep latest N docker images?

We can combine docker rules with usual "files" rules!

The idea came from #61

# artifactory-cleanup.yaml
artifactory-cleanup:
  server: https://repo.example.com/artifactory
  user: $ARTIFACTORY_USERNAME
  password: $ARTIFACTORY_PASSWORD

  policies:
    - name: Remove docker images, but keep last 3
      rules:
        # Select repo
        - rule: Repo
          name: docker-demo
        # Delete docker images older than 30 days
        - rule: DeleteDockerImagesOlderThan
          days: 30
        # Keep these tags for all images
        - rule: ExcludeDockerImages
          masks:
            - "*:latest"
            - "*:release*"
        # Exclude these docker tags
        - rule: ExcludePath
          masks: "*base-tools*"
        # Keep 3 docker tags for all images
        - rule: KeepLatestNFilesInFolder
          count: 3

Release

In order to provide a new release of artifactory-cleanup, there are two steps involved.

  1. Bump the version in the setup.py
  2. Bump the version in the init.py
  3. Create a Git release tag (in format 1.0.1) by creating a release on GitHub

artifactory-cleanup's People

Contributors

allburov avatar astellingwerf avatar donhui avatar drackthor avatar evertonsa avatar hardar3 avatar heikoschwarz avatar jetersen avatar kenny-monster avatar kondensatorn avatar mefju avatar mpepping avatar nikolasj avatar nittyy avatar offa avatar orbatschow avatar paramount avatar pre-commit-ci[bot] avatar pro avatar rockdreamer avatar rommmmm avatar roytev avatar seth-stvns avatar sm-gravid-day avatar stevenkger avatar thedatabaseme avatar tim55667757 avatar turiok avatar weakcamel avatar zhan9san avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

artifactory-cleanup's Issues

Stabilize and document API for Rules

Hi, I followed the existing rules in the codebase and created a rather convoluted one for my company's requirements. It's working quite well but I'd like to have an API that is documented and reasonably stable and some unit testing in place.

Is this a direction you're interested in for this project?

How do we handle protected images?

Hi Team,

Need your help!!! @allburov @jetersen @Tim55667757 @fishhead108 @Boltyk

I am trying to filter out artifacts which has retention-interval=forever.
I am using this rule rules.property_neq('retention-interval','forever')
however when I try to execute, the output provides artifacts which has the property value = retention-interval=forever. How do i filter out? These artifacts should not be deleted.

my reponame.py looks like this

from artifactory_cleanup import rules
from artifactory_cleanup.rules import CleanupPolicy

RULES = [

# ------ REPOS --------#
CleanupPolicy(
   'Show artifacts older than 300 days which can be deleted but not protected images',
    rules.repo('repo-name'),
    rules.delete_older_than(days=300),
    rules.property_neq('retention-interval','forever')
),

]

Apply to artifacts by name/mask not showing any data.

Hi Team,

We have few artifacts which we need to clean with below names
2018.11.0.7-SNAPSHOT
2019.02.0.1-local

When we tried to apply the rule (rules.include_filename('-SNAPSHOT'),
) in reponame.py. We are not getting any results, it shows the file count is 0.

Can someone help us here and let us know what is the pattern we need to follow.

Support list for Repo rule

Hi,

Would be great to be able to pass a list of repositories to a Repo rule for common cleaning rules
For example

  policies:
    - name: Remove all empty folders from [repo1, repo2, repo3, repo4]
      rules:
        - rule: Repo
          name:
            - repo1
            - repo2
            - repo3
            - repo4
        - rule: DeleteEmptyFolders

As of now, artifactory-cleanup search for a repository named [repo1, repo2, repo3, repo4]

artifactory-cleanup version : 1.0.1

********************************************************************************
Verbose MODE
Checking '['repo1', 'repo2', 'repo3', 'repo4']' repository exists.
404 Client Error:  for url: https://example.com/artifactory/api/storage/%5B'repo1',%20'repo2',%20'repo3',%20'repo4'%5D
The ['repo1', 'repo2', 'repo3', 'repo4'] repository does not exist

Thanks
Alex

Problems with trashcan and conan

using the example policy from README will fail if the trashcan is enabled for artifactory:

  policies:
    - name: Conan - delete empty folders (to fix the index)
      rules:
        - rule: Repo
          name: "conan-testing"
        - rule: DeleteEmptyFolders
        - rule: ExcludeFilename
          masks:
            - ".timestamp"
            - "index.json"

looks for me like: https://jfrog.atlassian.net/browse/RTFACT-11898 (but is already marked as resolved)

Checking 'ops-conan-dev-local' repository exists.
The ops-conan-dev-local repository exists.
Add AQL Filter - rule: Repo - Apply the policy to one repository.
Before AQL query: []
After AQL query: [{'repo': {'$eq': 'ops-conan-dev-local'}}]
Add AQL Filter - rule: DeleteEmptyFolders - Remove empty folders.
Before AQL query: [{'repo': {'$eq': 'ops-conan-dev-local'}}]
After AQL query: [{'repo': {'$eq': 'ops-conan-dev-local'}}, {'path': {'$match': '**'}, 'type': {'$eq': 'any'}}]
Add AQL Filter - rule: ExcludeFilename - Exclude artifacts by filename.
Before AQL query: [{'repo': {'$eq': 'ops-conan-dev-local'}}, {'path': {'$match': '**'}, 'type': {'$eq': 'any'}}]
After AQL query: [{'repo': {'$eq': 'ops-conan-dev-local'}}, {'path': {'$match': '**'}, 'type': {'$eq': 'any'}}, {'$and': [{'name': {'$nmatch': '.timestamp'}}, {'name': {'$nmatch': 'index.json'}}]}]
Add AQL Text - rule: Repo - Apply the policy to one repository.
Add AQL Text - rule: DeleteEmptyFolders - Remove empty folders.
Add AQL Text - rule: ExcludeFilename - Exclude artifacts by filename.
********************************************************************************
Result AQL Query:
items.find({"$and": [{"repo": {"$eq": "ops-conan-dev-local"}}, {"path": {"$match": "**"}, "type": {"$eq": "any"}}, {"$and": [{"name": {"$nmatch": ".timestamp"}}, {"name": {"$nmatch": "index.json"}}]}]}).include("*", "property", "stat")
********************************************************************************
Found 75638 artifacts
Filter artifacts - rule: Repo - Apply the policy to one repository.
Filter artifacts - rule: DeleteEmptyFolders - Remove empty folders.
Before count: 75638
After count: 6976
Filter artifacts - rule: ExcludeFilename - Exclude artifacts by filename.
Found 6976 artifacts AFTER filtering
DESTROY MODE - delete 'ops-conan-dev-local/Demo - 0B'
Traceback (most recent call last):
  File "/usr/local/bin/artifactory-cleanup", line 8, in <module>
    sys.exit(ArtifactoryCleanupCLI())
  File "/usr/local/lib/python3.8/dist-packages/plumbum/cli/application.py", line 177, in __new__
    return cls.run()
  File "/usr/local/lib/python3.8/dist-packages/plumbum/cli/application.py", line 634, in run
    retcode = inst.main(*tailargs)
  File "/usr/local/lib/python3.8/dist-packages/artifactory_cleanup/cli.py", line 164, in main
    for summary in cleanup.cleanup(
  File "/usr/local/lib/python3.8/dist-packages/artifactory_cleanup/artifactorycleanup.py", line 59, in cleanup
    policy.delete(artifact, destroy=self.destroy)
  File "/usr/local/lib/python3.8/dist-packages/artifactory_cleanup/rules/base.py", line 313, in delete
    r.raise_for_status()
  File "/usr/local/lib/python3.8/dist-packages/requests/models.py", line 941, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error:  for url: https://artifactory.mysecretdomain.de/artifactory/ops-conan-dev-local/Demo

the file is already in the trashcan. So something is not correct with the AQL, or the AQL itself is wrong.

I tried something like, but with no success:

        - rule: PropertyNeq
          property_key: module.artifact.item.repo
          property_value: auto-trashcan

Support for jfrog cloud

Hi,

Thanks a lot for this awesome tool!
I tried running it for our team's jfrog cloud artifactory (ybdb.jfrog.io). But the tool is adding /storage in between the url and the repo name.
e.g. if the repo name is yugabyte and the url for the repo is https://ybdb.jfrog.io/artifactory/yugabyte/ then the tool is trying to go the url https://ybdb.jfrog.io/artifactory/api/storage/yugabyte which doesn't exist. The error gets thrown is as below.

[bmukheja@devserver-bkumar-2 ~]$ artifactory-cleanup --destroy
********************************************************************************
Destroy MODE
Checking 'yugabyte' repository exists.
401 Client Error:  for url: https://ybdb.jfrog.io/artifactory/api/storage/yugabyte
The yugabyte repository does not exist!

Any help here?

Wrongly removes conan metadata files

Thank you for your awesome tool!

We would like to use it for cleaning up a conan repository.

When the conan client deploys a package, it uploads:

  • conan_package.tgz

At the same time, the server side generates some metadata (see also conan-io/conan#8418)

  • .timestamp
  • index.json

When dowloading a conan package, it only downloads package.tar.gz, but not the additional meta files.

If I now run the rules.delete_not_used_since(days=60), rule on packages which were deployed before 60 days, but still regularly downloaded, it will delete the meta files, but keep the conan_package.tgz file. A conan package download does not trigger the download counter on the metadata files.

And this will invalidate the whole conan package (as also mentioned here jfrog/artifactory-user-plugins#319).

One solution would be to use rules.exclude_filename(['.timestamp', 'index.json']) (also suggested here jfrog/artifactory-user-plugins#399).
This would then only delete the package, but not the metadata. And therefore, the metadata is never cleaned up... So also not the perfect solution.

Do you have any suggestions how to handle this?
I guess to solve this we would need dedicated rules just for conan, similar to the docker rules?

There may also other package types affected by this (e.g., npm, pypi, conda, debian, ...), see also jfrog/artifactory-user-plugins#319 (comment)

Can this project support custom sorting?

Now the artifacts sort by path:
artifacts = sorted(artifacts, key=lambda x: x["path"])

I wonder if We can support custom sorting rules, such as:
artifacts = sorted(artifacts, key=lambda x: (x['path'], x['created']))

For example, the artifacts log4j-web sort by path as follows, 2.6.2 is at the end, but I want to let 2.6.2 at the first when I execute cleanup.
So I think sort by path and created will slove the problem.

image

KeepLatestNFilesInFolder not working properly when more than one repository is selected

Hi all,

When filtering by RepoByMask instead of Repo, multiple repositories could be selected.

In this case, the rule KeepLatestNFilesInFolder won't work properly because doesn't use the repository name for grouping the artifacts: https://github.com/devopshq/artifactory-cleanup/blob/master/artifactory_cleanup/rules/keep.py#L134

My proposal is to modify it to something like this:

        for artifact in result_artifact:
            path = '{repo}/{path}'.format(**artifact)
            artifacts_by_path[path].append(artifact)

It would be nice to add the repo name to this print too: https://github.com/devopshq/artifactory-cleanup/blob/master/artifactory_cleanup/rules/keep.py#L146

Thanks for this awesome tool!

run.sh permissions are not set correctly

Bug:

At the moment Kubernetes CronJobs are failing, due to missconfigured permissions for run.sh:

/bin/sh: 1: /app/docker/run.sh: Permission denied

Fix:

The permissions should be extended by +x.

Failed to run artifactory-cleanup-0.4.1 command

Hi,

When I try to run the command artifactory-cleanup --help or artifactory-cleanup --user myuser --password mypassword --artifactory-server https://myrepo.com/artifactory --config myreponame.py have the following error.

Traceback (most recent call last):
  File "/home/ubuntu/.local/bin/artifactory-cleanup", line 5, in <module>
    from artifactory_cleanup.artifactorycleanup import ArtifactoryCleanupCLI
ImportError: cannot import name 'ArtifactoryCleanupCLI' from 'artifactory_cleanup.artifactorycleanup' (/home/ubuntu/.local/lib/python3.8/site-packages/artifactory_cleanup/artifactorycleanup.py)

I'm not sure if I miss something, could you help me with some advice?

Here is our local environment:

$ pip show artifactory-cleanup
Name: artifactory-cleanup
Version: 0.4.1
Summary: Rules and cleanup policies for Artifactory
Home-page: https://github.com/devopshq/artifactory-cleanup
Author: Alexey Burov
Author-email: [email protected]
License: MIT
Location: /home/ubuntu/.local/lib/python3.8/site-packages
Requires: attrs, dohq-artifactory, hurry.filesize, plumbum, prettytable, teamcity-messages, treelib
Required-by: 

$ python3 -V
Python 3.8.10

The behaviour of KeepLatestNVersionImagesByProperty is unclear and possibly incorrect

I want to useKeepLatestNVersionImagesByProperty as a safeguard to ensure that a few versions are always retained. I've been debugging the method and I'm still not sure what the intended behaviour is.

Assume the input is (with importance to the ordering):

result_artifact = [
    {"properties":{"docker.manifest":"0.1.100"},"path":"foobar/0.1.100"}, # refer to below as A
    {"properties":{"docker.manifest":"0.1.200"},"path":"foobar/0.1.200"}, # refer to below as B
    {"properties":{"docker.manifest":"0.1.99"},"path":"foobar/0.1.99"}    # refer to below as C
]

and that the rule has been called like so:

KeepLatestNVersionImagesByProperty(count=2, custom_regexp='(^ \d*\.\d*\.\d*.\d+$) ', number_of_digits_in_version=1)

The final result ends up being that nothing is deleted. I would have expected that A and B would have been filtered out and that C remains for potential deletion.

If I then alter this line (https://github.com/devopshq/artifactory-cleanup/blob/master/artifactory_cleanup/rules/docker.py#L183) from:

key = artifact["path"] + "/" + version_splitted[0]

to

key = artifact["path"].split("/")[0] + "/" + version_splitted[0]

then B and C are filtered out and A remains for potential deletion. This is because the three image versions are able to be grouped together. Previously, the version would be part of the grouping which seems incorrect to me. This still doesn't meet what I expect the result to be.

From here, it seems like the sorting isn't right (assuming what I said above is correct). https://github.com/devopshq/artifactory-cleanup/blob/master/artifactory_cleanup/rules/docker.py#L189

key=lambda x: [int(x) for x in x[0].split(".")]

Include filters do not work due to the incorrect logical operator

Given a policy that looks like this:

image_list = [
    'image-a:*',
    'image-b:*'
]

RULES = [
    # ------ SPECIFIC IMAGES ORIGINATING FROM MASTER BRANCH --------
    CleanupPolicy(
       'Delete images older than 60 days',
        rules.include_docker_images(image_list),
        rules.delete_docker_images_not_used(days=60),
    )
]

the tool will find 0 matches as the underlying query ends up being:

items.find({"$and": [{"$and": [{"path": {"$match": "image-a/*"}}, {"path": {"$match": "image-b/*"}}]}, {"name": {"$match": "manifest.json"}, "$or": [{"stat.downloaded": {"$lte": "2022-06-20"}}, {"$and": [{"stat.downloads": {"$eq": null}}, {"created": {"$lte": "2022-06-20"}}]}]}]}).include("*", "property", "stat")

I guess the issue is here:

aql_query_list.append({"$and": rule_list})
. It should be $and for exclusion rules but $or for inclusion rules.

ExcludeDockerImages fails with missing 1 required positional argument: 'masks'

The following config fails with

Check failed for rule 'ExcludeDockerImages' in policy 'Remove docker images, but keep last 10':
check() missing 1 required positional argument: 'masks'

Relevant part of artifactory-cleanup.yaml

  policies:
    - name: Remove docker images, but keep last 10
      rules:
        # Select repo
        - rule: Repo
          name: docker-local
        # Keep these tags for all images
        - rule: ExcludeDockerImages
          masks:
            - "*:latest"
            - "*:release*"

Repo name doesn't allow for "/" paths

When putting a repo name that is pathed "repo/my/folder" it's specifically not allowed in the init of Repo rules.
In the check function I can see a /api/storage/ request is made and that one works fine using a "/" repo path.
How is one supposed to clean up a sub folder then using this lib?

user and password are required parameters, even when specified from env

I tried to set the ARTIFACTORY_USER and ARTIFACTORY_PASSWORD env variables (per #9), but it won't run because it has mandatory = True. Maybe we need to change that somehow? I changed it to False for testing, but then there needs to be some other check added that ensures the value has been set one way or the other.

(.venv) [tmcneely@den3l1cliqa74077 artifactory-cleanup]$ export ARTIFACTORY_USER='tommy'
(.venv) [tmcneely@den3l1cliqa74077 artifactory-cleanup]$ export ARTIFACTORY_PASSWORD='NotMyPassword!'
(.venv) [tmcneely@den3l1cliqa74077 artifactory-cleanup]$ export ARTIFACTORY_SERVER='https://artifactory.company.com/artifactory'
(.venv) [tmcneely@den3l1cliqa74077 artifactory-cleanup]$ artifactory-cleanup --config artifactory-docker-rules.py 
Error: Switch --user is mandatory

Did I do it wrong?

Multiple paths not working

Hi

I was trying to write some cleanup policies that make use of the multiple path feature, but it does not seem to work with the latest Artifactory version

Artifactory version : 7.41.12 rev 74112900
Artifactory-cleanup version : 0.4.2

Be aware that I ofuscated repo, microservice and url of our Artifactory

Cleanup Policies

from artifactory_cleanup import rules
from artifactory_cleanup import CleanupPolicy
RULES = [
  CleanupPolicy(
  'Test',
  rules.Repo('npm-release-local'),
  rules.IncludePath(['*@scope/microservice*', '*microservice*']),
  rules.IncludeFilename('*-test*'),
  rules.KeepLatestNFiles(count=5),
  ),
]

Output

********************************************************************************
Verbose MODE
Simulating cleanup actions that will occur on 2022-09-13
Add AQL Filter - rule: Repo - Apply the rule to one repository.
Get from npm-release-local
Checking the existence of the npm-release-local repository
The npm-release-local repository exists
Add AQL Filter - rule: IncludePath - Apply to artifacts by path / mask.
Add AQL Filter - rule: IncludeFilename - Apply to artifacts by name/mask.
Add AQL Filter - rule: KeepLatestNFiles - Leaves the last (by creation time) files in the amount of N pieces. WITHOUT accounting subfolders
********************************************************************************
AQL Query:
Add AQL Text - rule: Repo - Apply the rule to one repository.
Add AQL Text - rule: IncludePath - Apply to artifacts by path / mask.
Add AQL Text - rule: IncludeFilename - Apply to artifacts by name/mask.
Add AQL Text - rule: KeepLatestNFiles - Leaves the last (by creation time) files in the amount of N pieces. WITHOUT accounting subfolders
Before AQL text: items.find({"$and": [{"repo": {"$eq": "npm-release-local"}}, {"path": {"$match": ["*@scope/microservice*", "*microservice*"]}}, {"name": {"$match": "*-test*"}}]}).include("*", "property", "stat")
After AQL text: items.find({"$and": [{"repo": {"$eq": "npm-release-local"}}, {"path": {"$match": ["*@scope/microservice*", "*microservice*"]}}, {"name": {"$match": "*-test*"}}]}).include("*", "property", "stat").sort({"$asc" : ["created"]})

items.find({"$and": [{"repo": {"$eq": "npm-release-local"}}, {"path": {"$match": ["*@scope/microservice*", "*microservice*"]}}, {"name": {"$match": "*-test*"}}]}).include("*", "property", "stat").sort({"$asc" : ["created"]})
********************************************************************************
Add AQL Text - rule: Repo - Apply the rule to one repository.
Add AQL Text - rule: IncludePath - Apply to artifacts by path / mask.
Add AQL Text - rule: IncludeFilename - Apply to artifacts by name/mask.
Add AQL Text - rule: KeepLatestNFiles - Leaves the last (by creation time) files in the amount of N pieces. WITHOUT accounting subfolders
Before AQL text: items.find({"$and": [{"repo": {"$eq": "npm-release-local"}}, {"path": {"$match": ["*@scope/microservice*", "*microservice*"]}}, {"name": {"$match": "*-test*"}}]}).include("*", "property", "stat")
After AQL text: items.find({"$and": [{"repo": {"$eq": "npm-release-local"}}, {"path": {"$match": ["*@scope/microservice*", "*microservice*"]}}, {"name": {"$match": "*-test*"}}]}).include("*", "property", "stat").sort({"$asc" : ["created"]})

Traceback (most recent call last):
  File "/home/alex/.local/bin/artifactory-cleanup", line 8, in <module>
    sys.exit(ArtifactoryCleanupCLI())
  File "/home/alex/.local/lib/python3.10/site-packages/plumbum/cli/application.py", line 178, in __new__
    return cls.run()
  File "/home/alex/.local/lib/python3.10/site-packages/plumbum/cli/application.py", line 624, in run
    retcode = inst.main(*tailargs)
  File "/home/alex/.local/lib/python3.10/site-packages/artifactory_cleanup/cli.py", line 114, in main
    for summary in cleanup.cleanup(
  File "/home/alex/.local/lib/python3.10/site-packages/artifactory_cleanup/artifactorycleanup.py", line 54, in cleanup
    artifacts = policy.get_artifacts()
  File "/home/alex/.local/lib/python3.10/site-packages/artifactory_cleanup/rules/base.py", line 214, in get_artifacts
    r.raise_for_status()
  File "/usr/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error:  for url: https://selfhosted.artifactory.com/artifactory/api/search/aql

And taking a look into our Artifactory logs we get this

2022-09-13T16:49:56.667Z [jfrt ] [ERROR] [247cfbc55f7e33d4] [o.a.r.r.a.AqlResource:101     ] [ttp-nio-8081-exec-26] - Fail to execute query: items.find({"$and": [{"repo": {"$eq": "npm-release-local"}}, 
{"path": {"$match": ["*@scoped/microservice*", "*microservice*"]}}, {"name": {"$match": "*-test*"}}]}).include("*", "property", "stat").sort({"$asc" : ["created"]}): 

Failed to parse query: items.find({"$and": [{"repo": {"$eq": "npm-release-local"}}, {"path": {"$match": ["*@scoped/microservice*", "*microservice*"]}}, {"name": {"$match": "*-test*"}}]}).include("*", "property", "stat").sort({"$asc" : ["created"]}), 
it looks like there is syntax error near the following sub-query: ["*@scoped/microservice*", "*microservice*"]}}, {"name": {"$match": "*-test*"}}]}).include("*", "property", "stat").sort({"$asc" : ["created"]})

We get the same error trying to use multiple items in IncludeFilename , if we write two separated cleanup policies they work just fine

Thanks
Alex

KeepLatestVersionNFilesInFolder is not working as expected

Problem Explanation

I have been looking into the KeepLatestVersionNFilesInFolder to clean up our docker images as we want to keep N highest versions. I think I found an issue with this rule in the filter function. I tested this with both a docker registry and maven repository in artifactory.

What I am seeing is that when the regex is set to something that will find my version numbers (default won't work for me), is that the run explodes with the following error --

Traceback (most recent call last):
  File "/usr/local/bin/artifactory-cleanup", line 8, in <module>
    sys.exit(ArtifactoryCleanupCLI())
  File "/usr/local/lib/python3.9/site-packages/plumbum/cli/application.py", line 178, in __new__
    return cls.run()
  File "/usr/local/lib/python3.9/site-packages/plumbum/cli/application.py", line 624, in run
    retcode = inst.main(*tailargs)
  File "/usr/local/lib/python3.9/site-packages/artifactory_cleanup/cli.py", line 121, in main
    for summary in cleanup.cleanup(
  File "/usr/local/lib/python3.9/site-packages/artifactory_cleanup/artifactorycleanup.py", line 53, in cleanup
    artifacts_to_remove = policy.filter(artifacts)
  File "/usr/local/lib/python3.9/site-packages/artifactory_cleanup/rules/base.py", line 284, in filter
    artifacts = rule.filter(artifacts)
  File "/usr/local/lib/python3.9/site-packages/artifactory_cleanup/rules/keep.py", line 181, in filter
    artifacts.keep(good_artifacts)
  File "/usr/local/lib/python3.9/site-packages/artifactory_cleanup/rules/base.py", line 32, in keep
    return self.remove(artifacts)
  File "/usr/local/lib/python3.9/site-packages/artifactory_cleanup/rules/base.py", line 42, in remove
    print(f"Filter package {artifact['path']}/{artifact['name']}")
TypeError: list indices must be integers or slices, not str

Looks like a value being sent to keep from filter is not matching the expected object type.

Debugging

Looking at the code artifactory_cleanup/rules/keep.py#L140 I see that good_artifacts is being used as an argument for artifacts.keep.

Looking into how that variables value gets set it comes from artifacts_by_path_and_name.values(). Which is appended to artifacts_by_path_and_name[key].append(artifactory_with_version) in the for loop over the artifacts provided to the filter. That is where I noticed that artifactory_with_version = [version_str, artifact] is an array with a string and dictionary values.

That means that artifacts_by_path_and_name.values() returns an array of arrays with values of ['String', 'Dict']. For example --

[ [ 'Version', { ArtifactDict } ], [ 'Version', { ArtifactDict } ] ]

Based on what I can tell the keep function is expecting to take in either an Artifact Dictionary i.e. {ArtifactDict} or an array of Artifact Dictionaries i.e. [ { ArtifactDict } ]. So passing it an array like the one above causes it to error out.

Potential fix

I believe fixed by pulling the Artifact Dictionary out of the result of artifactory_with_version[good_artifact_count:] and then calling keep with that value.

Something like --

# Replacement for https://github.com/devopshq/artifactory-cleanup/blob/master/artifactory_cleanup/rules/keep.py#L180
            good_sets = artifactory_with_version[good_artifact_count:]
  
            good_artifacts = []
  
            for good_set in good_sets:
              good_artifacts.append(good_set[1])

I am more than happy to put in a PR but wanted to submit an issue here first to see if I am missing something before doing so.

Cleanup plugin issue

Hi,

I have the plugin installed in our Artifactory server and when I run the curl command, I see the script won't return back to prompt and northing seems to be executing. Please advise

curl POST -X -v -user admin:password "http://localhost:8081/artifactory/api/plugins/execute/cleanup?params=timeUnit=month;timeInterval=1;repos=testbuilder-maven-dev-local;dryRun=true;paceTimeMS=2000;disablePropertiesSupport=true"

Getting error:

Invoke-WebRequest : A parameter cannot be found that matches parameter name 'X'.

At line:1 char:6

  • curl -X POST -v -user admin:password "http://localhost:8082/artifacto ...

  •  ~~
    
    • CategoryInfo : InvalidArgument: (:) [Invoke-WebRequest], ParameterBindingException

    • FullyQualifiedErrorId : NamedParameterNotFound,Microsoft.PowerShell.Commands.InvokeWebRequestCommand

when i remove -X in command i'm hgetting this error

Invoke-WebRequest : A positional parameter cannot be found that accepts argument 'http://localhost:8081/artifactory/api/plugins/execute/cleanup?params=timeUnit=month;timeInterval=1;repos=test

builder-maven-dev-local;dryRun=true;paceTimeMS=2000;disablePropertiesSupport=true'.

At line:1 char:1

  • curl POST -v -user admin:password "http://localhost:8081/artifactory/ ...

  • 
      + CategoryInfo          : InvalidArgument: (:) [Invoke-WebRequest], ParameterBindingException
    
      + FullyQualifiedErrorId : PositionalParameterNotFound,Microsoft.PowerShell.Commands.InvokeWebRequestCommand
    
    
    

KeepLatestNVersionImagesByProperty does not work for multiple images

It seems that its working when only one image is involved. When I include multiple images, it only seems to apply the logic to one of the images:

My policy looks like this:

- name: my-policy
      rules:
        - rule: Repo
          name: "my-repo"
        - rule: IncludeDockerImages
          masks:
            - "my-image-1:*"
            - "my-image-2:*"
        - rule: DeleteDockerImagesNotUsed
          days: 60
        - rule: KeepLatestNVersionImagesByProperty
          count: 2
          number_of_digits_in_version: 1

Both images have not been used for more than 60 days. Running with either image alone results in the two latest image versions being filtered out as expected.

Originally posted by @kenny-monster in #60 (comment)

Formatting Question - RepoList

Hello! Thank you for your hard work on this tool!
I have a question regarding RepoList, i can't seem to get it to work unlike the normal Repo & Name rule.

The instructions read as follow:

- rule: RepoList
  repos:
    - repo1
    - repo2
    - repo3

my code reads as follows:

      - rule: RepoList
          repos: 
            - repo1
            - repo2
        - rule: DeleteEmptyFolders

And i am getting the following error:

'NoneType' object has no attribute 'get'

Add option KeepLatestNImages

When cleaning images, it is possible to use the option KeepLatestNFiles. But this will look at docker tags, and not docker images.

Example:

If you got 3 images, both tagged with 2 tags each, so in total, you got 6 tags. If you run artifactory-cleanup with KeepLatestNFiles count=2 you may end up in the situation that you only have a single image left (with 2 tags), because each tag is counted as a file.

Proposal:

What I propose is an option, KeepLatestNImages, that only looks at the image digest, and for those images that are older, it will delete all tags. So in the example above you will be left with 2 images (still with 2 tags each, 4 in total).

where do we need to run the command ?

Hi, I'm new to this artifactory,
I'm using windows os and when I try to run the command it shows
artifactory-cleanup: The term 'artifactory-cleanup' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try again.

Plz help me

Provide JSON Schema for config file

To enhance the developer experience and facilitate validation of config in CI/CD pipelines, it would have been nice if it were a JSON Schema to validate the config against. Given a schema, the config file could be validated with e.g. yajsv:

yajsv -s artifactory-cleanup-schema.yaml artifactory-cleanup.yaml

Users of VSCode with the YAML extension (or IntelliJ users natively) could have added this at the top of their config to get auto-complete and documentation:

# yaml-language-server: $schema=https://raw.githubusercontent.com/devopshq/artifactory-cleanup/master/artifactory-cleanup-schema.yaml

(see blog-post about this extension for more info)

Here is a start for such a JSON Schema:
---
"$schema": http://json-schema.org/draft-07/schema#
type: object
properties:
  artifactory-cleanup:
    type: object
    properties:
      server:
        type: string
        format: uri
      user:
        type: string
      password:
        type: string
      policies:
        type: array
        items:
          type: object
          properties:
            name:
              type: string
            rules:
              type: array
              description:
                See all rules in
                https://github.com/devopshq/artifactory-cleanup#rules
              items:
                type: object
                properties:
                  rule:
                    type: string
                  name:
                    type: string
                  days:
                    type: integer
                  count:
                    type: integer
                  property_key:
                    type:
                      - integer
                      - string
                  property_value:
                    type:
                      - integer
                      - string
                  custom_regexp:
                    type: string
                  masks:
                    type: array
                    items:
                      type: string
                required:
                  - rule
                additionalProperties: true
          required:
            - name
            - rules
    required:
      - server
      - user
      - password
      - policies
required:
  - artifactory-cleanup

_collect_docker_size is *very* slow

We have been trying to use the artifactory_cleanup tool to cleanup our docker images repo (which is admittedly a whole street of dumpster fires). We needed to develop the ability to filter based on presence of a specific set of properties (docker labels), so we were looking deep into the code. We were trying to figure out what is taking soo long after the "filter results" log entry. We finally found it was the _collect_docker_size method, specifically this "N*M: loop:

for artifact in new_result:
artifact['size'] = sum([docker_layer['size'] for docker_layer in artifacts_list if
docker_layer['path'] == '{}/{}'.format(artifact['path'], artifact['name'])])

We are working on optimizations for this, but if anyone has ideas (we are not super-awesome python devs) for optimizing that process, we would be grateful.

Stats:

  • Our "real" docker repo has ~100,000 docker images and the size of the "artifacts_list" is ~1.5M entries. It has been running for over 19 hours and has not finished processing yet.
  • It takes several minutes to process our "test-docker" repo with only 5,000 images and ~100K layers.
  • If you want to apply more than one filter, such as "last downloaded over 30 days ago" and "property value contains" .. it runs the collect size twice (this should be fairly easy to fix)...

Ostensibly, this is just for "show" anyhow, for the "table" at the end. We may start by making the calculate size optional.

~tommy

fail to delete maven-metadata.xml with 404 after pom was deleted

like described here, artifactory deletes maven-metadata.xml automatically after the .pom file was delete.

The cleanup tool tries to delete maven-metadata.xml afterwards, and fails with 404 because its missing.

DESTROY MODE - delete 'my_artifact/maven-metadata.xml - 833B'
    sys.exit(ArtifactoryCleanupCLI())
  File "/usr/local/lib/python3.9/site-packages/plumbum/cli/application.py", line 177, in __new__
    return cls.run()
  File "/usr/local/lib/python3.9/site-packages/plumbum/cli/application.py", line 634, in run
    retcode = inst.main(*tailargs)
  File "/usr/local/lib/python3.9/site-packages/artifactory_cleanup/cli.py", line 121, in main
    for summary in cleanup.cleanup(
  File "/usr/local/lib/python3.9/site-packages/artifactory_cleanup/artifactorycleanup.py", line 59, in cleanup
    policy.delete(artifact, destroy=self.destroy)
  File "/usr/local/lib/python3.9/site-packages/artifactory_cleanup/rules/base.py", line 313, in delete
    r.raise_for_status()
  File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error:  for url: https://artifactory-url/artifactory/my_artifact/maven-metadata.xml

"or" for rules

Now, as I understand, although I may be mistaken, I ask you to correct me in this case, when specifying several rules, they work on the intersection (logical "and"), for example "DeleteNotUsedSince" and "DeleteOlderThanNDaysWithoutDownloads", it will take the intersection, although logically I would like to do a union (logical "or") here.

I don't know, maybe there is a workaround for this, I didn't think of something, divide it into two identical stages - it's very inconvenient in the debug, because it checks the same files 2 times and deletes them 2 times.

_collect_docker_size queries for all items in the registry

Running an aql query to get all items is very slow on large repositories. I also use object storage for the binary store which likely contributes to slower queries.

Example rule combination that I'm trying to use:

    - name: Example
      rules:
        - rule: Repo
          name: "docker"
        - rule: IncludePath
          masks: "app/*"
        - rule: DeleteDockerImagesOlderThan
          days: 14

args = ["items.find", {"$or": [{"repo": repo} for repo in docker_repos]}]

I tested replacing this line with args = ["items.find", {"$or": [{'path': {'$match': 'app/*'}}]}] and it is significantly faster while retaining the size info.

Happy to attempt to contribute a fix. I thought about two potential options; disabling getting the size or accepting a mask on DeleteDockerImagesOlderThan.

CI pipeline branching strategy

Hi guys,

is it possible to trigger the Pypi publish pipeline only when a new Release is created?
That way we could make sure to only release to pypi when a new release is created.
Therefore we could add multiple features (aka PRs to master) within one release and only published the finished thing.

Delete artifacts that are older than N days but never been downloaded

If someone deployed artifacts or pushed docker images several days ago and nobody download or pulled them till now, these kind of artifacts and docker images can be deleted.

CleanupPolicy(
    'Delete artifacts that are older than 30 days but never been downloaded',
    rules.repo('team-technology-maturity-locator'),
    rules.delete_older_than_n_days_without_downloads(days=30),
),
CleanupPolicy(
    'Delete images that are older than 30 days but never been downloaded',
    rules.repo('team-docker-maturity-locator'),
    rules.delete_docker_images_older_than_n_days_without_downloads(days=30),
)

404 when destroy flag is set

Using "devopshq/artifactory-cleanup:latest" docker image i get 404 error.

Command:

artifactory-cleanup --destroy --user admin --password SOME_ASSWORD --artifactory-server https://SOME_URL --config rules.py

Output:

Filter artifacts - rule: repo - Apply the rule to one repository.
Filter artifacts - rule: exclude_docker_images - Exclude Docker images by name and tags.
Filter artifacts - rule: delete_docker_images_not_used - Removes Docker image not downloaded days days
Found 1534 artifacts AFTER filtering
DESTROY MODE - delete docker-local/SOME_IMAGE
Traceback (most recent call last):
File "/usr/local/bin/artifactory-cleanup", line 8, in
sys.exit(ArtifactoryCleanup())
File "/usr/local/lib/python3.9/site-packages/plumbum/cli/application.py", line 178, in new
return cls.run()
File "/usr/local/lib/python3.9/site-packages/plumbum/cli/application.py", line 624, in run
retcode = inst.main(*tailargs)
File "/usr/local/lib/python3.9/site-packages/artifactory_cleanup/artifactorycleanup.py", line 179, in main
cleanup_rule.delete(artifact, destroy=self._destroy)
File "/usr/local/lib/python3.9/site-packages/artifactory_cleanup/rules/base.py", line 208, in delete
r.raise_for_status()
File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 960, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://SOME_URL/docker-local/SOME_IMAGE

Proposal: YAML format for rules

Engineers who support Artifactory sometimes don't know python but almost everyone is familiar with YAML.

Right now artifactory-cleanup import rules only from YAML, but I think the library is used limited because it requires a bit of knowing how to write python script and be familiar with its syntax.

For me, if I find a useful tool in different language and it's configurable via YAML \ JSON \ TOML - it's fine for me, I use it.
But if I find a tool that requires me to write a few lines of Ruby code to configure it - I'll think twice and try to find another solution.

The basic idea why we used python at the very first step - it gives a lot of comfort for people who know python - IDE suggestions, highlight and flexible mechanism to import from different files

I think we should move our configuration to YAML format, as follow


artifactory-cleanup.yaml (can be changed in CLI)

artifactory-cleanup:
  server: https://repo.example.com/artifactory
  # $VAR is auto populated from environment variables
  user: $ARTIFACTORY_USERNAME
  password: $ARTIFACTORY_PASSWORD

  policies:
    - name: Remove all .tmp files older than 7 days
      rules:
        - rule: RepoByMask
          mask: "*.tmp"
        - rule: DeleteOlderThan
          days: 7

    - name: My Docker Cleanup Policies
      rules:
        - rule: RepoByMask
          mask: "docker-*-tmp"
        - rule: DeleteOlderThan
          days: 7

    - name: My Docker Cleanup Policies
      rules:
        - rule: RepoByMask
          mask: "docker-*-tmp"
        - rule: ExcludeDockerImages
          masks:
            - '*:latest'
            - '*:release*'
        - rule: DeleteOlderThan
          days: 7

How to set retention policy in jfrog Artifactory

Looking for a way to set retention period in JFrog antifactory, which will remove SNAPSHOT versions older than 100 days.

If any teams need particular files need to be keep in artifactory for ever, need to exclude some path alone in that repo from retention policy and other directories should be removed as per the retention policy.

DeleteDockerImagesOlderThanNDaysWithoutDownloads not useable due to incorrect dict merge

When trying to clean a repository's docker images filtering by their last download date, an error occurs that indicates of non-standard dict usage

image

image

It seems that a dict containing other dicts as key are being inserted into a list, which is not possible since dicts are not hashable and therefore not key-suitable.

filter_ = {
{"stat.downloads": {"$eq": None}},
{"created": {"$lte": last_day.isoformat()}},
}
filters.append(filter_)

I will create a pull request which implements a fix updating the filter list properly.

Formatting Question - KeepLatestNVersionImagesByProperty

Thank you for developing this wonderful tool, it has been very useful.
Having said that, i am having some issues getting some of the docker rules to work.

For example, my yaml file looks like this:

artifactory-cleanup:
  server: $SERVER
  user: $USER
  password: $PASS

  policies:
     - name: Test
       rules:
         - rule: RepoList
           repos:
             - docker-repo
         - rule: KeepLatestNVersionImagesByProperty
           count: 2

and im getting the following error after the result queries:

********************************************************************************
Found 1457 artifacts
Filter artifacts - rule: RepoList - Apply the policy to a list of repositories.
Filter artifacts - rule: KeepLatestNVersionImagesByProperty - Leaves ``count`` Docker images with the same major.
Traceback (most recent call last):
  File "/Users/nferrovia/Library/Python/3.8/bin/artifactory-cleanup", line 8, in <module>
    sys.exit(ArtifactoryCleanupCLI())
  File "/Users/nferrovia/Library/Python/3.9/lib/python/site-packages/plumbum/cli/application.py", line 177, in __new__
    return cls.run()
  File "/Users/nferrovia/Library/Python/3.9/lib/python/site-packages/plumbum/cli/application.py", line 634, in run
    retcode = inst.main(*tailargs)
  File "/Users/nferrovia/Library/Python/3.9/lib/python/site-packages/artifactory_cleanup/cli.py", line 121, in main
    for summary in cleanup.cleanup(
  File "/Users/nferrovia/Library/Python/3.9/lib/python/site-packages/artifactory_cleanup/artifactorycleanup.py", line 53, in cleanup
    artifacts_to_remove = policy.filter(artifacts)
  File "/Users/nferrovia/Library/Python/3.9/lib/python/site-packages/artifactory_cleanup/rules/base.py", line 284, in filter
    artifacts = rule.filter(artifacts)
  File "/Users/nferrovia/Library/Python/3.9/lib/python/site-packages/artifactory_cleanup/rules/docker.py", line 235, in filter
    grouped = pydash.group_by(artifacts, iteratee=_groupby)
  File "/Users/nferrovia/Library/Python/3.9/lib/python/site-packages/pydash/collections.py", line 397, in group_by
    key = cbk(value)
  File "/Users/nferrovia/Library/Python/3.9/lib/python/site-packages/artifactory_cleanup/rules/docker.py", line 232, in _groupby
    return self.get_version(artifact)[: self.number_of_digits_in_version]
  File "/Users/nferrovia/Library/Python/3.9/lib/python/site-packages/artifactory_cleanup/rules/docker.py", line 222, in get_version
    raise ValueError(f"Can not find version in '{artifact}'")
ValueError: Can not find version in '{'repo': ............

the error leads me to believe it is some mistake on my part defining the versions, but im not sure what im doing wrong.
Our Artifactory directory looks like this:

Artifactory
     docker-repo
           microservice1
                  v2.31.0 (dockerimage)
                        manifest.json
                        layer1
                        layer2
                        layer....
                  v2.32.0
                       ....
                       ....
                  v2.....
           microservice2
           microservice....
     other-repo1
     other-repo2

It help me a lot if i could get some guidance regarding the proper syntax! ty.

KeepLatestNFilesInFolder & KeepLatestNFiles keeps oldest files, not newest

I'm not sure if it's a bug or I misunderstand how the program works, but with the policy written this way, the oldest artifacts are kept and not the newest ones.

This is my example:

  policies:
    - name: example-name
      rules:
        - rule: Repo
          name: example-repo
        - rule: DeleteDockerImagesNotUsed
          days: 200       
        - rule: KeepLatestNFiles
          count: 15
        - rule: KeepLatestNFilesInFolder
          count: 10
        - rule: ExcludeDockerImages
          masks:
            - "*:latest"
            - "*:release*"

Upload artifactory-cleanup to Pypi

Hi,
I appreciate you have created a great project. It satisfies our cleanup requirement.

Do we have a plan to upload this package to Pypi?

Here is one scenario.
A remote repository Pypi is set up in Artifactory.
I'd like to install this package in internal network.

I know we could workaround in many ways, like packaging it in Container or uploading it in local repository.

formatting question - KeepLatestVersionNFilesInFolder

I'm wondering if I may not be able to use this tool due to our versioning scheme. Each repo has a "Packages" subfolder that I'm including, but then also has 30+ different file names in each. I've written the equivalent number of rules for each repo, e.g.:

        - rule: KeepLatestVersionNFilesInFolder
          count: 2
          custom_regexp: "bootstrap-gui-([\d]+\.[\d]+\-[\d]+).*\.rpm"
        - rule: KeepLatestVersionNFilesInFolder
          count: 2
          custom_regexp: "bootstrap-influxdb-([\d]+\.[\d]+\-[\d]+).*\.rpm"
        - rule: KeepLatestVersionNFilesInFolder
          count: 2
          custom_regexp: "bootstrap-ingest-([\d]+\.[\d]+\-[\d]+).*\.rpm"
        - rule: KeepLatestVersionNFilesInFolder
          count: 2

etc & so forth, to cover each of the different file names. However, when I attempt to run the cleanup, the script gives me the error:

ValueError: invalid literal for int() with base 10: '6-4'

I have tested each individual regex, to make sure that they do match the version numbers for each file name. I'm pretty sure I'm just not formatting it correctly for what the script expects. I'm guessing that maybe the rule separates the digits by dots? Which won't work for us, since our last digit is separated by a dash?

Any guidance would be appreciated.

Possibility to use Regex

It would be interesting if there was the possibility of using regex when configuring a rule in the artifactory-cleanup.yaml file, as in the example below:

 - rule: IncludeFilename
  masks: 
   - "*/production-[0-9]+"
   - "*.develop-[0-9]+"

That way I believe you would have more control over what you want to look for in a repository.

Missing docs/RULES in master

I miss the directory docs/RULES with the Available Rules in the master branch. Is this a bug or a feature? ;-)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.