Coder Social home page Coder Social logo

oras-project / oras-py Goto Github PK

View Code? Open in Web Editor NEW
34.0 34.0 26.0 2.44 MB

ORAS Python SDK

Home Page: https://oras-project.github.io/oras-py/

License: Apache License 2.0

Dockerfile 1.39% Makefile 0.27% Python 97.19% Shell 1.15%
oci oci-registry-as-storage oras packages registry

oras-py's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

oras-py's Issues

Disable tgz extraction on pull

It seems that client.pull() automatically extracts archives which is a nice feature however I don't require it so it is possible to disable?

I've also noticed that the result from the pull is an invalid path, the path is actually the path to the extacted folder however the returned path includes the tgz which is infact in another folder (all be it with a random name)

image

image

image

[temporary] chunked upload disabled (could not push size > 1024)

Hi,

Whenever I try to push something larger than 1023, it fails as follows:

root@yuval:~# python3 -c 'open("foo", "wb").write(bytes(1024*"a", "ascii"))'
root@yuval:~# oras-py push -i --disable-path-validation registry.container-registry.svc.cluster.local:5000/test/foo:123 ~/foo 
blob upload invalid
Issue with http://registry.container-registry.svc.cluster.local:5000/v2/test/foo/blobs/uploads/fcda98d7-1086-48f2-bd8a-a3f41212f2b1?_state=bMUNu4I6M65BujbiZ80Qd1CnpzS8eEdNXwQkX1Pko0x7Ik5hbWUiOiJ0ZXN0L2ZvbyIsIlVVSUQiOiJmY2RhOThkNy0xMDg2LTQ4ZjItYmQ4YS1hM2Y0MTIxMmYyYjEiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMDktMjJUMTY6MjQ6NDkuNDEwNTYzNTY5WiJ9&digest=sha256%3A2edc986847e209b4016e141a6dc8716d3207350f416969382d431539bf292e4a:
Not Found
root@yuval:~# 

This is the log on the registry's side:

time="2022-09-22T16:25:13.641318415Z" level=info msg="response completed" go.version=go1.11.2 http.request.contenttype="application/octet-stream" http.request.host="registry.container-registry.svc.cluster.local:5000" http.request.id=29d94599-bd34-473e-97f1-299a9eeea5e6 http.request.method=POST http.request.remoteaddr="10.1.96.136:54946" http.request.uri="/v2/test/foo/blobs/uploads/" http.request.useragent="python-requests/2.28.1" http.response.duration=12.64265ms http.response.status=202 http.response.written=0 
10.1.96.136 - - [22/Sep/2022:16:25:13 +0000] "POST /v2/test/foo/blobs/uploads/ HTTP/1.1" 202 0 "" "python-requests/2.28.1"
time="2022-09-22T16:25:13.658398142Z" level=info msg="response completed" go.version=go1.11.2 http.request.contenttype="application/octet-stream" http.request.host="registry.container-registry.svc.cluster.local:5000" http.request.id=b193e768-3131-4d4c-8c2f-dd52023512e4 http.request.method=PATCH http.request.remoteaddr="10.1.96.136:54946" http.request.uri="/v2/test/foo/blobs/uploads/ab44b745-df06-485e-a6e0-1457ebfe117b?_state=6bmyqPyB_aox0XrXFHE-SvHnbvRZyO2fTU4bZOklkQR7Ik5hbWUiOiJ0ZXN0L2ZvbyIsIlVVSUQiOiJhYjQ0Yjc0NS1kZjA2LTQ4NWUtYTZlMC0xNDU3ZWJmZTExN2IiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMDktMjJUMTY6MjU6MTMuNjMxNDM0NzEyWiJ9" http.request.useragent="python-requests/2.28.1" http.response.duration=13.321217ms http.response.status=202 http.response.written=0 
10.1.96.136 - - [22/Sep/2022:16:25:13 +0000] "PATCH /v2/test/foo/blobs/uploads/ab44b745-df06-485e-a6e0-1457ebfe117b?_state=6bmyqPyB_aox0XrXFHE-SvHnbvRZyO2fTU4bZOklkQR7Ik5hbWUiOiJ0ZXN0L2ZvbyIsIlVVSUQiOiJhYjQ0Yjc0NS1kZjA2LTQ4NWUtYTZlMC0xNDU3ZWJmZTExN2IiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMDktMjJUMTY6MjU6MTMuNjMxNDM0NzEyWiJ9 HTTP/1.1" 202 0 "" "python-requests/2.28.1"
time="2022-09-22T16:25:13.662331644Z" level=error msg="upload resumed at wrong offest: 1024 != 0" go.version=go1.11.2 http.request.host="registry.container-registry.svc.cluster.local:5000" http.request.id=fca33ad6-2372-4ba1-b953-d0d7b70871eb http.request.method=PUT http.request.remoteaddr="10.1.96.136:54946" http.request.uri="/v2/test/foo/blobs/uploads/ab44b745-df06-485e-a6e0-1457ebfe117b?_state=6bmyqPyB_aox0XrXFHE-SvHnbvRZyO2fTU4bZOklkQR7Ik5hbWUiOiJ0ZXN0L2ZvbyIsIlVVSUQiOiJhYjQ0Yjc0NS1kZjA2LTQ4NWUtYTZlMC0xNDU3ZWJmZTExN2IiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMDktMjJUMTY6MjU6MTMuNjMxNDM0NzEyWiJ9&digest=sha256%3A2edc986847e209b4016e141a6dc8716d3207350f416969382d431539bf292e4a" http.request.useragent="python-requests/2.28.1" vars.name="test/foo" vars.uuid=ab44b745-df06-485e-a6e0-1457ebfe117b 
10.1.96.136 - - [22/Sep/2022:16:25:13 +0000] "PUT /v2/test/foo/blobs/uploads/ab44b745-df06-485e-a6e0-1457ebfe117b?_state=6bmyqPyB_aox0XrXFHE-SvHnbvRZyO2fTU4bZOklkQR7Ik5hbWUiOiJ0ZXN0L2ZvbyIsIlVVSUQiOiJhYjQ0Yjc0NS1kZjA2LTQ4NWUtYTZlMC0xNDU3ZWJmZTExN2IiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMDktMjJUMTY6MjU6MTMuNjMxNDM0NzEyWiJ9&digest=sha256%3A2edc986847e209b4016e141a6dc8716d3207350f416969382d431539bf292e4a HTTP/1.1" 404 76 "" "python-requests/2.28.1"
time="2022-09-22T16:25:13.666498169Z" level=error msg="response completed with error" err.code="blob upload invalid" err.message="blob upload invalid" go.version=go1.11.2 http.request.host="registry.container-registry.svc.cluster.local:5000" http.request.id=fca33ad6-2372-4ba1-b953-d0d7b70871eb http.request.method=PUT http.request.remoteaddr="10.1.96.136:54946" http.request.uri="/v2/test/foo/blobs/uploads/ab44b745-df06-485e-a6e0-1457ebfe117b?_state=6bmyqPyB_aox0XrXFHE-SvHnbvRZyO2fTU4bZOklkQR7Ik5hbWUiOiJ0ZXN0L2ZvbyIsIlVVSUQiOiJhYjQ0Yjc0NS1kZjA2LTQ4NWUtYTZlMC0xNDU3ZWJmZTExN2IiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMDktMjJUMTY6MjU6MTMuNjMxNDM0NzEyWiJ9&digest=sha256%3A2edc986847e209b4016e141a6dc8716d3207350f416969382d431539bf292e4a" http.request.useragent="python-requests/2.28.1" http.response.contenttype="application/json; charset=utf-8" http.response.duration=4.788669ms http.response.status=404 http.response.written=76 vars.name="test/foo" vars.uuid=ab44b745-df06-485e-a6e0-1457ebfe117b 

Looks like a problem with _chunked_upload. When I modify the code to use _put_upload, it works.

Any ideas?

OrasClient.push fails on Windows as empty Manifest Config uses /dev/null which is Linux specific

My code runs fine on Linux. On Windows it fails with an error.

Here's my code:

        client = OrasClient(hostname=acr_url)
        
        client.login(
            username=self._manifest_credentials["username"],
            password=self._manifest_credentials["acr_token"],
        )
        target = (
                f"{client.remote.hostname.replace('https://', '')}/testartifact:1.0.0"
            )
        client.push(`
                files=["testartifact.json"],
                target=target
            )
 File "<>\oras\client.py", line 131, in push
    return self.remote.push(*args, **kwargs)
  File "<>\oras\provider.py", line 718, in push
    response = self.upload_blob(config_file, container, conf)
  File "<>\oras\provider.py", line 217, in upload_blob
    response = self.put_upload(blob, container, layer)
  File "<>\oras\provider.py", line 474, in put_upload
    with open(blob, "rb") as fd:
OSError: [Errno 22] Invalid argument: 'C:\\dev\\null'

I have made a fix for this and will submit a PR. The fix is in oci.py ManifestConfig:

        if platform.system() == "Windows":
            path = "nul"
        else:
            path = "/dev/null"

Enforce branch policies on the repository

To improve the security of the ORAS project we need to enforce the branch policies for this repository. I propose that we enforce the policies as follows:

  • Use the following rules for main and release/* branches:
    • Require PR before merging
      • Require 3 approvals
      • Dismiss stale PR approvals when new commits are pushed
      • Require review from Code Owners
      • Require status checks to pass before merging
      • Require conversation resolution before merging
      • Require signed commits
      • Do not allow bypass the above settings

Please add your comments and proposals for additional changes to this issue.

Image Index support

As of 0.1.25 it doesn't appear to me this supports Image Indexes.

The lift looks small - maybe as little as adding a media type to defaults.py, a schemas.py definition, and minor work to support it in the Registry object

Sample code might be

class Registry(oras.provider.Registry):
    """
    Oras registry with support for image indexes.
    """

    @decorator.ensure_container
    def get_image_index(self, container, allowed_media_type=None):
        """
        Get an image index as a manifest.

        This is basically Registry.get_manifest with the following changes

        - different default allowed_media_type
        - no JSON schema validation
        """
        if not allowed_media_type:
            default_image_index_media_type = "application/vnd.oci.image.index.v1+json"  # TODO: need this in defaults.py
            allowed_media_type = [default_image_index_media_type]

        headers = {"Accept": ";".join(allowed_media_type)}

        manifest_url = f"{self.prefix}://{container.manifest_url()}"
        response = self.do_request(manifest_url, "GET", headers=headers)
        self._check_200_response(response)
        manifest = response.json()
        # TODO: jsonschema.validate(manifest, schema=...)
        return manifest

If there's interest please let me know and I can put in a PR. In that case I ask if a new method like get_image_index is desired or if something else would be preferred.

In any case, thank you for the project!

Create issue template for oras-py

Hi maintainers,

I am thinking it would be better to replicate the issue template from ORAS repo to oras-go since it may help issue creators to shape the issue content to a standard structure and added related labels automatically. It would be convenient for issue creators to provide enough context by following a content structure so that maintainer are easier to triage issues.

Directory mismatch on pull

When I pull an image, the directory specified by "outdir" is repeated. This causes the extracted files to be saved to the wrong directory:

For example, with the following code, this is my result:

import os

from oras.logger import logger, setup_logger

client = oras.client.OrasClient()
print(client.version())

f = open("secret.json", "r")
password = f.read()

image = <image name:tag>

def pull_to_oci_layout(username, password, outdir, reference):
    try:
        hostname = reference.split("/")[0]
        result = client.login("_json_key", password, hostname=hostname)
        print(result)
        client.pull(
            allowed_media_type=[],
            overwrite=True,
            outdir="tmp",
            target=reference,
            debug=True,
        )
    except Exception as e:
        logger.exit(str(e))


pull_to_oci_layout(
    "_json_key",
    password,
    "tmp",
    image,
)

This yields the directory structure:

tmp
└── tmp
    ├── Cargo.lock
    ├── Cargo.toml
    ├── config.json
    ├── loader.wasm
    └── src
        ├── foundation.rs
        ├── foundation_callbacks.rs
        ├── gas.rs
        ├── getters.rs
        ├── internal.rs
        ├── lib.rs
        ├── owner.rs
        ├── owner_callbacks.rs
        ├── tests
        │   └── test_utils.rs
        └── types.rs

I've been trying to figure out why this is occurring but haven't figured it out yet. Has anyone else seen this?

Documentation preview with Netlify?

The first version of docs aren't merged yet, but I did some messing around tonight and I have a second idea I like much better for our docs! Instead of a mkdocs jekyll template I'm going to use a material mkdocs sphinx template, and then be able to render the Python API / docstrings into it, always with an automated build on merge into main. If appropriate we can also use sphinx gallery for tiny code examples.

Ping @jdolitsky because when we do this refactor it would be nice to have a netlify build to render them! We don't need it yet because the deploy strategy is going to change (jekyll to sphinx) and it would just be unnecessary extra work.

Issue retrieving session url

using sample code from here https://github.com/oras-project/oras-py/blob/main/examples/simple/push.py with harbor https://demo.goharbor.io/harbor/projects

#!/usr/bin/env python3


# This shows an example client. You might need to modify the underlying client
# or provider for your use case. See the oras/client.py (client here) and
# oras.provider.py for what is used here (and you might want to customize
# these classes for your needs).

import argparse
import os

import oras.client
import oras.utils
from oras.logger import logger, setup_logger


def load_manifest_annotations(annotation_file, annotations):
    """
    Disambiguate annotations.
    """
    annotations = annotations or []
    if annotation_file and not os.path.exists(annotation_file):
        logger.exit(f"Annotation file {annotation_file} does not exist.")
    if annotation_file:
        lookup = oras.utils.read_json(annotation_file)

        # not allowed to define both, mirroring oras-go
        if "$manifest" in lookup and lookup["$manifest"]:
            raise ValueError(
                "`--annotation` and `--annotation-file` with $manifest cannot be both specified."
            )

    # Finally, parse the list of annotations
    parsed = {}
    for annot in annotations:
        if "=" not in annot:
            logger.exit(
                "Annotation {annot} invalid format, needs to be key=value pair."
            )
        key, value = annot.split("=", 1)
        parsed[key.strip()] = value.strip()
    return parsed


def main(args):
    """
    A wrapper around an oras client push.
    """
    manifest_annotations = load_manifest_annotations(
        args.annotation_file, args.annotation
    )
    client = oras.client.OrasClient(insecure=args.insecure)
    try:
        if args.username and args.password:
            client.set_basic_auth(args.username, args.password)
        client.push(
            config_path=args.config,
            disable_path_validation=args.disable_path_validation,
            files=args.filerefs,
            manifest_config=args.manifest_config,
            annotation_file=args.annotation_file,
            manifest_annotations=manifest_annotations,
            quiet=args.quiet,
            target=args.target,
        )
    except Exception as e:
        logger.exit(str(e))


def get_parser():
    parser = argparse.ArgumentParser(
        description="OCI Registry as Storage Python SDK example push client",
        formatter_class=argparse.RawTextHelpFormatter,
    )

    parser.add_argument(
        "--quiet",
        dest="quiet",
        help="suppress additional output.",
        default=False,
        action="store_true",
    )

    parser.add_argument(
        "--version",
        dest="version",
        help="Show the oras version information.",
        default=False,
        action="store_true",
    )

    parser.add_argument("--annotation-file", help="manifest annotation file")
    parser.add_argument(
        "--annotation",
        help="single manifest annotation (e.g., key=value)",
        action="append",
    )
    parser.add_argument("--manifest-config", help="manifest config file")
    parser.add_argument(
        "--disable-path-validation",
        help="skip path validation",
        default=False,
        action="store_true",
    )
    parser.add_argument("target", help="target")
    parser.add_argument("filerefs", help="file references", nargs="+")

    # Debug is added on the level of the command
    parser.add_argument(
        "--debug",
        dest="debug",
        help="debug mode",
        default=False,
        action="store_true",
    )

    parser.add_argument(
        "-c",
        "--config",
        dest="config",
        help="auth config path",
        action="append",
    )

    parser.add_argument("-u", "--username", dest="username", help="registry username")
    parser.add_argument(
        "-p",
        "--password",
        dest="password",
        help="registry password or identity token",
    )
    parser.add_argument(
        "-i",
        "--insecure",
        dest="insecure",
        help="allow connections to SSL registry without certs",
        default=False,
        action="store_true",
    )
    return parser


if __name__ == "__main__":
    parser = get_parser()
    args, _ = parser.parse_known_args()
    setup_logger(quiet=args.quiet, debug=args.debug)
    main(args)

getting below error

Issue retrieving session url: {'errors': [{'code': 'BLOB_UPLOAD_UNKNOWN', 'message': 'blob upload unknown to registry'}]} raise ValueError(f"Issue retrieving session url: {r.json()}")

py test-oras.py 
{'Status': 'Login Succeeded'}
Traceback (most recent call last):
  File "/Users/ali/Documents/code/oras/test-oras.py", line 5, in <module>
    client.push(files=["repo.txt"], target="demo.goharbor.io/oras/test:latest")
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/oras/client.py", line 112, in push
    return self.remote.push(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/oras/provider.py", line 592, in push
    response = self.upload_blob(blob, container, layer)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/oras/provider.py", line 214, in upload_blob
    response = self.put_upload(blob, container, layer)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/oras/provider.py", line 377, in put_upload
    raise ValueError(f"Issue retrieving session url: {r.json()}")
ValueError: Issue retrieving session url: {'errors': [{'code': 'BLOB_UPLOAD_UNKNOWN', 'message': 'blob upload unknown to registry'}]}

able to push same file via oras cli

oras push demo.goharbor.io/oras/test:latest repo.txt
Uploading fe7d9872a34c repo.txt
Uploaded  fe7d9872a34c repo.txt
Pushed demo.goharbor.io/oras/test:latest
Digest: sha256:a1c6906c2ba07....

is that harbor registry is not supported by oras?
any working example with harbor pushing tar file?

Provider decision to generalize or simplify

The current decision to have a provider Register with generic push/pull and args/kwargs is because I wasn't sure about the design wanting to be moved toward the "copy" way, where we have a generic idea of a provider that has the same interactions. Moving forward, we have two choices (and it would be good to discuss to come to a decision):

  • extend the current client "provider" to be able to select between different kinds of providers, akin to oras-go, so the current Registry provider might be extended to have one for a filesystem etc.
  • decide that most people just want push/pull to and from a registry, with implied local provider a filesystem and implied remote the Registry.

I haven't seen enough convincing use cases for the first, but we may just not have them in the HPC/national lab/whatever I am in communities. My preference would be to choose the second and then remove the unnecessary client middle layer, and interact with the provider directly. And then we can simplify further (if this is truly the only one) and just call it Oras so interactions are more intuitive, eg.:

from oras.client import Oras
cli = Oras(...)
oras.pull(....)

as opposed to how they are now! I'm putting this note/issue so I don't forget that we need to discuss, when there is a more established group of maintainers.

Bug: Session URL concatenates two url's into one bad one

The following section of code incorrectly concatenates the URL into a bad session URL.

Example:

origional session_url: 'https://registry.gitlab.foo.com/v2/project/path/blobs/uploads/3e7bf2a9-xxxx-xxx-xxx-xxxxxxxx?_state=I8fblah'
prefix: 'https://registry.gitlab.foo.com:443'

Result:

session_url: 'https://registry.gitlab.foo.com:443https://registry.gitlab.foo.com/v2/project/path/blobs/uploads/3e7bf2a9-xxxx-xxx-xxx-xxxxxxxx?_state=I8fblah'

Because the registry has the port embedded in the URL the session_url.startswith concatenates it to the good session URL string.

oras-py/oras/provider.py

Lines 496 to 499 in c817740

prefix = f"{self.prefix}://{container.registry}"
if not session_url.startswith(prefix):
session_url = f"{prefix}{session_url}"
return session_url

Expected actions:

Parse off the port from the URL when assigning the prefix to match.

Return the manifest digest in `Registry.push()`

It's often important to correctly identify a recently uploaded artifact. The current Registry.push() method, only returns an empty
response with the following headers:

{
  'Location': '/v2/XXXX//blobs/sha256:ee7dafd624ee5b0e5a8346730bdb319002aef13190489ffe7832c65816f9e63b',
  'Docker-Distribution-Api-Version': 'registry/2.0',
  'Docker-Content-Digest': 'sha256:ee7dafd624ee5b0e5a8346730bdb319002aef13190489ffe7832c65816f9e63b',
  'X-GUploader-UploadID': 'XXXXXX',
  'Content-Length': '0',
  'Date': 'Mon, 10 Jun 2024 21:35:38 GMT',
  'Server': 'UploadServer',
  'Content-Type': 'text/html; charset=UTF-8',
  'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000'
}

The digest returned in the headers does not match the digest of the manifest in the registry though, making it impossible to accurately locate it there.

oras-py doesn't work with unauthenticated pull

Works

oras pull ghcr.io/ngcloudsec/vdb:v1 -o /tmp/vdb

Performing the same action via the python library results in an authentication error with get_manifest.

oras-py pull ghcr.io/ngcloudsec/vdb:v1 --output /tmp/vdb1
This endpoint requires a token. Please set oras.provider.Registry.set_basic_auth(username, password) first or use oras-py login to do the same.
authentication required
Issue with https://ghcr.io/v2/ngcloudsec/vdb/manifests/v1:
Unauthorized

Dinosaur Todos

The development branch (with un-authenticated push and pull working!) is at https://github.com/oras-project/oras-py/tree/design-two, and this is my list of TODO before it can be considered for review (e.g., the first PR to main).

  • add same views with auth
  • docstrings
  • release workflow #9
  • pretty (branded) documentation #9
  • logout command
  • add testing #6
  • isort, mypy, pyflakes black #6
  • ensure we can push/pull directory, should be tar-d and extracted #8
  • add typing
  • add schemas for manifest, annotations, etc. #8
  • we should have common function to parse errors in json 'errors' -> list -> message #8
  • container bases and automated builds #7

And these need further discussion about design, etc, but I don't think are necessarily blockers for getting a community version for people to start contributing to!

  • add example (custom) GitHub client (talking with a colleague about this - it will be the first "here is how to use the API" example and I'll write docs for that then.
  • difference between plain_http and insecure (I just used the latter)
  • we haven't added path traversal, or cacheRoot to pull
  • environment variables like ORAS_CACHE - what others?
  • refactor internals to be more like oras-go (e.g., provider, copy?) I think I want to be convinced this abstraction is useful.
  • should there be a tags function?
  • need to have git commit, state, added to defaults on install/release. See here. - I'd rather not have this, I don't like when version/install relies on pinging git.

For some background, I made the unwise decision the first time around to mimic the Go client, and that was a bad idea :) This time around I'm implementing everything straight forward (e.g, push and pull) and then I'll add abstraction to make it easy to extend. I am curious about what cases of "copy" have been useful for users or developers? Conceptually it's much more clear with push and pull (and I suspect what 80% of users want!)

Add retry to requests

We will eventually hit cases where a retry is warranted for some upload (I have with oras-go at least). I'm not adding it yet because I haven't hit this case here yet, but when the time comes here is a simple way I implemented it with a wrapper to oras:

def push(self, uri, push_file, content_type=None, retry=3, sleep=1):
    """
    Push an oras artifact to an OCI registry
    """
    tries = 0
    content_type = content_type or pakages.defaults.content_type
    logger.info("Pushing oras {0}".format(uri))
    with pakages.utils.workdir(os.path.dirname(push_file)):
        while tries < retry:
            try:
                return self._push(uri, push_file, content_type)
            except:
                time.sleep(sleep)
                sleep = sleep * 2**tries + random.uniform(0, 1)
                tries += 1

We'd want a decorator in decorators.py that can wrap the main function to do the request. If you hit this issue, please comment so I know to work on it soon!

Push fails on Windows if filepath has drive in it

blob is C:\Users\blah\test.json
The command failed with an unexpected error. Here is the traceback:
C does not exist.
Traceback (most recent call last):
File "C:\Users\sunnycarter.azure\cliextensions\aosm\oras\client.py", line 131, in push
return self.remote.push(*args, **kwargs)
File "C:\Users\sunnycarter.azure\cliextensions\aosm\oras\provider.py", line 647, in push
raise FileNotFoundError(f"{blob} does not exist.")

I believe this is because various places in the oras code split on ':' expecting this to be between the filepath and the media data. However, ':' is in the path on window too. I'll submit a PR to fix.

Add `get_digest(container)` method on provider.Registry

It can often be quite useful to resolve a tag to the digest where an artifact is stored in the registry. To solve this I duplicated the code of Registry.get_manifest and compute the hash client side:

    @decorator.ensure_container
    def get_digest(self, container):
        self.load_configs(container)
        allowed_media_type = [defaults.default_manifest_media_type]
        headers = {"Accept": ";".join(allowed_media_type)}
        headers.update(self.headers)
        get_manifest = f"{self.prefix}://{container.manifest_url()}"  # type: ignore
        response = self.do_request(get_manifest, "GET", headers=headers)
        self._check_200_response(response)
        return f"sha256:{hashlib.sha256(response.content).hexdigest()}"

You cannot do a pull directly from AWS ECR (namespace problem)

I try to perform the following pull by oras-py

oras-py pull ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/registry-name:latest

And I get the following error:

You are minimally required to include a <namespace>/<repository>

But when I try it directly from the oras

oras pull ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/registry-name:latest
Works!

Downloading 827a4b2f6d63 .
Downloaded  827a4b2f6d63 .

It is not mandatory to use namespace in ECR

Login on failure with Nexus with basic auth

Hello,

I have a Nexus repository with basic auth enabled (not anonymous pull). I see two issues when pulling:

  1. Login fails even with docker/config.json having credentials properly set
  2. oras-py expects realm of WWW-authenticate to be an URL which is not what Nexus does

Note that pulling the same artifact with Go oras works fine.

I have a doubt on how basic auth is handled in provider.py when using config.json:

  1. In _load_auth if a matching token is found, it sets _basic_auth but does not change the headers like set_basic_auth does. Shouldn't it be:
            # Case 2: no auth there (wonky file)
            elif not auth:
                return False
            self._basic_auth = auth
            self.set_header("Authorization", "Basic %s" % self._basic_auth)# <<< add
            return True
  1. get_manifest does not load self.header when doing the query while a lot of other methods do. Shouldn't it be something like:
        headers = {"Accept": ";".join(allowed_media_type)}
        headers.update(self.headers) # <<< add
        url = f"{self.prefix}://{container.get_manifest_url()}"  # type: ignore
        response = self.do_request(url, "GET", headers=headers)

If I make these two changes, it works but I'm not sure whether this is entirely correct. If it is, I can submit an MR.

*** .venv/lib/python3.10/site-packages/oras/provider.py     2023-03-24 18:54:38.717625157 +0100
--- provider.py 2023-03-24 19:00:13.727639413 +0100
***************
*** 110,115 ****
--- 110,116 ----
              elif not auth:
                  return False
              self._basic_auth = auth
+             self.set_header("Authorization", "Basic %s" % self._basic_auth)
              return True
          return False
  
***************
*** 694,699 ****
--- 695,701 ----
          if not allowed_media_type:
              allowed_media_type = [oras.defaults.default_manifest_media_type]
          headers = {"Accept": ";".join(allowed_media_type)}
+         headers.update(self.headers)
          url = f"{self.prefix}://{container.get_manifest_url()}"  # type: ignore
          response = self.do_request(url, "GET", headers=headers)
          self._check_200_response(response)

Update contributors locally, discussion about workflow

I can run the contributors update locally so we can have the full set of people that have touched this repository represented, but we also need to have discussion for how to deal with #18.

  • Should we allow unsigned commits from actions?
  • How do other oras project repos deal with this?
  • If not, what is a potential workaround?

I'm hoping we can have some automation to handle this so the contributors can be front and center.

Features that we like in conda_oci_mirror

Hi @vsoch,

awesome work on the initial implementation. I think I have some feedback based on our experience implementing something like this for conda-oci-mirror: https://github.com/mamba-org/conda_oci_mirror

We currently push three layers per "conda"-package to the OCI registry. The first layer is the .tar.bz2 package file. The second layer is the index.json file, containing the metadata and dependencies of the package (useful for creating an index over the whole repo). The third file is a .tar.gz of the info/... directory that every conda package contains (that directory contains a list of all files of the package and more interesting metadata that would be nice to access without downloading the whole package).

In our client, we added some easy-access functions for the latter two objects. For the index.json we first fetch the manifest and look for the layer with the appropriate mediaType (vnd.conda.index+json or something like that) and then pull only that layer blob and immediately parse it to json using res.json() from requests.

For the .tar.gz file we pass it on to the tarfile reader to have access to all files inside in a nice way.

Maybe exposing the get_blob without an underscore could be cool?

I thought having a Layer class (with a .to_json() function) might be nicer than the NewLayer free function that returns a dict (this is just a nitpick, I guess)

We also have a function to retrieve all tags for a given container. This is necessary for efficient mirroring and could also fit well into oras-py I think.

Anyways, this looks super promising, thanks for your work!!

Client request only authorized for same repository: unauthorized when changing repo name

Hello.

We have started using the oras sdk but we got an strange behaviour. This is the following scenario:

Scenario

  • artifacts pushed to exampleregistry - repository quefacemos
  • artifacts pushed to same _exampleregistry- repository quefacemos2
  • authenticate oras client succesfully with access token from az acr registry

pseudocode

Actual behaviour

  • tags_quefacemos = cli.get_tags("exampleregistry/quefacemos") succesful request.
  • tags_quefacemos2 = cli.get_tags("exampleregistry/quefacemos2") fails and returns Unauthorized.
  • If we add a new line to login again between the get_tags call, it works!

fail

cli = oras.client.OrasClient()
cli.login(username=whatever, password=accessToken)
tags_quefacemos = cli.get_tags("exampleregistry/quefacemos")
tags_quefacemos2 = cli.get_tags("exampleregistry/quefacemos2")

ok

cli = oras.client.OrasClient()
cli.login(username=whatever, password=accessToken)
tags_quefacemos = cli.get_tags("exampleregistry/quefacemos")
cli.login(username=whatever, password=accessToken)
tags_quefacemos2 = cli.get_tags("exampleregistry/quefacemos2")

Expected behaviour

  • Running the same commands from the ORAS Cli works:
oras repo tags exampleregistry/quefacemos

OK

oras repo tags exampleregistry/quefacemos2

OK

Is this expected? Why the authentication seems to disappear or stop to being valid after changing the repository?

Authentication against gitlab registry fails

Authentication against the Container registry in gitlab appears to fail. The cause is because the authentication expects the service parameter is expected to be a POST parameter, and not a header.

A fix here is very easy in authenticate_request (https://github.com/oras-project/oras-py/blob/main/oras/provider.py#L691)

        params = {}  # define further up

        h = oras.auth.parse_auth_header(authHeaderRaw)
        if h.service:
            params["service"] = h.service  # added
            headers.update(
                {
                    "Service": h.service,
                    "Accept": "application/json",
                    "User-Agent": "oras-py",
                }
            )

I know this works, I've patched in in my local branch and it works for both private gitlab container registries. However, I don't know if this may introduce compatibility issues with other container registries. I assume they would ignore unused params, but I don't know this for sure

Output Details

When I run this below script I get the following console:

> registry = Registry(hostname="registry.gitlab.company.com:443")
> client = oras.client.OrasClient(registry=registry)
> res = client.pull(target="registry.gitlab.us.lmco.com:443/magelisk/artifact-testing/transfer-manifest.yaml:0.0.1", outdir="./TEMP")

authentication required
Issue with https://registry.gitlab.company.com:443/v2/magelisk/artifact-testing/transfer-manifest.yaml/manifests/0.0.1:
Unauthorized

The originalResponse.headers that goes into the authenticate_request function is:

{'Content-Type': 'application/json', 'Date': 'Wed, 26 Oct 2022 03:08:21 GMT', 'Docker-Distribution-Api-Version': 'registry/2.0', 'Server': 'nginx', 'Www-Authenticate': 'Bearer realm="https://gitlab.company.com/jwt/auth",service="container_registry",scope="repository:magelisk/artifact-testing/transfer-manifest.yaml:pull"', 'X-Content-Type-Options': 'nosniff', 'Content-Length': '197', 'Connection': 'keep-alive'}

The service of container_registry is definitely correct, but not in thne headers.

The way I actually figure out what was wrong was by comparing the ORAS CLI debug output. It gets the same 401, and you can see resulting request URL in last line with the param set

$ oras pull registry.gitlab.us.lmco.com:443/magelisk/artifact-testing/transfer-manifest.yaml:0.0.1 --debug

DEBU[0000]  Request URL: "https://registry.gitlab.company.com:443/v2/magelisk/artifact-testing/transfer-manifest.yaml/manifests/0.0.1"
DEBU[0000]  Request method: "GET"
DEBU[0000]  Request headers:
DEBU[0000]    "Accept": "application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, application/vnd.cncf.oras.artifact.manifest.v1+json"
DEBU[0000]    "User-Agent": "oras/0.15.0"
DEBU[0000]  Response Status: "401 Unauthorized"
DEBU[0000]  Response headers:
DEBU[0000]    "Content-Type": "application/json"
DEBU[0000]    "Connection": "keep-alive"
DEBU[0000]    "Www-Authenticate": "Bearer realm=\"https://gitlab.company.com/jwt/auth\",service=\"container_registry\",scope=\"repository:magelisk/artifact-testing/transfer-manifest.yaml:pull\""
DEBU[0000]    "X-Content-Type-Options": "nosniff"
DEBU[0000]    "Server": "nginx"
DEBU[0000]    "Content-Length": "184"
DEBU[0000]    "Docker-Distribution-Api-Version": "registry/2.0"
DEBU[0000]    "Date": "Wed, 26 Oct 2022 03:12:22 GMT"
DEBU[0000]  Request URL: "https://gitlab.company.com/jwt/auth?scope=repository%3Amagelisk%2Fartifact-testing%2Ftransfer-manifest.yaml%3Apull&service=container_registry"

Schema Validation Error on Pull from artefact push with Oras 1.1.0

I get an error on Pulling this artifact ghcr.io/mariusbertram/oci_test:latest

The Image was pushed with Oras 1.1.0 installed via brew
oras push ghcr.io/mariusbertram/oci_test:latest IdeaProjects/oci/test/:application/vnd.acme.rocket.docs.layer.v1+tar

This is the Error Output:

`File "/opt/homebrew/lib/python3.12/site-packages/oras/provider.py", line 866, in get_manifest
jsonschema.validate(manifest, schema=oras.schemas.manifest)
File "/opt/homebrew/lib/python3.12/site-packages/jsonschema/validators.py", line 1312, in validate
raise error
jsonschema.exceptions.ValidationError: Additional properties are not allowed ('artifactType' was unexpected)

Failed validating 'additionalProperties' in schema:
    {'$schema': 'http://json-schema.org/draft-07/schema',
     'additionalProperties': False,
     'properties': {'annotations': {'type': ['object', 'null', 'array']},
                    'config': {'properties': {'annotations': {'type': ['object',
                                                                       'null',
                                                                       'array']},
                                              'digest': {'type': 'string'},
                                              'mediaType': {'type': 'string'},
                                              'size': {'type': 'number'}},
                               'type': 'object'},
                    'layers': {'items': {'properties': {'annotations': {'type': ['object',
                                                                                 'null',
                                                                                 'array']},
                                                        'digest': {'type': 'string'},
                                                        'mediaType': {'type': 'string'},
                                                        'size': {'type': 'number'}},
                                         'type': 'object'},
                               'type': 'array'},
                    'mediaType': {'type': 'string'},
                    'schemaVersion': {'type': 'number'},
                    'subject': {'type': ['null', 'object']}},
     'required': ['schemaVersion', 'config', 'layers'],
     'title': 'Manifest Schema',
     'type': 'object'}

On instance:
    {'annotations': {'org.opencontainers.image.created': '2024-03-18T08:15:46Z'},
     'artifactType': 'application/vnd.unknown.artifact.v1',
     'config': {'data': 'e30=',
                'digest': 'sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a',
                'mediaType': 'application/vnd.oci.empty.v1+json',
                'size': 2},
     'layers': [{'annotations': {'io.deis.oras.content.digest': 'sha256:097fb7596f38dd92c6be6a3ea188233ba1021c99af143dcd3b3059b1c6c965c5',
                                 'io.deis.oras.content.unpack': 'true',
                                 'org.opencontainers.image.title': 'IdeaProjects/oci/test'},
                 'digest': 'sha256:3f0f2965ee017a5b26e83e9e64b6f218e75f9c4047b62f663db61623c2502065',
                 'mediaType': 'application/vnd.oci.image.layer.v1.tar+gzip',
                 'size': 193}],
     'mediaType': 'application/vnd.oci.image.manifest.v1+json',
     'schemaVersion': 2}

`

Missing verify argument in providers.py

On line 906 of https://github.com/oras-project/oras-py/blob/main/oras/provider.py, the argument verify=self._tls_verify, has been omitted. As a result, if you are using tls_verify=False in a dev environment and there is a non-tls related error connecting to the registry, you will only receive a tls error. Apologies if this is not the correct place to report this bug, I did not see an alternative in the contributing guidelines. I would make a PR myself, but I found this at work and cannot do so.

Consider HEAD to blob before upload

Currently we just try pushing a blob and don't check if it already exists. I tried adding this to #32 but hit some bugs that I didn't have bandwidth to work through (and it was adding too much complexity / too many variables to the PR), so I'm creating an issue as a reminder this is worth trying again.

We probably also need to set up some basic tests that are allowed to push to a GitHub packages registry to test that, not sure how others feel about allowing a push during a test (but we can discuss!)

Remove redundant docs

I have about 1-2 days more worth of checks, but after that we can remove redundant docs from oras-py here as they are represented under https://github.com/oras-project/oras-www. I will likely tweak the python page there retrieved from here) so it has links to the more detailed developer docs/notes here.

[Feature Request] Add "copy" functionality

The copy function is core to many of us using the ORAS CLI. Theoretically, one could re-create the functionality of copy leveraging a combination of the existing push and pull functions. This adds complexity when the desire is to push related artifacts and, due to the nature of docker push, raises concerns that oras push + oras pull may behave differently than oras copy. As such, the CLI is being invoked within a Python application and the SDK is not used.

Success criteria:

  • The ORAS Python client would have a copy function that behaves like the ORAS CLI and includes the recursive copy option.

Note: I haven't spent too much time looking at the code, but I imagine this could be a large effort. And, for us, the work-around of invoking the CLI in code is acceptable. Admittedly, it just isn't desirable.

Error due to default value of manifest_config

If manifest_config is not specified when pushing by oras-py, the implementation currently treats manifest_config as a completely empty string.

oras-py/oras/oci.py

Lines 130 to 136 in 02c5bda

if not path or not os.path.exists(path):
path = os.devnull
conf = {
"mediaType": media_type or oras.defaults.unknown_config_media_type,
"size": 0,
"digest": oras.defaults.blank_hash,
}

I was using google artifact registry and confirmed that push fails if manifest_config is an empty string.

The current workaround (or maybe this is the right way) is to create a file with only "{}" and read that file to successfully execute the push.

If manifest_config was not specified, why not change it so that the contents of manifest_config are treated as "{}"?

code

import oras.client
import oras.logger
from pathlib import Path
from tempfile import TemporaryDirectory
import os
import json

oras.logger.setup_logger(debug=True, quiet=False)

pwd = Path(__file__).parent.resolve()

client = oras.client.OrasClient()
client.login(username="oauth2accesstoken", password="password")

with TemporaryDirectory() as tmp:
    config_path = os.path.join(tmp, "config.json")
    with open(config_path, "w") as f:
        json.dump({}, f)
    res = client.push(files=[str(pwd / "hoge.txt")], target="asia-northeast1-docker.pkg.dev/example-repository/tmp/oras-test:v2", manifest_config=config_path)
print("sucess push with config file")

res = client.push(files=[str(pwd / "hoge.txt")], target="asia-northeast1-docker.pkg.dev/example-repository/tmp/oras-test:v2")
print("success push")

log

Preparing layer {'mediaType': 'application/vnd.oci.image.layer.v1.tar', 'size': 13, 'digest': 'sha256:8f8ad85c91228f6b241b95ecca626a4e9701d7c072ca0c6c001677d797a4af02', 'annotations': {'org.opencontainers.image.title': 'hoge.txt'}}
Service: asia-northeast1-docker.pkg.dev
Scope: repository:example-repository/tmp/oras-test:pull,push
Preparing config {'mediaType': 'application/vnd.unknown.config.v1+json', 'size': 2, 'digest': 'sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a'}
Service: asia-northeast1-docker.pkg.dev
Scope: repository:example-repository/tmp/oras-test:pull,push
Service: asia-northeast1-docker.pkg.dev
Scope: repository:example-repository/tmp/oras-test:pull,push
Successfully pushed asia-northeast1-docker.pkg.dev/example-repository/tmp/oras-test:v2
sucess push with config file
Preparing layer {'mediaType': 'application/vnd.oci.image.layer.v1.tar', 'size': 13, 'digest': 'sha256:8f8ad85c91228f6b241b95ecca626a4e9701d7c072ca0c6c001677d797a4af02', 'annotations': {'org.opencontainers.image.title': 'hoge.txt'}}
Service: asia-northeast1-docker.pkg.dev
Scope: repository:example-repository/tmp/oras-test:pull,push
Preparing config {'mediaType': 'application/vnd.unknown.config.v1+json', 'size': 0, 'digest': 'sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'}
Service: asia-northeast1-docker.pkg.dev
Scope: repository:example-repository/tmp/oras-test:pull,push
Service: asia-northeast1-docker.pkg.dev
Scope: repository:example-repository/tmp/oras-test:pull,push
Trying with provided Basic Authorization...
failed to read config blob: sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
Traceback (most recent call last):
  File "/Users/linsho/repos/oras-py/tmp/./tmp_log.py", line 22, in <module>
    res = client.push(files=[str(pwd / "hoge.txt")], target="asia-northeast1-docker.pkg.dev/example-repository/tmp/oras-test:v2")
  File "/Users/linsho/repos/oras-py/tmp/venv/lib/python3.10/site-packages/oras/client.py", line 132, in push
    return self.remote.push(*args, **kwargs)
  File "/Users/linsho/repos/oras-py/tmp/venv/lib/python3.10/site-packages/oras/provider.py", line 755, in push
    self._check_200_response(self.upload_manifest(manifest, container))
  File "/Users/linsho/repos/oras-py/tmp/venv/lib/python3.10/site-packages/oras/provider.py", line 593, in _check_200_response
    raise ValueError(f"Issue with {response.request.url}: {response.reason}")
ValueError: Issue with https://asia-northeast1-docker.pkg.dev/v2/example-repository/tmp/oras-test/manifests/v2: Not Found

environment

attrs==23.1.0
certifi==2023.7.22
charset-normalizer==3.3.1
idna==3.4
jsonschema==4.19.1
jsonschema-specifications==2023.7.1
oras==0.1.25
referencing==0.30.2
requests==2.31.0
rpds-py==0.10.6
urllib3==2.0.7

Authenticate Request Error For Nexus Registry

Hello! Thanks for all the hard work you've put into this project, it's really cool!

Setup

  • Mac
  • python 3.8
  • oras-py: 0.1.23

Error

So I was following the tutorial to upload a file, and I got an error when I did this with a Nexus Repository. This repository is private, btw.

import oras.client

host = "http://my-docker.my-ip.com"
client = oras.client.OrasClient(hostname=host)
client.login(username="my-username", password="my-password", hostname=host, insecure=True)
client.push(files=["requirements.txt"], target=f"{host}/artifact:v1")

My issue is I was able to login, but on the push command it would error out to the wrong host.

urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='sonatype%20nexus%20repository%20manager', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa32854a230>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

When I added some print statements to provider.py, I found that the www-authenticate header param was being set to that Sonatype realm. However in the Via field, I found my hostname.

 {
...
 'WWW-Authenticate': 'BASIC realm="Sonatype Nexus Repository Manager"',
 'Via': '1.1 my-docker.my-ip.com', 
...
}

So, it seems there's a proxy. I was going to write a workaround for my project that inherits the registry and checks for the Via, but I figured this was a fix that would be useful to the community and wanted to report it.

Bug: when layer media_type provided, not actually set

  File "/home/runner/.local/lib/python3.8/site-packages/oras/oci.py", line 104, in NewLayer
    return Layer(blob_path=blob_path, media_type=media_type, is_dir=is_dir).to_dict()
  File "/home/runner/.local/lib/python3.8/site-packages/oras/oci.py", line 83, in to_dict
    "mediaType": self.media_type,
AttributeError: 'Layer' object has no attribute 'media_type'
Error: Process completed with exit code 1.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.