Coder Social home page Coder Social logo

pulp_python's Introduction

pulp_python

Pulp Nightly CI/CD

A Pulp plugin to support hosting your own pip compatible Python packages.

For more information, please see the documentation or the Pulp project page.

pulp_python's People

Contributors

asmacdo avatar barnabycourt avatar beav avatar bfahr avatar bmbouter avatar byoungb avatar codeheeler avatar daviddavis avatar dkliban avatar dralley avatar eorochena avatar fao89 avatar fdobrovolny avatar gephelps avatar gerrod3 avatar goosemania avatar ichimonji10 avatar ipanova avatar jeremycline avatar jortel avatar lubosmj avatar mdellweg avatar mhrivnak avatar mikedep333 avatar pavelpicka avatar pcreech avatar pulpbot avatar rochacbruno avatar seandst avatar werwty avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pulp_python's Issues

Improve Publications functional tests

Original Pulp Redmine Issue: https://pulp.plan.io/issues/4748

The tests written during the feature change are a stopgap, and have some problems:
#242

The tests need to be broken up. Currently they are fragile and fully dependent on each other to pass. Ideally, the preparation should be moved to the setup of each class, but this could cause slow tests since a lot of steps need to be in place before a publication can be created. Please have a look at the utilities, they probably need to be refactored.

Expand coverage for pip install test

Original Pulp Redmine Issue: https://pulp.plan.io/issues/6838

The pip install test uses only the sdist (.tar.gz) package type when checking that it can install 'shelf-reader'. However, the repository has an additional wheel (.whl) version that could be installed from but isn't checked.

Currently the default behavior for pip install (Verision 20.1.1 on Python 3.7) seems to use the sdist type if the package 'wheel' is not installed (see pip install message below). Checking the pip forums the discussion on the workings of pip install seem to be ongoing and could be subject to change.

When documentation is available for the workings of pip install the pip install test should be expanded to test installing from different package types.

$ pip install --trusted-host localhost -i http://localhost:24816/pulp/content/4352fcae-4ca3-4df1-ab0f-199adc3cec8c/simple/ shelf-reader
Looking in indexes: http://localhost:24816/pulp/content/4352fcae-4ca3-4df1-ab0f-199adc3cec8c/simple/
Collecting shelf-reader
  Downloading http://localhost:24816/pulp/content/4352fcae-4ca3-4df1-ab0f-199adc3cec8c/shelf-reader-0.1.tar.gz (19 kB)
Using legacy setup.py install for shelf-reader, since package 'wheel' is not installed.
Installing collected packages: shelf-reader
    Running setup.py install for shelf-reader: started
    Running setup.py install for shelf-reader: finished with status 'done'
Successfully installed shelf-reader-0.1

As a user, I can expres how many old versions of a package to keep during sync

Migrated from Pulp Redmine, original link: https://pulp.plan.io/issues/138

Similar to the yum importer's --retain-old-count feature, we should have a way for users to express to us which versions of a package to keep during syncs. There are a few options we could consider:

  1. We could do exactly as --retain-old-count does, allowing the user to specify an integer of how many packages to keep with 0 meaning keep them all.

  2. We could instead allow the user to express ranges of versions they would like to sync with the package names. Something along the lines of "1.7<=Django<1.8" could express to sync and keep all Django versions in that range.

  3. We could also potentially accomplish both of the above.

I considered option 1 because PyPI is quite a bit different than a yum repository. It gets all versions that a package ever goes through, with sudden API changes as they occur. Yum repositories typically stick to a particular version of a package, and they patch it. Users may be interested in expressing the packages they want to keep in terms of API stability. Option 2 would allow them to do that while also limiting the number of compatible versions they keep. If we decide to go with the last option, we may decide to break this into two stories if that makes sense.

Deliverables:

  • Provide some way for users to limit how many versions of a package they wish to keep via the CLI
  • Modify the importer to honor this new importer setting
  • Write a documentation example of how to use the setting with the CLI
  • Write release notes
  • Document the new importer setting in the importer technical documentation
  • Write unit tests

References:
PEP-440[0] contains the specifications for Python versions, and PyPA has created a reference library[1] for this specification.

[0] https://www.python.org/dev/peps/pep-0440/#version-specifiers
[1] https://packaging.pypa.io/en/latest/specifiers/

Change remote and publisher name

Original Pulp Redmine Issue: https://pulp.plan.io/issues/4377

Since we changed app_label from pulp_python to python, remotes are now
pulp/api/v3/remotes/python/python/

and publishers are
pulp/api/v3/publisher/python/python/

Not pretty. Lets change this to be more descriptive. I suggest "warehouse" or "pypi".

dalley comment:
Let's go with "pypi"

Note, with the Pulp-CLI now supporting python commands, a separate PR would be required to close this issue.

Implement Last Serial for Python Repositories

Original Pulp Redmine Issue: https://pulp.plan.io/issues/7491

When syncing with PyPi a check for the field last_serial is performed to see if an update is needed. Currently this field is defaulted to 0 in the sync operation so the update is always performed, but this optimization can be added by creating a last_serial field for each Python repository. This field would be passed onto the sync operation and would update after each sync. This field is currently hard-coded to 1000000000 in the PyPi live API when Pulp distributes Python content, but this value can be switch to the repository's value after it is created.

Update docs for adding content to a repository

Using the latest pulp 3 container image and pulp-cli 0.8.0 I noticed that the syntax to add content to a repository has changed from repository add --name foo to repository content add --repository foo. This is for the python plugin version 3.2.0.

Remote storage backends don't support twine upload

When performing a twine upload to pulp_python using a remote storage backend like S3 the upload task will fail with the traceback:

pulpcore.tasking.pulpcore_worker:INFO:   File "/usr/local/lib/python3.9/site-packages/pulpcore/tasking/pulpcore_worker.py", line 268, in _perform_task
    result = func(*args, **kwargs)
 
  File "/usr/local/lib/python3.9/site-packages/pulp_python/app/tasks/upload.py", line 43, in upload
    content_to_add = pre_check or create_content(artifact_sha256, filename)
 
  File "/usr/local/lib/python3.9/site-packages/pulp_python/app/tasks/upload.py", line 105, in create_content
    shutil.copy2(artifact.file.path, temp_path)
 
  File "/usr/local/lib/python3.9/site-packages/django/db/models/fields/files.py", line 58, in path
    return self.storage.path(self.name)
 
  File "/usr/local/lib/python3.9/site-packages/django/core/files/storage.py", line 116, in path
    raise NotImplementedError("This backend doesn't support absolute paths.")

Revise Pulp to Pulp syncing to use https

Original Pulp Redmine Issue: https://pulp.plan.io/issues/7552

With the new integration with Bandersnatch, Pulp Python will now require the host url for the sync to use https. This requirement causes the Pulp to Pulp syncing feature to fail since Pulp content is hosted using http. Revise Python content hosting to use https so this feature and its corresponding test can be enabled.

Change uniqueness constraint for packages to their sha256

Currently packages are unique by their filename, so there can't be custom versions of any package already present in Pulp. By changing the uniqueness constraint to the sha256 one user can have a package from PyPI in their repository and another user upload their own custom version of that package with no conflicts in Pulp. At the repo level, the package filename should still be the uniqueness constraint as the simple api doesn't allow for indexes to have packages of the same name.

Publications publish more content then is in the repository

The publish task creates the simple api files by going through each package in the repository and creating a page for them with links to each of their releases. The publish task includes all the releases of a package even if that release is not apart of the repository. Simply changing the query to check that the release is in the repository's content will fix this.

pulp to pulp syncs fail with 405 on /pypi

Trying to sync one pulp server from another (or even one remote from another on the same instance), results in a 405 when accessing /pypi

"POST /pulp/content/Default_Organization/Library/custom/python/python2/pypi HTTP/1.1" 405 226 "-" "bandersnatch 4.4.0 aiohttp XML-RPC client

The sync doesn't error, but no content is synced

Test installation of very large lazily-synced package

Original Pulp Redmine Issue: https://pulp.plan.io/issues/7402

Some of the largest packages, such as TensorFlow, can be as large as 340 megabytes in size [0]. Some may even be larger.

If such packages are lazily-synced, and a user tries to install them from Pulp, we should ensure that any delays incurred by Pulp in the process of downloading these packages in the background does not interfere with the operation of the client making the request to Pulp.

A manual test would be acceptable, but an automated test that is not part of the standard CI suite would be better.

[0] https://pypi.org/project/tensorflow/2.3.0/#files

Support pull-through caching

I have a use case for creating a local pypi cache which acts as a pull-through (as defined by pulpcore), not requiring the specification of includes for a remote, and not requiring the downloading of all pypi metadata for an on_demand remote.

PythonDistribution extends Distribution, which supports a direct remote link for pull-through cache implementation, but the serializer will have to be updated to include the field in the remote defintion.

https://github.com/pulp/pulpcore/blob/354383883032277e7a1f7dc7ddf2dc0a5bc40fad/pulpcore/app/serializers/publication.py#L202

Discussing this with @gerrod3, he expressed concern with this implementation strategy alone, because it does not store package artifacts for packages that are requested and resolved - the download is streamed directly from pypi every time.

@gerrod3 pointed out that helper methods for the Remote model would need to be updated to store an artifact on_demand in addition to pull-through caching behavior via Distribution remotes.

https://github.com/pulp/pulpcore/blob/9043a56dfd7f66a243f6ffc9c149d22c53fef14b/pulpcore/app/models/repository.py#L388

As a user, I can choose which package types to sync

Original Pulp Redmine Issue: https://pulp.plan.io/issues/2040

A user should be able to disable syncing of wheels or sdist, and possibly restrict the different compression (tar.gz, tar.bz2, .zip) that hold them. The primary use case for this is a user who primarily uses sdists because there can be many wheels for each sdist. This could expend a lot of unnecessary resources.

Add RBAC to views

Adding RBAC to plugin views: https://docs.pulpproject.org/pulpcore/plugins/plugin-writer/concepts/index.html#role-based-access-control

Specific Permissions for PyPI views

From PyPI:

There are two possible roles for collaborators:

Maintainer
Can upload releases for a package. Cannot invite collaborators. Cannot delete files, releases, or the project.
Owner
Can upload releases. Can invite other collaborators. Can delete files, releases, or the entire project.

These permissions need to be added to distributions (our indexes) per project in the distribution and be auto assigned when uploading content to the index. Upon first time a package from a project is uploaded, that user becomes the owner of that project. Future package uploads for that project needs to have the owner or maintainer role. For content that is already present in the index or is added through pulp (syncing or content api) they will need to get a default owner (creator of the distribution seems like a good choice). These permission checks will only occur on the /legacy/ and POST /simple/ endpoints for now. All other PyPI apis will be available to view from unauthenticated users.

Publications are broken on S3

Publications don't work on S3 for two reasons:

  • Incorrect Content-Type Header: https://pulp.plan.io/issues/9216
  • Simple index page links are relative which makes the links break when the files are hosted on S3.

Using live repositories on distributions is a simple workaround (and preferred method) for setups that use S3.

As a user, I can upload a python package to a repository from twine

Original Pulp Redmine Issue: https://pulp.plan.io/issues/2887

In pulp_python for Pulp 2, uploads come in over the standard "uploads" REST API. Because of this, the python plugin uses the twine utility to inspect the uploaded Python Distribution to parse the appropriate metadata. Rather than continue to depend on twine or carry identical code to inspect packages, it makes sense to receive those uploads directly from twine.0

This story will be complete when a user can upload a package to pulp_python using twine upload. This story does not include the use of the standard Pulp 3 upload API.

This story will be complete when a user can use twine extract metadata from a Python package (all types) and create a ContentUnit via the REST API.

dalley comment:
(All of this info applies to legacy PyPI, not Warehouse)

This is the code which twine uses to do uploads 0

The URL is just the unmodified PyPI endpoint url, e.g. https://upload.pypi.org/legacy/, https://test.pypi.org/legacy/. Nothing special going on there.

At the point where it is shoved into the multipart encoder [1], this is what the raw set of tuples looks like

(sdist)

[('name', 'pulp-python'), ('version', '3.0.0a1.dev0'), ('filetype', 'sdist'), ('pyversion', None), ('metadata_version', '1.0'), ('summary', 'pulp-python plugin for the Pulp Project'), ('home_page', 'http://www.pulpproject.org'), ('author', 'Pulp Project Developers'), ('author_email', '[email protected]'), ('maintainer', None), ('maintainer_email', None), ('license', 'GPLv2+'), ('description', None), ('keywords', None), ('platform', 'UNKNOWN'), ('download_url', None), ('comment', None), ('md5_digest', '1488f866e0a86455e3a90ed8152167bb'), ('sha256_digest', '41b0233eb20db4324c0285720f3c206b78df3219d6e636c09e85cfa22751d857'), ('blake2_256_digest', '9445dbe404dd962f9af7bd915b7e8bd92bd602fa032ec42c5ac33a9c5f6d4cc2'), ('requires_python', None), (':action', 'file_upload'), ('protcol_version', '1'), ('content', ('pulp-python-3.0.0a1.dev0.tar.gz', <_io.BufferedReader name='dist/pulp-python-3.0.0a1.dev0.tar.gz'>, 'application/octet-stream'))]

(bdist_wheel)

[('name', 'pulp-python'), ('version', '3.0.0a1.dev0'), ('filetype', 'bdist_wheel'), ('pyversion', 'py3'), ('metadata_version', '2.0'), ('summary', 'pulp-python plugin for the Pulp Project'), ('home_page', 'http://www.pulpproject.org'), ('author', 'Pulp Project Developers'), ('author_email', '[email protected]'), ('maintainer', None), ('maintainer_email', None), ('license', 'GPLv2+'), ('description', 'UNKNOWN\n\n\n'), ('keywords', None), ('platform', 'UNKNOWN'), ('download_url', None), ('comment', None), ('md5_digest', 'e7589d3c306f46003bcbb90107b16421'), ('sha256_digest', '880c97d59ec6a94a5e35ef49cfeb3be7161d503dc7a7a283894b61bb4b5aacc5'), ('blake2_256_digest', '8848709ab5c62da72825b76477483073e2179e1681c41c8fa9545b18bf7ef93d'), ('requires_dist', 'pulpcore-plugin'), ('requires_python', None), (':action', 'file_upload'), ('protcol_version', '1'), ('content', ('pulp_python-3.0.0a1.dev0-py3-none-any.whl', <_io.BufferedReader name='dist/pulp_python-3.0.0a1.dev0-py3-none-any.whl'>, 'application/octet-stream'))]

This is the raw bytes content of the encoder itself are attached as files. One is an upload with an sdist content and one is an upload with bdist_wheel as the content. The request is slightly different between the two. This is the closest I could get to the actual HTTP request, considering PyPI is HTTPS-only which prevents actual interception of the request (i.e. Wireshark). It's good enough for our purposes.

The POST request is created by the requests library with these options [2]. The Content-Type header is

multipart/form-data; boundary=b9762ceb450a48d98e81d94d363782c0

Where "boundary" is a uuid generated here [3]

Looks like there is also a defect in this code... [4] "protocol" is misspelled, and it is spelled correctly in the register code above. I submitted a PR to twine for this.

0 https://github.com/pypa/twine/blob/a90be8f57f02630c25cbb7e9f3d9a89578122f6c/twine/repository.py#L120-L172
[1] https://github.com/pypa/twine/blob/a90be8f57f02630c25cbb7e9f3d9a89578122f6c/twine/repository.py#L133
[2] https://github.com/pypa/twine/blob/a90be8f57f02630c25cbb7e9f3d9a89578122f6c/twine/repository.py#L145
[3] https://github.com/requests/toolbelt/blob/master/requests_toolbelt/multipart/encoder.py#L83
[4] https://github.com/pypa/twine/blob/master/twine/repository.py#L127

amacdona comment:
Since the twine uploads go to an "unmodified PyPI endpoint url" this one depends on making a python repository as part of a live API. Once the live API story is written, this one needs to be related to it.

+1, this is a cool issue.

We can't do this just yet, because currently publications are served by an associated distribution, which is a pulpcore feature. To allow this, the python plugin will need to field requests that are namespaced by distribution.base_path. I'm not sure if this can be done while leaving the existing code, or if we would need to alter the workflows for pulp_python to avoid the pulpcore distributions. If this is the way that we have to go, we would need to implement a live API endpoint for upload and somehow also handle the workload of distributions. I imagine this could be done in two ways, 1) we could serve static files (publications) or 2) we could implement a live API for GET requests for the JSON metadata like PyPI does itself.

As a user, I can specify Remote.includes with a requirements.txt

Original Pulp Redmine Issue: https://pulp.plan.io/issues/4711

This will fill out the common workflow:

  1. pip install something from pypi. (pip will do dependency resolution)
  2. pip freeze > requirements.txt (will include packages and all deps)
  3. Create a warehouse remote, with requirements.txt used to populate the includes

This will significantly simplify the workflow, preventing the user from having to hand-parse their own includes.

Add missing fields to PyPi live API info

Original Pulp Redmine Issue: https://pulp.plan.io/issues/7492

The following fields are missing from the info dictionary in the PyPi live API created by python_content_to_info():

bugtrack_url, description_content_type, docs_url, package_url, project_urls {Download, Homepage}, release_url, yanked, yanked_reason

These fields typically are left blank or have a common default value and are not used very often, but they could be added when creating PythonPackageContent so the PyPi live API would have full completeness.

unable to sync python packages via proxy

remote configuration:

$ .local/bin/pulp python remote list
[
  {
    "pulp_href": "/pulp/api/v3/remotes/python/python/c29cd792-70ee-4685-93a4-93406c0b9fb8/",
    "pulp_created": "2021-08-30T18:52:26.441534Z",
    "name": "pypi",
    "url": "https://pypi.org/",
    "ca_cert": null,
    "client_cert": null,
    "tls_validation": true,
    "proxy_url": "http://my-web-proxy.local:8080",
    "pulp_labels": {},
    "pulp_last_updated": "2021-08-30T19:24:11.665512Z",
    "download_concurrency": null,
    "max_retries": null,
    "policy": "on_demand",
    "total_timeout": null,
    "connect_timeout": null,
    "sock_connect_timeout": null,
    "sock_read_timeout": null,
    "headers": null,
    "rate_limit": null,
    "includes": [
      "ansible",
      "ansible-base",
      "ansible-lint"
    ],
    "excludes": [],
    "prereleases": false,
    "package_types": [],
    "keep_latest_packages": 0,
    "exclude_platforms": []
  }

error:

$ podman logs --follow pulp
pulp [e4be013a1fb34a829098328c8648769f]: bandersnatch.package:ERROR: Timeout error for ansible (0) not updating. Giving up.

Is it possible to enable verbose logging to see the URL that bandersnatch tries to fetch?

base_url and reverse proxy, LB or else

With pulpcore and other plugins we can set CONTENT_ORIGIN and CONTENT_PATH_PREFIX that make possible to use the base_url in a complex environment where the local fqdn is not reachable from outside.

Is it possible to alter the current base_url in the python plugin according to these parameters ?

Python module synced from PyPI is not found and cannot be installed if it has "." in name

Running the commands to create a PyPI repository, create a remote (including the module "jaraco.collections"), sync the remote, and create a distribution all complete without issue. However, there is some issue when the module name contains a full stop (".").

pulp python repository create --name test_repo2
pulp python remote create --name test_remote2 --includes '["jaraco-collections"]' --url https://pypi.org/
pulp python repository sync --name test_repo2 --remote test_remote2
pulp python distribution create --name test_dist2 --base-path test_dist2 --repository test_repo2

When attempting to install "jaraco.collections", it is not found in the distribution and the install fails. Our pulp server is named "pulp-server".

(venv) [tom@tom-pc /tmp/venv $ pip install jaraco.collections --index-url https://pulp-server/pypi/test_dist2/simple/ --trusted-host pulp-server
Looking in indexes: https://pulp-server/pypi/test_dist2/simple/
ERROR: Could not find a version that satisfies the requirement jaraco.collections (from versions: none)
ERROR: No matching distribution found for jaraco.collections
(venv) [tom@tom-pc /tmp/venv $ pip install jaraco-collections --index-url https://pulp-server/pypi/test_dist2/simple/ --trusted-host pulp-server
Looking in indexes: https://pulp-server/pypi/test_dist2/simple/
ERROR: Could not find a version that satisfies the requirement jaraco-collections (from versions: none)
ERROR: No matching distribution found for jaraco-collections
(venv) [tom@tom-pc /tmp/venv $ 

Examining the distribution on our pulp server, strange things can be seen regarding the module synced. It is confirmed that "jaraco.collections" has been included.

dist_summary

Clicking the "jaraco.collections" link (NOTE: dot "." in the name) from this page, we are taken to a page that should have references to the actual module files, but it is empty, and the server displays the name of the module as "jaraco-collections" (NOTE: dash "-" in the name).

click_link_dot

Tweaking the URL from "jaraco-collections" -> "jaraco.collections", we are taken to a page that does successfully show the module files.

tweak_url_dot

I am convinced therefore that the "jaraco.collections" module is available on our pulp server for install, but the server has confusion about how it handles the "-" vs "." in the module name.

Packages from remote unavailable

I'm not sure if this is an issue with the documentation, my method, or a bug. This started as an issue seen on our local pulp repository running a slightly older version; I've now tested it on the latest docker images and am still having no joy.

I have a package, package with a dependency on dependency, available in PyPI and want to be able to install package like this:

$ pip install --index-url https://pulp/pulp/content/foo/simple/ package

What happens is an error something like this:

...
  Looking in indexes: https://pulp/pulp/content/foo/simple/
  ERROR: Could not find a version that satisfies the requirement dependency>=1.0.0 (from versions: none)
  ERROR: No matching distribution found for dependency>=1.0.0
...
ERROR: Could not find a version that satisfies the requirement package (from versions: 1.0.0)
ERROR: No matching distribution found for package

Note that pip finds (and downloads) package successfully - it's only the dependencies (which are not in the repository foo) which cause an issue.
The documentation here is not especially clear. My understanding is that it should be sufficient to perform the following:

$ pulp python remote create --name pypi-excludes --url https://pypi.org/ --excludes '["package"]'
$ pulp python repository sync --name foo --remote pypi-excludes

Once the task completes, I notice that no new repository version is created. Nevertheless, something happened: if I inspect the docker volumes, they now have information from the index. I see that the the remote has been assigned to the repository (I also tried to do this by hand, but the docs say the sync does it):

$ pulp python repository show --name foo                                                                                                                                   
{
  "pulp_href": "/pulp/api/v3/repositories/python/python/6f6d94b0-9906-4b34-ba90-2973ccb767d9/",
  "pulp_created": "2021-05-27T11:27:29.503126Z",
  "versions_href": "/pulp/api/v3/repositories/python/python/6f6d94b0-9906-4b34-ba90-2973ccb767d9/versions/",
  "pulp_labels": {},
  "latest_version_href": "/pulp/api/v3/repositories/python/python/6f6d94b0-9906-4b34-ba90-2973ccb767d9/versions/2/",
  "name": "foo",
  "description": null,
  "remote": "/pulp/api/v3/remotes/python/python/051db320-ad11-4e1d-ac84-8d9002f13720/"
}

and here is the repository version:

$ pulp python repository version show --repository foo
{
  "pulp_href": "/pulp/api/v3/repositories/python/python/6f6d94b0-9906-4b34-ba90-2973ccb767d9/versions/2/",
  "pulp_created": "2021-05-27T17:43:42.798016Z",
  "number": 2,
  "base_version": null,
  "content_summary": {
    "added": {
      "python.python": {
        "count": 1,
        "href": "/pulp/api/v3/content/python/packages/?repository_version_added=/pulp/api/v3/repositories/python/python/6f6d94b0-9906-4b34-ba90-2973ccb767d9/versions/2/"
      }
    },
    "removed": {},
    "present": {
      "python.python": {
        "count": 2,
        "href": "/pulp/api/v3/content/python/packages/?repository_version=/pulp/api/v3/repositories/python/python/6f6d94b0-9906-4b34-ba90-2973ccb767d9/versions/2/"
      }
    }
  }
}

(There are 2 things present because I added a second version of the package to see if one had to have the remote already assigned to the repository when adding a new content unit for it to work)

I shall skip the response for the remote since it is long, but it has excludes: ["package"] and policy: "on_demand" as I expected.

Maybe there is some issue with the way I have created the remote, or something else?

By the way, there is a definite documentation bug regarding the policy - the description is inconsistent on possible values. There is a common issue throughout the auto-generated API documentation where only the types are included. Often it is not clear without already being fully familiar with the system and API what object a pulp_href refers to, since all objects have hrefs. In this specific case, the meaning of the different policies is not explained beyond the names of the enum, but there are many others.

pull-through-cache ignores proxy setting

Version
Installed through pip, from pypi:
pulpcore: 3.20.0
pulp_python: 3.7.1

Describe the bug
Pulp is behind an HTTP proxy. Trying to set up a pull-through cache.
Function pull_through_package_simple ignores my remote's proxy setting and tries to establish a direct connection to pypi.

To Reproduce

pulp python remote create --name pypi --url https://pypi.org/ --proxy-url http://10.20.30.40:8080
pulp distribution create --base-path pypi --remote pypi --name pypi

Go to http://yourpulpserver/pypi/pypi/simple/pulp-python/

Expected behavior
Pulp should connect through the http proxy and display the available packages.

Additional context

Quick+dirty fix. Not sending a pull request because I'm not familiar with pulp source code and how things are done around here.

diff --git a/pulp_python/app/pypi/views.py b/pulp_python/app/pypi/views.py
index 4bb9906..5371c83 100644
--- a/pulp_python/app/pypi/views.py
+++ b/pulp_python/app/pypi/views.py
@@ -193,15 +193,19 @@ class SimpleView(ViewSet, PackageUploadMixin):
     def pull_through_package_simple(self, package, path, remote):
         """Gets the package's simple page from remote."""
         def parse_url(link):
             parsed = urlparse(link.url)
             digest, _, value = parsed.fragment.partition('=')
             stripped_url = urlunsplit(chain(parsed[:3], ("", "")))
             redirect = f'{path}/{link.text}?redirect={stripped_url}'
             d_url = urljoin(BASE_CONTENT_URL, redirect)
             return link.text, d_url, value if digest == 'sha256' else ''

         url = remote.get_remote_artifact_url(f'simple/{package}/')
-        response = requests.get(url, stream=True)
+        kwargs = {}
+        if remote.proxy_url:
+            kwargs["proxies"] = {"http": remote.proxy_url, "https": remote.proxy_url}
+
+        response = requests.get(url, stream=True, **kwargs)
         links = parse_links_stream_response(response)
         packages = (parse_url(link) for link in links)
         return StreamingHttpResponse(write_simple_detail(package, packages, streamed=True))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.