Coder Social home page Coder Social logo

secure-systems-lab / peps Goto Github PK

View Code? Open in Web Editor NEW

This project forked from python/peps

0.0 0.0 2.0 14.67 MB

Python Enhancement Proposals

Home Page: https://www.python.org/dev/peps/

Makefile 0.75% Python 91.52% C 2.90% CSS 4.67% Shell 0.15%

peps's People

Contributors

1st1 avatar abalkin avatar akuchling avatar ambv avatar benjaminp avatar birkenfeld avatar brettcannon avatar dholth avatar dstufft avatar ericsnowcurrently avatar ericvsmith avatar freddrake avatar goodger avatar gvanrossum avatar ilevkivskyi avatar larryhastings avatar loewis avatar mariatta avatar ncoghlan avatar ned-deily avatar nnorwitz avatar pitrou avatar pjeby avatar rhettinger avatar rosuav avatar tim-one avatar tiran avatar trishankatdatadog avatar vstinner avatar warsaw avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

mnm678 jhdalek55

peps's Issues

Metadata scalability: compression

Caveat: I'm reviewing the PEP as a non-expert in attempt to help improve clarity, I hope these comments are useful.

The PEP suggests that PyPI SHOULD provide a compressed version of snapshot metadata, it would be valuable to include some more specific recommendations and guidance on how to go about doing so. Particularly with regards to the grave warnings around the perils of compression in TAP 10.

Specify hashing algorithms

We do say SHA-2 when it comes to consistent snapshots of targets, but let's also specify we mean SHA-256 and SHA-512 when it comes to hashes of metadata and targets in the first place This is to stave off comments from folks who prefer versioned protocols instead of "crypto agility."

Transaction size and upload process

dstufft:

PyPI currently does not allow anything more than a single file per transaction, are we going to be required to change that for this PEP?

Currently on PyPI all of this is handled in-band during the upload, and each file uploaded is treated as a separate transaction. Does this PEP require that all files for a release be uploaded within the same transaction? Are we going to have to move to an out of band process for dealing with releases and have the upload API just queue a release?

Clean up references

As we've deleted sections and updated the text several of the listed references are no longer referred to in the document content. We should clean these up before the updated PEP is merged.

Clarify what "multiple target files" mean

@di said: If I understand correctly, this means that the size of PyPI's "on disk" size for target files must grow 3x: once for the existing files (to maintain backwards compatibility), once for the SHA2-256 hash and once for the SHA2-512 hash. Correct? If so I think this should probably be noted in the "What Additional Repository Files are Required on PyPI?" section above.

I think it makes sense to also address in the same PR something else @di said: I think "target file" could stand to be defined here. We usually refer to these as "release files" or "artifacts" when working on PyPI/pip, so drawing the parallel here would be helpful.

Add a transition plan

Might be wise to discuss a transition plan to slowly but surely migrate pip users to PEP 458 instead of turning it on full-blast all at once. This plan should also discuss how users can temporarily disable using PEP 458 in case of unforeseen issues (such as fixing a bug that requires downloading and installing an update of pip in the first place).

Replace "--no-pep-458" with something stronger

@di said: I think that the example we give here has a high likelihood of becoming the actual flag that pip implements. The current example (--no-pep-458) is obscure enough that a user could be convinced to use it without realizing the effect it has. I think we should use an example flag that is more clear about the outcome, such as --unsafely-disable-package-verification, or something similar.

Clarify snapshot process

@ewdurbin:

If I understand this correctly, this means that for a given release, PyPI must wait for all files for that release to exist before signing metadata? That may be an issue for two reasons:

files are uploaded one-by-one currently
PyPI allows for upload of additional files for a release at any point in the future
Are these cases allowable?

Use a single hash function

dstufft:

It feels to me like using both sha256 and sha512 here is needless extra cost. We don't know what kinds of attacks are going to be available in the future and which hash functions they're going to affect. I think It would be a better overall idea to have a single hashed location, plus the unhashed location.

CDN

Can we use relative filenames for packages, or do we need to point to absolute locations?

Why doesn't targets delegate directly to bin-n roles?

Why doesn't targets delegate directly to bin-n roles? What is the benefit of having an intermediate bins role? The only purpose I can see is semantic. Delegating from targets directly to bin-n would reduce complexity of this pep.

Hash algorithm transition plan

dstufft

In the future if/when an attack against the current hash, we'll need a plan to transition to a new hash (which I think we need documented anyways, even with 2 hashes because the weakness could affect both of them).

Metadata scalability: more detail on hashed bins

Caveat: I'm reviewing the PEP as a non-expert in attempt to help improve clarity, I hope these comments are useful.

Having read the "Metadata Scalability" section of the PEP and the "Delegate to Hashed Bins" section of the TUF tutorial, I can help but feel there's some detail missing on the hashed bins proposal.

As a reader looking to implement the PEP I'd like to better understand:

  • when should the number of bins increase?
  • are changes to the number of bins transparent to the client? i.e. does changing the number of bins require some kind of synchronisation between PyPI and its clients?

Version number scalability

dstufft

Does this have to be an incrementing integer? My concern here is simply that an ever increasing integer can run into scalability problems, particularly if people don't take into account it's ever increasing nature (example, needing a different integer size in a database).

My second question is, do we have to guarantee these are never reused, or can they be reused after some time has passed?

My gut tells me we'd be better off using a uuid for a version number, or generating a random string of characters. The biggest downside I see to that is the extremely remote chance of generating the same version number twice-- but that can be solved by recording all possible versions ever used (which might be good to do anyways to return a 410 instead oof a 404).

Appendix C appears to be unreferenced

Is "Appendix C: PEP 470 and Projects Hosted Externally" still useful? It doesn't appear to be referenced anywhere in the body of PEP 458.

If not I'll open a PR, or update #35, to remove it.

PyPI and Key Requirements: clarify recommendations around digital signature algorithm

Note: I'm reviewing the PEP as a non-expert in attempt to help improve clarity, I hope these comments are useful.

The introduction to the section "PyPI and Key Requirements" states that:

Nevertheless, we do NOT recommend any particular digital signature algorithm in this PEP because there are a few important constraints: first, cryptography changes over time; and second, TUF recommends diversity of keys for certain applications.

This contradicts the rest of the PEP, which now feels like it is recommending Ed25519, and is likely to be confusing to an implementer of the PEP.

Do we feel comfortable explicitly recommending Ed25519? Should we be recommending different key types for different roles?

Disambiguate bin-n metadata even more

  • Why doesn't targets delegate directly to bin-n roles? What is the benefit of having an intermediate bins role? The only purpose I can see is semantic. Delegating from targets directly to bin-n would reduce complexity of this pep.

  • What is the difference between revoking trust in projects and revoking it in versions? If I understand correctly in both cases, the target files corresponding to that project or version need to be removed from the bin-n metadata they are listed in, which need to be re-signed with the online key. If that's accurate, why differentiate?

  • Also, is there a case where the bins role (offline key) needs to be resigned?

  • Disambiguate bins and bin-n in table.

  • Clarify what "projects targets metadata" is (paragraph under the table).

  • Maybe re-think the table. There are multiple instances that describe how,
    snapshot, timestamp and/or bin-n roles need to cooperate for an attacker to be successful, which seems less meaningful when these roles share the same online key, which is recommend in this pep.

Rethink the security analysis table

Maybe re-think the table. There are multiple instances that describe how,
snapshot, timestamp and/or bin-n roles need to cooperate for an attacker to be successful, which seems less meaningful when these roles share the same online key, which is recommend in this pep.

Managing Keys: demystify the offline ceremony

Note: I'm reviewing the PEP as a non-expert in attempt to help improve clarity, I hope these comments are useful.

Between the term "ceremony" and the lack of reasoning for some of the steps the management of offline keys as described feels mysterious.

For example, why should we "Print and save cryptographic hashes of new TUF metadata"? Under what circumstances will those printouts be used?

Option to disable PEP 458 when necessary

It may be possible that some bug might prevent package managers like pip from using TUF to install or update packages, including pip itself. Therefore, it would be wise to add an undocumented option such as --no-pep-458 to temporarily disable PEP 458 support.

Number of bins

How many do we need for # of targets a year from now?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.