Coder Social home page Coder Social logo

Comments (19)

eryksun avatar eryksun commented on September 27, 2024 2

This is a straight-forward fix. Add the statement config['user_site_directory'] = 0 in the following section:

cpython/Modules/getpath.py

Lines 760 to 764 in 69b3e8e

if pth:
config['isolated'] = 1
config['use_environment'] = 0
config['site_import'] = 0
config['safe_path'] = 1

from cpython.

eryksun avatar eryksun commented on September 27, 2024 2
  • sitecustomize gets imported only after adding the system and user site-packages directories and processing their .pth files. So using sitecustomize to remove the user site-packages directory from the module search path doesn't isolate your application from their effects.
  • Running without isolated mode enabled means your application can be affected by various Python environment variables, including PYTHONHOME, PYTHONSTARTUP, PYTHONPATH, PYTHONIOENCODING, PYTHONUTF8, etc. You'd have to depend on running with the -I or -E command-line arguments.
  • If an empty python._pth file is placed beside "python.exe" on Windows, then it sets home in the "getpath.py" code, and thus sets sys.prefix. The "Lib" directory will have to be co-located. Isolated mode will not be enabled in this case. The environment is ignored with respect to the code in "getpath.py", but not in general (i.e. sys.flags.ignore_environment will be 0). Getting isolated mode, safe-path mode, and (once fixed) disabling the user site-packages, is contingent on the ._pth file being non-empty, in which case its contents determine the initial sys.path.

from cpython.

eryksun avatar eryksun commented on September 27, 2024 1

Could be another esoteric piece of information that I lack however I noted that when using sitecustomize on macOS, even though import site in defined in the ._pth file, site-packages directory, e.g. /lib/python3.11/site-packages is not added to sys.path by default. On Windows it is. I could of course manually add it in the ._pth file but that feels hacky.

Maybe the value of sys.prefix is wrong. I don't use macOS, so I can't test this.

from cpython.

tanzim avatar tanzim commented on September 27, 2024

While reproducing this, I noted something else on macOS - this is assuming Python 3.11 adds support for ._pth file on macOS:

  1. Placing ._pth file with interpreter executable name in the interpreter dir, e.g. bin works
  2. Enabling site has the same issue as above
  3. However, no matter how much I tried I couldn't get a .pth file to work, doesn't matter where I placed it.

from cpython.

eryksun avatar eryksun commented on September 27, 2024

However, no matter how much I tried I couldn't get a .pth file to work, doesn't matter where I placed it.

On Windows, the site module executes .pth files found in the prefix directory; the site-packages directory "{prefix}\Lib\site-packages"; and, if enabled, the user site-packages directory. On POSIX, it checks the site-packages directory and, if enabled, the user site-packages directory, but not the prefix directory. The path of the site-packages directory on POSIX is usually of the form "{prefix}/lib/pythonX.Y/site-packages", unless a distribution customizes it.

After adding the site-packages and user site-packages directories, and executing their .pth files, the site module tries to import a sitecustomize module. This may be simpler for you to manage since it doesn't matter where sitecustomize is located.

from cpython.

tanzim avatar tanzim commented on September 27, 2024

On POSIX, it checks the site-packages directory and, if enabled, the user site-packages directory, but not the prefix directory.

That explains why the .pth file didn't work for me same way as Windows even though I tried placing it in prefix, prefix/bin and site packages as above. I can confirm sitecustomize works. I have to say though, the various initialization logic is pretty convoluted!

Thanks again for the help.

from cpython.

tanzim avatar tanzim commented on September 27, 2024

Could be another esoteric piece of information that I lack however I noted that when using sitecustomize on macOS, even though import site in defined in the ._pth file, site-packages directory, e.g. /lib/python3.11/site-packages is not added to sys.path by default. On Windows it is. I could of course manually add it in the ._pth file but that feels hacky.

from cpython.

tanzim avatar tanzim commented on September 27, 2024

You guessed right. It's pointing to python/bin/lib/python3.11/site-packages and sys.prefix is set to python/bin when it should be one level up.

I'm pretty sure this is a side effect of colocating the ._pth file with the python executable in the bin folder. If I remove the _.pth sys.prefix works as expected and site.getsitepackages() returns the correct value.

from cpython.

eryksun avatar eryksun commented on September 27, 2024

I'm pretty sure this is a side effect of colocating the ._pth file with the python executable in the bin folder.

If a ._pth file is found, its parent directory gets set as home in "Modules/getpath.py". This sets sys.prefix1. If you need it to be a level up, can you run from a "python" symlink that's created a level up? A ._pth file can be located beside a symlink to the executable or beside the real executable.

Footnotes

  1. Except when running from the build directory on POSIX, the default prefix is used.

from cpython.

zooba avatar zooba commented on September 27, 2024

Yeah, having ._pth directly dictate the prefix isn't ideal, but it's also a difficult one to generalise the rules for.

Fundamentally, the idea is that your ._pth file should prevent/limit the Python runtime from being exploitable through path hijacks, but as soon as we start adding rules like "Sometimes sys.prefix is the directory above the file" or deriving it from user-provided/environment data like normal, we break the security rule.

I think the right fix here is to add the site-packages directory to the ._pth file if you want it. The main scenario is for embedding, when you probably aren't using (and certainly don't need) the site-packages convention at all. Still, it's a bit of a shame that it is deriving from sys.prefix rather than the stdlib path (maybe we don't even calculate this though...?).

from cpython.

tanzim avatar tanzim commented on September 27, 2024

I'm pretty sure this is a side effect of colocating the ._pth file with the python executable in the bin folder.

If a ._pth file is found, its parent directory gets set as home in "Modules/getpath.py". This sets sys.prefix1. If you need it to be a level up, can you run from a "python" symlink that's created a level up? A ._pth file can be located beside a symlink to the executable or beside the real executable.

Footnotes

1. Except when running from the build directory on POSIX, the default prefix is used. [↩](#user-content-fnref-1-8b6d27ba455fad70a88c13c6396bda3d)

Unfortunately due some constraints, symlinks are not an option for us atm.

from cpython.

tanzim avatar tanzim commented on September 27, 2024

Yeah, having ._pth directly dictate the prefix isn't ideal, but it's also a difficult one to generalise the rules for.

Fundamentally, the idea is that your ._pth file should prevent/limit the Python runtime from being exploitable through path hijacks, but as soon as we start adding rules like "Sometimes sys.prefix is the directory above the file" or deriving it from user-provided/environment data like normal, we break the security rule.

I think the right fix here is to add the site-packages directory to the ._pth file if you want it. The main scenario is for embedding, when you probably aren't using (and certainly don't need) the site-packages convention at all. Still, it's a bit of a shame that it is deriving from sys.prefix rather than the stdlib path (maybe we don't even calculate this though...?).

My only concern with the _.pth co-location with real executable on posix (e.g. bin/python) is side effects on other tools/packages that have some form of assumption on sys.prefix being in a root dir and calculating things based on it, which as we witnessed above can break down easily.

Totally agree it's a shame and I think it's it's a genuine issue as the prefix calculation should be uniform across the board, presence of _pth file or not.

from cpython.

zooba avatar zooba commented on September 27, 2024

side effects on other tools/packages that have some form of assumption on sys.prefix being in a root dir

This applies to anything in the stdlib for sure, but outside of that we really have to leave it up to the person adding the ._pth to validate. The default layout is just that - a default - and it's totally possible to rearrange everything and have it work. There's no reason to assume there'll always be a bin or a lib directory.

from cpython.

eryksun avatar eryksun commented on September 27, 2024

Note that if a custom site directory contains .pth files that need to be processed, it can be added via site.addsitedir(), which can be called in sitecustomize. For example, if it's the same directory that contains "sitecustomize.py", the latter could include the statement site.addsitedir(os.path.dirname(__file__)).

from cpython.

tanzim avatar tanzim commented on September 27, 2024

Some additional findings - this is based on trying to simply use sitecustomize as mentioned in site documentation

sitecustomize.py

import site
import sys

print("run sitecustomize")

if site.ENABLE_USER_SITE and site.USER_SITE:
    try:
        sys.path.remove(site.getusersitepackages())
    except:
        pass

Python 3.10/3.11 (macOS)

Simply placing sitecustomize.py in the site directory, e.g. lib/python3.NN/site-packages does the trick. Site customization is run and is inline with what the docs state. I didn't need to place a _.pth file which avoids the sys.prefix issue altogether. Bonus, relative module imports works with no issue.

Python 3.10 (Windows)

Same behavior as macOS. Simply placing sitecustomize.py in the site directory, e.g. Lib\site-packages does the trick.

Python 3.11 (Windows)

Placing placing sitecustomize.py in the site directory works, however if no ._pth file is present, both user specific Python and any globally installed (e.g. all users) Python paths are added to sys.path, clearly from registry look up:

  • <username>\AppData\Local\Programs\Python\Python3.NN\Dll
  • <username>\AppData\Local\Programs\Python\Python3.NN\Lib
  • <global>\Python3.NN\Dll
  • <global>\Python3.NN\Lib

I looked the code for Modules/getpath.py and decided to add a small experiment:

  • Placed an empty python._pth file. Received the same behavior as macOS (3.10/3.11) and 3.10 on Windows. This works because getpath.py has code path that simply enables isolated mode based on presence of the ._pth file directory
  • If I execute python with -Esu parameter it all works, sitecustomize is run

So question - can I reliably depend on Modules/getpath.py to retain the behavior for enabling isolated mode when an empty ._pth file is present on Windows?

The code and comment seems to suggest yes. See Modules/getpath.py

from cpython.

zooba avatar zooba commented on September 27, 2024
  • Getting isolated mode, safe-path mode, and (once fixed) disabling the user site-packages, is contingent on the ._pth file being non-empty, in which case its contents determine the initial sys.path.

Yeah, this is the intended use. An empty ._pth ought to just fail - not entirely obvious why it doesn't right now - because it should result in an empty sys.path. I guess there are some additional checks in getpath.py that look for empty lists...

from cpython.

tanzim avatar tanzim commented on September 27, 2024
  • Getting isolated mode, safe-path mode, and (once fixed) disabling the user site-packages, is contingent on the ._pth file being non-empty, in which case its contents determine the initial sys.path.

Yeah, this is the intended use. An empty ._pth ought to just fail - not entirely obvious why it doesn't right now - because it should result in an empty sys.path. I guess there are some additional checks in getpath.py that look for empty lists...

Because https://github.com/python/cpython/blob/main/Modules/getpath.py#L669 and https://github.com/python/cpython/blob/main/Modules/getpath.py#L723

It's baked in Windows behavior.

from cpython.

eryksun avatar eryksun commented on September 27, 2024

Yeah, this is the intended use. An empty ._pth ought to just fail - not entirely obvious why it doesn't right now - because it should result in an empty sys.path. I guess there are some additional checks in getpath.py that look for empty lists...

An empty ._pth file sets prefix, but it doesn't preclude the normal code that sets up the initial module search path in pythonpath, except to skip anything from the environment/registry because use_environment is 01. The way it's designed is that if the pth variable is non-empty from the ._pth file, then its contents replaces the calculated value of pythonpath, in addition to setting config values that isolate the application.

Footnotes

  1. But not config['use_environment'], so other Python environment variables still affect the application.

from cpython.

zooba avatar zooba commented on September 27, 2024

By "intended" I meant what I intended when I originally added it, not when it got translated into getpath.py 😄

But it does seem reasonable to allow it to generate a relatively default path if it's empty. It's very easy to avoid that by just not making it empty, which is going to be necessary most of the time anyway.

from cpython.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.