Coder Social home page Coder Social logo

Comments (24)

djhoese avatar djhoese commented on June 22, 2024 1

@snowman2 I had some luck with the vispy project (builds wheels with a single Cython extension) with forcing the pre-release of numpy to be installed. I see you're already doing that, but it looks like for some reason it is trying to build numpy from source. Here's vispys cibuildwheel env vars:

https://github.com/vispy/vispy/blob/a1a639f33c59a16c8af2bf605a23b55210569f5e/.github/workflows/wheels.yml#L38-L39

If needed I can try to take a look tomorrow, but no guarantees on time. I also don't build 32-bit wheels for vispy (or any of my packages) and it looks like you do...how do you handle that with numpy not providing 32-bit wheels?

from pyproj.

snowman2 avatar snowman2 commented on June 22, 2024 1

https://pypi.org/project/pyproj/3.6.1/

from pyproj.

djhoese avatar djhoese commented on June 22, 2024 1

@snowman2 could you point me to the release instructions? In your opinion what is the hardest part? Would you consider pyproj's release process much harder than other python packages you maintain (ex. rioxarray) given how tightly it is tied to the PROJ library? Or are there other reasons?

from pyproj.

snowman2 avatar snowman2 commented on June 22, 2024

For most projects, a release is fairly simple as you mention. However, due to a requirement on maintainers free time, building wheels, and testing with downstream Linux release managers, it is not quite as simple for pyproj.

The next release of pyproj depends on python 3.12 compatibility updates from a release from numpy to ensure stability.

We welcome assistance preparing pyproj for the next release with Python 3.12 wheels.

from pyproj.

gwerbin avatar gwerbin commented on June 22, 2024

@snowman2 thanks for the reply. Is it a matter of someone running through the release instructions here? Or is there more to it? I'm happy to dedicate some time to the patch release, unless the next real release of Pyproj is right around the corner.

from pyproj.

snowman2 avatar snowman2 commented on June 22, 2024

The blocker for the next release is #1330. Once that is ready, the release will follow. That likely depends on the next release of numpy, but there may be a way to get it to work now.

The instructions you linked to are correct. However, slightly out of date. The wheels are automatically uploaded except from Cirrus CI as those need to be manually uploaded to pypi currently.

from pyproj.

snowman2 avatar snowman2 commented on June 22, 2024

I think that it is the pypy wheels and win32 wheels that are building from source. Depending on the win32 failures, that one would be okay to disable.

from pyproj.

snowman2 avatar snowman2 commented on June 22, 2024

There are suggestions for workarounds linked in #1330 to numpy issues.

from pyproj.

djhoese avatar djhoese commented on June 22, 2024

My kid had to stay home sick today so I'm not really getting any work done today and probably not tomorrow.

from pyproj.

snowman2 avatar snowman2 commented on June 22, 2024

No worries @djhoese. I hope that your kid feels better soon.

from pyproj.

snowman2 avatar snowman2 commented on June 22, 2024

Coming soon ... #1344

from pyproj.

gwerbin avatar gwerbin commented on June 22, 2024

Thanks @snowman2! I would still be happy to help with patch or "post" releases for older versions for the sake of any users who can't upgrade for whatever reason. In my case, updating to the newest version is fine.

from pyproj.

snowman2 avatar snowman2 commented on June 22, 2024

I would still be happy to help with patch or "post" releases for older versions for the sake of any users who can't upgrade for whatever reason.

Contributions to help with releases are welcome 👍

from pyproj.

snowman2 avatar snowman2 commented on June 22, 2024

could you point me to the release instructions?

https://github.com/pyproj4/pyproj/blob/main/HOW_TO_RELEASE.md

In your opinion what is the hardest part?

The wheels.

The matrix of python versions (3.9-3.12), operations systems (Windows, MacOS, Linux), architectures (x86, i686, x86_64, amd64), and python implementations (cpython, pypy) make for long build times and a high potential for failures. Due to the long build times, it takes a while to debug as the CICD process takes a long time to complete. It is a never ending game of wackamole to keep it stable.

Most of the wheels are build using GitHub Actions. However due to need to support MacOS arm64 and Linux aarch64, the wheels are build on Cirrus CI and Travis CI. With Travis CI, we have a limited amount of credits, so reducing the frequency of the releases helps to stretch the credits farther.

Would you consider pyproj's release process much harder than other python packages you maintain (ex. rioxarray) given how tightly it is tied to the PROJ library?

Essentially, yes. PROJ makes it required to provide wheels. With rioxarray, I can make a new release in 1 minute and it is all automatically uploaded to pypi. I make a lot of rioxarray releases as soon as features are added (release early, release often 😄 ). With pyproj, it sucks hours of my time both preparing for and making sure the builds complete properly (which rarely happens the first time around). So, I try to limit the number of pyproj releases to reduce the amount of time I have to dedicate to making releases.

Additionally, there are downstream linux distribution package managers that are kind enough to run tests on pyproj before each release. I usually try to limit the number of pyproj releases to be respectful of their time.

In general, a release every 4-6 months is the current cadence of pyproj releases.

from pyproj.

djhoese avatar djhoese commented on June 22, 2024

A couple thoughts:

  1. Is cirrus CI used for arm64 for "parallel" wheel generation? Unless I'm forgetting something about my own projects, it is possible to make macos aarch in github actions.
  2. Why are the wheels on cirrus CI manually uploaded?
  3. Have you had users request PyPy wheels? If they build successfully then great, but if they're a burden to maintain then maybe we drop it. In the linked to numpy issue about PyPy they're talking about dropping PyPy 3.9 wheels now anyway and then will do PyPy 3.10 wheels later.
  4. I've never had a slow enough build process (even in my Cython-based projects) to justify this, but what about splitting the cibuildwheel builds up into multiple github actions environments/jobs? So one for linux 64-bit python 3.9 and 3.10, one for 64-bit linux 3.11 and 3.12, one for 32-bit linux...and so on. I'm not sure how github actions feels about that many environments but theoretically if any of them run in parallel it should be faster than what is happening now.
  5. Looking at the last releases wheel building I see for CPython wheels it takes about 70s to build the wheel and about 120s to test the wheel. Additionally the PROJ building before all the wheel processes takes 10 minutes. For PyPy wheel its 235 seconds (4 minutes) to test the wheel. This point is more about the timing info than it is about suggesting anything.
  6. Is/can the PROJ build be cached? Looking at the proj-compile-wheels.sh script it doesn't seem too complicated as far as depending on pyproj's code state. The hardest part to me seems that it needs to (should?) run on the same docker image as the one the wheels are built on. This proj build could maybe even be its own set of docker images based on the upstream PyPA images that are pulled in during wheel building time and only get updated when the proj-compile-wheels.sh script gets updated. I see some amount of caching being done on Windows, but it looks like MacOS can do that too: https://cibuildwheel.readthedocs.io/en/stable/setup/#macos-windows-builds. Otherwise we could build docker images like I said and specify them with https://cibuildwheel.readthedocs.io/en/stable/options/#linux-image
  7. The tests aren't failing when they should. The last release had failures in multiple spots but didn't die. Shapely didn't have a wheel for the platform/python version so it tried to build from source, couldn't find the geos library, and then failed to install. Wheel testing continued though and failed in some spots including failing to import shapely.
  8. What are your thoughts on identifying a specific set of tests and marking them with a pytest mark and only running those tests (possibly with a reduced set of dependencies?) for wheel tests?

from pyproj.

djhoese avatar djhoese commented on June 22, 2024

I should have maybe started with: this is just me brainstorming and not suggesting that you alone should tackle these things. So as a project, what should pyproj do to speed this up and what has been tried/avoided in the past?

Oh:

  1. How possible would it be to automate the downstream linux distribution testing? For example, something in the github action triggers their builds and there is a known URL to look for whether it passed or not and see the log. That way no one on their side of things needs to do much if anything and you/we don't have to passively wait for something to happen? Or...depending on their update cycle, maybe we ignore their build success until it is a problem and then come out with bug fix releases?

from pyproj.

snowman2 avatar snowman2 commented on June 22, 2024
1. Is cirrus CI used for arm64 for "parallel" wheel generation? Unless I'm forgetting something about my own projects, it is possible to make macos aarch in github actions.

It is used so the wheels generated can be tested. You can generate the wheels on Actions, but cannot test them. In my experience, it is a bad Idea to release something you haven't tested.

2. Why are the wheels on cirrus CI manually uploaded?

Automating it is on the TODO list.

3. Have you had users request PyPy wheels? If they build successfully then great, but if they're a burden to maintain then maybe we drop it. In the linked to numpy issue about PyPy they're talking about dropping PyPy 3.9 wheels now anyway and then will do PyPy 3.10 wheels later.

IIRC someone requested them a while back...

4. I've never had a slow enough build process (even in my Cython-based projects) to justify this, but what about splitting the cibuildwheel builds up into multiple github actions environments/jobs? So one for linux 64-bit python 3.9 and 3.10, one for 64-bit linux 3.11 and 3.12, one for 32-bit linux...and so on. I'm not sure how github actions feels about that many environments but theoretically if any of them run in parallel it should be faster than what is happening now.

I am open to this idea.

5. Looking at the last releases wheel building I see for CPython wheels it takes about 70s to build the wheel and about 120s to test the wheel. Additionally the PROJ building before all the wheel processes takes 10 minutes. For PyPy wheel its 235 seconds (4 minutes) to test the wheel. This point is more about the timing info than it is about suggesting anything.

6. Is/can the PROJ build be cached? Looking at the `proj-compile-wheels.sh` script it doesn't seem too complicated as far as depending on pyproj's code state. The hardest part to me seems that it needs to (should?) run on the same docker image as the one the wheels are built on. This proj build could maybe even be its own set of docker images based on the upstream PyPA images that are pulled in during wheel building time and only get updated when the `proj-compile-wheels.sh` script gets updated. I see some amount of caching being done on Windows, but it looks like MacOS can do that too: https://cibuildwheel.readthedocs.io/en/stable/setup/#macos-windows-builds. Otherwise we could build docker images like I said and specify them with https://cibuildwheel.readthedocs.io/en/stable/options/#linux-image

That is definitely something to look into.

7. The tests aren't failing when they should. The last release had failures in multiple spots but didn't die. Shapely didn't have a wheel for the platform/python version so it tried to build from source, couldn't find the geos library, and then failed to install. Wheel testing continued though and failed in some spots including failing to import shapely.

Those are optional test dependencies. If it works, great. If not, not a big deal.

8. What are your thoughts on identifying a specific set of tests and marking them with a pytest mark and only running those tests (possibly with a reduced set of dependencies?) for wheel tests?

numpy is the only bottleneck at the moment. Not sure I would want to release without testing using numpy.

How possible would it be to automate the downstream linux distribution testing? For example, something in the github action triggers their builds and there is a known URL to look for whether it passed or not and see the log. That way no one on their side of things needs to do much if anything and you/we don't have to passively wait for something to happen? Or...depending on their update cycle, maybe we ignore their build success until it is a problem and then come out with bug fix releases?

Sounds worth looking into.

from pyproj.

djhoese avatar djhoese commented on June 22, 2024

It is used so the wheels generated can be tested. You can generate the wheels on Actions, but cannot test them. In my experience, it is a bad Idea to release something you haven't tested.

I can understand that. Usually the temptation is too great and I just end up not testing for the hard-to-test platforms. In my usual cases though these are just simple Cython extensions with simple for loops over some numpy arrays so the compatibility usually leans on Cython and numpy's testing so I'm a lot less scared about my own libraries doing something incompatible.

My only worry with two separate build systems is if one uploads to PyPI but the other fails and you get this weird partial release. I suppose that is one advantage to the non-automatic cirrus wheels.

Those are optional test dependencies. If it works, great. If not, not a big deal.

Hm, what I saw was a doctest failure. I guess if that's integrated with your pytest and is an xfail or whatever then 👍

numpy is the only bottleneck at the moment. Not sure I would want to release without testing using numpy.

I guess I was thinking more breaking up the categories of tests. Like basic python-heavy functionality probably isn't going to break between platforms but if they're fast then whatever include them. The heavy PROJ compatibility is probably always needed. I guess it depends on what tests take the longest. Is it an even distribution for test execution time or are there some that are like 20 seconds each that could be skipped.

Otherwise...

it was brought to my attention today that pykdtree and pyresample wheel building are not in the modern times and I need to overhaul them. I'm working on that now and if they have any decent amount of build time maybe I'll try that splitting per-platform thing and see how github actions feels about it. I could then port that to pyproj.

from pyproj.

djhoese avatar djhoese commented on June 22, 2024

Oh about not testing on GA, you're talking about not being able to test on the emulated platforms?

from pyproj.

gwerbin avatar gwerbin commented on June 22, 2024

Re: PyPy wheels, I actually first ran into this problem because I was trying to install Pyproj on PyPy, and there wasn't a wheel available for my particular platform, so it tried to build from source and failed.

Given that problems like the current one are very rare, I don't think it's such a bad idea to drop wheel builds that are an undue maintenance burden.

from pyproj.

snowman2 avatar snowman2 commented on June 22, 2024

Oh about not testing on GA, you're talking about not being able to test on the emulated platforms?

That sounds about right.

from pyproj.

snowman2 avatar snowman2 commented on June 22, 2024

I guess I was thinking more breaking up the categories of tests.

The tests are pretty quick, so I wouldn't spend too much time optimizing those. The main bottleneck is dependencies.

from pyproj.

djhoese avatar djhoese commented on June 22, 2024

Of the 28m45s of the ubuntu wheel building for the last release, the testing makes up 11+ minutes (~37%). If I can get PROJ building cached testing becomes ~58% of the build time. I also suggest we drop PyPy 3.9 given that numpy will likely drop it:

numpy/numpy#24728

And if that's dropped then that does take ~5 minutes off of the total build time. PyPy tests take most of that time because they try to build shapely from source. So yeah the dependencies being installed don't help, but I'm not sure there is much that can be done there without caching them...which they might be already internal to cibuildwheel...I'll look at that too.

from pyproj.

djhoese avatar djhoese commented on June 22, 2024

And I've decided against splitting the environments based on python version. If I get caching working for PROJ then I'll reconsider it. The main downside though is that updating cibuildwheel doesn't automatically get you wheels for new versions of Python because you're likely explicitly setting what versions of Python to build.

Edit: ...and you're already splitting on platform/arch because of the GA versus cirrus versus appveyor split.

from pyproj.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.