Coder Social home page Coder Social logo

Comments (19)

michaeleisel avatar michaeleisel commented on September 17, 2024 1

The timestamp issue can be solved in Apple codesign with --disable_timestamp I believe. Reproducible code signing is important for Bazel, there may be a few more tricks in https://github.com/bazelbuild/rules_apple/blob/master/apple/internal/codesigning_support.bzl . But even if we couldn't get bit-for-bit identical signing, at least we could limit the differences to specific, documented parts of the output

from apple-platform-rs.

michaeleisel avatar michaeleisel commented on September 17, 2024 1

It's been a minute since I reviewed it, but I just like it in general for learning about code signing

from apple-platform-rs.

indygreg avatar indygreg commented on September 17, 2024

Making digests parallel within Mach-O is like a n=1 magnitude line change to plug in https://docs.rs/rayon/latest/rayon/iter/index.html. The magic of Rust :)

Do you have examples of large/slow signing operations to help measure against?

from apple-platform-rs.

michaeleisel avatar michaeleisel commented on September 17, 2024

Large app binaries can be >1GB. I think if you make a small test app, but add the linker flag -Wl,-sectcreate,__DATA,GIANT_SECTION,/path/to/some_giant_file, you can make an artificially large binary to sign with. Or, download an app store .ipa with https://github.com/majd/ipatool such as Spotify, where although the binary is smaller you could extrapolate.

from apple-platform-rs.

michaeleisel avatar michaeleisel commented on September 17, 2024

*Large debug app binaries can be >1GB (they lack size optimization and include additional debug metadata)

from apple-platform-rs.

indygreg avatar indygreg commented on September 17, 2024

The 0.19 release performed today contains a few improvements:

  • Commit 18c1db8 eliminated a double computation of code digests during signing.
  • Commit a1df303 introduced parallel digest computation for binaries >64 MB in size. There should be a ~linear speedup with number of CPU cores.

I haven't measured the impact. But it should be noticeable for large binaries. Please post numbers if this makes a difference for you!

from apple-platform-rs.

michaeleisel avatar michaeleisel commented on September 17, 2024

Thank you for doing this. Unfortunately, when I tried to measure the speed on an internal test app, I got the following: Error: bundle Info.plist does not define CFBundleIdentifier: MyApp.app/SomeBundle.bundle/Base.lproj/SomeStoryboard.storyboardc/Info.plist. So, I think there is some error that I'd have to deal with before I can measure the performance.

from apple-platform-rs.

michaeleisel avatar michaeleisel commented on September 17, 2024

Another couple things that would be helpful, aside from resolving that bug:

  • Getting byte-for-byte identical signings between rcodesign and some recent version of Apple's codesign (this would go a long ways towards convincing people that the risk of bugs is sufficiently low)
  • Making rcodesign, or some simple wrapper of it (e.g. rcodesign sign $@), a drop-in replacement for Apple's codesign, which would make it easier to integrate into existing build systems

These aren't entirely necessary, but I think they would help the adoption rate substantially

from apple-platform-rs.

roblabla avatar roblabla commented on September 17, 2024

Getting bit-accurate signature seems impossible - signatures include timestamps and other pieces of random data. Even running codesign twice on the exact same binary gives a different binary output AFAICT.

from apple-platform-rs.

indygreg avatar indygreg commented on September 17, 2024

In a former professional life, I was a maintainer of Firefox's build system and filed the tracking bug to make builds deterministic and reproducible: https://bugzilla.mozilla.org/show_bug.cgi?id=885777. So you don't have to convince me that bit-for-bit reproducibility is important :)

The Rust code today should be deterministic (aside from timestamps). I consider determinism table stakes for most software I author, especially this one.

Bit-for-bit identical output with Apple tooling is a goal. But I'm unsure how achievable it is in practice.

One area that will likely give us fits is the size of the code signature data in the __LINKEDIT segment. When a timestamp token is in play, you don't know the size until you get it. And you need to obtain the timestamp token after you've produced the CMS signature. And the CMS signature encapsulates the Mach-O headers containing the size of the __LINKEDIT segment and the embedded code signature. It's a nice circular dependency meaning we have to estimate the final code signature size and pad empty bytes. Apple and us have different size estimation logic and I'm unsure how easily we can get the 2 implementations to agree.

What I'm trying to say is there will be limits to bit-for-bit identical parity with Apple tooling. But determinism for this crate should mostly be in place today and relied on indefinitely. (We should document this design goal and ideally shore up testing to ensure this guarantee is achieved.)

I'll file a new issue to track a codesign like CLI interface to make this tool a drop-in replacement. There's definitely compelling value in having that.

from apple-platform-rs.

indygreg avatar indygreg commented on September 17, 2024

A few minutes ago I pushed 3576f92 to make resource digesting streaming instead of loading the full file in memory first.

from apple-platform-rs.

michaeleisel avatar michaeleisel commented on September 17, 2024

Very cool. You may also be interested in a project I released this spring, https://github.com/michaeleisel/AutoPen , which makes the existing codesign tool faster

from apple-platform-rs.

michaeleisel avatar michaeleisel commented on September 17, 2024

Parallel signing on mmap'd files (just the main binary for now) is done with https://github.com/michaeleisel/AutoPen/blob/main/libautopen/swapper.hpp#L75 and populateDigests

from apple-platform-rs.

indygreg avatar indygreg commented on September 17, 2024

Provide a flag to only sign the binary, and not any of the resources (doable in codesign with a flag that points to a separate metadata file)

I don't see this option in codesign provided on macOS 14.0. When attempting to validate some iOS bundles apparently using custom CodeResources rules, codesign now complains about the use of custom rules. e.g.

$ codesign -vvvv YouTube.app
YouTube.app: resource envelope is obsolete (custom omit rules)

Which codesign option are you referring to? And do you have man page output saying how it worked?

from apple-platform-rs.

indygreg avatar indygreg commented on September 17, 2024

@michaeleisel I'm sympathetic to the request to make rcodesign as fast as possible. (In my day job I work for my company's Infrastructure Performance Team and I know a thing or two about software performance.)

But it is difficult for me to prioritize performance over other features/bugs without knowing a few things:

  1. Why does code signing need to be fast? What's the cost to not making it fast?
  2. Which specific workflows/operations need to be fast?
  3. Can you provide example numbers demonstrating timings for different workflows/tooling? e.g. Exactly how large are the inputs and on your machine what are the timing differences? Please be as specific as possible.

from apple-platform-rs.

michaeleisel avatar michaeleisel commented on September 17, 2024

The codesign option I'm referring to is --resource-rules, which I discuss in my blog post linked below. It does indeed cause Apple to make a fuss in certain ways, but as far as I know there are no functional differences when actually running an app. Likewise for using SHA-1.

  1. Code signing needs to be fast for the iOS development use case because signing has to get re-run on every single incremental build for even the smallest change. Signing also takes the same amount of time regardless of the size of the change, and this time increases with the size of the app. So, you have a sort of "clean build" step even for an incremental change. This is also true of linking, which is why signing and linking are the two things that tend to dominate the incremental build time. The only exception to this is that, depending on what APIs an app is using, you may be able to get away with not signing when running it on the simulator, injecting entitlements into the binary as necessary. You could also make signing more incremental, e.g. not re-computing hashes for unchanged resources, but this isn't currently done as of my last tests on it.
  2. As mentioned, I'm primarily concerned with incremental builds of iOS apps.
  3. It can take in the ballpark of 1-10 seconds for very large apps. With debug info included, there are some large-scale apps out there whose main binary alone can exceed 2GB (which broke my usage of write in my linker at one point since that size exceeds a signed 32-bit int). It actually used to take far longer, 30 seconds up to even minutes, but thanks to most devs having ARM macs, along with the SHA-1 and resource rules tricks that I showed people, it's much better now.

More info on injecting entitlements, the SHA-1 trick, and the resource rules trick, are in my blog post https://eisel.me/signing

My personal opinion is that with the release of AutoPen, signing performance essentially became a solved problem for iOS development with no more major bottlenecks left to fix. Large-scale apps where it matters a lot have people that can get their systems to use AutoPen by default for all devs. That having been said, I'm always a fan of people making things fast :)

Note: one thing that working on that project showed me is how hard it is to get Apple to use a custom binary for code signing, unlike e.g. the linker which is easy to specify. You can see in the README the sort of gymnastics required for devs using Xcode. So, that would be something to consider when quantifying the benefits of getting rcodesign to support codesign's flags.

from apple-platform-rs.

indygreg avatar indygreg commented on September 17, 2024

Thank you for the detailed write-up @michaeleisel! I don't do any iOS development and you enlightened me to some pains that now appear obvious in hindsight.

I agree there are optimizations we can do to improve performance:

  • Caching resource digests (based on file mtime?) to avoid I/O + digest overhead on resigns.
  • Only computing SHA-1 to avoid SHA-256 overheads.
  • Parallel I/O + digest computation for non Mach-O resource files.
  • Not reading / writing full Mach-O when signing. (Although you may need to read for digest computations for re-signs.)

FWIW I wasn't aware of that __entitlements Mach-O section bit. I don't think I've ever come across that! Will need to do more research. If you have any info on it, a new issue with details would be very much appreciated!

As for getting Xcode to invoke rcodesign, yeah, that's tricky. There's an open issue on having rcodesign expose a codesign compatible CLI interface. But if you can't get Xcode to invoke our executable, it may be all for nothing. I'm sure there's a way to employ PATH based tricks here. There's got to be, right? But the performance improvements have merit even if Xcode is a pain to get using rcodesign.

Thanks again for the detailed write-up!

from apple-platform-rs.

michaeleisel avatar michaeleisel commented on September 17, 2024

Yeah it's tricky getting it to use a custom path for codesign. In my build logs, I see it explicitly writes out /usr/bin/codesign ..., so I don't think the PATH will help there. And of course with SIP, it's hard to mess with that particular path. You can still use a custom version if you follow the guide in that README, but it either needs a custom build system like Bazel, or you have to make a build step that gets run every time regardless of if anything has changed. The latter is not the worst thing, and maybe you can have some caching system that avoids unnecessary work anyways. After all, people usually only run a build when something has meaningfully changed.

mtime is out-of-style for avoiding build tasks in the iOS community, as it has proven flaky. Xcode may still use it but Bazel is all about hashes. Maybe a faster hash function like XXH3 would be preferable to check if anything needs changing (one could even cache by, say, 16kb chunks like is done for code signing), or maybe there could be a --take-some-mtime-risks sort of flag.

Overall though, all of the things you suggest sound very reasonable. There's also room for improvement with codesign_allocate that AutoPen never messed with. Note that SHA-1, as with resource rules, is essentially deprecated and Apple could someday remove support for running SHA-1 signed binaries.

I don't have any particularly good docs on __entitlements, aside from chapter 5 of book 3 of *OS internals, which goes into code signing in general very in-depth. If you want, I can send you the book or something. I have a copy of the book that I'm happy to lend for a few months (https://www.amazon.com/MacOS-iOS-Internals-III-Insecurity/dp/0991055535/)

from apple-platform-rs.

roblabla avatar roblabla commented on September 17, 2024

I don't have any particularly good docs on __entitlements, aside from chapter 5 of book 3 of *OS internals, which goes into code signing in general very in-depth.

I have MOXiI vol3, and I couldn't find anything about __entitlements in it (all the references to entitlements are talking about the XML/DER in the CodeSigning blob). Do you have a specific section in mind?

from apple-platform-rs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.