Coder Social home page Coder Social logo

Comments (2)

mysterymath avatar mysterymath commented on June 2, 2024

To me, this seems like a consequence of the fundamental tension between the way ELF is intended to be used and the way CP/M-65 is using it.

What CP/M-65 is doing is termed "load-time relocation": the executable contains relocations that a dynamic loader (CP/M-65) performs as part of the loading of that executable into memory.

In typical ELF usage, load-time relocations uses the symbol and reloc information in the dynamic sections. The main ELF relocation and symbol sections are instead intended for relocatable objects used as part of a final link, eventually ending up as either a shared library (which is really a type of ELF executable) or an executable. Thus, it's expected that all there are no outstanding non-dynamic relocations in an executable; any present are informative, not nominative.

Typically, load-time relocation can only be done for shared libraries. However, position independent executables (-fpie and friends) can also have dynamic relocations. However, PIE currently tells the compiler to generate position-independent code; this typically necessitates a GOT, PLT, PC-relative addressing, all that jazz. However, that isn't intrinsic; you could instead imagine the compiler producing position-dependent code, then the linker turning it into an executable that was compiled to produce dynamic-relocs as if it were a load-time relocatable shared library. That would be what you'd ideally want take take as input in CP/M-65; it would have a set of relocations (the dynamic ones) specifically called out by the linker as being the right ones to fix up at load time.

However, I'm wildly unsure whether the linker can do anything like the above at present. I'd had a mind to look into this for my own hobby OS project, but I hadn't actually spent much time on it. It may take some linker work, but that isn't exactly an obstacle, it's just something that adds to the latency of a real solution for this.

That being said, I do want both load-time relocatable executables and real PIC with a GOT and PLT to eventually be artifacts that llvm-mos can produce, independently of CP-M/65. It's come up enough times in enough contexts that it seems like a good idea. I've just no idea when I'd get around to it. It's also very researchy; I handwaved a lot of the above, such as whether the linker really has enough information to tell what should be relocated and what shouldn't. It really feels like it should know though; I can't see a meaningful difference between shared libraries or PIC executables and the CP/M-65 case.

from llvm-mos-sdk.

mysterymath avatar mysterymath commented on June 2, 2024

I had a chat with Roland McGrath (look him up, but I didn't didn't ping him here, since hopefully he has better things to do ;) about this, and he had a ton of useful historical perspective.

Apparently there is actually a lot of prior art for doing this kind of load-time relocation in ELF executables in the UNIX world. 32-bit x86 shares the 6502's difficulty in emitting PC-relative addressing, so it was common for contemporaneous linkers to emit "TEXTREL" text relocations to point code at a PLT or GOT. These would be stored in the dynamic sections and be fixed up at load time, but they point at the .text section, rather than .got or .plt. This is precisely equivalent to what CP/M-65 is doing.

The biggest revelation from this discussion was that I was modelling how this worked incorrectly: there's absolutley nothing in the compiler that enables this. -fPIC and -fPIE purely have to do with requesting the compiler emit GOTs and PLTs; if you've already decided that you want load time relocation instead, then the compiler already emits a relocatable object: that's just a regular .o file!

So, it's the linker that needs a feature to support turning a collection of relocatable .o files into a load-time relocatable executable using TEXTREL. Fangrui (LLD maintainer) generally hates TEXTRELs, and they're generally dispreferred due to disabling sharing between executables and making more pages of the executable writable than need be, which harms security. So I doubt LLD has any special handling for this today, but I also doubt it would be particularly difficult to add.

That being said, when I think about how that feature might actually work, it seems very similar to what you're already doing with the linker today... which suggests that there are facets of our SDK that are hostile to this working. The existence of symbols like __zp_data_size is one, just as you've highlighted. Notably, UNIXen use __start and __stop symbols for this purpose, and now I think I understand why. In a relocatable binary, symbol values are taken to refer to addresses that are relative to the ELF base: the ELF base is in turn the smallest VMA of any PHDR. We could take this to be zero by fiat on the 6502, and indeed I believe our linker backend is actually set up that way.

That also means that it isn't generally possible to encode an absolute address reference into a load-time relocatable binary at all; after all, how could the linker know anything about where the program will be loaded? Instead, everything in the program image would need to either relative to 0 (program start) or an undefined dynamic symbol reference. The former addresses can have an addend added by the loader, and the latter have their actual absolute addresses substituted in by the loader.

So, @davidgiven, I wanted to specifically ask: Would making sure that everything in the process image is "zero relative" work for the CP/M-65? My intuition says yes; programs should already have undefined references to the BDOS fixed up by the loader, right?

The one snaggle I could see is separation between zero page and the regular image; but that would be relatively straightforward to encode: just treat addresses 0-100 to be the zero page, by fiat. The loader would then just need to add a different base to small symbol values than to large ones.

If this would work out, I could start looking at what in common actually violates this property. I do want both PIC (someday) and load-time relocation to work, so I think our common should broadly maintain this property, so long as it's actually sufficient to make operating systems like CP/M-65 work.

With that repaired, I'd expect that the existing -q based solution used by CP/M-65 would work again. But it would also open the door to a much more standard dynamic symbol version of the tool in the linker; this would remove the need for any shenanigans, and the resulting binaries would be absolutely bog-standard ELF, analyzable with any of the flotilla of usual tools. The ELF to CPM tool would just instead need to consume the .dynamic and dynamic relocation sections instead of the usual relocation ones.

from llvm-mos-sdk.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.