Presently, a function statically allocates its stack frame if it can be proven not to

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Support interrupts about llvm-mos HOT 18 CLOSED

llvm-mos commented on June 6, 2024

Support interrupts

from llvm-mos.

Comments (18)

beholdnec commented on June 6, 2024 1

When I thought about this problem in my llvm-m6502 project, I determined that there are three "threads" in a 6502-based system: RESET (the main thread), NMI, and IRQ.

A function can have a static stack if only one instance of the function is ever active at a time (I call these singleton functions). Therefore, a singleton function is one that:

Never recurses
Is only called in one thread

Proving the singleton-eity of a function may be difficult, but perhaps it would suffice to allow programmer annotations, such as attribute(interrupt(reset)) and attribute(interrupt(irq)), or attribute(interrupt(*)) for functions that can be called in any thread. A rule could be enforced that says a function can only call other functions within its own thread. For example, an interrupt(irq) function can only call other interrupt(irq) functions.

from llvm-mos.

mysterymath commented on June 6, 2024

The problem is worse than I thought. If for example a function that is directly callable by an interrupt is sequestered in its own module, then the conservative list of functions it could possibly call is every externally visible function, including main. Being conservative here would disable static stacks entirely.

"Might be called in an interrupt handler" is a globally viral property. If any such function could possibly call anything, it infects all possible callees with the same property. Thus, without extremely detailed caller/callee tracking, even through assembly, there's no way to limit the infection's spread.

from llvm-mos.

mysterymath commented on June 6, 2024

Thinking about this another way, the problem isn't too different from the present situation with zero-page registers. The static stack regions are essentially just "caller-saved memory locations". The interrupt handler prologue and epilogue could technically save all static stack regions to the soft stack. This satisfies the C99 standard, since the behavior is "as if" static stack were maintained on a per-invocation basis. This would suck for the exact same reason as the zero page registers: there's too much memory, and ISRs need to be tight.

It shouldn't actually be too difficult to write that; it's just a matter of emitting the static stacks to a dedicated section, then using the section size and address to do the copy in ASM. The same solution as for ZP registers would then solve both problems: if you want a signal handler to be fast, you need to restrict both its zero page register usage and its static stack usage. This allows the save and restore code to be sufficiently tight. The mechanisms for doing so would be separate compilation of the signal handler with flags to decreasr imaginary register usage and disable static stack usage.

from llvm-mos.

mysterymath commented on June 6, 2024

Hey, thanks for the advice!

This does work, but it does restrict the generality of C in a way that might not be obvious. It's not uncommon to divide an interrupt handler (say, just for IRQ) into fast and slow portions, with only the fast portion running with interrupts disabled. In that case, another IRQ could come in and be simultaneous with the slow portion of another interrupt. This is particularly typical for say, preemptive operating systems, where all the different processes are essentially "the slow portions of the clock interrupt", to be overly reductive.

Accordingly, we'd need a more general mechanism to allow programmers to build and model guarantees of non-simultaneity throughout the program. I do recall spending a bit of time along these lines, but not much, so there's might be some fertile ground there somewhere.

from llvm-mos.

johnwbyrd commented on June 6, 2024

@beholdnec, you remember that I spent a good bit of time going through your work a few years back. I'd love to see you jump into this project and help improve the state of things. There is still a lot to be done. We could use another hand that understands LLVM internals.

from llvm-mos.

mysterymath commented on June 6, 2024

Alright, I've spent some more time thinking about this, and I think I have a reasonably workable plan.

We can add two target-specific function attributes: __attribute__((interrupt)) and __attribute__((recursive_interrupt)). Both indicate that the function can be the entry point of a call graph that can interrupt other currently-active call stacks (either the one rooted at main, or one rooted at an interrupt).

The difference between the two is that the first is declared not to be able to be able to "interrupt itself", that is, if one invocation of an "interrupt" function is active, then that same interrupt function cannot be activated again until the first activation finishes. Typically this is ensured by leaving the interrupt flag disabled.

Recursive interrupts don't have this guarantee; they can be interrupted "by themselves"; typically this is the case if the interrupt flag is reenabled shortly after the interrupt.

The compiler would take stock of these attributes by tracing the call graph starting from each possible root of the program: main and each of the interrupt functions. It would maintain two bits for each function: whether it can use the static stack, and whether it can use the zero page registers. Both would start true.

The compile would start with main, disabling static stacks for each recursive function. This is what it does today.

Next, it would trace each regular interrupt's call graph. All visited functions would have their register usage disabled, and any recursive functions or functions already visited in a previous sweep would have their static stacks disabled.

Finally, each recursive interrupt's call graph would be traced, and each function visited would have both its register usage and static stacks summarily disabled.

I think we'll be able to get away with this kind of tracing now that we're forcing LTO for the compiler to work. We can get a degree of introspection into the assembly parts of the call graph by examining symbol references by section. It's not perfect, but it'll probably be quite good, and there's easy ways for users to fix inaccuracies in its reckoning. It can also be conservative enough to always generate correct code.

The motivation for this is to have interrupt routines have low constant startup overhead. Any zero page registers used by an interrupt would always need to be saved before the interrupt can begin; it's a good idea to keep that set small. Static stack regions, on the other head, might not conflict between interrupt and non-interrupt functions; it depends on whether or not the same function is reachable from more than one context. Hence we trace whether interrupts are recursive or non-recursive and examine their call graphs, which should allow us to determine whether or not each function is callable from more than one context.

We can't entirely eliminate zero page usage, nor would we want to. Instead, the zero page register disable would switch routines over to a calling convention with a small fixed number of zero page registers; with the number configurable by the user. This would ensure a small, relatively fast interrupt entry sequence; you'd only need to save 2 or 4 zero page registers, A, X, and Y, and the 5 compiler temporary locations.

Having few zero page registers will dampen the call sequence quite a bit; large arguments will have to go through the soft stack. However, once we have a whole-program optimization to do so, we should be able to pass and return arguments via static stack for routines that aren't accessed via function pointer. This make the function calling sequence within interrupts very similar to what is typically done by hand: memory is reserved for each function, and care is taken by the programmer to keep the interrupt and non-interrupt call sequences separate.

from llvm-mos.

mysterymath commented on June 6, 2024

And this, and any solution like it, breaks down in the presence of function pointers. Can't have different functions with silently different calling conventions. So sub out the calling convention stuff: there's only one calling convention (for now), and both interrupt and non-interrupt functions would use it.

This still isn't workable. Sigh...more thought is required

from llvm-mos.

mysterymath commented on June 6, 2024

For now, let's just provide a way to disable static stacks and save the zero-page registers. We already have a way to limit the number used, so this is the lowest-effort way to support the standard C semantics.

From there, we can start looking at targeted optimizations to restore the MOS-ness of the interrupt world, but we'll always need a baseline to fall back to.

from llvm-mos.

beholdnec commented on June 6, 2024

Ignore the details and focus on the top level problem for a moment. We need a way to determine IsSingleton to every function in the program, where a singleton function is defined in my previous message.

Let's say we have a list of functions that act as call-graph heads (main(), irq(), nmi(), etc.). Is it feasible to annotate every function in the program as IsSingleton=Yes/No/Maybe ? When function pointers are involved, does LLVM provide enough information to do this?

from llvm-mos.

mysterymath commented on June 6, 2024

If we required users to hand-annotate every single function (or not) with isSingleton, then yes, LLVM would be able to know whether a function was a singleton.

There's really really no way to tell what a function pointer points to though; you can take a function pointer, turn it into an int, run AES on it with a key, and stick it in a table. Then decrypt the pointer with AES and call the resulting function. What does it point to? Who knows! All you can know is that it's some function with the given signature, and a function that was either externally visible to the linker or a had its address taken.

Thus, any external function or function that has its address taken has to have the same calling convention. It doesn't necessarily have to use the same set of internal registers though.

Right now, a function will happily use 80 or so zero page registers for its arguments. It doesn't matter whether or not a function is a singleton; if an interrupt interrupts that function, and it uses its own 80 registers, it has to save all 80. There's no way of knowing which function will be interrupted, so it has to restore all registers that anything could possibly be using. This would generally be too slow, without tightly restricting the number of zero page registers used for arguments.

This is why, say, ARM has only 32 registers, even though internally the processors actually have 800 or so. You need to limit the number of registers that you need to save and restore in the interrupt entry routine. We'd need to do so too.

from llvm-mos.

beholdnec commented on June 6, 2024

I found LLVM's "callees" annotation
https://llvm.org/docs/LangRef.html#callees-metadata
This could be helpful but it's a false hope. Clang doesn't ever seem to generate it.

from llvm-mos.

mysterymath commented on June 6, 2024

We do, and we use it, but it only occurs if there are 3 or fewer possible callees. Very specific optimization, that.

from llvm-mos.

mysterymath commented on June 6, 2024

That's the rub: there is a lot of information we can obtain and use about how functions are used, but all of it is imperfect.

We first need to figure out an acceptable baseline: this will occur any time our more specific rules fail.

Then, we need to build towards our desired state, one rule at a time. Each rule needs to have a known set of conditions when it applies, how we can know, 100% for certain, whether it applies, and how to fall back when it doesn't.

The existing static stack analysis plays within these rules; it's based on the control flow graph analysis in LLVM, which is conservative; so it won't activate itself unless it's sure it's safe. The big obvious broken case, interrupts, we explicitly exclude from consideration; they "can't happen", or it's "undefined behavior."

There's a ton of stuff we can do really easily if we don't care what happens in edge and corner cases, but that's not the game!

from llvm-mos.

mysterymath commented on June 6, 2024

Okay, so say we start with the "Standard C" baseline: only 5 or so zero page pointers, and the 5 compiler temporary locations need to be saved on interrupt. That's 11 locations to save to the hard stack, since we don't have to save callee-saved registers. Not bad, not great. Nothing uses static stacks.

From there, the interrupt and recursive_interrupt attributes are enough to restore static stacks to this scenario. All we need for static stacks is the isSingleton bit, which can be conservatively determined by walking the stacks as described above.

I'll think more about how to safely expand zero page usage.

from llvm-mos.

mysterymath commented on June 6, 2024

From here, we can promote byte and word static stack locations from absolute memory to zero page memory. Also, just like with static stack, whenever we can prove that two functions cannot be simultaneously activated, they can share the same static stack and zero page locations

We'd want to prioritize interrupts, then leaf functions, as they're usually the hottest parts of a program.

This doesn't do anything to improve the calling convention, though.

from llvm-mos.

mysterymath commented on June 6, 2024

Say the compiler observes a direct, unambiguous call from one function to another. In such cases, we could customize the calling convention based on the specific callee. However, other functions in general still need to be able to call the callee, even when it's ambiguous which function is being called.

We can split the callee in two: one used by unambiguous calls, and the other used by ambiguous calls. The latter would retain the name of the function, while the former would be mangled. The latter would call the former after translating from the uniform calling convention to the callee specific one.

The overhead of this approach would be an extra JMP per ambiguous function call. Definitely worthwhile.

This leaves the question of what calling convention we'd want to assign to each function.

from llvm-mos.

mysterymath commented on June 6, 2024

If a function cannot use static stacks, then the default C calling convention is actually pretty good. It's assumed recursive one way or another, so most of the arguments and locals really will need to end up on the soft stack.

If a function can use static stacks, then it would want its arguments on the static stack.

Thus either way, "all the arguments go on the stack", only the difference is, whether the dynamic stack or a per function static stack. You can only use the latter if you know the callee for certain; if you don't, you call a wrapper that copies from the dynamic stack to the static stack. This overhead would only be incurred on indirect calls.

Since static stack locations would now be often fulfilled by the zero page, this restores a largely zero page calling convention in the presence of interrupts. This also seems to mesh well with how I've seen program and OS code laid out on target systems, which is a good sign.

from llvm-mos.

mysterymath commented on June 6, 2024

I've split this issue into #66, #67, and #68. The first is the only one I'd personally consider in-scope for a v1; it'll get us a sort-of OK interrupt solution that has the rough shape of a final one.

from llvm-mos.

Support interrupts about llvm-mos HOT 18 CLOSED

Comments (18)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent