external interpreters and --target help with the cros

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Add naive Core/STG interpreter so TH and GHCi work even with ABI changes,about ghc-proposals/ghc-proposals

Comments (20)

int-index commented on May 10, 2024 4

I agree that an STG interpreter would be a better fit for this than a Core interpreter. And, quite excitingly, @csabahruska has developed one (with an entirely different motivation!)

I’d love to see Csaba’s work go upstream and applied to solve the problem of making TH available in more scenarios (e.g. in stage1 GHC, so that GHC itself could use TH, that would be terrific).

Furthermore, I think this would help DTs. There’s one thing common between TH and DTs: compile-time evaluation. Currently, type-level computation is done with type families, and their performance is quite bad. Instead, what we should aim for is that promoted functions have the same space/time complexity at the type level as they do at the term-level. At least when they are applied to plain data (i.e. no type variables). And one way to achieve that would be to compile them to STG, and then use an STG interpreter. We could even do that before DTs, to speed up type families.

from ghc-proposals.

chrisdone commented on May 10, 2024 2

You're in for an adventure.

I made a start on a "pure-Haskell" STG interpreter a year ago: https://github.com/chrisdone/prana It could run some simple functions.

As Simon suggested, it interpreted everything all the way down: base, ghc-prim's primops, etc. Otherwise you have to "FFI" from your interpreter to native GHC's runtime representation.

If you're looking for a "feasibility study" of how hard this would be to do, you can consider that project a modern (ish) analysis. It also doesn't do any fancy optimizations, every step is the dumbest thing possible. I even put tasks in a table with my appraisal of how difficult each was or was expected to be. Exceptions, for example, raise an interesting design challenge. Another open challenge is FFI support.

I would recommend interpreting STG, not Core. Core is full of assumptions and missing things. STG is the actual "ready to interpret" type. I tried Core first and hit a dead end. STG is also simpler, as it has restrictions that all thunks are explicit, there are only lets, no lambdas (ostensibly, but I seen 'em), etc. Quoting myself from 2019:

I think this mailing list thread turns out to be an x/y problem. I need STG.

I ended up doing lots of cleaning steps to generate a more well-formed
Core. I learned that the Core that comes from the desugarer is
incomplete and realized today that I was duplicating work done by the
transformation that converts Core to STG. I'd initially avoided STG,
thinking that Core would suffice, but it seems that STG is the AST
that I was converging on anyway. It seems that STG will have all class
methods generated, wrappers generated (and inlined as appropriate),
primops and ffi calls are saturated, etc. no Types or coercions, etc.
It even has info about whether closures can be updated and how.

Just running the "lambda calculus with thunks" is one thing, but if you want to genuinely interpret GHC Haskell it's a much broader surface area.

from ghc-proposals.

goldfirere commented on May 10, 2024

Apologies if I'm being dense, but I really do not understand what this is proposing. What is --target? What is the "ABI compatibility problem"? How do --target and external intepreters help? What don't they do? How would a Core interpreter help? What's "highly hostile" about today's setup?

from ghc-proposals.

angerman commented on May 10, 2024

@goldfirere, you need to look at this through the cross compilation lenses.

-target refers to ghc being eventually able to produce code for multiple targets from the same compiler by providing the relevant -target flag to ghc. Similar to what clang allows.
-fexternal-interpreter runs a secondary process (called iserv) as the interpreter. This allows a stage1 GHC to evaluate TH splices via the external interpreter process.

TH in its current (unrestricted) form is a major hurdle to get cross compilation working with GHC.

from ghc-proposals.

Ericson2314 commented on May 10, 2024

Granted, I did write super tersely without explaining anything.

The "ABI comparability problem" is that we must currently limit the way GHC is bootstrapped so that things work correctly if the ABI changes. GHC's "introspection" way of doing things for TH and GHCi requires that the ABI of the loaded code and GHC itself match. This is only the case when newly-built GHC has the same ABI as the compiler that built that GHC.

-target helps with cross, by allowing the same GHC to build native code suitable to be loaded for TH, and cross code to actually be installed. But for this to work, -target with the native platform must be the ABI as the GHC itself, getting us right back where we started. In short, making GHC multi-target doesn't make it multi-ABI. [A multi-ABI RTS would be a real nightmare to maintain.]

External interpreters actually get us closer. If the goal is for TH to be used everywhere, then lets assume it is used pervasively throughout GHC. The stage1 GHC with the new ABI cannot be used to directly build the stage2 GHC. But if it can talk to the stage0 iserve and use the for the TH, then it can build a stage2 GHC which will produce code using same ABI it itself is built with, getting us through the dire straights.

Still, relying on external processes and mixing compiler versions is also a huge maintainence burden (too much IO, TH must be conservative in various ways, protocol must evolve slowly instead). The golden solution is TH via the clean interpreter. GHC is completely control of run-time representations used by the interpreter, and those definitions, completely writable without taking into account RTS internals, are "stateless" with respect to ABI changes in that only the 1 ABI of GHC itself is relevant, and nothing cares deeply about it anyways. The only restriction is that template-haskell library provided for TH must match GHC's expectation for the host-guest interpreter IO to work. But this is no more onerous than ghc-prim matching the actual back-end.

from ghc-proposals.

Ericson2314 commented on May 10, 2024

@angerman tells me about the source plugins new to GHC 8.6. These are very cool, and I think better embody the current implementation strategy. A plugin by nature accesses far too vast an API to be worth the manual virtualization and mashalling the interpreter method uses. Using the old GHC (worst case) to make code ABI-compatible with the new GHC is the necessary price to access all the bells and whistles of the compiler library.

To mime industry buzzwords, source plugins are good for large overwrought "waterfall" language extensions, while TH is good for ad-hoc, on the fly "agile" sugar.

from ghc-proposals.

simonmar commented on May 10, 2024

Fun fact: GHCi was originally implemented as a Core interpreter. But that implementation never made it as far as a release, because we couldn't get it to fully work. The sticking point was unboxed types. It would be possible if the interpreter didn't have to interact directly with compiled code, but for GHCi it does. Without that constraint the interpreter could choose its own representation for unboxed values (i.e. it could use boxed ones).

But I presume the idea here is to interpret everything, including the base package. For that you would need some way to persist the Core when compiling modules, and load it into the interpreter.

from ghc-proposals.

Ericson2314 commented on May 10, 2024

Yes persisting the core would be the eventual goal. How different is core from the "unfoldings" we already persist for downstream inlining?

But as a stop gap, we can implemenent multi-package source loading. That is super wanted for GHCi and more incremental ghc --make anyways.

from ghc-proposals.

goldfirere commented on May 10, 2024

It seems this conversation is useful to the cognoscenti in this area. However, unless I'm in the minority here, I'll need quite a bit more text (i.e. a full proposal) to understand this when it comes to the committee. I appreciate the effort in #162 (comment), but I still feel the need for concrete examples showing where, exactly, the current story founders.

If this preliminary conversation is useful, no need to spend lots of time writing it up now, but (as a committee member) I'd very much like to see this idea in the context of a full proposal.

from ghc-proposals.

treeowl commented on May 10, 2024

@simonmar, could you explain what makes unboxed tuples so very hard for an interpreter?

from ghc-proposals.

simonmar commented on May 10, 2024

@treeowl I'm not sure I can recall exactly where we ran into trouble. But consider that you need to `unsafeCoerce' things all over the place because the interpreter (in Haskell) needs to invoke native compiled functions and pass arguments and collect results in the correct way. If you don't need to invoke native compiled functions from the interpreter then it's much easier, because you can choose representations.

from ghc-proposals.

Ericson2314 commented on May 10, 2024

And yes, it's precisely in avoiding calling "native compiled functions" that we escape the ABI restrictions that complicate bootstrapping.

from ghc-proposals.

alexbiehl commented on May 10, 2024

could you explain what makes unboxed tuples so very hard for an interpreter?

IIRC the hard part is lifting return values from native functions back into the interpreter. The interpreter operates on stack only. To call a native function, it pushes a "return frame" onto the stack which pushes the result of the native function (which is in either R1, L1, F1, D1,..) onto the stack before entering the interpreter. For this you need only a handful of predefined "return frames". Basically one for each PrimRep (I think they are defined in StgMiscClosures.cmm currently).

Now, unboxed tuples can have not one but an arbitrary number of return values of different types - which are in registers and possibly on the stack. For this you would need some kind of "generic return frame" which takes a mapping from register/stack slot to stack slot:

For an unboxed tuple (# Int, Double# #) a mapping could look like:

take R1 and put it into stack slot x
take D1 and put it into stack slot x + 1

It even gets trickier once you exceed the number of available registers: you would have to shuffle things on the stack... and all that programmed out in our lovely CMM. I guess this would be a place where jit compilation has its advantages. As you could just generate code for this translation directly.

from ghc-proposals.

simonmar commented on May 10, 2024

Right, so I think the problem is that you need an infinite number of different return frames because there are an infinite number of return conventions for functions returning unboxed tuples. You could always generate them at runtime - rather like the way we generate info tables for constructors, which also have a little bit of code - but the return frames would be significantly more complex, and you have to do it for each platform we support.

from ghc-proposals.

Ericson2314 commented on May 10, 2024

There is a similar problem calling foreign code, but at least that doesn't evolve in lockstep to the GHC ABIs. And hopefully libffi would make that easy enough.

from ghc-proposals.

mchakravarty commented on May 10, 2024

Isn't the simplest way to think about this to say, we add a new target called core (which is always available). It never calls native code (or rather Core is the native code for that target) and you cannot link it — hence, it is only useful for being loaded for GHCi and TH.

from ghc-proposals.

howtonotwin commented on May 10, 2024

@mchakravarty Are you suggesting that core be an actual target, like arm or amd64? If so, then I can think of multiple issues that may need addressing:

What platform will we be "faking"? I don't think "platform-agnostic" Core actually exists today, as e.g. certain primitives exist only on certain targets. If you just have "Core" as a target, you don't specify what these things should behave like. The sensible default is the system running the interpreter; maxInt in the interpreted Core would mean the same thing as maxInt in the interpreter's Core. However, this seems like a tripping point for cross.
How do we get Core in a reasonable amount of time? GHC produces Core when compiling for all targets. Yet, GHC doesn't support multiple targets just yet. This means, if I want to produce Core versions of my libraries, I would have to go out and compile myself a whole new GHC that is the same as the one I already have, but less. Perhaps we could jump the gun and add -target early, and just have it support only -target core and -target native or something.
This doesn't seem to work in the "imagine TH is everywhere" scenario. In this scenario, TH is prevalent within GHC itself, and we are trying to cross-compile GHC itself. That is, we have a host system, a target platform, a set of base libraries compiled for the host system, and a GHC compiled to run on and target the host linked against those libraries, and we're going to compile, first, a new set of base libraries that are compiled for the host system and a GHC linked against them that runs on the host system but targets the target, and then a set of base libraries compiled for the target and a GHC linked against them that runs on and targets the target. If Core is a target, then, configuring the above build to it produces 1) a set of new native base libraries 2) a runnable GHC that compiles to Core (useful) 3) the base libraries, but all in Core (also useful) and 4) a GHC that is in Core and produces Core (good for testing???). Further, if I am compiling for a real foreign target, then I get 1) a set of new native base libraries and 2) a host->target GHC. I then get stuck, as Step 3, the target libraries, may need TH, and that TH needs to refer to stuff from the target libraries, which means I need the Core version of the target libraries, but I don't have a Core compiler to get them from.

A better idea may be treating Core as a way. It actually seems quite similar to the dynamic way. Importantly, just like we have -dynamic-too, maybe we can have -core-too. Issue 1 is now fixed: the Core compiler's underlying target is the containing GHC's actual target. Issue 2 is solved by -core-too and the fact that my cross-compiler is automatically my Core compiler. Issue 3 is also solved. When cross-compiling GHC to a new target, assuming TH is everywhere, I have a clean path to get 1) the new set of native libraries 2) the host->target+(target-Core) compiler 3) the target+(target-Core) libraries (the libraries built both to actually go on the target and their interpretable Core versions) and 4) a target+(target-Core)->target+(target-Core) compiler (two GHCs, one built to run on the target and the Core version of the former which runs in a target-Core interpreter, where both can compile to both the target and target-Core). Steps 3 and 4 require -core-too (and the two GHCs are mandatory), because, if some file uses TH, it can refer to definitions within itself or its dependencies, and so we need everything in Core to account for it. Building the stage2 compiler in two ways doesn't bother me in the same way issue 2 does, as -core-too means it should be way faster than two separate compilations, and because the cross-full-stage2 build is only important for porting GHC, not for cross-compiling applications.

from ghc-proposals.

Ericson2314 commented on May 10, 2024

Yes, core as a "way" does strike me as better for those reasons. Regardless, the actual work here is not producing core---we already can do that!---but evaluating it so that nothing beyond core is needed.

from ghc-proposals.

Ericson2314 commented on May 10, 2024

Makes sense. I wish we could do an optional typed STG, and then have STG lint too.

More on topic, it's occurred to me that it's less that the existing bytecode interpreter is unfit for this purpose, than trying to do the FFI that @chrisdone mentions.

If in conjunction with https://gitlab.haskell.org/ghc/ghc/-/issues/18954 we can get enough core/stg/whatever that we're never mixing compiled and interpreted code, maybe the existing bytecode is fine. CC @JoshMeredith.

(Also N.B. @luite is making the bytecode generation itself work from STG.)

from ghc-proposals.

Ericson2314 commented on May 10, 2024

https://www.patreon.com/posts/external-stg-49857800 it looks like we might have this now! CC @csabahruska

from ghc-proposals.

Add naive Core/STG interpreter so TH and GHCi work even with ABI changes about ghc-proposals HOT 20 OPEN

Comments (20)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent