xdslproject / xdsl Goto Github PK

View Code? Open in Web Editor NEW

208.0 208.0 58.0 10.93 MB

A Python Compiler Design Toolkit

License: Other

Python 73.85% MLIR 26.05% sed 0.03% Makefile 0.06% Nix 0.01%

xdsl's People

Contributors

Stargazers

Watchers

Forkers

joejiong martin-luecke stoltzstrop anh-ng-21 ingomueller-net chenglong92 purvi-h georgemitenkov tobiasgrosser meshtag georgebisbas pingshiyu miccio-dk kaixuan13 papychacal moxinilian karelpeeters fergtic goens afd jubileetitusjm mohsincsv hatsunespica ed741 groverkss shaolunwang jossevandelm lukamac taobi22 muke101 eymay bringlein adutilleul cigarichard ajrichins lucjaulmes ris-bali astralsorcerer compor maerhart boneyard93501 2510551378 ego lerenhua anniezfy quarub vmiheer kfafsp tarinduj iviarcio tavakkoliamirmohammad emilysillars gglin001 knickish alexshuang lopkop watermelonwolverine gabrielrodcanal

xdsl's Issues

Print types after instructions, not before the equal sign

These types tend to get long. For readability, it is useful to bring the interesting things to the front.

Standardise the verify names

Currently Attributes and Operations use either verify or verify_ as according method names. This can lead to confusion when one tries to implement a custom verifier.

Add support for rewriting with block arguments

Currently, the rewrite walker creates new blocks for each existing block but does not introduce BlockArguments. Furthermore, this should be added to the bookkeeping part.

I cannot assign this to anyone, but I'll look into that as I require this feature.

I'm not sure if we really want terminators in xdsl, but they might be necessary for stronger verification passes.
We do not enforce terminators in ChocoPy so far, which leads to issues while lowering.

Minor list of items to improve overall structure and automate logging

Just a small list of minor upgrades that can help improve the structure of the xdsl repo.
I am not assigning this to anyone, just listing things that can help a lot in automating minor stuff for long term.
Happy to discuss.

https://github.com/orgs/xdslproject/projects/2

Field shadowing and parametrized attributes (op.name/attr.name)

The following breaks with a Exception 1 parameters expected, got 0:

@irdl_attr_definition
class Foo(ParametrizedAttribute):
  name: str = "bar.foo"
  param = ParameterDef(AnyAttr())

# Try to construct
Foo([*SomeAttr*])

Removing the type annotation for the name field fixes it:

@irdl_attr_definition
class Foo(ParametrizedAttribute):
  name = "bar.foo"
  param = ParameterDef(AnyAttr())

# Try to construct
Foo([*SomeAttr*])

I think we should change the API s.t. names are not defined as fields. That would make it less error-prone. Maybe have the name of the Attribute as part of the decorator? irdl_attr_definition("bar.foo")

IRDL: operand_segment_sizes do no longer reflect MLIR's functionality.

It seems that MLIR changed the way they store information in the *_segment_sizes attributes. In the case of cond_br they expect an entry for all the operand "kinds", even for non-variadic ones.

Rewrite verification without exceptions

Unit and optional attributes

For some operations/types it would be nice to have optional attributes or unit attributes/parameters.

For example, MLIR's memref.global has a unit attribute constant which, when present, indicates that this global is constant.
Another one are IntegerTypes, which can have a "signedness" parameter which indicates that the type is a signed integer.

One could also add boolean attributes but this might clutter the output a lot.

Cleanup the Affine dialect

The affine dialect still uses new_op instead of actual classes. Furthermore, it is nowhere near feature completion as IntegerSets are not yet supported.
Either we change the existing operations to use IRDL or we drop the dialect completely. What do you think @math-fehr ?

Update the documentation to the latest API

Parsing/Printing: do not expect type annotation in operator arguments

The default parser expects type annotations for the arguments of an operand. This seems unnecessarily verbose:

Instead of the code:

  %0 : !riscv_ssa.reg = riscv_ssa.li() ["immediate" = 42 : !i64]
  %1 : !riscv_ssa.reg = riscv_ssa.mul(%0 : !riscv_ssa.reg , %0 : !riscv_ssa.reg)

I would prefer to write

  %0 : !riscv_ssa.reg = riscv_ssa.li() ["immediate" = 42 : !i64]
  %1 : !riscv_ssa.reg = riscv_ssa.mul(%0, %0)

MLIRConverter: fix for CF

The MLIRConverter currently struggles with the CF dialect on some things related to operand_segment_sizes.

Update release to make sure xdsl.printer is available in pip repositories

IntAttr vs IntegerAttr

Currently, IntAttr is defined in builtin but there is also a IntegerAttr. I think we should either remove IntAttr or rename it to something else.
This caused quite some confusion, e.g., https://github.com/compiling-techniques/ChocoPyCompiler/pull/75/files

Ensure deleted operations are not used in any way

Right now, after deletion, operations can still be accessed.
It would be nice to raise exceptions when deleted operations are accessed.

Do not print the types in the operand list

Printing and requiring types after an operand use makes the IR unnecessarily verbose and hard to read. MLIR does not do this either. Hence, I wonder, should we maybe just not do this?

Refactor parsing and printing

Currently, the parsing and printing is slightly broken.
First, raised by #3, we cannot print an operation/module to a string, but allowing to simply write print(op) would be nice.

However, this raises multiple issues:

How to print an operation that refers to an external SSA value? Should we just give them a name like as UNK0, to show explicitely that the value is outside?
Then, how to print a list of operation? If the second operation use a result of the first, we don't want to print an UNK for instance. I believe that in that case, we might have to use the Printer explicitly in some way.

Improved diagnostics for SSAValue.erase()

SSAValue.erase() raises a generic exception "Attempting to delete SSA value that still has uses."

It would be more useful if it also printed out the specific operation.

This can be achieved by:

   def erase(self, safe_erase: bool = True) -> None:
        """
        Erase the value.
        If safe_erase is True, then check that no operations use the value anymore.
        If safe_erase is False, then replace its uses by an ErasedSSAValue.
        """
        if safe_erase and len(self.uses) != 0:
            raise Exception(
                f"Attempting to delete SSA value that still has uses: {self}")
        self.replace_by(ErasedSSAValue(self.typ))

Named SSAValue

Implement named SSAValues (i.e. SSAValues get an optional name, which they keep when being printed)

Refactor std into std and arith dialects.

MLIR changed parts of their dialect structures by introducing an arith dialect. We should reflect this new dialect as well to ensure we stay compatible with MLIR.

Parent Pointers

currently, we just do parent.parent.parent kinds of things. Add parent_op, parent_region, and parent_block.

Add support for errors

Currently, we just return an exception when there is an error in the input program, or when a verifier fails.
It would be nice to have a way to print back the program, with the error pointing to the right place.

Printer error handling

If one currently forgets to add a referenced operation to a region, i.e., if it isn't present in the output, the printer fails with a very nasty error message.

A small check around https://github.com/xdslproject/xdsl/blob/main/src/xdsl/printer.py#L76-L79 might already be sufficient.

Parsing/Printing: print result type at the end of an operation

I would prefer to parse the type after an assignment to be more in line with what MLIR uses. I also find the delayed type easier to read as the interesting information, the operation that we call, is frontloaded.

Instead of:

%0 : !riscv_ssa.reg = riscv_ssa.li() ["immediate" = 42 : !i64]
%1 : !riscv_ssa.reg = riscv_ssa.mul(%0 : !riscv_ssa.reg, %0 : !riscv_ssa.reg)

I would expect to read:

%0 = riscv_ssa.li() ["immediate" = 42 : !i64] : !riscv_ssa.reg 
%1 = riscv_ssa.mul(%0 : !riscv_ssa.reg, %0 : !riscv_ssa.reg) : !riscv_ssa.reg

Highlighting and Arrows

On my thesis project I have many prints of a module as it is rewritten (see below), where the module is output with printer.print_op(module).

Following a meeting with Tobais and Mathieu, we think it would be incredible helpful if the printer could implement functionality for highlighting and arrows. Highlighting could simply be done by changing the output colour, and arrows could use a string such as "->".

If anyone has any input on this, or any additional printer features they would like added, I would love to hear!

Also, sorry if I have posted this in the wrong place.

Print "MISSING_SSA_VALUE" instead of aborting after exception

Currently when the printer encounters an SSAValue that is not attached to the IR, e.g. a missing operand, it throws an exception and fails. For debugging purposes it would be very helpful if it could still print the whole IR with a string similar to MISSING_SSA_VALUE instead of the missing value where it fails and throw the Exception afterwards. This can be done similarly in MLIR and, seeing the whole context, helps with debugging.

This is the part I am refering to

op_type_rewrite_pattern does not seem to work on external packages

It seems that code outside of the xdsl project cannot use op_type_rewrite_pattern. The decorator triggers an error.
From what I saw, the reason is that the decorator does not get the hint as a class, but a string.

Do not clash names between regions

Currently, if we have two regions, we can't reuse names between regions.
For instance:

func.func () {
  %a = ...
}

func.func () {
  %a = ... // This will trigger an error : `SSAValue "a" is already defined`
}

I believe this would also be a problem for Basic block names

Add something like MLIROptMain to xdsl

Moving a lot of the choco-opt boilerplate code into xDSL might help to construct other opt like tools for filecheck tests.

Parent attribute and block factories

The parent attribute does not get set for a lot of operations. This seems to happen because Blocks are constructed by hand, which does not set the parent attribute. AFAIU As the region factory function internally uses the block factory, even constructing regions does not set the parent attribute.

Capture BlockArg changes in RewriteAction

Currently, the walker simply duplicates the BlockArguments of Blocks it traverses, independent of any changes a pattern might have caused. When a pattern modifies the arguments the pattern does not notice and thus does not bookkeep them.

I guess one could somehow integrate this into the RewriteAction, but it might be costly to detect these changes. On the other hand, informing the RewriteAction manually about every change might be cumbersome and error-prone.

Note: This might be solvable by introducing some kind of Rewriter, as MLIR has it, which either replaces or changes operations. The MLIR rewriter must be used to perform any modification on the IR, which in turns allows to introduce arbitrary hooks from the framework side, as all actions will go through it.

Only check for unique block ids in the same region.

Func dialect: Verify that types match the block arguments

Actually write a README

Add support for comments

We currently do not support comments when parsing xdsl. Our parser should gain support to ignore comments.

Cleanup tests

Many of the pytests are quite old and were not updated cleanly. Some of them can be migrated to filecheck tests, while others do not require such complicated handling of dialect instances.

Make IRs with missing operations not verify

As discussed in #60, we might want to emit an error while verifying in this case, not just when printing.

Custom Parser and Printer

It would be very cool to have easy support for defining custom parser and printer. In particular, some of the tree-based code becomes very ugly, there defining custom syntax like the following would be very helpful:

op_with_regions()
with first_region = {
  *region content*
}
with second_region = {
  *region content*
}

I might just be out of the loop on these things, is there some support for this already, @math-fehr ?

Add support for referencing "forward-declared" attributes

We would like to write something like:

@irdl_attr_definition
class ChocoListTypeData:
    name = "choco.list_type_data"
    # The following is a reference to itself
    type = AttributeDef(AnyOf([ChocoTypeData, ChocoListTypeData]))

Nicer error prints for jupyter notebooks

Some of the jupyter notebooks currently have parts in them, that trigger huge error prints, since the build API checks for the number of results and so on. Maybe update these such that the error prints are nicer.

Incorporate visitor into xDSL

ChocoPy has a visitor which is quite useful to work through the IR, and a copy of this is at https://github.com/xdslproject/psy/blob/main/util/visitor.py . Whilst the chocoPy visitor imported the choco dialects this isn't actually needed and the visitor is dialect independent.

Therefore it might be a good idea to bring into xDSL proper as it seems like generic functionality. I guess there are two questions related to this:

We can do visiting of the dialect with PatternRewriteWalker already, so I guess that sort of does the job of the visitor but from the naming it seems like the PattenRewriter should be doing some rewriting of the IR, whereas the visitor can be collecting information only. Therefore the visitor is lighter-weight?
Currently the visitor matches methods based on their pot_hole case name vs the dialect's CamelCase class name. This works fine, but isn't ideal I think from a generality POV and we might want to modify to match on the type of the method's IR node argument instead (like the PatternRewriteWalker does).

Add a clean interface to the MLIR converter

Currently the mlir_converter.py is a mess. When we want to use it from the ChocoPy project, we have to rewrite parts of it to make it possible to wrap it into a tool.

Match on specific tree structures

Devito uses a multitude of tree visitors that only triggers matches on specific tree structures.
For example, they do only match on Iteration nodes that do not contain any children nodes of type Iteration.

Supporting such more complex patterns out of the box might make many Devito transforms easier to implement.

Filecheck: add support for invalid tests.

Currently, the lit tool marks a test as failed as soon as one of the commands in the pipeline terminates with non-zero exit code. This makes it impossible to check for certain diagnostics/exceptions.

This could be solved by improving diagnostics support and adding a kind of verify-diagnostics flag (as mlir-opt does) to then print diagnostics or exception traces while still exiting with zero. I'm not yet sure how this can be integrated nicely into the opt tool though.

Note: there is already a hack in choco-opt to capture exceptions to then print the message. This is not clean in my opinion. Maybe we could define certain exception types that will be ignored with a certain flag.

Improve indentation of regions when printing ASTs

Support user-specified out-of-tree MLIR dialects

It may be useful to use xDSL for out-of-tree MLIR dialects in user-defined rewritings. One might think that one can simply import mlir_mydialect in addition to import mlir but this is fundamentally impossible.

The problem is that out-of-tree MLIR dialects are compiled as native (C++) libraries, which are exposed to Python as native modules (e.g., mlir_mydialect) that are incompatible with the in-tree MLIR module (mlir). The underlying reason is that some objects in MLIR are per-library singletons, so different libraries each have their own singleton instances. The recommended way to compile out-of-tree Python modules is therefor to include all of MLIR core such that all MLIR-related dependencies are included in the out-of-tree Python module. Users of such a library then have to import mlir_mydialect instead of import mlir everywhere in their code.

In xDSL, this means that the main xDSL Python package (i.e., the one from this repository) needs to import a different Python package for mlir depending on what dialects the user of the package want to use . In other words, xdsl needs to import <desire-mlir-package> based on what the code that does import xdsl desires. The import keyword, however, does not allow to import packages dynamically (i.e., based on the value of a variable), so there is not trivial way to achieve the desired goal.

This issue is an attempt to express the problem clearly and provide a place to find the best solution to the problem. #109 and #110 are two initial attempts.

Adapt vector and tensor structures to mirror MLIR

Currently, we have a VectorAttr, but MLIR calls them DenseIntOrFPElementsAttr and parametrizes them with either a VectorType or a TensorType. As we require this flexibility for the MLIR conversion, we have to reflect this structure in our implementation as well.

Print to string

It would be nice if the printer could print to a string.

Add support for signed integers

Currently builtin's IntegerType has no signedness information, thus all integers are treated as unsigned integers, which is a limitation for our lowering to mlir.