pschachte / wybe Goto Github PK

View Code? Open in Web Editor NEW

42.0 42.0 7.0 9.39 MB

A programming language supporting most of both declarative and imperative programming

License: MIT License

Makefile 0.30% Haskell 92.18% C 0.28% Prolog 0.28% TeX 4.80% Shell 0.55% Python 1.61%

declarative-programming imperative-programming programming-language

wybe's People

Contributors

Stargazers

Watchers

Forkers

caviarchen rudolphalmeida sushi6006 jimbxb darmawanalbert neon64 wnats

wybe's Issues

Add a usable string type to the standard library

The current Wybe string type is just a C string, which is convenient for character constants and to use C I/O, and that's about it. The string type should support efficient concatenation and substring. Ideally, when a string is not aliased, it should be able to act like a buffer by supporting destructive update.

wybe compiler give internal error when compiling helloworld.wybe

What I tried to do:

% cat hello_world.wybe
!println("Hello, World!")

% wybemk hello_world
Internal error: Invalid Wybe object file /usr/local/lib/wybe/wybe.o:  module data missing
CallStack (from HasCallStack):
  error, called at src/AST.hs:2784:17 in main:AST

Seems like the function loadModuleFromObjFile failed when loading the standard library:

wybe/src/Builder.hs

Line 492 in 6350c56

++ ": module data missing"

All of the above is done under Ubuntu 18.04.2

Documentation in WYBE.md have wrong code in it

every print and println start from here should have an exclamation marks in front of of it

Better data types needed to make good use of the Haskell's type system

For example, we store LPVM instructions using this data structure which is very error-prone.

wybe/src/AST.hs

Line 1948 in f68d0f5

| ForeignCall Ident Ident [Ident] [Placed Exp]

If we add an incompatible change to the LPVM instructions. It won't case a compile error and in some cases it won't even cause a runtime error.

Most of the current codes just match on the strings and make assumptions about the length of list etc, and the compiler can't verify it.

When compiling a statement like foo(?x, ..., x) the wrong '#version' suffix is added to the x

Wybe program runs slower on system with more cores

This is a bit odd. The same program can become slower when there are more cpu cores in the system.
I used the same physical machine to run vm and allocate a different number of cpu cores to it. Then I ran the same wybe program on it and here are the results:

number of cores	time used
1	54s
2	58s
6	128s
12	220s

I suspect that the issue is caused by the gc, because:

The situation is worse on a program with more memory allocation.
Wybe program will create multiple threads with the number equal to the number of cores in the system.

After a quick look at the doc of the BDW GC, it seems that the gc should be single-threaded with incremental mode disabled by default. Not sure why and I will keep investigating this.

Type checker doesn't raise an error when the type of a varibale is changed.

test.wybe

?x = 1
?x = 1.0

output

Error detected during type checking of module(s) test
Internal error: type error in move: {foreign llvm move(1.0 @test:2:9, ?x @test:2:1)}
CallStack (from HasCallStack):
  error, called at src/AST.hs:2786:17 in main:AST

Incremental build may generate object file with outdated code

main.wybe

use lib

!foo()

lib.wybe

pub def foo() use !io {
    !println("test message A")
}

Reproduce steps:

Build main.
Change foo() in lib.wybe (eg. change the string in println).
Build main again and this new main will behave the same as the previous one.

The current incremental build works as follows:

wybe/src/Builder.hs

Lines 327 to 343 in 5cb70a8

    
           ModuleSource (Just srcfile) (Just objfile) Nothing Nothing -> do 
        
               -- both source and object exist:  rebuild if necessary 
        
               srcDate <- (liftIO . getModificationTime) srcfile 
        
               dstDate <- (liftIO . getModificationTime) objfile 
        
               if force || srcDate > dstDate 
        
                 then do 
        
                   unless force (extractModules objfile) 
        
                   buildModule modspec srcfile 
        
                   return True 
        
                 else do 
        
                   loaded <- loadModuleFromObjFile modspec objfile 
        
                   unless loaded $ do 
        
                       -- Loading failed, fallback on source building 
        
                       logBuild $ "Falling back on building " ++ 
        
                           showModSpec modspec 
        
                       buildModule modspec srcfile 
        
                   return $ not loaded

During the bottom-up pass, if the object file is newer than the source code file (based on last modified time) then the compiler will re-use the lpvm module read from the object file.

wybe/src/Builder.hs

Lines 949 to 959 in 5cb70a8

    
           ModuleSource (Just srcfile) (Just objfile) _ _ -> do 
        
               -- TODO: Multiple specialization make this part not working because 
        
               -- the object file can change (new specz requirement) even if the  
        
               -- source code is not changed. Need something better. Also that the 
        
               -- original version doesn't work anyway. Due to inlining, a object 
        
               -- file can be changed if its dependencies changes. 
        
               -- srcDate <- (liftIO . getModificationTime) srcfile 
        
               -- dstDate <- (liftIO . getModificationTime) objfile 
        
               -- return $ srcDate > dstDate 
        
               return True

After the top-down pass, the same logic is used to determine whether the object file needs to be updated (It is disabled by PR #50 ).

The main issue is that incremental build only considers each module itself and ignores the changes of its dependencies. Inlining and alias analysis that cross multiple modules will break.

Supporting incremental build during the top-down pass requires this issue to be fixed first. I'm planning to discuss this in the coming meeting.

Incremental build may cause the builder to call compileModSCC with invalid SCC.

The diagram below shows the target main and its dependencies.

The problem is related to the circular dependency of ca and cb.
If there isn't any object file that can be reused, then compileModSCC [ca, cb] will be executed, which is correct.
If there are up-to-date object files for ca and cb, then compileModSCC won't be executed, which is correct.
If only the object file of ca is missing (or outdated), then compileModSCC [ca] will be executed, which is wrong.
if only the object file of cb is missing (or outdated), then compileModSCC won't be executed and left cb uncompiled, which is wrong.

The cause is that loadModuleFromObjFile inserts modules directly into the Compiler State and doesn't call things like exitModule, so the topological sort is kind of broken. In the example above, cb is deferred first. Then compiler visits ca and realize that ca and cb is in an SCC. But if cb is loaded directly then it won't be marked as deferred and if ca is loaded directly then the deferred cb won't be picked.

Overall, I feel that the builder part is a bit messy and we need to fix this before working on #66
In my opinion, a module has two stages while building (shown in the diagram below).

Currently, the builder handles the first part correctly. A stage 1 module only depends on its source code, so using the lastModifiedTime is correct.
However, a stage 2 module (generated by the compileModSCC) also depends on other stage 2 modules and its stage 1 module, so we can't let the decision from the first part affects this part. So we need to visit all SCCs no matter if it's loaded from an object file or not. And for each SCC, we need to check whether anything changes (like the one that will be added to fix #66 ), if so, then compileModSCC should be executed with the whole SCC.
Then there is another problem, currently, we only store the stage 2 modules in the object file, we need a way to get the stage 1 module if we need to rebuild it. I don't like the idea of getting it from the source code since it basically makes the object file not portable at all. Maybe storing the list of tokens?

Support higher order procs and funcs

Also support anonymous procs/functions using notation like @ to refer to the next lambda argument and @2 to refer to the second lambda argument. This covers both lambda and partial application.

Linktime dead code elimination only works on macOS

We added the support of linktime dead code elimination in #4 when wybe only works on macOS.
It turns out that the -dead_strip only exists in the system linker on macOS so it doesn't work on ubuntu.

I tried -Wl,--gc-sections and it doesn't work, I believe it requires that every function in its own ELF section.

Compiler trys to inline uncompiled code.

There is a case where

wybe/src/Expansion.hs

Lines 198 to 201 in 47a01b2

    
           def <- lift $ lift $ getProcDef pspec 
        
           case procImpln def of 
        
             ProcDefSrc _ -> shouldnt $ "uncompiled proc: " ++ show pspec 
        
             ProcDefPrim proto body _ _ ->

will be reached with procImpln def is ProcDefSrc instead of ProcDefPrim.

A sample csae is test-cases/final-dump/multi_specz_cyclic_lib2.wybe.
Actually it belongs to a test case with 3 files.

test-cases/final-dump/multi_specz_cyclic_exe.wybe
test-cases/final-dump/multi_specz_cyclic_lib.wybe
test-cases/final-dump/multi_specz_cyclic_lib2.wybe

Compiling them as a whole succeeds without any error but compiling multi_specz_cyclic_lib2.o individually causes this problem.

Allow a resource to be defined as a collection (ie, union) of other resources

Add a for loop to Wybe

The for loop should allow iteration over integer sequences, lists, and arrays. It should allow iteration over more than one sequence in lock step.

The mode/determinism checker does not understand that break and continue interrupt forward execution

... so if a break appears in one branch of a conditional, then variables assigned in the other branch(es) will be assigned in subsequent statements.

Support generic types

The wybe compiler failed to generate executable file

Example:

> cat helloworld.wybe 
!println("Hello World!")

> wybemk helloworld

> ./helloworld 
zsh: no such file or directory: ./helloworld

Inconsistent code logic between LLVM IR and executable

Using bigloop.wybe as an example, output executable using command:
wybemk bigloop
will get an executable doing exactly what the code say: looping over i and add 2 to sum each time.
wybemk bigloop.ll
will get a LLVM IR output with fully optimized code without any looping.

As a side note, when disassembling the bigloop executable I also noticed the fact that the final executable will include all standard types and functions, including those it never used during runtime.

$ wybemk --version
wybemk 0.1 (git 40f3b16)

Add an array type

Arrays are always tricky in a declarative language, because copying the whole array to change one element is impractical. CTGC make make it workable locally, but we need interprocedural CTGC for it to really be practical.

The wybe foreign interface needs to be documented in the user guide.

The compiler allows changes to the io state to be thrown away

In the following code:

def main use io {
    !println("hello world!")
}
!main

the compiler allows changes to the io state to be thrown away. This happens because the use clause says use io instead of use !io. Then the compiler sees that the main proc does not produce any output, so it optimises it away.

To fix this, we must introduce a restriction that io states cannot be copied -- we need linear types of some sort, which will be a significant effort.

Support multiple specialisation

This allows generation of specialised versions of procedures based on how it will be called, and allows the compiler to select the best available version of a procedure for each call.

Support optional and named arguments

That means overloading based on arity will not be supported.

Unexpected behaviour of alias analysis 2

Code:

pub type intlist { pub [] | [|](head:int, tail:intlist) }

pub def list_print(x:intlist) use !io {
    if { x = [ ?h | ?t] :: 
        !print(h)
        !print(" ")
        !list_print(t)
    }
}

pub def list_println(x:intlist) use !io {
    !list_print(x)
    !println("")
}

?x = [1,2,3,4,5,6]
?y = [999]
!list_println(x)
!list_println(y)

if { x = [ ?h | ?t] ::
    if { tail(!t, y) ::
        !print("t: ")
        !list_println(t)
    }
}

!list_println(x)

Ouput:

1 2 3 4 5 6 
999 
t: 2 999 
1 2 999

Expect Output:

1 2 3 4 5 6 
999 
t: 2 999 
1 2 3 4 5 6

The LPVM code is wrong:

*main* > public (0 calls)
0: (argc#0:wybe.int, [?argc#0:wybe.int], argv#0:wybe.int, [?argv#0:wybe.int], exit_code#0:wybe.int, [?exit_code#0:wybe.int], io#0:wybe.phantom, ?io#5:wybe.phantom): AliasPairs: []
    foreign lpvm alloc(16:wybe.int, ?tmp$13#0:error_case.intlist)
    foreign lpvm mutate(~tmp$13#0:error_case.intlist, ?tmp$14#0:error_case.intlist, 0:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, 6:wybe.int)
    foreign lpvm mutate(~tmp$14#0:error_case.intlist, ?tmp$15#0:error_case.intlist, 8:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, 0:error_case.intlist)
    foreign lpvm alloc(16:wybe.int, ?tmp$18#0:error_case.intlist)
    foreign lpvm mutate(~tmp$18#0:error_case.intlist, ?tmp$19#0:error_case.intlist, 0:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, 5:wybe.int)
    foreign lpvm mutate(~tmp$19#0:error_case.intlist, ?tmp$20#0:error_case.intlist, 8:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, ~tmp$15#0:error_case.intlist)
    foreign lpvm alloc(16:wybe.int, ?tmp$23#0:error_case.intlist)
    foreign lpvm mutate(~tmp$23#0:error_case.intlist, ?tmp$24#0:error_case.intlist, 0:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, 4:wybe.int)
    foreign lpvm mutate(~tmp$24#0:error_case.intlist, ?tmp$25#0:error_case.intlist, 8:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, ~tmp$20#0:error_case.intlist)
    foreign lpvm alloc(16:wybe.int, ?tmp$28#0:error_case.intlist)
    foreign lpvm mutate(~tmp$28#0:error_case.intlist, ?tmp$29#0:error_case.intlist, 0:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, 3:wybe.int)
    foreign lpvm mutate(~tmp$29#0:error_case.intlist, ?tmp$30#0:error_case.intlist, 8:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, ~tmp$25#0:error_case.intlist)
    foreign lpvm alloc(16:wybe.int, ?tmp$33#0:error_case.intlist)
    foreign lpvm mutate(~tmp$33#0:error_case.intlist, ?tmp$34#0:error_case.intlist, 0:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, 2:wybe.int)
    foreign lpvm mutate(~tmp$34#0:error_case.intlist, ?tmp$35#0:error_case.intlist, 8:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, ~tmp$30#0:error_case.intlist)
    foreign lpvm alloc(16:wybe.int, ?tmp$38#0:error_case.intlist)
    foreign lpvm mutate(~tmp$38#0:error_case.intlist, ?tmp$39#0:error_case.intlist, 0:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, 1:wybe.int)
    foreign lpvm mutate(~tmp$39#0:error_case.intlist, ?tmp$40#0:error_case.intlist, 8:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, tmp$35#0:error_case.intlist)
    foreign lpvm alloc(16:wybe.int, ?tmp$43#0:error_case.intlist)
    foreign lpvm mutate(~tmp$43#0:error_case.intlist, ?tmp$44#0:error_case.intlist, 0:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, 999:wybe.int)
    foreign lpvm mutate(~tmp$44#0:error_case.intlist, ?tmp$45#0:error_case.intlist, 8:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, 0:error_case.intlist)
    error_case.list_print<0>(tmp$40#0:error_case.intlist, ~#io#0:wybe.phantom, ?tmp$48#0:wybe.phantom) @error_case:12:6
    foreign c print_string("":wybe.string, ~tmp$48#0:wybe.phantom, ?tmp$49#0:wybe.phantom) @wybe:100:39
    foreign c putchar('\n':wybe.char, ~tmp$49#0:wybe.phantom, ?#io#1:wybe.phantom) @wybe:86:26
    error_case.list_print<0>(tmp$45#0:error_case.intlist, ~#io#1:wybe.phantom, ?tmp$52#0:wybe.phantom) @error_case:12:6
    foreign c print_string("":wybe.string, ~tmp$52#0:wybe.phantom, ?tmp$53#0:wybe.phantom) @wybe:100:39
    foreign c putchar('\n':wybe.char, ~tmp$53#0:wybe.phantom, ?#io#2:wybe.phantom) @wybe:86:26
    foreign llvm icmp ne(tmp$40#0:error_case.intlist, 0:wybe.int, ?tmp$55#0:wybe.bool)
    case ~tmp$55#0:wybe.bool of
    0:
        error_case.list_print<0>(~tmp$40#0:error_case.intlist, ~#io#2:wybe.phantom, ?tmp$58#0:wybe.phantom) @error_case:12:6
        foreign c print_string("":wybe.string, ~tmp$58#0:wybe.phantom, ?tmp$59#0:wybe.phantom) @wybe:100:39
        foreign c putchar('\n':wybe.char, ~tmp$59#0:wybe.phantom, ?#io#5:wybe.phantom) @wybe:86:26

    1:
        foreign llvm move(~tmp$35#0:error_case.intlist, ?t#0:error_case.intlist)
        foreign llvm icmp ne(%t#0:error_case.intlist, 0:wybe.int, ?tmp$58#0:wybe.bool)
        case ~tmp$58#0:wybe.bool of
        0:
            foreign llvm move(~%t#0:error_case.intlist, ?%t#1:error_case.intlist)
            error_case.list_print<0>(~tmp$40#0:error_case.intlist, ~#io#2:wybe.phantom, ?tmp$61#0:wybe.phantom) @error_case:12:6
            foreign c print_string("":wybe.string, ~tmp$61#0:wybe.phantom, ?tmp$62#0:wybe.phantom) @wybe:100:39
            foreign c putchar('\n':wybe.char, ~tmp$62#0:wybe.phantom, ?#io#5:wybe.phantom) @wybe:86:26

        1:
            # !!!! It should be non-destructive here.
            foreign lpvm mutate noalias(~%t#0:error_case.intlist, ?%t#1:error_case.intlist, 8:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, ~tmp$45#0:error_case.intlist)
            foreign c print_string("t: ":wybe.string, ~#io#2:wybe.phantom, ?#io#3:wybe.phantom) @wybe:100:39
            error_case.list_print<0>(~t#1:error_case.intlist, ~#io#3:wybe.phantom, ?tmp$63#0:wybe.phantom) @error_case:12:6
            foreign c print_string("":wybe.string, ~tmp$63#0:wybe.phantom, ?tmp$64#0:wybe.phantom) @wybe:100:39
            foreign c putchar('\n':wybe.char, ~tmp$64#0:wybe.phantom, ?#io#4:wybe.phantom) @wybe:86:26
            error_case.list_print<0>(~tmp$40#0:error_case.intlist, ~#io#4:wybe.phantom, ?tmp$67#0:wybe.phantom) @error_case:12:6
            foreign c print_string("":wybe.string, ~tmp$67#0:wybe.phantom, ?tmp$68#0:wybe.phantom) @wybe:100:39
            foreign c putchar('\n':wybe.char, ~tmp$68#0:wybe.phantom, ?#io#5:wybe.phantom) @wybe:86:26

Unexpected behaviour of alias analysis 3

Code:

pub type intlist { pub [] | [|](head:int, tail:intlist) }

pub def list_print(x:intlist) use !io {
    if { x = [ ?h | ?t] :: 
        !print(h)
        !print(" ")
        !list_print(t)
    }
}

pub def list_println(x:intlist) use !io {
    !list_print(x)
    !println("")
}

?x = [1,2,3]
?y = [0]
!println("----------")
!list_println(x)
!list_println(y)

if { tail(!y, x) ::
    !println("----------")
    !print("x: ")
    !list_println(x)
    !print("y: ")
    !list_println(y)
}

if { head(!x, 999) ::
    !println("----------")
    !print("x: ")
    !list_println(x)
    !print("y: ")
    !list_println(y)
}

Output:

----------
1 2 3 
0 
----------
x: 1 2 3 
y: 0 1 2 3 
----------
x: 999 2 3 
y: 0 999 2 3

Expect Output:

----------
1 2 3 
0 
----------
x: 1 2 3 
y: 0 1 2 3 
----------
x: 999 2 3 
y: 0 1 2 3

The problem is that the output of mutate is always considered as unalised.

Fix LLVM code generation to use FastCC instead of CCC for calling Wybe code

Currently, LLVM code gen uses CCC for everything, which means it does not perform tail call/last call optimisation. We need to use CCC for foreign calls, but should use FastCC for Wybe code. Or perhaps TailCC, which is like FastCC, except for guarantees tail calling if it's possible.

Auto generated proc for loop has duplicate paramters

I am working on another test case for gloabl CTGC (using explicit setter).
Here is the program:


type drone_info { pub drone_info(x:int, y:int, z:int, count:int) }

def drone_init():drone_info = drone_info(0, 0, 0, 0)

def print_info(d:drone_info) use !io {
    !print("(")
    !print(x(d))
    !print(", ")
    !print(y(d))
    !print(", ")
    !print(z(d))
    !print(") #")
    !print(count(d))
    !nl
}

def do_action(!d:drone_info, action:char, ?success:bool) {
    ?success = true
    if { action = 'n' ::
        y(!d, y(d)-1)
    | action = 's' ::
        y(!d, y(d)+1)
    | action = 'w' ::
        x(!d, x(d)-1)
    | action = 'e' ::
        x(!d, x(d)+1)
    | action = 'u' ::
        z(!d, z(d)+1)
    | action = 'd' ::
        z(!d, z(d)-1)
    | otherwise ::
        ?success = false
    }
    if { success ::
        count(!d, count(d)+1)
    }
}

?d = drone_init()
do {
    !read(?ch:char)
    until ch = eof
    if { ch /= ' ' && ch /= '\n' ::
        if { ch = 'p' ::
            !print_info(d)
        | otherwise ::
            do_action(!d, ch, ?success)
            if { success = false ::
                !println("invalid action!")
            }
        }
    }
}

!malloc_count(?mc)
!println(mc)

Note that the variable d should be unaliased. However, the prototype of the generated procedure for the loop is like this:

0: gen$1(argc#0:wybe.int, argv#0:wybe.int, d#0:drone.drone_info, exit_code#0:wybe.int, io#0:wybe.phantom, tmp$0#0:drone.drone_info, ?argc#1:wybe.int, ?argv#1:wybe.int, ?exit_code#1:wybe.int, ?io#3:wybe.phantom):

The d#0 and tmp$0#0 in it are the same thing, according to the call site:

    foreign lpvm alloc(32:wybe.int, ?tmp$13#0:drone.drone_info)
    foreign lpvm mutate(~tmp$13#0:drone.drone_info, ?tmp$14#0:drone.drone_info, 0:wybe.int, 1:wybe.int, 32:wybe.int, 0:wybe.int, 0:wybe.int)
    foreign lpvm mutate(~tmp$14#0:drone.drone_info, ?tmp$15#0:drone.drone_info, 8:wybe.int, 1:wybe.int, 32:wybe.int, 0:wybe.int, 0:wybe.int)
    foreign lpvm mutate(~tmp$15#0:drone.drone_info, ?tmp$16#0:drone.drone_info, 16:wybe.int, 1:wybe.int, 32:wybe.int, 0:wybe.int, 0:wybe.int)
    foreign lpvm mutate(~tmp$16#0:drone.drone_info, ?tmp$0#0:drone.drone_info, 24:wybe.int, 1:wybe.int, 32:wybe.int, 0:wybe.int, 0:wybe.int)
    drone.gen$1<0>(~argc#0:wybe.int, ~argv#0:wybe.int, ~tmp$0#0:drone.drone_info, ~exit_code#0:wybe.int, ~io#0:wybe.phantom, ~tmp$0#0:drone.drone_info, ?argc#1:wybe.int, ?argv#1:wybe.int, ?exit_code#1:wybe.int, ?io#1:wybe.phantom) #1 @drone:42:1

It stops the specialized version of gen$1<0> to be used.

"adjustNth refers beyond list end" raised in Callers.hs

int_list.wybe

pub type int_list { pub [] | [|](head:int, tail:int_list) }


pub def print(x:int_list) use !io {
    if { x = [ ?h | ?t] :: 
        !print(h)
        !print(" ") # the problem seems related to this one.
        !print(t)
    }
}

pub def println(x:int_list) use !io {
    !print(x)
    !nl
}

?x = [1,2,3]
!println(x)

> wybemk int_list
adjustNth refers beyond list end
CallStack (from HasCallStack):
  error, called at src/Callers.hs:79:20 in main:Callers

Unexpected behaviour of alias analysis

Program:
position.wybe

pub type position { pub position(x:int, y:int) }

pub def printPosition(pos:position) use !io {
    !print(" (")
    !print(pos.x)
    !print(",")
    !print(pos.y)
    !println(")")
}

main.wybe

use position

?pos = position(100,150)
?pos2 = pos
!println("-------")
x(!pos, 200) 
!printPosition(pos)
!printPosition(pos2)
!println("-------")
x(!pos2, 90) 
!printPosition(pos)
!printPosition(pos2)

Expected output:

-------
 (200,150)
 (100,150)
-------
 (200,150)
 (90,150)

Actual output:

-------
 (200,150)
 (200,150)
-------
 (90,150)
 (90,150)

Final dump:

    foreign lpvm alloc(16:wybe.int, ?tmp$3#0:position.position)
    foreign lpvm mutate(~tmp$3#0:position.position, ?tmp$4#0:position.position, 0:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, 100:wybe.int)
    foreign lpvm mutate(~tmp$4#0:position.position, ?tmp$0#0:position.position, 8:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, 150:wybe.int)
    foreign c print_string("-------":wybe.string, ~#io#0:wybe.phantom, ?tmp$7#0:wybe.phantom) @wybe:100:39
    foreign c putchar('\n':wybe.char, ~tmp$7#0:wybe.phantom, ?#io#1:wybe.phantom) @wybe:86:26
    foreign lpvm mutate noalias(tmp$0#0:position.position, ?%pos#1:position.position, 0:wybe.int, 0:wybe.int, 16:wybe.int, 0:wybe.int, 200:wybe.int)
    position.printPosition<0>(pos#1:position.position, ~#io#1:wybe.phantom, ?#io#2:wybe.phantom) @alias_des2:7:2
    position.printPosition<0>(tmp$0#0:position.position, ~#io#2:wybe.phantom, ?#io#3:wybe.phantom) @alias_des2:8:2
    foreign c print_string("-------":wybe.string, ~#io#3:wybe.phantom, ?tmp$12#0:wybe.phantom) @wybe:100:39
    foreign c putchar('\n':wybe.char, ~tmp$12#0:wybe.phantom, ?#io#4:wybe.phantom) @wybe:86:26
    foreign lpvm mutate noalias(~tmp$0#0:position.position, ?%pos2#1:position.position, 0:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, 90:wybe.int)
    position.printPosition<0>(~pos#1:position.position, ~#io#4:wybe.phantom, ?#io#5:wybe.phantom) @alias_des2:11:2
    position.printPosition<0>(~pos2#1:position.position, ~#io#5:wybe.phantom, ?#io#6:wybe.phantom) @alias_des2:12:2

Initialisation in resource declaration should make the resource available in top level code

See test-cases/use_resource.wybe. Compiling to .o works fine, but generating an executable gets this error:

Error detected during resource checking of module(s)
Unknown resource use_resource.count

Compiler optimization give wrong LLVM output

Wybe code:

!println("hello world!")

correctly print out hello world!
while:

#test.wybe
def main() use io {
    !println("hello world!")
}
!main()

print out nothing.

wybemk test.ll give an almost empty text file:

; ModuleID = 'test'
source_filename = "....../test.wybe"

wybemk -V:

wybemk 0.1 (git 40f3b16)
Built Wed Apr 22 13:37:07 AEST 2020
Using library directories:
    /usr/local/lib/wybe

Determinacy checking of procs does not check that proc declared `terminal` actually is.

For example, this code compiles when it shouldn't, because it should not be able to verify that the C exit is terminal:

def terminal my_exit(code:int) {
    foreign c exit(code)
}

Nested loops are optimized away

loop.wybe.txt

A single do loop like the one in the above attached file executes as expected.

nested_loop.wybe.txt

However, using a nested do or for loop optimizes the inner loop away and inlines only a single iteration of the outer loop.
Expected output:

Outer
Inner
Inner
...

Actual output:

Outer

The inner loop is optimized away somewhere between Clause.hs and Expansion.hs.

wybemk is unable to locate cbits.so

In the current makefile, it will build the wybemk and copy it to "~/.local/bin". However, the cbits.so is built and put in the wybe project folder.
When using wybemk to compile a program, it seems that the wybemk will try to locate the cbits.so in the work directory and throw an internal error "clang: error: no such file or directory: 'cbits.so'".

The compiler is also trying to use the "./wybelibs/wybe.wybe" in the current work directory.

Incorrect int comparison

?res = 10 > 5
!println(res)
?res = 10 < 5
!println(res)

gives output:

false
false

and for this program

!read(?a:int)
!read(?b:int)
if { a > b ::
    !println("a > b")
| otherwise ::
    !println("a <= b")
}

we have:

> ./test 
5 10
a <= b
> ./test 
10 5
a > b
> ./test 
5 -10
a <= b
> ./test 
-10 5
a > b

Compiler goes into an infinite loop when code contains cyclic dependencies

f1.wybe

use f2

pub def xxx(v:int, n:int) use !io {
    ?n = n - 1
    if { n < 0 :: 
            !println(v)
        | otherwise ::
            !yyy(v, n)
    }
}

!xxx(99, 10)

f2.wybe

use f1

pub def yyy(v:int, n:int) use !io {
    ?n = n - 1
    if { n < 0 :: 
            !println(v)
        | otherwise ::
            !xxx(v, n)
    }
}

Then:
wybemk f1

I suspect the problem is in orderedDependencies called from

wybe/src/Builder.hs

Line 776 in d7fcff1

depends <- orderedDependencies targetMod

Improve determinism analysis

Wybe issues an error message if you call a test procedure or function outside of a test context. So code like

if { lst = [] :: ?z = y
    | else :: ?z = [head(lst) | append(tail(lst), y)]
    }

causes an error, because the calls to head and tailare tests. However, because they appear in a context in which we know lst is not [], they actually can be guaranteed to succeed.

Because determinism errors are detected and reported during type/mode analysis, this must be addressed prior to transformation to LPVM, and since LPVM code is strictly deterministic, determinism errors need to be handled at the AST state, which makes the analysis messier.

I believe the easiest way to handle this would be to maintain sets of mutually exclusive calls, and sets of exhaustive tests. Initially, all the constructors of each type would be recorded as both mutually exclusive and exhaustive. Additionally, we should allow users to declare these sets. For example, the int type would include declarations that <= and > are mutually exclusive and exhaustive, likewise <, =, and >.

Given this information, when compiling code that follows success of a test, we can know that none of the calls that are mutually exclusive with it can succeed. When compiling code that follows failure of a test, we can know that one of the calls it is exhaustive with must succeed. When that set is a singleton, we can record that it must succeed. Thus in the example code above, because we know that and head(lst,?tmp) are exhaustive, in the else branch in that example, because has failed, then the call to head must succeed, and similarly for tail.

The type checker can't handle unary negation

test.wybe

?x = -1.3
!println(x)

output:

Error detected during type checking of module(s) test
./test.wybe:1:7: Type error in call to -, argument 1
./test.wybe:1:7: Type error in call to -, argument 1
./test.wybe:1:7: Toplevel call to - with 2 arguments, expected 3
./test.wybe:1:7: Toplevel call to - with 2 arguments, expected 3
./test.wybe:1:7: Toplevel call to - with 2 arguments, expected 3
./test.wybe:1:7: Toplevel call to - with 2 arguments, expected 3
./test.wybe:1:7: Toplevel call to - with 2 arguments, expected 3
./test.wybe:1:7: Toplevel call to - with 2 arguments, expected 3

Assertion failed: (i < getNumArgOperands() && "Out of bounds!"), function getArgOperand,

Compilation aborts with this message for any source file that defines a new data structure

Incorrect behavior of aliasMap.

There is a performance degradation in procedure range in the complex test case testcase_multi_specz-int_list caused by #87 .

The original procedure:

pub def range(start:int, stop:int, step:int, ?result:int_list) {
    ?result = []
    do {
        while start < stop
        ?result = [start | result]
        ?start = start + step
    }
    reverse(!result)
}

I rewrite it as below to avoid that issue.

pub def range(start:int, stop:int, step:int, ?result:int_list) {
    ?result = []
    range_loop(start, stop, step, !result)
    reverse(!result)
}

pub def range_loop(start:int, stop:int, step:int, !result:int_list) {
    if { start < stop ::
        ?result = [start | result]
        range_loop(start+step, stop, step, !result)
    }
}

and get LPVM code as below:

range > public (0 calls)
0: range(start#0:wybe.int, stop#0:wybe.int, step#0:wybe.int, ?result#2:int_list.int_list):
 AliasPairs: []
 InterestingCallProperties: []
    int_list.range_loop<0>(~start#0:wybe.int, ~stop#0:wybe.int, ~step#0:wybe.int, 0:int_list.int_list, ?%result#1:int_list.int_list) #1 @int_list:26:5
    int_list.reverse_helper<0>(~%result#1:int_list.int_list, 0:int_list.int_list, ?%result#2:int_list.int_list) #3 @int_list:155:42


range_loop > public (2 calls)
0: range_loop(start#0:wybe.int, stop#0:wybe.int, step#0:wybe.int, result#0:int_list.int_list, ?result#2:int_list.int_list):
 AliasPairs: [(result#0,result#2)]
 InterestingCallProperties: []
    foreign llvm icmp slt(start#0:wybe.int, stop#0:wybe.int, ?tmp$2#0:wybe.bool) @wybe:31:35
    case ~tmp$2#0:wybe.bool of
    0:
        foreign llvm move(~result#0:int_list.int_list, ?result#2:int_list.int_list)

    1:
        foreign lpvm alloc(16:wybe.int, ?tmp$7#0:int_list.int_list)
        foreign lpvm mutate(~tmp$7#0:int_list.int_list, ?tmp$8#0:int_list.int_list, 0:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, start#0:wybe.int)
        foreign lpvm mutate(~tmp$8#0:int_list.int_list, ?tmp$9#0:int_list.int_list, 8:wybe.int, 1:wybe.int, 16:wybe.int, 0:wybe.int, ~result#0:int_list.int_list)
        foreign llvm add(~start#0:wybe.int, step#0:wybe.int, ?tmp$1#0:wybe.int) @wybe:20:34
        int_list.range_loop<0>(~tmp$1#0:wybe.int, ~stop#0:wybe.int, ~step#0:wybe.int, ~tmp$9#0:int_list.int_list, ?%result#2:int_list.int_list) #3 @int_list:33:9

The alias analysis failed to handle this case that the first int_list arg in the call to range_loop in range is a constant.
So it will think the return value result#1 is aliased.

While trying to fix it, I found something odd.
The aliasmap after the call to range_loop is like this:

current aliasMap: fromList [fromList [LiveVar lst#0,LiveVar result#1],fromList [LiveVar result#2,MaybeAliasByParam result#2],fromList [MaybeAliasByParam start#0],fromList [MaybeAliasByParam step#0],fromList [MaybeAliasByParam stop#0]]

The lst#0 isn't even a variable in the procedure range. It seems the alias analysis mixed up the parameters of the callee and the arguments of the caller.

Makefile do not escape spaces in user control paths

This was tested in a Ubuntu environment.
For example, when the path of the wybe compiler's git repo is something like:
/home/wybe compiler/
make all will failed to copy the wybemk file from stack's tmp directory to the wybe compiler's root directory
The exact command causing the above issue is cp `stack path --local-install-root`/bin/$@ $@ in line 43 of Makefile
A simple fix would be adding quote to the command:
cp "`stack path --local-install-root`/bin/$@" $@
I am not familiar with MacOS enough to know will this break though
This same issue also occur if INSTALLBIN or INSTALLLIB contain spaces for some reason.

Builder doesn't handle imports in sub-modules

I belive the cause is here:

wybe/src/Builder.hs

Lines 435 to 460 in 9bfffb1

    
           compileModule :: FilePath -> ModSpec -> Maybe [Ident] -> [Item] -> Compiler () 
        
           compileModule source modspec params items = do 
        
               logBuild $ "===> Compiling module " ++ showModSpec modspec 
        
               enterModule source modspec (Just modspec) params 
        
               -- Hash the parse items and store it in the module 
        
               let hashOfItems = hashItems items 
        
               logBuild $ "HASH: " ++ hashOfItems 
        
               updateModule (\m -> m { itemsHash = Just hashOfItems }) 
        
               -- verboseMsg 1 $ return (intercalate "\n" $ List.map show items) 
        
               -- XXX This means we generate LPVM code for a module before 
        
               -- considering dependencies.  This will need to change if we 
        
               -- really need dependency info to do initial LPVM translation, or 
        
               -- if it's too memory-intensive to load all code to be compiled 
        
               -- into memory at once.  Note that this does not do a proper 
        
               -- top-down traversal, because we may not visit *all* users of a 
        
               -- module before handling the module.  If we decide we need to do 
        
               -- that, then we'll need to handle the full dependency 
        
               -- relationship explicitly before doing any compilation. 
        
               Normalise.normalise items 
        
               stopOnError $ "preliminary processing of module " ++ showModSpec modspec 
        
               loadImports 
        
               stopOnError $ "handling imports for module " ++ showModSpec modspec 
        
               mods <- exitModule -- may be empty list if module is mutually dependent 
        
               logBuild $ "<=== finished compling module " ++ showModSpec modspec 
        
               logBuild $ "     module dependency SCC: " ++ showModSpecs mods 
        
               compileModSCC mods

The loadImports only handles the imports of the root module.

If #71 is merged, then we also need to fix the corresponding part for loading from an object file.

I belive the builder needs a large refactor, for more detail, please find XXXs added by PR #71 .

Move the documentation into a separate markdown file, out of the README.md.

The Scanner doesn't handle float literal properly.

err.wybe

?x = 0.01
!println(x)

output:

Syntax Error: "./err.wybe" (line 1, column 7):
unexpected TokSymbol "." "./err.wybe" (line 1, column 7)
expecting simple expression terms, relational expressions or end of input

I believe the wrong part is here, a number that stars with 0 is considered to be octal.

wybe/src/Scanner.hs

Lines 113 to 115 in db7242e

    
           | isDigit c = let (tok,rest,newpos) =  
        
                               scanNumberToken (if c=='0' then 8 else 10) 
        
                               (fromIntegral $ digitToInt c)

Also, based on the current document. For the scientific notation of a float, the "order of magnitude" part can't be negative. That's a bit odd.

Put more examples in the documentation

bitwidth warning compiling cbits.c

When compiling cbits.c, I get these warning messages:

wybelibs/cbits.c:16:35: warning: format specifies type 'long' but the argument
      has type 'int64_t' (aka 'long long') [-Wformat]
    return (int64_t)printf("%ld", x);
                            ~~~   ^
                            %lld
wybelibs/cbits.c:28:42: warning: format specifies type 'long' but the argument
      has type 'int64_t' (aka 'long long') [-Wformat]
  return (int64_t)fprintf(stderr, "%ld", x);
                                   ~~~   ^
                                   %lld
wybelibs/cbits.c:51:18: warning: format specifies type 'long *' but the argument
      has type 'int64_t *' (aka 'long long *') [-Wformat]
    scanf("%ld", &x);
           ~~~   ^~
           %lld

Add subtype support to Wybe

Allow a type to be a subtype of another type. For a type a to be a subtype of b, it would be required for there to be a total function named b from a to b. Then it would be allowed to pass a value x:a anywhere a b was expected; the compiler would automatically pass b(x) instead.

This permits multiple inheritance. It also permits bi-directional conversion, in cases where two types are effectively semantically equivalent, though perhaps implemented differently.

For now, don't search for a conversion route between types: do the conversion from a to b if there's a function named b from a to b; otherwise, the conversion is not supported.

Allow the conversion function to be defined in any module. E.g., module c can import both a and b, which are not related, but if c defines or imports a function b from a to b, then a is a subtype of b and c can pass an a anywhere a b is expected.

Unable to compile use_resource.wybe (one of the test cases)

trying to compile test cases inside /test-cases/final-dump/, use_resource.wybe gives error related to the use of resources

$wybemk use_resource
gives the following error:
Unknown resource use_resource.count
while
$wybemk use_resource.o
gives no error

The current implementation of Union-Find seems to be wrong.

I was trying to fix #21 which requires merging multiple alias maps and I found a case where the result is wrong.

Util.combineUf (fromList [(a#0,a#0),(b#0,b#0),(r#0,a#0)]) (fromList [(a#0,a#0),(b#0,b#0),(r#0,b#0)]) returns fromList [(a#0,a#0),(b#0,b#0),(r#0,a#0)]

Util.uniteUf (fromList [(a#0,a#0),(b#0,b#0),(r#0,b#0)]) (r#0) (a#0) returns fromList [(a#0,a#0),(b#0,b#0),(r#0,a#0)].

I don't know how to fix it due to the current implementation doesn't make sense to me.
My questions are:

What is the "weight" for?

wybe/src/Util.hs

Line 162 in f5cae83

weight :: Int

If I understand it correctly, each node records its root directly instead of its parent.

wybe/src/Util.hs

Lines 273 to 282 in f5cae83

    
           | weight rootP <= weight rootQ -> 
        
               let 
        
                   -- update p's root to q's root 
        
                   ip' = ip {root = rq} 
        
                   -- append p's weight to q's root's weight 
        
                   rootQ' = rootQ {weight = weight rootQ + weight ip} 
        
                   currMap1 = Map.insert p ip' currMap 
        
                   -- also make p's children point to new root (ip') 
        
                   currMap2 = _updateRoot p ip' currMap1 
        
               in Map.insert rq rootQ' currMap2

Each update requires visiting all nodes in the map and update old root to new root.

wybe/src/Util.hs

Lines 334 to 339 in f5cae83

    
           _updateRoot cond newInfo = 
        
               Map.foldrWithKey (\k i uf -> 
        
                   if root i == cond 
        
                   then Map.insert k newInfo uf 
        
                   else Map.insert k i uf 
        
                   ) Map.empty

I don't think it should be considered as union find since the union operation takes O(n) time instead of O(log n). Any naive approaches for this question run in O(n) (List of List, List of Set, Map from item to group id etc).

Implement automatic parameter/argument globalisation

Using some suitable heuristic, automatically pass some procedure input and output arguments in global variables. This will be beneficial for arguments passed around widely through the program, and probably detrimental for arguments that change frequently.

It would probably be best to actually store all these in allocated heap memory, so that each thread can have its own set, when we support multithreading.

	ModuleSource (Just srcfile) (Just objfile) Nothing Nothing -> do
	-- both source and object exist: rebuild if necessary
	srcDate <- (liftIO . getModificationTime) srcfile
	dstDate <- (liftIO . getModificationTime) objfile
	if force \|\| srcDate > dstDate
	then do
	unless force (extractModules objfile)
	buildModule modspec srcfile
	return True
	else do
	loaded <- loadModuleFromObjFile modspec objfile
	unless loaded $ do
	-- Loading failed, fallback on source building
	logBuild $ "Falling back on building " ++
	showModSpec modspec
	buildModule modspec srcfile
	return $ not loaded

	ModuleSource (Just srcfile) (Just objfile) _ _ -> do
	-- TODO: Multiple specialization make this part not working because
	-- the object file can change (new specz requirement) even if the
	-- source code is not changed. Need something better. Also that the
	-- original version doesn't work anyway. Due to inlining, a object
	-- file can be changed if its dependencies changes.

	-- srcDate <- (liftIO . getModificationTime) srcfile
	-- dstDate <- (liftIO . getModificationTime) objfile
	-- return $ srcDate > dstDate
	return True

	def <- lift $ lift $ getProcDef pspec
	case procImpln def of
	ProcDefSrc _ -> shouldnt $ "uncompiled proc: " ++ show pspec
	ProcDefPrim proto body _ _ ->

	compileModule :: FilePath -> ModSpec -> Maybe [Ident] -> [Item] -> Compiler ()
	compileModule source modspec params items = do
	logBuild $ "===> Compiling module " ++ showModSpec modspec
	enterModule source modspec (Just modspec) params
	-- Hash the parse items and store it in the module
	let hashOfItems = hashItems items
	logBuild $ "HASH: " ++ hashOfItems
	updateModule (\m -> m { itemsHash = Just hashOfItems })
	-- verboseMsg 1 $ return (intercalate "\n" $ List.map show items)
	-- XXX This means we generate LPVM code for a module before
	-- considering dependencies. This will need to change if we
	-- really need dependency info to do initial LPVM translation, or
	-- if it's too memory-intensive to load all code to be compiled
	-- into memory at once. Note that this does not do a proper
	-- top-down traversal, because we may not visit all users of a
	-- module before handling the module. If we decide we need to do
	-- that, then we'll need to handle the full dependency
	-- relationship explicitly before doing any compilation.
	Normalise.normalise items
	stopOnError $ "preliminary processing of module " ++ showModSpec modspec
	loadImports
	stopOnError $ "handling imports for module " ++ showModSpec modspec
	mods <- exitModule -- may be empty list if module is mutually dependent
	logBuild $ "<=== finished compling module " ++ showModSpec modspec
	logBuild $ " module dependency SCC: " ++ showModSpecs mods
	compileModSCC mods

	\| isDigit c = let (tok,rest,newpos) =
	scanNumberToken (if c=='0' then 8 else 10)
	(fromIntegral $ digitToInt c)

	\| weight rootP <= weight rootQ ->
	let
	-- update p's root to q's root
	ip' = ip {root = rq}
	-- append p's weight to q's root's weight
	rootQ' = rootQ {weight = weight rootQ + weight ip}
	currMap1 = Map.insert p ip' currMap
	-- also make p's children point to new root (ip')
	currMap2 = _updateRoot p ip' currMap1
	in Map.insert rq rootQ' currMap2

	_updateRoot cond newInfo =
	Map.foldrWithKey (\k i uf ->
	if root i == cond
	then Map.insert k newInfo uf
	else Map.insert k i uf
	) Map.empty

pschachte / wybe Goto Github PK

wybe's People

Contributors

Stargazers

Watchers

Forkers

wybe's Issues

Recommend Projects

Recommend Topics

Recommend Org