Coder Social home page Coder Social logo

laythe-lang / laythe Goto Github PK

View Code? Open in Web Editor NEW
64.0 4.0 4.0 32.34 MB

A gradually typed language originally based on the crafting interpreters series

License: MIT License

Rust 97.74% Python 0.83% Ruby 0.71% HTML 0.20% JavaScript 0.51%
virtual-machine crafting-interpreters programming-language laythe gradual-typing interpreter

laythe's People

Contributors

dependabot[bot] avatar jonnyboyc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

laythe's Issues

Migrate all collections to GC

Fundamental our current strategy for tracking VM size is extremely error prone. The Allocator methods .grow and shrink are poor and slows ways to attempt to capture resizing for structures that are simply not owned by the VM.

I think we should approach this in a few steps.

  • Remove .grow and .shrink and stop pretending that we are actually capturing heap resizing for collections outside the stack
  • Replace usages of boxed slices with the existing GcArray
  • Create replacements for Vec, HashSet and HashMap. It will be easier to correctly track size by simply preventing resizing instead requesting it from the GC.

SetIndex fails on multiple assign

Problem

Currently SetIndex does not correctly support multiple assignments like SetLocal and SetProperty. Here is an example:

laythe:> let a = [1, 2]; a[0] = a[1] = 5;

script.lay
  0000    1 List         
  0001    | Constant          1 1
  0003    | Constant          2 2
  0005    | ListInit          2
  0008    | DefineGlobal      0 'a'
  0011    | GetGlobal         0 'a'
  0014    | Constant          3 0
  0016    | GetGlobal         0 'a'
  0019    | Constant          1 1
  0021    | Constant          4 5
  0023    | SetIndex     
  0024    | SetIndex     
  0025    | Drop         
  0026    | Nil          
  0027    | Return       
Local Stack:  [ nil ]
  0000    1 List         
Local Stack:  [ nil ][ [] ]
  0001    | Constant          1 1
Local Stack:  [ nil ][ [] ][ 1 ]
  0003    | Constant          2 2
Local Stack:  [ nil ][ [] ][ 1 ][ 2 ]
  0005    | ListInit          2
Local Stack:  [ nil ][ [1, 2] ]
  0008    | DefineGlobal      0 'a'
Local Stack:  [ nil ]
  0011    | GetGlobal         0 'a'
Local Stack:  [ nil ][ [1, 2] ]
  0014    | Constant          3 0
Local Stack:  [ nil ][ [1, 2] ][ 0 ]
  0016    | GetGlobal         0 'a'
Local Stack:  [ nil ][ [1, 2] ][ 0 ][ [1, 2] ]
  0019    | Constant          1 1
Local Stack:  [ nil ][ [1, 2] ][ 0 ][ [1, 2] ][ 1 ]
  0021    | Constant          4 5
Local Stack:  [ nil ][ [1, 2] ][ 0 ][ [1, 2] ][ 1 ][ 5 ]
  0023    | SetIndex     
Local Stack:  [ nil ][ [1, 5] ][ 0 ][ [1, 5] ]
  0024    | SetIndex     

thread 'main' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)

Currently we leave a copy of the list on top of the stack instead of 5. I also want to transition away from an explicit list and map functionality in GetIndex and SetIndex. Instead I want these methods to define a [] and []= where SetIndex manipulates the stack to get the desired result.

Iterator skip & take

After doing more implementing in Laythe the common iterator methods .take and .skip are becoming more of a pain point. I believe like essentially all iterator methods that don't produces a single value like .reduce .all or .any these methods should be lazy. Below shows what I believe the behavior should be

let l1 = [1, 2, 3].iter();
let s = l1.skip(2);
print(List.collect(s));
// [3]
// l1 is now exhausted

let l2 = [1, 2, 3].iter();
let t = l2.take(2);
print(List.collect(t));
// [1, 2]
// l2 can still yield 3

Duplicate Drop

For loop jumps, such as continue or break we have two instances of drops.

while true {
  let a = 10;
  // <-- drop a
  break;
  // <-- drop a
}

While the second invocation of a will never be hit there are two issues.

  • Instruction bloat. We could be more compacting representing the code for instruction that can't be hit
  • The real issue is we're not accounting for the max number of function slots correctly. Below is an example
fn cool() {
  while true {
    let a = 10;
    // <-- drop a
    break;
    // <-- drop a
  }

  let b = 10; 
  let c = 10; 
  let d = 10; 
}

In this example we decimate the count by 2 for the dropping of a. Which would lead us to undercount the slots by run which could lead to a runtime panic. Additionally we could decimate the counter such that we underflow usize during compliation

GC Bug

Summary

It currently appears that at some point I introduced a non deterministic GC bug. Using gc_stress feature flag we frequently get crashes. We'll need to hunt down this issue.

Setup CI

Summary

As I continue to work on this project setting up CI will be of greater benefit. Want want to look into setting up a docker image.

Extra values in vm stack when exception is caught in the same function

Summary

Currently there is issue with exception handling when the catch block is handled in the current function. See this script snippet

let list = [];
try {
  list.insert();
  assert(false);
} catch {
  assert(true);
}

The following snippet will throw an exception on the line list.insert() because not enough arguments were provided. This currently generates this bytecode

0000    0 List            
0001    | DefineGlobal         0 'list'
0004    | GetGlobal            0 'list'
0007    2 Invoke               0 (1 args) 'insert'  // <- will throw on this instruction
0011    | Drop            
0012    | GetGlobal            2 'assert'
0015    3 False           
0016    | Call                 1
0018    | Drop            
0019    | Jump                19 -> 29
0022    4 GetGlobal            2 'assert'
0025    5 True            
0026    | Call                 1
0028    | Drop                   
0039    | Nil             
0030   13 Return    

This produces the following debug information

Stack:        
0000    0 List            
Stack:        [ [] ]
0001    | DefineGlobal         0 'list'
Stack:        
0004    | GetGlobal            0 'list'
Stack:        [ [] ]
0007    2 Invoke               0 (1 args) 'insert'
Exception popped 0000 frames caught by: script
Stack:        [ [] ]
0022    4 GetGlobal            2 'assert'
Stack:        [ [] ][ <native assert> ]
0025    5 True            
Stack:        [ [] ][ <native assert> ][ true ]
0026    | Call                 1

Because of the proceeding GetGlobal instruction we have an empty list on the stack when the exception is thrown and then caught. The program terminates correctly with the correct output. It not clear if this can produce an incorrect behavior or if this simply wastes a small amount of stack space. Eventually if I continue to look to wren for ideas it may make sense for the compiler to be more aware of stack effects of each instruction. It may then be possible to calculate while compiling how many values should be on the stack at the end of the try catch block.

Investigate adding std::any::Any to instance

I'm thinking there may be some benefit to adding some generic fields to the Instance struct. Something like

#[derive(PartialEq, Clone)]
pub struct Instance {
  pub class: Managed<Class>,
  fields: Box<[Value]>,
  native_data: Option<Managed<ManagedAny>>
} 

I think this gives us some options to store extra data related to native functionality without need several new types to values. The example in mind is the beginnings of the regexp module. Here I'm primarily just wrapping the regex crate. For each of the three test methods I have test, match and captures I have currently construct a new Regex struct each time this is invoked which is quite wasteful. Instead I could make a new trait ManagedAny something like

trait ManagedAny : Manage + any::Any {}

This way I can store essentially arbitrary extra data on an instance such as a Regex struct and implement the normal Manage members. Inside methods on my Regexp class I should be able to call.

let instance = this.unwrap().to_instance();
let regex = match instance.native_data.downcast_ref::<Regex>().expect("Expected Regex")
// use regex

Native functions need to indicate if they are stackless

Summary

Currently how native function are implemented assumes that these functions are stackless and immediately return. This is left over from the original crafting interpreters implementation when all native functions were stackless. Specifically the only native function was clock() which returned a single double.

Now Laythe implements several methods, specially on iterators, that can take any callable value. Here the stack from can be extended beyond the original native call. This leads to issues where values are popped from that stack that shouldn't be. From initially probing I think Native functions need to make the distinction between stackful and stackless functions. In the case of stackful function I believe we need to push a frame onto the stack unify the logic and to increase debugibility.

Implement class / static methods

Summary

Add the ability to have methods attached to a class directly without an instance. To accommodate this two changes will need to occur.

  1. Rename this to self cuz
  2. Syntax change to methods to take self as the first parameter of instance methods
  3. either add a class class or an additional field on the class object to hold class methods
  4. Add super method to class class to retrieve a reference to the super class

Example

Here is an example of what I want to syntax to look like after these changes.

class Base {
  init(self) {
    self.a = 10;
  }

  instance(self) {
    return self.a;
  }

  cls() {
    return "cat";
  }
}

class Extended < Base {
  init(self) {
    super.init();
  }

  get(self) {
    return super.cls();
  }

  cls() {
    return true;
  }
}

let e = Extended();

e.get(); // "cat"
e.cls() // true
Base.cls() // "cat"
Extended.cls() // true

Eagerly Compile Imported Code

As of today Laythe compiles code just before it attempts to run it. In most cases this will occur nearly immediately but as an example say we have these two modules

main.lay

10000.times().iter().each(() => {
  // slow operation
});

import self.lib:{foo}

lib.lay

let bar = 10;

export let foo = bard + 1; // compile error unknown symbol

In laythe today we would indicate the compiler error once we hit the import after the slow operation. Ideally we want to warn of these issues as quickly as possible.

Exception Handler Popping issue

Problem

Right now there is an issue with pushing and popping exception handlers. Right now we only really handle two cases

try {
  // stuff 
  // <- pop here
} catch {
  // <- pop here
  // stuff
}

At first this seems fine we pop the handler at the end of the try catch block and we pop it at the begin of the catch if an exception is handled. The issue is we can exit this block in other ways

try {
  // stuff 
  return 10;
  // <- pop here
} catch { 
  // <- pop here
  // stuff
}

In this case we return prior to popping the exception handler.

Additional we have a similar case with continue and break

while true {
  try {
    // stuff 
    continue; // or break
    // <- pop here
  } catch { 
    // <- pop here
    // stuff
  }
}

Here continue is worse because we can build up a large number of exception handlers that never get popped. This issue materialized because we had a case where an inner exception handler wasn't popped because of a return. Later an exception was through inside the scope of another handler but the first handler was used and attempted to index out of bounds.

ByteCode definition leading to large "bytes" ?

Hello! First I would like to thank you for having this repository. It has been invaluable as I work through crafting interpreters with rust.

In your definition for bytecode you allow some enums to have values, but this will cause all the enums to have that size:
https://github.com/jonnyboyC/spacelox/blob/0808b0138e399539d28e30feef9baef0cc5c34e4/src/chunk.rs#L46

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=0a47a8a24b4c8c33e01c93001bd05ecf

Was this intentional?

Expression Conditional

Problem

Currently the language has no mechanism to do conditionals as expressions which as I test out actually developing in it quickly becomes a pain. The only real mechanism right now is something similar to IIF in JavaScript or using short circuit logical operators

let myValue = || {
  if true {
    return 10;
  } else {
    return 8
  }
}();

let myValue = true and 10 or 8;

Obviously this a lot of code compounded by the additional lack of implicit return. On top of that we pay for the function call which is probably 2 or 3 times more expensive than the branch.

I haven't decided which way I'd like to go but likely I'd go with how Js or rust does conditional

const myValue = true ? 10 : 8;
let myValue = if true { 10 } else { 8 };

The javascript / C style is obviously very compact with the downside of not being able to use statements. The rust style paired with implicit return is were I currently lean. Mostly because I need to actually get block statements working so I've already committed to a similar instruction

Implement Minimal TypeChecker for basic class shape

Summary

Right now we don't have any type checking in Laythe. So far this has been fine but we have a synergy now with another problem. Specifically we don't have a good way for a user to subclass a "native" class. Today our native classes are built in the rust code but because of that our regular compiler code doesn't know which fields or method are available on the class. Today we use the ClassAttributes structs to track fields but these are only present in user provided code.

In order to solve this we could produce a minimal type checker that really only handles classes and primitives. Once we have this in place we can build a type for each native function, constant and class. In the user provided code we could then use this type signature to extend

Shorthand for .iter() and .into(<Class>.collect)

Problem

While working on advent of code 2021 I've noticed there is a quite a bit of boilerplate with both .iter() and .into(<Class>.collect). Essentially converting a container too and from an iterator. I'd like to consider add some sort of sugar make this explicit still but shorter.

let a = [1,2, 3]$.map(|x| x).toList()

Move Allocator up to laythe_vm

Summary

I keep running into issue where new GC bugs pop up as I progress through these refactors. In general I'd like to be able to run cargo test in ci with the gc_stress on to gc on every allocation to minimize the chance of introducing one of these bugs.

Overall I think to achieve this I need to do something along the lines of thing

  • Move the current allocator to laythe_vm crate
  • Implement a new dummy allocator in laythe_core. Likely we can just have it never GC to simplify the implementation
  • Route essentially all current allocator call through GcHooks to simplify the interface where we can have a test context that uses our dummy allocator
  • Move the crate feature up to laythe_vm so we can have it be a feature when we call cargo test

Add Channel Support

Originally I thinking of following Wren's direction by offer fiber communication through a yield keyword. I think there are some restrictions here the biggest being distributing work among many fibers. I think here I want to go with approach more similar to go where I have another first class value as a channel.

As some potential syntax we could have some like this

fn printFile(chan: channel) -> string {
  // I think like go we should have a conveniences
  // keyword around channels that effectively reads
  // out of the channel until it's closed
  for path in drain channel {

    // I think we also want the std lib methods to yield during 
    print(File.readallText(path));
  }
}

let paths = [
  // a bunch of file paths
]

// I think I'm leaning towards have 
// channel creation look like an instance
// creation. I'm not sure if it makes sense
// to use a keyword like chan
let chan = Channel();

for _ in 4.times() {
  // I'm between spawn and run. I'd probably
  // want to avoid go since that will probably
  // be somewhat confusing with go itself
  spawn printFile(chan);
}

for path in paths {
  // push value into channel
  // some fiber will read the value
  chan <- path
}

// close the channel once we're done
chan.close();

I think there are a few new pieces of syntax that may need to be introduced.

// push a value into the channel
chan <- value

// read a value from the channel
let value <- chan

// drain keyword to treat a channel like an iterator
for value in drain chan {

}

// I think something like go's select will also be a good idea
// I think since I'd ideally like to have a match statement similar to rust
// I'll have this select somewhat mirror that

// here is the non blocking case
select {
  chan1(msg) => {
    print('msg')
  }
  chan2(msg) => {
    print('msg')
  }
  _ => {
    print('example');
  }
}

// here is the blocking case
select {
  chan1(msg) => {
    print('msg')
  }
  chan2(msg) => {
    print('msg')
  }
}

Some methods that may make sense on channels

// I think we can optionally take a parameter
// to the channel constructor to indicate if it should be buffered
// I think as an mvp we only have unbuffered channels
let chan = Channel(5);

chan.isClose();
// boolean value is this channel closed

chan.bufferLen()
// how many values will this channel buffer

// restrict the interface of the channel to be
// read or write only
// I think under the hood this should only change the
// class of the channel but otherwise should be the same data
let writeOnlyChan = chan.writeOnly();
let readOnlyChan = chan.readOnly();

I'll need to gather some thoughts on how actually implement this.

Calling Methods in HOF doesn't work correctly

Currently the follow script will report an incorrect runtime error

class A {
  foo(x) { x / 2 }
}

let iter = [1, 2, 3, 4].iter().map(A().foo);

assertEq(iter.next(), true);
// RuntimeError: Iter is not callable.

The stack for the script is getting incorrectly setup.

Bug with static methods passed to HOF

The following function will case and error in the interpreter.

// boom
let numbers = '1 2 4 25 103'.split(' ').map(Number.parse).into(List.collect);

class Test {
  static addOne(x) {
    return x + '1'
  }
}

// this is fine
let addedOne = '1 2 4 25 103'.split(' ').map(Test.addOne).into(List.collect);

It appears this is an issue with static method when they are placed inside a Method object

Add User Module Support for nested files

With #99 we added our initial implementation of user level modules. The biggest doesn't side is we're currently restricted to just the directory. The next iteration of user modules should allow importing files of arbitrary nesting.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.