laythe-lang / laythe Goto Github PK
View Code? Open in Web Editor NEWA gradually typed language originally based on the crafting interpreters series
License: MIT License
A gradually typed language originally based on the crafting interpreters series
License: MIT License
Fundamental our current strategy for tracking VM size is extremely error prone. The Allocator
methods .grow
and shrink
are poor and slows ways to attempt to capture resizing for structures that are simply not owned by the VM.
I think we should approach this in a few steps.
.grow
and .shrink
and stop pretending that we are actually capturing heap resizing for collections outside the stackVec
, HashSet
and HashMap
. It will be easier to correctly track size by simply preventing resizing instead requesting it from the GC.Currently SetIndex
does not correctly support multiple assignments like SetLocal
and SetProperty
. Here is an example:
laythe:> let a = [1, 2]; a[0] = a[1] = 5;
script.lay
0000 1 List
0001 | Constant 1 1
0003 | Constant 2 2
0005 | ListInit 2
0008 | DefineGlobal 0 'a'
0011 | GetGlobal 0 'a'
0014 | Constant 3 0
0016 | GetGlobal 0 'a'
0019 | Constant 1 1
0021 | Constant 4 5
0023 | SetIndex
0024 | SetIndex
0025 | Drop
0026 | Nil
0027 | Return
Local Stack: [ nil ]
0000 1 List
Local Stack: [ nil ][ [] ]
0001 | Constant 1 1
Local Stack: [ nil ][ [] ][ 1 ]
0003 | Constant 2 2
Local Stack: [ nil ][ [] ][ 1 ][ 2 ]
0005 | ListInit 2
Local Stack: [ nil ][ [1, 2] ]
0008 | DefineGlobal 0 'a'
Local Stack: [ nil ]
0011 | GetGlobal 0 'a'
Local Stack: [ nil ][ [1, 2] ]
0014 | Constant 3 0
Local Stack: [ nil ][ [1, 2] ][ 0 ]
0016 | GetGlobal 0 'a'
Local Stack: [ nil ][ [1, 2] ][ 0 ][ [1, 2] ]
0019 | Constant 1 1
Local Stack: [ nil ][ [1, 2] ][ 0 ][ [1, 2] ][ 1 ]
0021 | Constant 4 5
Local Stack: [ nil ][ [1, 2] ][ 0 ][ [1, 2] ][ 1 ][ 5 ]
0023 | SetIndex
Local Stack: [ nil ][ [1, 5] ][ 0 ][ [1, 5] ]
0024 | SetIndex
thread 'main' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)
Currently we leave a copy of the list on top of the stack instead of 5. I also want to transition away from an explicit list and map functionality in GetIndex
and SetIndex
. Instead I want these methods to define a []
and []=
where SetIndex manipulates the stack to get the desired result.
After doing more implementing in Laythe the common iterator methods .take
and .skip
are becoming more of a pain point. I believe like essentially all iterator methods that don't produces a single value like .reduce
.all
or .any
these methods should be lazy. Below shows what I believe the behavior should be
let l1 = [1, 2, 3].iter();
let s = l1.skip(2);
print(List.collect(s));
// [3]
// l1 is now exhausted
let l2 = [1, 2, 3].iter();
let t = l2.take(2);
print(List.collect(t));
// [1, 2]
// l2 can still yield 3
Today we only allow user level fiber suspension for via Sends and Receives. If we want to enable async IO we'll need to do similar in native code
For loop jumps, such as continue
or break
we have two instances of drops.
while true {
let a = 10;
// <-- drop a
break;
// <-- drop a
}
While the second invocation of a will never be hit there are two issues.
fn cool() {
while true {
let a = 10;
// <-- drop a
break;
// <-- drop a
}
let b = 10;
let c = 10;
let d = 10;
}
In this example we decimate the count by 2 for the dropping of a. Which would lead us to undercount the slots by run which could lead to a runtime panic. Additionally we could decimate the counter such that we underflow usize
during compliation
It currently appears that at some point I introduced a non deterministic GC bug. Using gc_stress
feature flag we frequently get crashes. We'll need to hunt down this issue.
As I continue to work on this project setting up CI will be of greater benefit. Want want to look into setting up a docker image.
Currently there is issue with exception handling when the catch block is handled in the current function. See this script snippet
let list = [];
try {
list.insert();
assert(false);
} catch {
assert(true);
}
The following snippet will throw an exception on the line list.insert()
because not enough arguments were provided. This currently generates this bytecode
0000 0 List
0001 | DefineGlobal 0 'list'
0004 | GetGlobal 0 'list'
0007 2 Invoke 0 (1 args) 'insert' // <- will throw on this instruction
0011 | Drop
0012 | GetGlobal 2 'assert'
0015 3 False
0016 | Call 1
0018 | Drop
0019 | Jump 19 -> 29
0022 4 GetGlobal 2 'assert'
0025 5 True
0026 | Call 1
0028 | Drop
0039 | Nil
0030 13 Return
This produces the following debug information
Stack:
0000 0 List
Stack: [ [] ]
0001 | DefineGlobal 0 'list'
Stack:
0004 | GetGlobal 0 'list'
Stack: [ [] ]
0007 2 Invoke 0 (1 args) 'insert'
Exception popped 0000 frames caught by: script
Stack: [ [] ]
0022 4 GetGlobal 2 'assert'
Stack: [ [] ][ <native assert> ]
0025 5 True
Stack: [ [] ][ <native assert> ][ true ]
0026 | Call 1
Because of the proceeding GetGlobal instruction we have an empty list on the stack when the exception is thrown and then caught. The program terminates correctly with the correct output. It not clear if this can produce an incorrect behavior or if this simply wastes a small amount of stack space. Eventually if I continue to look to wren for ideas it may make sense for the compiler to be more aware of stack effects of each instruction. It may then be possible to calculate while compiling how many values should be on the stack at the end of the try catch block.
I'm thinking there may be some benefit to adding some generic fields to the Instance
struct. Something like
#[derive(PartialEq, Clone)]
pub struct Instance {
pub class: Managed<Class>,
fields: Box<[Value]>,
native_data: Option<Managed<ManagedAny>>
}
I think this gives us some options to store extra data related to native functionality without need several new types to values. The example in mind is the beginnings of the regexp
module. Here I'm primarily just wrapping the regex
crate. For each of the three test methods I have test
, match
and captures
I have currently construct a new Regex
struct each time this is invoked which is quite wasteful. Instead I could make a new trait ManagedAny
something like
trait ManagedAny : Manage + any::Any {}
This way I can store essentially arbitrary extra data on an instance such as a Regex
struct and implement the normal Manage
members. Inside methods on my Regexp
class I should be able to call.
let instance = this.unwrap().to_instance();
let regex = match instance.native_data.downcast_ref::<Regex>().expect("Expected Regex")
// use regex
Currently how native function are implemented assumes that these functions are stackless and immediately return. This is left over from the original crafting interpreters implementation when all native functions were stackless. Specifically the only native function was clock()
which returned a single double.
Now Laythe implements several methods, specially on iterators, that can take any callable value. Here the stack from can be extended beyond the original native call. This leads to issues where values are popped from that stack that shouldn't be. From initially probing I think Native functions need to make the distinction between stackful and stackless functions. In the case of stackful function I believe we need to push a frame onto the stack unify the logic and to increase debugibility.
Add the ability to have methods attached to a class directly without an instance. To accommodate this two changes will need to occur.
this
to self
cuzself
as the first parameter of instance methodssuper
method to class class to retrieve a reference to the super classHere is an example of what I want to syntax to look like after these changes.
class Base {
init(self) {
self.a = 10;
}
instance(self) {
return self.a;
}
cls() {
return "cat";
}
}
class Extended < Base {
init(self) {
super.init();
}
get(self) {
return super.cls();
}
cls() {
return true;
}
}
let e = Extended();
e.get(); // "cat"
e.cls() // true
Base.cls() // "cat"
Extended.cls() // true
As of today Laythe compiles code just before it attempts to run it. In most cases this will occur nearly immediately but as an example say we have these two modules
10000.times().iter().each(() => {
// slow operation
});
import self.lib:{foo}
let bar = 10;
export let foo = bard + 1; // compile error unknown symbol
In laythe today we would indicate the compiler error once we hit the import after the slow operation. Ideally we want to warn of these issues as quickly as possible.
Right now there is an issue with pushing and popping exception handlers. Right now we only really handle two cases
try {
// stuff
// <- pop here
} catch {
// <- pop here
// stuff
}
At first this seems fine we pop the handler at the end of the try catch block and we pop it at the begin of the catch if an exception is handled. The issue is we can exit this block in other ways
try {
// stuff
return 10;
// <- pop here
} catch {
// <- pop here
// stuff
}
In this case we return prior to popping the exception handler.
Additional we have a similar case with continue
and break
while true {
try {
// stuff
continue; // or break
// <- pop here
} catch {
// <- pop here
// stuff
}
}
Here continue is worse because we can build up a large number of exception handlers that never get popped. This issue materialized because we had a case where an inner exception handler wasn't popped because of a return
. Later an exception was through inside the scope of another handler but the first handler was used and attempted to index out of bounds.
Hello! First I would like to thank you for having this repository. It has been invaluable as I work through crafting interpreters with rust.
In your definition for bytecode you allow some enums to have values, but this will cause all the enums to have that size:
https://github.com/jonnyboyC/spacelox/blob/0808b0138e399539d28e30feef9baef0cc5c34e4/src/chunk.rs#L46
Was this intentional?
Currently the language has no mechanism to do conditionals as expressions which as I test out actually developing in it quickly becomes a pain. The only real mechanism right now is something similar to IIF in JavaScript or using short circuit logical operators
let myValue = || {
if true {
return 10;
} else {
return 8
}
}();
let myValue = true and 10 or 8;
Obviously this a lot of code compounded by the additional lack of implicit return. On top of that we pay for the function call which is probably 2 or 3 times more expensive than the branch.
I haven't decided which way I'd like to go but likely I'd go with how Js or rust does conditional
const myValue = true ? 10 : 8;
let myValue = if true { 10 } else { 8 };
The javascript / C style is obviously very compact with the downside of not being able to use statements. The rust style paired with implicit return is were I currently lean. Mostly because I need to actually get block statements working so I've already committed to a similar instruction
Right now we don't have any type checking in Laythe. So far this has been fine but we have a synergy now with another problem. Specifically we don't have a good way for a user to subclass a "native" class. Today our native classes are built in the rust code but because of that our regular compiler code doesn't know which fields or method are available on the class. Today we use the ClassAttributes
structs to track fields but these are only present in user provided code.
In order to solve this we could produce a minimal type checker that really only handles classes and primitives. Once we have this in place we can build a type for each native function, constant and class. In the user provided code we could then use this type signature to extend
While working on advent of code 2021 I've noticed there is a quite a bit of boilerplate with both .iter()
and .into(<Class>.collect)
. Essentially converting a container too and from an iterator. I'd like to consider add some sort of sugar make this explicit still but shorter.
let a = [1,2, 3]$.map(|x| x).toList()
I keep running into issue where new GC bugs pop up as I progress through these refactors. In general I'd like to be able to run cargo test
in ci with the gc_stress
on to gc on every allocation to minimize the chance of introducing one of these bugs.
Overall I think to achieve this I need to do something along the lines of thing
GcHooks
to simplify the interface where we can have a test context that uses our dummy allocatorcargo test
Originally I thinking of following Wren's direction by offer fiber communication through a yield keyword. I think there are some restrictions here the biggest being distributing work among many fibers. I think here I want to go with approach more similar to go where I have another first class value as a channel.
As some potential syntax we could have some like this
fn printFile(chan: channel) -> string {
// I think like go we should have a conveniences
// keyword around channels that effectively reads
// out of the channel until it's closed
for path in drain channel {
// I think we also want the std lib methods to yield during
print(File.readallText(path));
}
}
let paths = [
// a bunch of file paths
]
// I think I'm leaning towards have
// channel creation look like an instance
// creation. I'm not sure if it makes sense
// to use a keyword like chan
let chan = Channel();
for _ in 4.times() {
// I'm between spawn and run. I'd probably
// want to avoid go since that will probably
// be somewhat confusing with go itself
spawn printFile(chan);
}
for path in paths {
// push value into channel
// some fiber will read the value
chan <- path
}
// close the channel once we're done
chan.close();
I think there are a few new pieces of syntax that may need to be introduced.
// push a value into the channel
chan <- value
// read a value from the channel
let value <- chan
// drain keyword to treat a channel like an iterator
for value in drain chan {
}
// I think something like go's select will also be a good idea
// I think since I'd ideally like to have a match statement similar to rust
// I'll have this select somewhat mirror that
// here is the non blocking case
select {
chan1(msg) => {
print('msg')
}
chan2(msg) => {
print('msg')
}
_ => {
print('example');
}
}
// here is the blocking case
select {
chan1(msg) => {
print('msg')
}
chan2(msg) => {
print('msg')
}
}
Some methods that may make sense on channels
// I think we can optionally take a parameter
// to the channel constructor to indicate if it should be buffered
// I think as an mvp we only have unbuffered channels
let chan = Channel(5);
chan.isClose();
// boolean value is this channel closed
chan.bufferLen()
// how many values will this channel buffer
// restrict the interface of the channel to be
// read or write only
// I think under the hood this should only change the
// class of the channel but otherwise should be the same data
let writeOnlyChan = chan.writeOnly();
let readOnlyChan = chan.readOnly();
I'll need to gather some thoughts on how actually implement this.
Currently the follow script will report an incorrect runtime error
class A {
foo(x) { x / 2 }
}
let iter = [1, 2, 3, 4].iter().map(A().foo);
assertEq(iter.next(), true);
// RuntimeError: Iter is not callable.
The stack for the script is getting incorrectly setup.
The following function will case and error in the interpreter.
// boom
let numbers = '1 2 4 25 103'.split(' ').map(Number.parse).into(List.collect);
class Test {
static addOne(x) {
return x + '1'
}
}
// this is fine
let addedOne = '1 2 4 25 103'.split(' ').map(Test.addOne).into(List.collect);
It appears this is an issue with static method when they are placed inside a Method
object
With #99 we added our initial implementation of user level modules. The biggest doesn't side is we're currently restricted to just the directory. The next iteration of user modules should allow importing files of arbitrary nesting.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.