Coder Social home page Coder Social logo

mochivm's Introduction

MochiVM

Stack-based, some-batteries-included virtual machine supporting delimited continuations via effect handlers

mochivm's People

Contributors

robertkleffner avatar

Stargazers

Álvaro Ceballos avatar Hiperión avatar Andrejs Agejevs avatar  avatar

Watchers

 avatar

mochivm's Issues

Better build scripts

The current make file process leaves some things to be desired.

  • Optionally build tests
  • Optionally build some dependencies
  • Pass some #defines via compiler parameters

Add permissions functionality to vm

Programmers should have the ability to request permissions, and jump to offsets based on whether a permission is enabled

TODO: Should set of permissions be cross-platform?

  • PERM_QUERY
  • PERM_REQUEST
  • PERM_REVOKE

Add option to disable runtime object tagging

Since Boba is a statically typed language, we generally don't need the object type tag on every single object.

However, it is very useful during VM development to have runtime type information enabled. So we should place runtime type information behind a compile tag, which will remove the field from the struct. A few key areas may need to change as well, such as printing and asserting object type.

Primitive linked list type

This should be a new descendant of Obj, providing fast access to the head and the foot of a list. Also implement instructions to create and modify list values.

Move instruction tracing out of interpreter loop

There's a big block of text in vm.c that does some debug instruction tracing. All that stuff should get its own function, which is then called from that #if inside the interpreter function.

Constructor objects and functionality

To support basic algebraic data type functionality, we need another custom object type and a few instructions for constructing, extracting, and 'pattern-matching' on those custom objects.

  • CONSTRUCT
  • DESTRUCT
  • IS_STRUCT
  • OFFSET_STRUCT
  • JUMP_STRUCT

Implement offset instruction

Straightforward relative-location instruction pointer adjustment. Similar to tail-call, but the new location will be decided relative to the current instruction pointer, not the global start of the byte code.

Each thread should have it's own LibUV event loop

Currently only the default event loop is used throughout the entire VM. This may not be desirable, so we should at least provide the option for each thread 'fiber' to have it's own event loop, which is initialized when the fiber starts and destroyed when the fiber completes/dies.

Does this come with improvements, or is one loop good enough?

Implement hash tables

Maybe these can support both the built-in dictionary type and the built-in scoped record type.

Implement shared mutable store (heap)

Need a new object type ObjRef, which will contain a pointer to another object.

Need a VM-level collection of ObjRef to represent the shared (between threads) state.

Three instructions to manipulate the store:

  • NEW_REF
  • PUT_REF
  • GET_REF

Ability to load and run from bytecode file

Requirements

  • Byte code filename as parameter to executable binary from CLI
  • In debug, full listing can be optionally dumped for inspection
  • Most code not reliant on byte code filename, so we can continue to supply arbitrary byte code in tests

Tests for closures

Basic first-order functions tests. Verify being able to create, call, and close over other values in the frame stack.

Async operation tests with LibUV battery

Making sure the VM can handle IO-bound evented calls properly. Good stress test for handlers and calling MochiVM closures (or at least setting up the calls) from C-code as well.

Rename 'mark frames' to 'handle frames'

Marking as a concept fits better in the GC. This will also allow us to rename the markId field in the mark frame struct as handleId or something that explicitly denotes the association with 'effect handling' contexts.

Setup GitHub actions

Actions I'm currently interested in:

  • Security scanning
  • Code quality scanning
  • Automatic formatting
  • Automatic test runs
  • Automatic builds

Add C unit testing framework

Would be nice to have tests that don't reside in/are not driven by main.c. It is likely that the completion of this issue would allow the current hand-made testing framework to be deprecated and removed.

Primitive array and slice types

Arrays will be constant-time indexable, and provide operations for mutations. In Boba, these operations will probably come with linear types.

Slices will be lightweight index-offset 'windows' into a subsequence of an array. Their operations will be similar to those provided for arrays.

Add labels to disassembly

Pre-req is probably #12 so we can re-use that data structure to store, manage, and query label -> bytecode index maps.

The label set should always be initialized to a valid hash table, but should default to empty.

This would help make CALL, TAILCALL, CLOSURE, and other functions that use direct locations more readable in the disassembly.

We need to handle the label set not having an item present in each instruction that might reference a direct location.

Add a CONTRIBUTING file

This would be helpful to have sooner rather than later, and should definitely be added before announcing the project on any socials/forums or IRL.

See GitHub docs about this as a starting point.

Compact byte array object

We have a special object type for potentially polymorphic Arrays that just store the base Value type. While this object is highly re-usable, it is not at all space efficient for byte arrays.

Can we just re-use the string object for this? Or should the string object become a byte array object with some extra logic surrounding it in runtime operations?

Reduce duplication in action/continuation call instruction

Candidates that have a lot of overlap

  • ESCAPE + REACT (can probably be implemented by one if branch with a few lines)
  • CALL_CONTINUATION + TAILCALL_CONTINUATION (basically tail call just pops the previous frame and has a different return pointer)

For ESCAPE and REACT, it may be better to unify them into one function. Maybe we could introduce a three-mode execution style for handlers, doing specific things for no-resume, single-shot, and multi-resume capabilities. The default here would assume multi-resume, but compiler writers could choose to mark action handlers as no-resume or single-shot for some performance gains.

Support UTF8 strings and Unicode code points

Large development effort here, but probably nothing that hasn't been done before elsewhere. Probably reasonable to stand on the shoulders of giants for this one.

Consider utf8proc for implementation or at least inspiration, as it is managed by the folks at JuliaLang and likely to be a good pattern or candidate for other managed-memory languages.

Finish pointer tagging representation

Then we'll have three basic data representations to use for benchmarking. Now that NaN tagging has been fleshed out to support some values-as-objects this should be straightforward.

Add in-place instructions

Some operations don't need to allocate an immutable clone if the type is unique. Adding these core operations will allow the programmer to take advantage of efficiency gains provided by guaranteed uniqueness.

  • RECORD_EXTEND_INPLACE
  • RECORD_UPDATE_INPLACE

Tests for handlers

Lots of tests verifying that handlers and continuations work as expected. Need to cover non-resuming, single-shot, and multi-resuming actions, after-closures, action parameters, handlers in actions, handler nesting, handler injecting.

Implement LibUV 'battery'

Basically a nice, easy to integrate wrapper for various async LibUV functions, and also integrates the LibUV event loop into the runtime.

Should implement #13 concurrently or before this issue.

Documentation for LibUV.

  • File system functions & types
  • Networking functions & types
  • Thread functions & types
  • Process functions & types
  • Timer functions & types
  • Runtime library loading

Add root stack to fibers instead of VM

Currently the 'temporary GC root' feature is not thread safe since it is attached to the over-arching VM data structure, rather than to each 'fiber', some which will eventually correspond with threads. This means that one fiber can push a root, and another pop that root when it was expecting to pop the root below due to non-deterministic order of execution.

Each fiber should have a small root collection, rather than the VM struct having a single root collection.

Implement generic shuffle instruction

First argument - number of items to pop
Second argument - number N of items to push
Next N arguments - index of popped item to push

Example:

shuffle ab-ba = swap = SHUFFLE 2 2 1 0
shuffle a-aa = dup = SHUFFLE 1 2 0 0
shuffle ab- = drop2 = SHUFFLE 2 0

Add tuple support

Can we re-use some existing objects or instructions?

  • GATHER
  • SPREAD

Improve README

As the entry point of the project for most developers (coming in through GitHub especially), the README should be informative, good-looking, and inspire confidence and excitement to try out the Mochi VM.

  • Maybe a picture of a mascot or symbol for branding.
  • Links to relevant documentation and supporting materials.
  • Current status of project and basic build/getting started instructions.

Implement all Value numerics

Tedious but adds a lot of usefulness to the VM. Helpful to read Laurence Tratt's Static Integer Types for concerns and context. The requirements of this issue may change after reading it more closely, and determining just how many fixed-size values Boba should support. Will also take a look at other VMs to see how they do this in their byte codes.

Current plan:

  • U8
  • U32
  • I32
  • U64
  • I64
  • ISize
  • USize
  • Half
  • Single
  • Double

Possible:

  • I8
  • I16
  • U16
  • Complex

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.