Coder Social home page Coder Social logo

thakeenathees / pocketlang Goto Github PK

View Code? Open in Web Editor NEW
1.5K 1.5K 78.0 2.4 MB

A lightweight, fast embeddable scripting language.

Home Page: https://thakeenathees.github.io/pocketlang/

License: MIT License

Python 3.82% C 94.21% Ruby 0.16% Batchfile 0.82% Makefile 0.21% JavaScript 0.28% Lua 0.49%
bytecode-compiler c functional interpreter language programming-language scripting-language vm

pocketlang's People

Contributors

alexcpatel avatar andrea321123 avatar billy4479 avatar comonadd avatar ekinbarut avatar lukeed avatar naurelius avatar ntnlabs avatar rhnsharma avatar takashiidobe avatar thakeenathees avatar tiagocavalcante avatar timgates42 avatar timwi avatar tsujp avatar xsavitar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pocketlang's Issues

^D starts an endless loop

I wanted to exit pocket with a ^D in the terminal. Instead pocket printed out >>> in an endless loop:

$ pocket
PocketLang 0.1.0 (https://github.com/ThakeeNathees/pocketlang/)
Copyright(c) 2020 - 2021 ThakeeNathees.
Free and open source software under the terms of the MIT license.

>>> ^D
>>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>^C
$ 

This is commit 2b4d3af

$ uname -omp
x86_64 unknown GNU/Linux

Bitwise operators implementations

TODO: bitwise operators

The bitwise operators (except for &) currently aren't implemented. Evaluating an expression like a | b will cause a crash with a TODO: message. Which needs to be implemented.

Here is a simplified example of how a | b compiled into pocketlang bytecode

PUSH a  ; push value 'a' on the stack
PUSH b  ; push value 'b' on the stack
BIT_OR  ; pop the top 2 values on the stack, perform bitwise or and push the result.

And here is a simplified "pseudo-code" of the pocketlang VM

void runVM() {

  while (!finished)
    instruction = nextInstruction();
  
    switch (instruction)
      ...
      case PUSH:
        push(value()); // value a and b

      case BIT_AND:
        Var b = pop(); // from stack.
        Var a = pop(); // from stack.
        Var result = varBitAnd(a, b); // a & b;
        push(result);
        break;

      case BIT_OR:
      case BIT_XOR:
        TODO;
}

Which you can find here

How to implement

  • Use BIT_AND (here) as a reference to implement BIT_OR, BIT_XOR, ..
  • Use varBitAnd (here) as a reference to implement varBitOr, varBitXor.

Finally

Once you've successfully implemented, write some tests at tests/lang/basics.pk
(note that currently, pocketlang doesn't support binary, hex literals)

assert( 1 |  2 ==  3)
assert(10 | 12 == 14)
assert(-1 | 10 == -1)

lua is so fast

my lua version is 5.3.4

# python3 benchmarks.py                       
----------------------------------
 CHECKING FOR INTERPRETERS 
----------------------------------
Searching for pocketlang -- found
Searching for wren       -- not found (skipped)
Searching for python     -- found
Searching for ruby       -- not found (skipped)
Searching for lua        -- found
Searching for javascript -- not found (skipped)
----------------------------------
 factors 
----------------------------------
pocketlang : 4.000000s
python     : 5.265634s
lua        : 0.600000s
----------------------------------
 fib 
----------------------------------
pocketlang : 3.040000s
python     : 3.029289s
lua        : 0.950000s
----------------------------------
 list 
----------------------------------
pocketlang : 2.880000s
python     : 2.594415s
lua        : 0.790000s
----------------------------------
 loop 
----------------------------------
pocketlang : 1.230000s
python     : 2.204718s
lua        : 0.390000s
----------------------------------
 primes 
----------------------------------
pocketlang : 3.740000s
python     : 2.913168s
lua        : 0.550000s

More math module functions implementations

TODO: add cos(), tan(), asin(), ...

Currently, we have a bunch of functions in the math module (here) yet we're still missing some essential functions (cos, tan, log, etc).

All native (C) functions are bonded to a module with the void (*pkNativeFn)(PKVM* vm) function pointer which then be called by the opcode OP_CALL.

How to implement

  • The implementation is straightforward. use sin() function implementation as a reference for cos and tan.
  • Register the function here, see how sine function registered there (just above the line) for a reference.

Finally

And don't forget to write some tests at tests/lang/core.pk.

from math import PI, sin, cos, tan

assert(sin(0) == 0)
assert(sin(PI/2) == 1)
assert(cos(PI/3) == 0.5) ## TODO: ".5" is not valid number YET

for i in 0..1000
  assert(sin(i) / cos(i) - tan(i) < 0.00001)
end

Allow the position range (in `str_sub()`) to be 0 <= pos <= str->length

Followup

Sorry I've missed it in the last review, We should allow the position range to be 0 <= pos <= str->length
when the pos == 0 the user can still get a substring of length zero.

if (pos < 0 || str->length < pos) RET_ERR(...);

The position should be added to the length check, if the length of the substring starting at the pos exceeded the string length it would be an error. (see below)

if (str->length < pos + len) RET_ERR(...);
pos = 5
len = 7
                 v- str->length
"f o o b a r b a z"
           ^-----------^
           pos         pos+len

TODO:
I'm not expecting it from this pr but it's worth mentioning.
Make the length argument as optional to let the users allow get substring starting from pos to the reset of the string.

[RfC] Extract FIXME/TODO from codebase that's beginner fiendly

@ThakeeNathees, could you curate a couple of beginner friend tasks from the codebase's FIXME or TODO sections? This is so that I can begin making code-wise contributions in a direction that makes sense rather than just attempting to make some changes etc.

From your view point, I think a couple of issues that I can start with will be awesome. Let me know if this is something you'll like to do so I can begin giving you assistance. Thank you!

native api requests

Hello,

finish please native interface:

  • _vecAdd() function in tests/native/example2.c (native object from slot)
  • example how to call pk function from c/c++ with parameters (pkRunString() is not enough)
  • list api

thank you

[ref] Lexing

The pocketlang compiler will read the source (a string) and generate bytecode (an array of bytes) which can be interpreted by the pocket VM.

The compilation is a 2 step process

  • Lexing: Make tokens from the source string (see below)
  • Parsing: Generate parse tree (in multi-pass compilers) or bytecode (in single pass compilers) according to the language grammar (this will be updated in a new ref issue).

Pocketlang is a single-pass compiler, but for the sake of easiness, we'll see lexing and parsing separately, even though we parse immediately after a token is lexed.

Tokens

Tokens are the result of the lexing process. A token has a type (TokenType) and optionally a value. Here is the simplified pocketlang token type and token declaration. (the source reference).

enum TokenType {
  TK_LPARAN, // '('   -- no token value
  TK_RPARAN, // ')'   -- no token value
  TK_NAME,   // foo   -- token value = name of the identifier
  TK_STRING, // "foo" -- token value = the string literal
  ...
};
struct Token {
  TokenType type;
  Var value;
};

Lexing

lexing

The compiler will read the source string (sequence of characters) and make a sequence of corresponding tokens. These tokens then will be used to generate bytecode by the compiler.

The tokens can be classified into

  • separators - ( ) { } [] . .. , ;
  • operators - + - * / = & \| ^
  • literals - 42 false "foo" 0b10110
  • keywords - def import if break
  • names - foo bar print input

Except for keywords, any identifier will be tokenized to name token (TK_NAME) which will be determined by the parser if it's a built-in function or variable or an imported module, etc. The names that aren't defined are semantic errors throws by the parser. The lexer only cares about if it can make a valid token out of it. Here are some lexing errors.

  • Non terminated string x = "foo
  • Invalid literals 0b123456abc
  • And every other places the lexError() function called.

Each different classification of token types will be tokenized by various lexer helper functions (source). Those functions are encapsulated by lexToken(Compiler* compiler) function. (source)

Use this thread to discuss the lexing process of pocketlang

Map contains check (elem in map) implementation

We have in test after 64bf276 commit (ie. elem in container) however we currently only support lists, The map implementation would be fairly simple.

How to implement

  • You can find the TODO here
  • Use the mapGet() function to check if the element exists.
  • Return true if it exists.

Finally

Don't forgot to write some tests at tests/lang/basics.pk

assert('key' in {'key':'value'})
assert(!('foo' in {'bar':'baz'}))

UTF-8 LEXER SUPPORT

Hi,

I want to add a russian syntax but I am getting the following error.

What path should I follow?

1

Add |= and ^= operator tokens (use &= as a reference)

TODO:

Currently, we're missing the &= |= and ^= tokens, and using them causes a compile-time error.
The expression a &= b will be compiled as a = a & b and we already have the functionality for that. Adding the tokens for the operators is sufficient.

How to implement

  • I've made a walkthrough on how to do it in #69 use it for a reference.
  • see it's diff and complete the todos

Finally

Don't forget to add some tests (at basics.pk).

Numeric literal starts with decimal point (ex: .5)

TODO:

The numeric literal like .5 isn't valid in pocketlang yet and still in todo. Currently, if the lexer see a dot ., it'll make a TK_DOT token, but if the next token is also a dot it'll make TK_DOTDOT (here). For the implementation, if next token is a digit ([0-9]) (use utilIsDigit() function) this should make a TK_NUMBER token, using the eatNumber() function.

Read #93 to learn about how the lexer works in pocket lang.

How to implement

  • You can find the todo here

  • Check if the next char is digit and consume a number if it is.

     case '.':
-       setNextTwoCharToken(compiler, '.', TK_DOT, TK_DOTDOT);
+       if (matchChar(compiler, '.')) {
+         setNextToken(compiler, TK_DOTDOT);  // '..'
+       } else if (utilIsDigit(peekChar(compiler))) {
+         eatChar(compiler); // Consume the decimal point.
+         eatNumber(compiler);    // '.5'
+       } else {
+         setNextToken(compiler, TK_DOT);    // '.'
+       }
       return;

In addition (Optional)

  • We use strtod to convert the string to double, which is locale-dependent we should always use the dot '.' as the decimal separator, remove the strtod and write our own version of it. See here to learn how we currently tokenizing binary and hex literal by our own algorithm.

Finally

write some tests at tests/lang/basics.pk.

add `exit` keyword

Similar to Ruby having an exit keyboard would be nice. I don't think we should bring over the lack of cleanup bang variant exit! though.

I'd be happy to implement this if some pointers (pun?) were given.

[ref] ByteBuffer implementation detail

How a ByteBuffer works in pocketlang

Use this thread to discuss how pocketlang's buffer works

  • First of all, there are 2 implementations of byte buffer in pocketlang (but both have the same function "interface" and work the same but slightly different).
    1. first one defined at src/pk_buffers.h which uses the VM's allocator for allocations.
    2. second one defined in cli/utils.h which uses the malloc and realloc for allocations.

A byte buffer is a heap allocated array of bytes (uint8_t). Here is the declaration of it.

typedef struct {
  uint8_t* data;     // Pointer to the heap allocated array.
  uint32_t count;    // Number of elements in the array.
  uint32_t capacity; // The allocated (reserved) size.
} ByteBuffer;

Every time we write a byte into the buffer, the value will be "appended" at the end of the buffer and the count will increase.

              Un initialized memory
                  .---------.
        [ 42 12 65 ? ? ? ? ? ]
The count is 3 --^         ^-- The capacity is 8

If the buffer is filled with values it'll resize itself to a larger size. (by default it'll double it size)

                     Un initialized memory
                        .---------------.
[ 42 12 65 78 10 2 55 68 ? ? ? ? ? ? ? ? ]
      The count is 8 --^               ^-- The capacity is 16

Here are the functions for the byte buffer defined in cli/utils.h

// Initialize a new buffer int instance.
void byteBufferInit(ByteBuffer* buffer);

// Clears the allocated elements from the VM's realloc function.
void byteBufferClear(ByteBuffer* buffer);

// Ensure the capacity is greater than [size], if not resize.
void byteBufferReserve(ByteBuffer* buffer, size_t size);

// Fill the buffer at the end of it with provided data if the capacity
// isn't enough using VM's realloc function.
void byteBufferFill(ByteBuffer* buffer, uint8_t data, int count);

// Write to the buffer with provided data at the end of the buffer.
void byteBufferWrite(ByteBuffer* buffer, uint8_t data);

// Add all the characters to the buffer, byte buffer can also be used as a
// buffer to write string (like a string stream). Note that this will not
// add a null byte '\0' at the end.
void byteBufferAddString(ByteBuffer* buffer, const char* str, uint32_t length);

These comments explain what the function does, for how it does read the comments inside the function implementations at the cli/utils.c. And if you have any questions, suggestions or improvements, feel free to use this thread to discuss.

[Question] What is the intended way to pass variables between C and Pocketlang?

I was playing around with Pocketlang and it looks really cool!
This is my first time using an embedded language and I got stuck when I had to read a global variable from my script and use it in C.

My solution was creating a native function that will have to be called from the script using the variable as argument. Something like this:

a = 10
variableToC(a)

I ran into a similar issue when I had to set a global variable from C. I saw that pkModuleAddGlobal exists but I wasn't able to create a *PkHandle from, for example, an int.
I also was able to solve this using native functions and returning the value I wanted.

Are those the intended ways?
I think it would be cool if it was a more straight forward way, something like *PkHandle pkCreateNumber(PKVM* vm, const char* name, double value) for creating new variables and something like bool pkGetGlobalNumber(PKVM* vm, const char* name, double* value) to get a value from the VM.

I saw here that creating variables is in the TODO list. Are we refering to the same thing?
As I said, this is my first time working with embedded languages so it's possible that I'm doing something wrong.

Add `log` function to pocketlang math module

Currently, our math module isn't completely filled with necessary math functions, where you can contribute us by adding more math functions. This issue has reference on how to add log function to the math module.

How to implement

PR #154 has example on how to add log10 to the math module, which you can use as a reference to add log (base 2) function.

Finally

And don't forget to add some tests to tests/lang/core.pk.

List addition implementation

TODO: list addition

Currently adding 2 lists ([1, 2] + [3, 4]) cause a crash with a TODO message. which need to be implemented. (here)

How to implement

List* l1, *l2 are defined above the TODO statement in the source.

  • Create a new list using newList(vm, size) function, where (initial reserved) size is the total length of l1 and l2.
    length of l1 = l1->elements.count
  • iterate each lists and append to the new list.
for (uint32_t i = 0; i < l1->elements.count; i++) {
  listAppend(vm, list, l1->elements.data[i]);
}
  • return the list as a var using VAR_OBJ(list) macro function.

Finally

Once you successfully implemented it, write some tests at tests/lang/

Add start, end attribute to range

Currently there is no way of getting the start and end value of a range object

>>> r = 1..100
[Range:1..100]
>>> r.start
Error: 'Range' object has no attribute named 'start'
  $(SourceBody)() [line:1]
>>>

In the above, it's expected to print 1 for the .start attribute

How to implement

  • Here is the switch statement that matches the attribute name and returns its value.
    case OBJ_RANGE:
    {
      Range* range = (Range*)obj;
      SWITCH_ATTRIB(attrib->data) {

        CASE_ATTRIB("as_list", 0x1562c22) :
          return VAR_OBJ(rangeAsList(vm, range));

        default:
          ERR_NO_ATTRIB(vm, on, attrib);
          return VAR_NULL;
      }

      UNREACHABLE();
    }
  • Add attrib with the name "start" and "end", (their corresponding values are range->from and range->to) the second argument of the CASE_ATTRIB macro is the hash value of the string, which you can get like this.
>>> from math import hash
>>> print(hex(hash("as_list")))
0x1562c22
>>>

A better phrase to credit Wren

The language is written using Wren Language and their wonderful book Crafting Interpreters as a reference.

The phrase in the readme (above) seems to confuse several people into think that pocket is written in Wren. But it's written in pure C99.

We need a better phrase to clarify this and credit wren.

Doc string for core modules

TODO: add docstring for core modules

Most of our native functions are documented with PK_DOC() macro (example)

PK_DOC(
  "input([msg:var]) -> string\n"
  "Read a line from stdin and returns it without the line ending. Accepting "
  "an optional argument [msg] and prints it before reading.",
static void coreInput(PKVM* vm)) {
   // The implementation.
}

which then can be used to get a help text of the function. (note that help not supported yet)

>>> x = input
[Func:input]
>>> help(x)
input([msg:var]) -> string
Read a line from stdin and returns it without the line ending.
Accepting an optional argument [msg] and prints it before reading.
>>>

How to document

  • add PK_DOC macro to functions that already aren't documented here

  • An example:

static void foo(PKVM* vm) {
  bar();
}
PK_DOC(
  "foo(i:num) -> list\n"
  "The description of the function, it's return value and parameters "
  "what it does and what the user has to consider when calling.",
static void foo(PKVM* vm)) {
  bar();
}

Seamless compile-time FFI

Any plans to have first-class support for compile-time FFI (e.g. to C functions & structs)?

Or more generally - any plans to stand on the shoulders of some of the huge existing language ecosystems (C, Python, C++, Ruby, Rust, Java, ...)?

With compile-time I mean to not need to manually generate (or worse write) bindings, but use some language feature (e.g. special syntax like prefix C. resulting in C.printf( "%s%d", ... )).

Write a better Makefile

Our current Makefile is straightforward, a single compile command of all source files. Even though the project is small and it takes no time to rebuild everything, We need to do it "the right way". Keep track of files that are changed and only compile them. Set the src/*.c to be compiled as a static library which then can be integrated with cli.

However, even if the project grows, It'll always be possible to compile the source with the single command below.

gcc -o pocket cli/*.c src/*.c -Isrc/include -lm -Wno-int-to-pointer-cast

Refactor the read line function

Re-implement readLine() function

The pocketlang's input() function and cli's input reading both calls the read_line() function defined in cli/repl.c which is something I wrote for temporary. that can only read at a maximum of 1024 characters and not safe. Which needs to be refactored with reading each character one by one and write it to a buffer till we reach a new line or an EOF.

Refer #78 for ByteBuffer implementation details.

How to implement.

  • Create another function with the name readLine (and don't refactor the read_line function because it's used by the VM's IO callbacks, and refactoring it is for the future). That'll take a byte buffer and read the chars into that buffer. (here is an un tested code)
static void readLine(ByteBuffer* buff) {
  do {
    char c = fgetc(stdin);
    if (c == EOF || c == '\n') break;

    byteBufferWrite(buff, (uint8_t)c);
  } while (true);

  byteBufferWrite(buff, '\0');
}
  • Don't forget to write the comments documenting the function.

    • What it does at above the function's declaration.
    • How it does inside the function, at required places (But here we don't need any).
  • Create a buffer to store the line that'll be read from stdin.

+ // A buffer to store a line read from stdin.
+  ByteBuffer line;
+  byteBufferInit(&line);

   // A buffer to store lines read from stdin.
   ByteBuffer lines;
   byteBufferInit(&lines);
  • Remove the read_line() call and change it to readLine(&line)

  • Change all the line with (const char*)line.data

  • Change all the free((void*)line); with byteBufferClear(&line);

Finally

Unit tests for REPL haven't implemented yet, so you have to run some tests by yourself to make sure everything works fine, before making the PR. (ex:)

>>> def hello()
...     print('hello world')
... end
>>>
>>> hello()
hello world
>>>

Parse command line args

We don't have an argparser for the command line yet, which you could help us to implement. It would be easier to use a small (preferred single heder or a maximum of 2 source files) library than writing one yourself from scratch.

https://github.com/cofyc/argparse seems good to me (I haven't used or tested it before), or something else you know might be great, but for License compatibility, we're only using MIT licensed or public domain libraries in pocketlang.

Here is an expected example usage message (feel free to come up with your own one).

usage: pocket [option] ... [-c cmd | file | -] [arg] ...
  -c cmd          Evaluate and run the passed string.
  -d, --debug     Compile and run a debug version.
  -h, --help      Prints this help message and exit.
  -q, --quiet     Don't print version and copyright statement on REPL startup
  -v, --version   Prints the pocketlang version number and exit.

Update the copyright statement in the source files.

The current copyright statement in the file headers doesn't address the pocketlang aurhtos. It should be updated like this.

Before

/*
 *  Copyright (c) 2020-2021 Thakee Nathees
 *  Distributed Under The MIT License
 */

After

/*
 *  Copyright (c) 2020-2021 Thakee Nathees
 *  Copyright (c) 2021 The pocketlang Authors
 *  Distributed Under The MIT License
 */

And thanks for being part of this ๐ŸŽ‰

Include Nodejs in Benchmarks

Hey! It looks like you started implementing some JS for the benchmarks and stopped. There's a note in loop.js about V8's performance optimizations/benefits โ€“ย I'm not sure if this deterred you from including/finishing JS or not, but I think it's still important to include them for context.

Pocketlang doesn't have to be the fastest in a graph in order for it to be considered :) It's a cool little language and I think it has more merit than speed.

I added the JS variants for all benchmarks and will include a sample of their results on my machine. Happy to PR the benchmark items if interested

Node v14.15.3
Python v3.9.5
Pocket (master : 6d434db) -> make release

benchmarks/fib
  node   0.090436s
  python 2.506929s
  pocket 1.380002s

benchmarks/list
  node   0.143674s
  python 2.351552s
  pocket 0.884381s

benchmarks/loop
  node   0.325279s
  python 1.946506s
  pocket 0.54898s

benchmarks/primes
  node   0.143707s
  python 2.6421330000000003s
  pocket 1.400831s

Again, I don't think this should be discouraging at all. It's amazing so far!

Set stack max limit to prevent stack-overflow crash the VM

An infinite recursion crash the VM, for the bellow code when the depth is around 65565. We should set a maximum recursion depth (which could be changeable by the user) like depth 1000 is python's default max depth.

def f
  f()
end
f()

However the bellow code will run all day without any trouble even with the max depth limit because of the tail call optimization.

def f
  return f()
end
f()

Bitwise Not Implementation

TODO: Bitwise Not Operator.

Currently the bitwise not operator ~ isn't implemented in pocketlang, and trying to run it would crash with a TODO message

>>> ~1
Assertion failed: TODO: It hasn't implemented yet.
        at runFiber() (C:\dev\pocketlang\src\pk_vm.c:1220)

You can find the TODO Here

How to implement

  • Here is a similar issue for every other bitwise operator (& | ^ << >>) #54
  • Here are the implementation for every other bitwise operator #59 and #63.

Finally

Once you've successfully implemented, write some tests at tests/lang/basics.pk

[portability] Report us ANY compiler error, warning you encounter.

We're trying to make pocketlang sources as portable as possible and get rid of any and all compiler warnings.

Compile the pocketlang source with any of your preferred c99 compiler (gcc -o pocket cli/*.c src/*.c -Isrc/include -lm see here for more) with -Wall, -Wextra flags and report us whatever error or warning it shows, by opening an issue.

And if you like to resolve those warnings yourself, we're happy to have your PR.

Remove find dependency in Makefile

Currently, our makefile depends on the find unix command which makes it harder for others to compile on windows

Here is what they have to do

To run make file on windows with mingw, you require make and find unix tools in your path. Which you can get from msys2 or cygwin. Run set PATH=<path-to-env/usr/bin/>;%PATH% && make, this will override the system find command with unix find for the current session, and run the make script.

But we want the makefile to be portable and independent of the system, Remove the find from the makefile.

function redefinition

Hello.
Pocketlang looks really nice. I'd like to have function redefinition possibility. then it's more REPL-like usage possible.

Parallelism & concurrency along with synchronous & asynchronous programming

Any plans for parallelism & concurrency?

And an orthogonal question - any plans to support seamless synchronous & asynchronous programming?

With "seamless" I don't mean how majority of languages implement it nowadays (C#, Javascript, Rust, Python, ...) because that's insane ("async" is infectious - once you use it you need to use it all the way to the root; in addition to that one needs 2 variants of each stdlib function) pretty much the same level of insanity as async everything.

With "seamless" I mean something closer to what I described here (just to clarify: ECS is a fully synchronous pipeline - therefore a good example to demonstrate the "buffer events at the beginning of the synchronous pipeline and defer all from-this-pipeline-emitted events to the end of this pipeline and first then dispatch them").

Add `BUILD_STRING` opcode to optimize string interpolation (like python)

https://bugs.python.org/issue27078

I benchmarked some f'' strings against .format, and surprisingly f'' was slower
than .format in about all the simple cases I could think of. I guess this is
because f'' strings implicitly use `''.join([])`.

The byte code for f'foo is {foo}' currently is

  1           0 LOAD_CONST               1 ('')
              3 LOAD_ATTR                0 (join)
              6 LOAD_CONST               2 ('foo is ')
              9 LOAD_GLOBAL              1 (foo)
             12 FORMAT_VALUE             0
             15 BUILD_LIST               2
             18 CALL_FUNCTION            1 (1 positional, 0 keyword pair)

It was mentioned here https://bugs.python.org/issue24965 but since I came up
with the idea when inspecting this, I'd propose again here that a new opcode
be added for f'' strings - BUILD_STRING n, with which f'foo is {foo}' could
be compiled to 

              0 LOAD_CONST               2 ('foo is ')
              3 LOAD_GLOBAL              1 (foo)
              6 FORMAT_VALUE             0
              9 BUILD_STRING             2

Python 3.7.4

>>> def foo(name):
...     return f"hello {name}"
...
>>> from dis import dis
>>> dis(foo)
  2           0 LOAD_CONST               1 ('hello ')
              2 LOAD_FAST                0 (name)
              4 FORMAT_VALUE             0
              6 BUILD_STRING             2
              8 RETURN_VALUE

Pocketlang (current)

>>> def foo(name)
...     return "hello $name"
... end
>>> from lang import disas
>>> print(disas(foo))
Instruction Dump of function 'foo' "$(REPL)"
     2:     0  PUSH_BUILTIN_FN    14 [Fn:list_join]
            2  PUSH_LIST           2
            5  PUSH_CONSTANT       1 "hello "
            8  LIST_APPEND
            9  PUSH_LOCAL_0          (param:0)
           10  LIST_APPEND
           11  CALL                1 (argc)
           13  RETURN
     3:    14  POP
           15  RETURN
           16  END

Move fiber functions into its own module

Pocketlang support co-routine via fibers. The functions to create and interact with fibers are builtin functions (see here)
This needs to be moved into its own module named Fiber to organize everything and have a clean interface.

This is the current fiber interface.

fb = fiber_new(fn)  ## creating a fiber
fiber_run(fb, args...) ## start running
fiber_resume(fb, val) ## resume once yielded
fiber_is_done(fb) ## check if it's done
fiber_get_func(fb) ## get the underlying function

They need to be refactored in to

import Fiber
fb = Fiber.new(fn)  ## creating a fiber
Fiber.run(fb, args...) ## start running
Fiber.resume(fb, val) ## resume once yielded
fb.is_done ## check if it's done
fb.function ## get the underlying function 

How to implement

  • This is how the math module is defined use it as a reference (see here)
Script* math = newModuleInternal(vm, "math");
moduleAddFunctionInternal(vm, math, "floor", stdMathFloor,   1);
moduleAddFunctionInternal(vm, math, "ceil",  stdMathCeil,    1);
  • Create a Fiber module and add it's functions (just below math module) (you have to rename coreFiberNew to stdFiberNew to indicate it's a module function and not a builtin function)
Script* Fiber = newModuleInternal(vm, "Fiber");
moduleAddFunctionInternal(vm, Fiber, "new", stdFiberNew,    1);
moduleAddFunctionInternal(vm, Fiber, "run", stdFiberRun,   -1);
  • Remove the coreFiberIsDone and coreFiberGetFunc functions, because we're changing them to fiber attributes. (delete these lines)

  • Now add "is_done" and "function" attribute to fibers (here). replace the ??? with the name's hash value (see below). (use coreFiberIsDone and coreFiberGetFunc for the return value here)

  case OBJ_FIBER:
    {
      Fiber* fb = (Fiber*)obj;
      SWITCH_ATTRIB(attrib->data) {

        CASE_ATTRIB("is_done", ???) :
          // TODO: return is done

        CASE_ATTRIB("function", ???) :
          // TODO: return function

      CASE_DEFAULT:
        ERR_NO_ATTRIB(vm, on, attrib);
        return VAR_NULL;
      }
      UNREACHABLE();
    }

The second argument of the CASE_ATTRIB macro is the hash value of the string, which you can get like this. (run the pocket lang from the command line and evaluate this or you could you this python function)

>>> from math import hash
>>> print(hex(hash("foo")))
0xa9f37ed7
>>>

Finally

Make sure everything works fine by running the unit test script at tests/tests.py and run the tests/check.py to check if the hash value matches the string.

Add a .clang-format

This is a config file for clang-format, a tool that is already used by many editors and adopted by many projects to format the code and ensure that the style is consistent.
It could even be used as a git hook or in the CI/CD workflow to ensure that new patches respect the coding style.
A config can derive from other preexisting configs (like Google's, LLVM's or Mozilla's) so it's not a big effort to write one.
It is not the number one priority but I think that's a great thing to have.
More information at LLVM's website.

[ref] Garbage Collection

Garbage Collection

Use this thread to discuss how GC works, suggestions and improvements.

Here is the pocketlang's Object type (see) (the "abstract base class" for "struct inheritance". How it's implemented isn't related to this discussion)

struct Object {
  ObjectType type;  //< Type of the object. // OBJ_STRING, OBJ_LIST, OBJ_FUNC, ...
  bool is_marked;   //< Marked when garbage collection's marking phase.
  Object* next;     //< Next object in the heap allocated link list.
};

Heap Allocations

heapalloc

vmRealloc() is the pocket VM's allocator where, all the "GC magic" happens. All heap allocation calls invoke that function, and when an object is allocated it'll form a linked list in the VM, the Object* next points to the next object in the linked list.

Pocket VM keep track of the number of bytes allocated, and if it reached a threshold value, the garbage collection will be triggered.

The method we're using for GC is called mark-and-sweep

Marking Phase

VM's roots are objects that we have direct reference to and we don't want them garbage collected such as, stack values, temp references, handles, running fiber, current compiler, etc.

In this phase, we'll perform a reachability analysis, by doing a graph traversal and mark objects which are reachable from the roots. (note that the darker green objects below are marked, ie. is_marked = true)

mark

  • First, we mark and push all the root objects to the working_set of the VM.

  • Now, pop the top object from the working set mark and push all its referenced values. (ex: a list has reference to all of its elements and a fiber has reference to the stack values and its function object).

  • Repeat the above step till the working set become empty

  • Once, we're done with the working_set, any object that isn't marked is a garbage.

Sweeping Phase

At this phase, we'll iterate through the VM's object linked list, and if an object isn't marked we can free the object and remove it from the list. (see here)

sweep

Finally, we'll adjust the allocation threshold value depending on the number of bytes we're left with now (see here). See vmCollectGarbage(PKVM* vm) for the implementation. And here is how you can trigger GC manually in pocketlang

>>> from lang import gc
>>> gc() ## Returns the amount of bytes cleaned.
1520
>>> 

Reference:

A warm thanks to Bob Nystrom (@munificent) for these wonderful resources.

operator %= Implementation

Currently the bitwise not operator %= isn't implemented in pocketlang, and evaluating an expression with it causes a syntax error.
The expression a %= b will be compiled as a = a % b and we already have the functionality for that. Adding the tokens for the operators is sufficient.

How to implement

  • I've made a walkthrough on how to implement &=, in #69 use it for a reference.
  • #70 is another example for |= and ^= which is done by #71

Finally

Don't forget to add some tests (at test/lang/basics.pk).

include lua benchmarks

Similar to #124 โ€“ย I have the the benchmarks converted into Lua, which is more of a direct point of comparison than Node.js is.

Can PR if you'd like :) With performance in mind, I think it'd be a really good language to measure & compare against.

Rename elif to elsif

Discussion

Currently, our "else if keyword" is elif which something I've derived from python, however

  • Our language syntactically very much like ruby, and it would confuse the users (especially those who have a ruby background) and it's like learning a new language (since elsif is the ruby's way).
  • On the other hand, if someone learned pocketlang, it would be easier for them to continue with ruby (so we'll keep it as closer to ruby as possible).
  • We can also reuse any of ruby's syntax highlighter if we ever have to use one.
  • Reading elsif sounds more like else if then elif does.

If you have any suggestions, regarding the renaming let me know.

How to rename

  • the name is defined at src/pk_compiler.c (here is the line)
  • just renaming "elif" to "elsif" is sufficient but for consistency rename, it's token name from TK_ELIF to TK_ELSIF and find elif in the comments if any and replace them too.

Rename builtin and core module functions

BREAKING CHANGE -- Till the first release, it's most likely to change everything with compatibility breaks.

Discuession

Currently, pocketlang doesn't have any naming convention, and every symbol is snake_case. We need to enforce a naming convention for consistency across the builtin functions and core modules. And I'd like it to match our C implementation.

  • Function names are camelCase

  • Variable names/attributes are snake_case

  • Class names are PascalCase

  • Modules intended to used as a collection of functions (ex math, path) - snake_case

  • Modules intended to used as a type wrapper (ex File, Fiber) - PascalCalse

How to rename

  • You can find all the builtin functions, core module and its functions at src/pk_core.c Here (ex: to_string -> toString)

  • The additional cli modules at cli/modules.c

Finally

  • This requires you to rename all the test suites in tests/ directory for the tests to be passed
  • Check docs at docs/pages/Getting-Started/learn-in-15-minutes.md if any functions need renaming.

If you have any suggestions, regarding the renaming let me know.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.