Coder Social home page Coder Social logo

asteria's Introduction

The Asteria Programming Language

Compiler Category
GCC 7 🥇Primary
Clang 11 🥈Secondary

asteria

Asteria is a procedural, dynamically typed programming language that is highly inspired by JavaScript but has been designed to address its issues.

How to Build

First, you need to install some dependencies and an appropriate compiler, which can be done with

# For Debian, Ubuntu, Linux Mint:
# There is usually an outdated version of meson in the system APT source. Do
# not use it; instead, install the latest one from pip.
sudo apt-get install ninja-build python3 python3-pip pkgconf g++  \
        libpcre2-dev libssl-dev zlib1g-dev libedit-dev uuid-dev
sudo pip3 install meson
# For MSYS2 on Windows:
# The `iconv_open()` etc. functions are provided by libiconv. Only the MSYS
# shell is supported. Do not try building in the MINGW64 or UCRT64 shell.
pacman -S meson gcc pkgconf pcre2-devel openssl-devel zlib-devel  \
        libiconv-devel libedit-devel libutil-linux-devel
# For macOS:
# The `gcc` command actually denotes Clang, so ask for a specific version
# explicitly.
brew install meson pkgconf gcc@10 pcre2 openssl@3 zlib libedit
export CXX='g++-10'

Then we can build as usual

meson setup build_debug
meson compile -Cbuild_debug

Finally we launch the REPL, as

./build_debug/asteria

README

If you need only the library and don't want to build the REPL, you may omit libedit from the dependencies above, and pass -Denable-repl=false to meson.

License

BSD 3-Clause License

asteria's People

Contributors

frankhb avatar lhmouse avatar lixiayu avatar maskray avatar omimakhare avatar peaceshi avatar wanghenshui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

asteria's Issues

compound assignment branches shall not have been TCO'd

var a = 1;
func two() { return 2;  }
func check() { return a &&= two();  }
check();
std.debug.logf("a = $1\n", a);
======================
** ASSERTION FAILED **
======================
        Condition: status == air_status_next
        Location:  asteria/src/runtime/air_node.cpp:121
        Message:
======================
terminate called without an active exception
Aborted

Standard I/O library

  1. There are two standard streams: the standard input stream and the standard output stream. All input operations apply to standard input. All output operations apply to standard output.
  2. Each stream may be byte-oriented or text-oriented. A byte-oriented stream consists of a sequence of bytes. A text-oriented stream consists of a sequence of UTF code points encoded in UTF-8.
  3. Upon start, both streams are neither byte-oriented or text-oriented. A byte operation that has been applied to a stream makes the stream byte-oriented. A text operation that has been applied to a stream makes the stream text-oriented.
  4. If a byte operation is applied to a text-oriented stream or vice versa, the operation fails.
  5. Successful byte input operations shall not return partial UTF code points. Successful byte output operations shall not deliver invalid UTF-8 sequences.

Don't use `rcptr`s for function and opaque types

At the moment, G_function is an alias for rcptr<Abstract_Function> and G_opaque is an alias for rcptr<Abstract_Opaque>.

Although the constructors and assignment operators of Value are able to check for null pointers and convert them to null values, I would like some other specific classes for them, especially G_function.

Simple_Binding_Wrapper itself is a derived class from Abstract_Function so requires dynamic memory. However, all standard library functions are stateless, so the encapsulated Value member turns out to be over-design. The description strings are also all static, so it could be replaced with a const char* rather than cow_string. This eliminates dynamic allocation of all standard library functions.

Implement single-step hooks in `Abstract_Hooks`

This idea originates because at the moment it is possible to write for(;;); in the REPL to make it stuck in that loop, as we don't want Ctrl-C to terminate the program. A solution to this problem is to have a hook that is called before every statement to check whether an interrupt has happened, and if it is the case, throw an exception to break the loop. This feature will also be helpful when we add debugger support in the future.

Optimization techniques

Optimizer Wish List

  • Tail and Sibling Call Optimization (abbr. mistakenly TCO)
    Note that proper tail calls are a core language feature that cannot be disabled.

  • Dead Code Elimination (abbr. DCE)
    This had been implemented naively but was withdrawn. It can be added back easily. But before that, rather than having individual interfaces for varieties of optimization techniques, I would prefer to have a uniformed interface for cow_vector<AIR_Node>, which may be invoked recursively.

  • Constant Folding
    Ideally this pass should precede DCE. We may also support constant expressions (such as 1+2*3) in the future. Arithmetic overflows (such as 1<<100), if not folded, result in exceptions, so we must not fold such expressions.

  • #137

Simplify `Simple_Binding_Wrapper`

At the moment we have this in Simple_Binding_Wrapper:

using Prototype = Reference (const Value& opaque,
                             const Global_Context& global,
                             Reference&& self,
                             cow_vector<Reference>&& args);

The parameter list is horrible and not Simple at all:

  1. Seldom do any bindings need the opaque parameter. In reality, no standard library function have ever used it so far.
  2. The global parameter is used only by bindings that wish to call user-provided functions.
  3. The self reference is used only by member functions.
  4. There are a few bindings that take no argument.

Basing on this fact, the parameter list should be simplified in such a way that those less likely to be useful come more behind.

I propose a prototype as follows:

using Prototype = Reference (cow_vector<Reference>&& args,
                             Reference&& self,
                             const Global_Context& global,
                             const Value& opaque);

As all parameters are references, it is possible to cast a user-defined function from Reference (*)(cow_vector<Reference>&& args) to Prototype* then call it. According to the C++ standard this is undefined behavior, but a lot of POSIX functions have been doing this (calling functions taking fewer parameters than arguments) for decades, and Itanium ABI already states that references are passed as if they were pointers, so I think it is safe.

De-uglify `Argument_Reader`

*[](Reference& self, cow_vector<Reference>&& args, Global_Context& /*global*/) -> Reference&
  {
    Argument_Reader reader(::rocket::sref("std.array.find"), ::rocket::cref(args));
    Argument_Reader::State state;
    // Parse arguments.
    V_array data;
    Value target;
    if(reader.I().v(data).S(state).o(target).F()) {
      Reference_root::S_temporary xref = { std_array_find(::std::move(data), 0, nullopt,
                                                          ::std::move(target)) };
      return self = ::std::move(xref);
    }
    V_integer from;
    if(reader.L(state).v(from).S(state).o(target).F()) {
      Reference_root::S_temporary xref = { std_array_find(::std::move(data), from, nullopt,
                                                          ::std::move(target)) };
      return self = ::std::move(xref);
    }
    optV_integer length;
    if(reader.L(state).o(length).o(target).F()) {
      Reference_root::S_temporary xref = { std_array_find(::std::move(data), from, length,
                                                          ::std::move(target)) };
      return self = ::std::move(xref);
    }
    // Fail.
    reader.throw_no_matching_function_call();
  }
  • Use macros for definitions and return statements.
  • Remove unnecessary function descriptions.
  • De-curry the reader.L(state).o(length).o(target).F() thing.

Missing high-level plan of optimization

It seems that there is lacking a normative reference of overall optimizations, and by which criteria an issue related to the optimization can be closed.

For example, #114 needs to do the optimization of a kind of local CSE. CSE is a common and general class of classical optimizations. When should such optimization class be involved?

array subscript 9 is above array bounds of ‘const std::type_info* const [9]’

If you build asteria with -O2 (which is the default for autotools) using GCC 6, 7 or 8 you might get this error:

  CXX      asteria/src/value.lo
In file included from ../asteria/src/precompiled.hpp:40,
                 from ../asteria/src/value.cpp:4:
../asteria/src/rocket/variant.hpp: In member function ‘void Asteria::Value::enumerate_variables(const Asteria::Abstract_variable_callback&) const’:
../asteria/src/rocket/variant.hpp:336:35: error: array subscript 9 is above array bounds of ‘const std::type_info* const [9]’ [-Werror=array-bounds]
         return *(s_table_type_info[this->m_index]);
                 ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
../asteria/src/rocket/variant.hpp: In member function ‘void Asteria::Value::dump(std::ostream&, Asteria::Size, Asteria::Size) const’:
../asteria/src/rocket/variant.hpp:336:35: error: array subscript 9 is above array bounds of ‘const std::type_info* const [9]’ [-Werror=array-bounds]
         return *(s_table_type_info[this->m_index]);
                 ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
Makefile:1182: recipe for target 'asteria/src/value.lo' failed

I filed a bug report but it still must be worked around.

spurious backtrace frame of proper sibling calls

func two() {
  throw "boom";
}

func one() {
  try {
    two();
  }
  catch(e) {
    throw 123;
  }
}

try {
  one();
}
catch(e) {
  std.debug.dump(__backtrace);
}

Note the 10th frame:

          9 = object(5) {
            "offset" = integer -1;
            "file" = string(9) "[unknown]";
            "line" = integer -1;
            "frame" = string(10) "  function";
            "value" = string(0) "";
          };

implement bash-style formatting functions in Rocket

Examples

rocket::format("hello $1 $2", "world", '!')  // returns "hello world!"
rocket::format("${1} + $1 = ${2}", 1, 2)  // returns "1 + 1 = 2"
rocket::format("literal $$")  // returns "literal $"
rocket::format("funny $0 string")  // returns "funny funny $0 string string"
rocket::format("$123", 'x')  // returns "x23"
rocket::format("${12}3", 'x')  // returns "<error>3"
  1. $$ in the format string is replaced with a literal $.
  2. $N where N is a numeral within 1 and 9 is replaced with the N-th argument, converted to a string.
  3. $0 is replaced with the format string itself verbatim.
  4. ${NNN} can be used to reference the NNN-th argument. At most three numerals are allowed.
  5. References whose corresponding arguments do not exist are replaced with <error>.

Rewrite AVMC queue

  1. Make it resizable.
  2. Tidy code generation in AIR node.
  3. Add dead code elimination.

Segfaullt

var data = [];
data[0] = data;

std.system.gc_collect();

[RFC] Module system

We may take Node.js as an example.

Considerations

  1. What will happen if modules reference each other recursively and infinitely?

`std.numeric.parse_real()` doesn't handle underflows correctly

The documentation says:

std.numeric.parse_real(text, [saturating])
...
If the absolute value of the result is too small to fit in a real, a signed zero is returned.
...

But at the moment a null is returned:

#1:1> std.numeric.parse_real("1e-999")
* result #1: null

Any Documenation?

Hello,

I've seached for a Language Syntax Like Asteria for Months , but are there any "Example"-files or good Dokumentation? And whats about Sockets / Networking or Http Support?

:)

'undeclared identifier' error in nested functions

func two() {
  func one() {
    return typeof two;
  }
  return one();
}
two();

This program causes the following error:

! runtime error: do_push_global_reference: undeclared identifier `two`
        [thrown from native code at 'asteria/src/runtime/air_node.cpp:925']
        -- backtrace:
         * [0] <native code>:-1 (native code) -- string(121) "do_push_global_reference: undeclared identifier `two`\n[thrown from native code at \'asteria/src/runtime/air_node.cpp:925\']"
         * [1] <stdin>:6 (function) -- string(5) "two()"
         * [2] <stdin>:8 (function) -- string(12) "<file scope>"
        -- end of backtrace

`opaque` and non-copyable classes

One of the core principles of Asteria is that values may be copied and destroyed with no side effects. The current implementation relies heavily on copy-on-write. For extensibility and interoperability, we added the opaque type, but:

  1. If an opaque object is copied via COW, will the user observe the difference?
  2. If a COW'd object is modified, will other references to the same object observe the modification (note functions are immutable)?
  3. If the user really wishes to perform a deep copy, when should the object be copied?

Bad assignment to references

var obj = { };
ref x -> obj.meow;
x = x ?? 42;
assert obj.meow == 42;
! error: asteria runtime error: apply_const_opt: String subscript applied to non-object (parent `integer 42`, key `meow`)
[thrown from native code at 'asteria/src/runtime/reference_modifier.cpp:45']
[backtrace frames:
  #0 native code at '<unknown>:-1:-1': string(166) "apply_const_opt: String subscript applied to non-object (parent `integer 42`, key `meow`)\n[thrown from native code at \'asteria/src/runtime/reference_modifier.cpp:45\']"
  #1   frame at '<stdin>:4:16': string(0) ""
  #2   function at '<stdin>:0:0': string(12) "<file scope>"
  -- end of backtrace frames]

Build failure on GCC 7.3, due to -Wstrict-overflow=2

In file included from asteria/src/rocket/cow_string.hpp:21:0,
                 from asteria/src/rocket/insertable_streambuf.hpp:9,
                 from asteria/src/rocket/insertable_streambuf.cpp:4:
asteria/src/rocket/cow_string.hpp: In destructor ‘rocket::basic_insertable_streambuf<charT, traitsT, allocatorT>::~basic_insertable_streambuf() [with charT = char; traitsT = std::char_traits<char>; allocatorT = std::allocator<char>]’:
asteria/src/rocket/assert.hpp:19:62: error: assuming signed overflow does not occur when simplifying conditional to constant [-Werror=strict-overflow]
 #  define ROCKET_DETAILS_ASSERT(expr_, str_, m_)    ((expr_) ? (void)0 : ROCKET_UNREACHABLE())
                                                     ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
asteria/src/rocket/assert.hpp:22:43: note: in expansion of macro ‘ROCKET_DETAILS_ASSERT’
 #define ROCKET_ASSERT(expr_)              ROCKET_DETAILS_ASSERT(expr_, #expr_, "")
                                           ^~~~~~~~~~~~~~~~~~~~~
asteria/src/rocket/cow_string.hpp:166:15: note: in expansion of macro ‘ROCKET_ASSERT’
               ROCKET_ASSERT(old > 0);
               ^~~~~~~~~~~~~
clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)

Double buffering in `variant`

Our current implementation of variant uses the double buffering technique, which wastes 50% of space of a Value. Although it might sound nominal, another level of nested variant would end up in 75% of space (of Reference) being wasted, and one yet more level would end up in 87.5% (of Xpnode) being wasted. This could ultimately lead to performance penalty due to cache misses.

standard streams and locales

At the moment a number of features (e.g. std.debug.dump() and exceptional messages) depend on rocket::insertable_ostream, which implements std::ostream. As it copies the global locale object upon construction, which can be altered by the user with std::locale::global(), it is not safe to rely on its behavior. An example where this matters is that in every European language other than English, the decimal separator is a comma(,) rather than a period(.). A permanent solution to this issue shall be taken into account.

P.S. Only output streams have this issue. The only functions in Asteria that take std::istream as inputs are the constructor of Simple_Source_File and Simple_Source_File::reload(), which operates on the streambuf directly w/o regard to the locale object stored in the stream.

[RFC] Standard library module: process

  1. std.process.spawn(path, [argv], [envp])
    Wraps posix_spawn() and waitpid()
  2. std.process.daemonize()
    Wraps daemon().
  3. umask() ?
  4. getrlimit() and setrlimit() ?

RFC: Deblockification

The program

var i = 1;
{
  i += 2;
  i *= 3;
}
std.io.putf("i = $1\n", i);  // 9

can be transformed to

var i = 1;
// {
i += 2;
i *= 3;
// }
std.io.putf("i = $1\n", i);  // 9

However this requires rewriting the IR, as folows:

var i = 1;
{
  i += 2;   // depth of local reference `i` is 1
  i *= 3;
}
std.io.putf("i = $1\n", i);  // 9
var i = 1;
// {
i += 2;    // depth of local reference `i` is now 0
i *= 3;
// }
std.io.putf("i = $1\n", i);  // 9

This can be performed during either code generation or IR solidification. We also have to apply the same transformation to not only plain blocks, but if branches, switch clauses, while bodies, etc. (note due to type completeness limitation, these cannot be Statements).

Wish List

I am considering a demo release. Hereby I would like to hear about your opinion, what you wish, what you would like, what you propose, either as a core language feature or a library component.

If you have any suggestions, please don't hesitate to let me know.

[RFC] Exceptions or error codes?

Prolog: We don't take std::bad_alloc etc. into account here.

  1. std.filesystem.get_information() a.k.a. stat()
    Suggested solution: Return null on failure. Never throw exceptions.
    Rationale: Security consideration.
  2. std.filesystem.move_from() a.k.a. rename()
    Suggested solution: Always throw an exception on failure.
  3. std.filesystem.remove_recursive()
    Suggested solution: Return 0 if the path does not exist. Throw an exception otherwise.
  4. std.filesystem.directory_list()
    Suggested solution: Return null if the path does not exist. Throw an exception otherwise.
  5. std.filesystem.directory_create() a.k.a. mkdir()
    Suggested solution: Return false if a directory already exists. Throw an exception otherwise.
  6. std.filesystem.directory_remove() a.k.a. rmdir()
    Suggested solution: Return false if the path does not exist. Throw an exception otherwise.
  7. std.filesystem.file_read() a.k.a. read() and pread()
    Suggested solution: Return null if the path does not exist. Return an empty string if the end-of-file has been reached. Throw an exception otherwise.
  8. std.filesystem.file_stream() a.k.a. read() and pread()
    Suggested solution: Return null if the path does not exist. Return 0 if the end-of-file has been reached. Throw an exception otherwise.
  9. std.filesystem.file_write() a.k.a. write() and pwrite()
    Suggested solution: Always throw an exception on failure.
  10. std.filesystem.file_append() a.k.a. write() and pwrite()
    Suggested solution: Always throw an exception on failure.
  11. std.filesystem.copy_from()
    Suggested solution: Always throw an exception on failure.
  12. std.filesystem.file_remove()
    Suggested solution: Return false if the path does not exist. Throw an exception otherwise.

Use of being declared variable inside its own initializer

Synopsis

In languages that distinguish definitions from assignments and have variables with scopes, what would you expect the following pseudo code to output?

DEFINE a AS INTEGER = 1
IF true THEN
  DEFINE a AS INTEGER = a + 1    # should name lookup find the first or second `a`?
  PRINT "inner a = ", a
END IF
PRINT "outer a = ", a

Examples

Node.js

let a = 1;
if(true) {
  let a = a + 1;
  console.log("inner: a = ", a);
}
console.log("outer: a = ", a);

gives

/dev/pts/0:3
  let a = a + 1;
          ^

ReferenceError: a is not defined
    at Object.<anonymous> (/dev/pts/0:3:11)
    at Module._compile (internal/modules/cjs/loader.js:778:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:789:10)
    at Module.load (internal/modules/cjs/loader.js:653:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:593:12)
    at Function.Module._load (internal/modules/cjs/loader.js:585:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:831:12)
    at startup (internal/bootstrap/node.js:283:19)
    at bootstrapNodeJSCore (internal/bootstrap/node.js:623:3)

Lua

local a = 1
if true then
  local a = a + 1
  print("inner: a = ", a)
end
print("outer: a = ", a)

gives

inner: a =      2
outer: a =      1

Perl

my $a = 1;
if(true) {
  my $a = $a + 1;
  print("inner: $$a = $a\n");
}
print("outer: $$a = $a\n");

gives

inner:  = 2
outer:  = 1

Debug information?

var x = 100;
x *= 200;
x *= 300;
x *= 400;
x *= 500;
x *= 600;
x *= 700;
x *= 800;
x *= 900;
! runtime error: do_operator_MUL: integer multiplication overflow (operands were `504000000000000000` and `800`)
        [thrown from native code at 'asteria/src/runtime/air_node.cpp:1071']
        [backtrace:
          #0 <native> at '<native code>:-1': string(164) "do_operator_MUL: integer multiplication overflow (operands were `504000000000000000` and `800`)\n[thrown from native code at \'asteria/src/runtime/air_node.cpp:1071\']"
          #1 <function> at '<stdin>:0': string(11) "<top level>"
          -- end of backtrace]
        [exception class `N7Asteria13Runtime_ErrorE`]

I expect a detailed source location in the backtrace, instead of just saying the exception was thrown from native code, where no contextual information is available.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.