Coder Social home page Coder Social logo

jumcc's Introduction

jumcc (Jasper's UM C Compiler)

jumcc is a compiler for a small C-like language I created, that targets the umasm language used to teach assembly programming to students in Tufts University's CS40 course.

Installation

To compile jumcc from source, install Stack.

Then, change to the jumcc source directory and and issue the following commands:

stack setup
stack install

stack setup will download the ghc compiler if you don't have it and stack install will install the jumcc executable to ~\.local\bin which you should add to your PATH.

Usage

By default, jumcc takes the name of a primary um-C source file [name].umc and any number of additional source files and creates a file called out.ums to which it sends output.

jumcc src.umc ...

To specify an output file name, use the -o option.

jumcc -o out.ums src.umc ...

To generate an executable .um binary, the resulting .ums file must be linked with umcrtn.ums and umcrt1.ums, which are found in the umcrt directory, using the UM Macro Assembler umasm program (as far as I know this is only available to those with a Tufts EECS account). I have also supplied a toy standard library in the sample directory that you may compile which implements simple versions of puts and gets.

umasm umcrtn.ums stdlib.ums [your-file].ums ... umcrt1.ums

I've included two additional sample programs in the sample directory that you can compile and assemble/link. Both depend on stdlib.umc

um-C

um-C is intended to be a strict subset of C. It is statically typed, enforced in jumcc by a type checker. I haven't written a grammar yet but here is a general overview of the language:

The include directive instructs the preprocessor to insert the contents of another file into the source code at that point. The name of the file must be enclosed with ":

#include "stdlib.umc"

The preprocessor also supports single-line and multi-line comments:

// I am a single-line comment

/*
 * I am a multi-
 * line comment
 */

There are 2 primitive types: char and int. Both are unsigned and stored in memory as 32-bit values but char values are truncated to between 0 and 255 when accessed. These can be declared like so:

int a;
int b = 1 + 2;
char c;
char d = 'a' + b;

To use a value of one type as another without being yelled at by the type checker, use a cast expression:

int a = 1;
int *a_but_a_ptr = (int *) a;

um-C supports 1-dimensional arrays of primitive types which can be declared like so:

int[10] a;
char[5] str = "four";
int[3] ints = {1, 2, 3};

The size of an array must be specified with an integer constant.

Pointers are also supported, and a variable can have a type of pointer to pointer, pointer to array, or pointer to primitive type. They are declared like so:

int *a;
char *b;
int *c[];

Programs consist of a series of function definitions and declarations, written as in C. Functions may only be called in your code after they have been declared and/or defined:

char return_a();
int func(int a) {
    return_a();
    return a;
}
char return_a() {
    return 'a';
}

Functions must terminate with a return statement.

The main function is the entry point into a program, and there can be only one across all the .umc files you intend to compile and link together to create your program.

int main() {
    ...
    return 0;
}

Functions contain a series of statements. They can be variable declarations as seen before or:

Function calls:

puts("hello world");
sum(1, 2);
gets(str, 10);

Assignment:

*str = 'a';
arr[5] = (10 + b);

Return:

return 100;
return b[5];

Or one of the two control flow structures in um-C:

While loops:

while (x > 10) {
    x = x - 1;
    puts("thats crazy\n");
}

If statments

if (x == 10) {
    puts("x is 10\n");
}

The outb I/O primitive

outb('a');

Available for use in expressions are the following:

The inb I/O primitive

char c = inb();

Function calls

int a = sum(1, 2) + 3;

Relops 1, <=, >, >=, !=, ==, &&, ||. These all function the same way as they do in C. Note that compound expressions on either side of a && or || operator must be in parentheses

char tru = 1 && 1;

Binops +, -, *, /, &, |, ^, %.

int inplusone = inb() + 1;

Unary (prefix) operators -, !, ~, \*, &

*(str + 3) = *(str + 2);

Note that only pointer and array type values may be dereferenced.

Unary (postfix) operator [] used for array access.

str[3] = str[2];

Notes

  • The default stack size for a um-C program is 100000 32-bit words, but you can increase this by changing the value in umcrtn.ums

Built With

Contributors

License

MIT (c) Jasper Geer

jumcc's People

Contributors

jaspergeer avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.