zsaleeba / picoc Goto Github PK
View Code? Open in Web Editor NEWA very small C interpreter
A very small C interpreter
Hi Zik. I've been working to fix bugs in picoc and one of them is an Out of Memory error. It happens in LexTokenise in this code:
int ReserveSpace = (int) (Lexer->End - Lexer->Pos) * 4 + 16;
void *TokenSpace = HeapAllocStack(pc, ReserveSpace);
Lexer->End is pointing to the end of the source file, and Lexer->Pos is at the beginning. So if the source file is large, like some of the unit tests in the test/csmith directory (e.g. rand0.c is 41K), it tries to allocate 4 times the file size on the stack.
Can you explain the thinking behind this? Why allocate 4 times the size of the text we're lexing?
Thanks,
Bob Alexander
Hi I make my own programming language, how i can modify this interpreter for read my language?
My target is the ARM processor running with 16kb RAM and 128kb Flash ROM and there is no filesystem. Is it possible to apply the picoc?
Thank you, it is a very nice project.
Do you have some examples built using picoc?
$ cat main.c
void main() {}
$ valgrind --leak-check=yes --track-origins=yes ./picoc main.c
==30986== Memcheck, a memory error detector
==30986== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==30986== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==30986== Command: ./picoc main.c
==30986==
==30986== Conditional jump or move depends on uninitialised value(s)
==30986== at 0x4194A7: VariableScopeBegin (variable.c:177)
==30986== by 0x414D3B: ParseBlock (parse.c:519)
==30986== by 0x41516D: ParseStatement (parse.c:659)
==30986== by 0x413638: ParseStatementMaybeRun (parse.c:30)
==30986== by 0x413BCE: ParseFunctionDefinition (parse.c:146)
==30986== by 0x414484: ParseDeclaration (parse.c:336)
==30986== by 0x41564A: ParseStatement (parse.c:772)
==30986== by 0x415E85: PicocParse (parse.c:966)
==30986== by 0x404207: PicocPlatformScanFile (platform_unix.c:131)
==30986== by 0x4161BD: main (picoc.c:54)
==30986== Uninitialised value was created by a stack allocation
==30986== at 0x415D58: PicocParse (parse.c:937)
==30986==
==30986== Conditional jump or move depends on uninitialised value(s)
==30986== at 0x4194A7: VariableScopeBegin (variable.c:177)
==30986== by 0x414D3B: ParseBlock (parse.c:519)
==30986== by 0x41516D: ParseStatement (parse.c:659)
==30986== by 0x40FB6A: ExpressionParseFunctionCall (expression.c:1545)
==30986== by 0x40EDBD: ExpressionParse (expression.c:1251)
==30986== by 0x415123: ParseStatement (parse.c:653)
==30986== by 0x415E85: PicocParse (parse.c:966)
==30986== by 0x416523: PicocCallMain (platform.c:77)
==30986== by 0x416242: main (picoc.c:57)
==30986== Uninitialised value was created by a stack allocation
==30986== at 0x415D58: PicocParse (parse.c:937)
==30986==
==30986==
==30986== HEAP SUMMARY:
==30986== in use at exit: 0 bytes in 0 blocks
==30986== total heap usage: 125 allocs, 125 frees, 136,140 bytes allocated
==30986==
==30986== All heap blocks were freed -- no leaks are possible
==30986==
==30986== For counts of detected and suppressed errors, rerun with: -v
==30986== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
Solution (lex.c):
void LexInitParser(struct ParseState *Parser, Picoc *pc, const char *SourceText, void *TokenSource, char *FileName, int RunIt, int EnableDebugger)
{
Parser->ScopeID = 0; /* HERE */
Parser->pc = pc;
Implementation of PicocResetState()
? Dumb is PicocCleanup/PicocInitialise
pair, but...
PicocPlatformScanFile( &pc, "file.1.c" );
PicocCallMain( &pc );
/* --- ??? --- */
PicocResetState( &pc );
/* --- ??? --- */
PicocPlatformScanFile( &pc, "file.2.c" );
PicocCallMain( &pc );
The folowing input into interactive mode produces a failed assertion:
starting picoc v2.2
picoc> int a,b,c,d;
picoc> a=b=c=d=0;
picoc: lex.c:574: LexTokenise: Assertion `ReserveSpace >= MemUsed' failed.
[1] 6395 abort (core dumped) picoc -i
However, doing the same action with two or three variables works fine e.g.
// This works fine
picoc> int a,b,c;
picoc> a=b=c=0;
According to the IncludeFile()
definition the built-in include headers (such as stdio.h) are protected against multiple inclusion, so the following code runs with no problems:
void main(void) {
while (1) {
#include <stdio.h>
}
}
I agree it does not make sense in itself, but read on to see why I need it. As per PicocPlatformScanFile()
such a protection is not implemented for user-defined includes, so the following code will sooner or later explode in memory when interpreted by PicoC:
void main(void) {
while (1) {
#include "my-include.h"
}
}
with my-include.h
containing arbitrary (but valid) C code, e. g.:
int i;
At the same time the same source compiled with gcc runs at constant memory. I would be grateful for explanation, what is the logic behind? What use case could take advantage of multiple inclusions of the same file?
For my usage scenario such a design is lethal, since I am trying to parametrize the program's behaviour by defining the body of a function in a separate file, which could be modified by the user, e. g.:
void my_func(void) {
#include "my-include.c"
}
The function itself would contain assignments and logical operations on input variables that modify output variables and would be called in a loop by the main program.
Could you make 2.1 release here on github ?
For make it fetchable again on FreeBSD ports. v2.1
http://www.freshports.org/lang/picoc
Thank you.
Character constants in C should be of the int
type (see C99 §6.4.4.4 paragraph 10), yet using picoc the following snippet produces '1' in cases where it should produce '0' (e.g. when sizeof(int) == 4
and sizeof(char) == 1
):
#include <stdio.h>
int main(void) {
char input = 'A';
printf("%d\n", sizeof(input) == sizeof('A'));
return 0;
}
When I try to "make" picoc on Ubuntu, this is the result:
wbhart@hilbert:~/picoc$ make
gcc -Wall -pedantic -g -DUNIX_HOST -DVER=\"`svnversion -n`\" -c -o picoc.o picoc.c
gcc: error: directory": No such file or directory
make: *** [picoc.o] Error 1
Here is the output of svnversion -n:
wbhart@hilbert:~/picoc$ svnversion -n
Unversioned directorywbhart@hilbert:~/picoc$
Your library seems great , and we are struggling to add it to an Arduino program.
Maybe you can help us understand how to even start porting it :
Is there any documentation, or anybody who had tried it with arduino ?
Thank you .
i have triedto run 42_function_pointer.c example
but gives me an error about function pointer :
bad type declaration <-- int (*f)(int) = &fred;
#include <stdio.h>
int fred(int p) {
printf("yo %d\n", p);
return 42;
}
int (*f)(int) = &fred;
int main()
{
printf("%d\n", (*f)(24));
return 0;
}
When running the tests, the typedef test case fails.
$make test
...
Test: 63_typedef...
error in test 63_typedef
--- 63_typedef.expect 2015-06-02 08:03:21.363460115 +0200
+++ 63_typedef.output 2015-06-02 08:25:12.031715146 +0200
@@ -1,11 +1,11 @@
-104<1>
-17768<2>
-19088744<4>
--1250999861249<8>
+-1164378113<4>
152<1>
47768<2>
4275878552<4>
-280223976849407<8>
+3130589183<4>
-19088744
(1, 3)
(1, 2)
Makefile:70: recipe for target '63_typedef.test' failed
I'm running the test on an i686 installation.
$ uname -a
Linux pelagic-joth 3.19.0-18-generic #18-Ubuntu SMP Tue May 19 18:30:59 UTC 2015 i686 i686 i686 GNU/Linux
Let's add something simple that makes sure pull requests are tested and do not break the project.
First travis integration could show that build is passing on 64 bits platform but failing on 32 bits ones. Then starting from their we could use the pull requests to fix these issues.
Hi,
Thanks for the great job developing PicoC in the first place!
When playing around with the interpreter I noticed that using static variables triggers a memory leak. To reproduce the bug run the following minimal working example:
void main(void) {
static int i;
}
(saved e. g. in static.c) under valgrind
(picoc compiled with -O0 -g
):
$ valgrind --leak-check=yes --track-origins=yes ./picoc static.c
On my Linux 3.16.0 with gcc 4.8.4 and valgrind 3.10.0.SVN it outputs:
==10829== HEAP SUMMARY:
==10829== in use at exit: 40 bytes in 1 blocks
==10829== total heap usage: 130 allocs, 129 frees, 136,416 bytes allocated
==10829==
==10829== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1
==10829== at 0x4C2CC70: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==10829== by 0x40F0DC: HeapAllocMem (heap.c:138)
==10829== by 0x410B10: VariableAlloc (variable.c:73)
==10829== by 0x410B87: VariableAllocValueAndData (variable.c:91)
==10829== by 0x4119A1: VariableDefinePlatformVar (variable.c:377)
==10829== by 0x4116D0: VariableDefineButIgnoreIdentical (variable.c:332)
==10829== by 0x408709: ParseDeclaration (parse.c:345)
==10829== by 0x409852: ParseStatement (parse.c:772)
==10829== by 0x408FD9: ParseBlock (parse.c:536)
==10829== by 0x409375: ParseStatement (parse.c:659)
==10829== by 0x40EBA5: ExpressionParseFunctionCall (expression.c:1545)
==10829== by 0x40DDF8: ExpressionParse (expression.c:1251)
==10829==
==10829== LEAK SUMMARY:
==10829== definitely lost: 40 bytes in 1 blocks
==10829== indirectly lost: 0 bytes in 0 blocks
==10829== possibly lost: 0 bytes in 0 blocks
==10829== still reachable: 0 bytes in 0 blocks
==10829== suppressed: 0 bytes in 0 blocks
The problem vanishes when static
qualifier is removed. The issue becomes much more serious when static variables are used within a function called periodically, e. g.:
void my_func(void) {
static int i;
}
void main(void) {
while (1) {
my_func();
}
}
The code above acquires heap memory pretty quickly (which you can see e. g. with top
under Linux) and gets killed by the kernel as soon as it exhausts the physical memory, e. g. on my embedded device:
Out of memory: Kill process 918 (picoc) score 865 or sacrifice child
Killed process 918 (picoc) total-vm:227516kB, anon-rss:224908kB, file-rss:408kB
My quick and dirty fix would be to put all global variables allocated with VariableDefinePlatformVar
on the stack instead of the heap, i. e. replace line 377 of variable.c
:
struct Value *SomeValue = VariableAllocValueAndData(pc, NULL, 0, IsWritable, NULL, TRUE);
with:
struct Value *SomeValue = VariableAllocValueAndData(pc, NULL, 0, IsWritable, NULL, FALSE);
This way valgrind
reports no memory leaks on my static.c
above and the infinite loop calling my_func
runs at constant memory. Moreover, all the regression tests keep passing with no errors.
Could you advise if this thinking makes any sense? Or maybe the call to VariableDefinePlatformVar
should be context sensitive, such that local-scope copies of static variables only should be allocated on the stack?
Best regards,
Lukasz
Reading the source, I saw how to run a program whose source is defined in a file, and do it by either calling main() or not calling it (as a script). This is in picoc.c
Also in picoc.c, I found how to run a program stored in a string rather than in a file, in what's called "surveyor host" mode, but it does it without calling main().
I want both things: Running a program which is already stored in a string, and start its execution at main().
My guess is the following code snippet, but is this correct? I'm afraid the program would be run twice (first time when calling PicocParse()
and second time when calling PicocCallMain()
) by doing it this way, but I don't know how to achieve it in other way. Didn't find any docs explaining it.
Is the following guess correct? How should I do it?
int picocrun(const char *src, int argc, char **argv)
{
Picoc pc;
int StackSize = 128*1024; /* space for the stack */
PicocInitialise(&pc, StackSize);
if (PicocPlatformSetExitPoint(&pc))
{
PicocCleanup(&pc);
return pc.PicocExitValue;
}
PicocParse(&pc, "nofile", src, strlen(src), TRUE, FALSE, TRUE, TRUE);
PicocCallMain(&pc, argc, argv);
PicocCleanup(&pc);
return pc.PicocExitValue;
}
is it possible to pass a function pointer (e.g. by argv) to main() and call the function from inside the c-'script'?
I tested successfully with a struct pointer, but I don't found a solution for a function pointer.
Or is there another way to access data of the main program over the API?
Thy and best Regards
Jörg
The following snippet outputs '10' rather than the expected value of '1' due to the way macro expansion is handled incorrectly in the interpreter:
#include <stdio.h>
#define NUM_ONE 0
#define NUM_TWO 1
#define NUM_THREE NUM_ONE + NUM_TWO
int main(void) {
printf("%d\n", 10 * NUM_THREE);
return 0;
}
typedef void (*pTestFunc)();
Got Error Message :"bad type declaration"
I've made a superficial timing testing of the captioned versions using an ad-hoc long empty for loop.
It resulted that ver. 2.2 is approximately 3 times slower than ver. 2.1 !!!
Could you tell me what am I doing wrong ?
Step-by-step:
I've got ver. 2.1 from Google Code and ver. 2.2 form GitHub.
I made the necessary cosmetic changes to unistd in order to get compiled by MinGW32 toolkit.
Got both complied (no optimisations) with TDM-GCC 4.8.1 32bit release on Win 7.
Run a very simple script containing an empty for and looped 3 million times.
Thank you in advance.
Hi!
Have you made some performance comparison with any common scripting languages (perl5, cpython)?
On my machine the following snippet outputs '1' from its second printf
despite the unsigned value of status
being reported as '4294967295', and C's integer promotion rules indicating that status
should be promoted to an unsigned int
for the comparison:
#include <stdio.h>
int main(void) {
int status = -1;
unsigned int value = 1;
printf("%u\n", status);
printf("%d\n", status < value);
return 0;
}
Excuse the amateurish question, but how could I fetch the value of variables I have declared within the source code of the ParseState-parameter of "ParseStatement"?
For example, if I run "ParseStatement" using my ParseState I have prepared with int x = 5;
as a parameter of "LexInitParser", how could I get my hands on the value of x within the scope where I run ParseStatement?
I have tried iterating through pc->GlobalHashTable
and pc->StringHashTable
and searched my variable by checking entry->p.v.Key
and entry->p.v.Val
but couldn't find it.
Many thanks in advance.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.