Coder Social home page Coder Social logo

ctod's People

Contributors

dkorpel avatar schveiguy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

ctod's Issues

non-initial float member of union shouldn't be default initialized to 0

In a struct, floats should be initialized to 0 to prevent surprises.

However, in a union, D does not permit setting the default value of members that aren't the first.

So the following doesn't work:

typedef union
{
    stbir_uint32 u;
    float f;
} stbir__FP32;
union _Stbir__FP32 {
    stbir_uint32 u;
    float f = 0; // error
}alias stbir__FP32 = _Stbir__FP32;

Let's talk about macros

I feel like the current macro translation situation is poor in ctod. Probably not ctod's fault, and again, we are creeping towards a full compiler but...

Just ran into this:

#define MIN(a,b) (((a)<(b))?(a):(b))
pixels[y*image->width + x+1].r = MIN((int)pixels[y*image->width + x+1].r + (int)((float)rError*7.0f/16), 0xff);

Translates to:

enum string MIN(string a,string b) = ` (((a)<(b))?(a):(b))`;
pixels[y*image.width + x+1].r = MIN(cast(int)pixels[y*image.width + x+1].r + cast(int)(cast(float)rError*7.0f/16), 0xff);

Lots of problems here:

  1. the enum doesn't actually do string substitution!
  2. the call doesn't translate to using strings instead of the values themselves.
  3. The call really should require a mixin!

I get that ctod has to do something here. But this isn't very useful. I get that understanding MIN is now a macro call, and therefore you change the expressions inside to strings would be difficult in an automatic way. But I'd almost rather have a nested function call than an enum here.

Can we explore other options?

using sizeof with multiplication causes confusion

Another weird one, I originally thought related to #4, but it happens without the unsigned attribute.

void foo(void) {
    size_t a = sizeof(unsigned char) * 5;
    size_t b = sizeof(unsigned char);
    size_t c = sizeof(int) * 5;
    size_t d = sizeof(int);
}   
void foo() {
    size_t a = sizeofcast(ubyte) * 5;
    size_t b = ubyte.sizeof;
    size_t c = sizeofcast(int) * 5;
    size_t d = int.sizeof;
}

sizeof(...) * something is used a lot in malloc calls, so this is an important one.

Usage of struct tag results in odd translation

This is likely a somewhat uncommon occurrence as most code will typedef structs into a symbol, but using a struct with a tag as results in some odd code.

struct S {
    int x;
};

struct T {
    struct S s;
};

void foo(struct T t);
struct S {
    int x;
}

struct T {
    struct S ;S s;
}

struct T ;void foo(T t);

I have a file that uses structs without typedefs, and it doesn't translate well.

casts using custom types and parentheses don't turn into cast statements

typedef unsigned char X;

void main()
{
   unsigned char c = 5;
   c = (unsigned char)(c + 5);

   c = (X)(c + 5);  
}
alias X = ubyte;

int main() {
   ubyte c = 5;
   c = cast(ubyte)(c + 5);

   c = (X)(c + 5);  
}

That second line should change into a cast. It may not be as easily detectable. But there is a lot of code that uses typedefs, and casts.

Remove the parentheses around the expression, and it's recognized as a cast.

versions vs. enums

In the project I'm working on (raylib), many #defines are specified in a config.h file, and many are specified by the makefile. Some way to distinguish between them would be helpful:

e.g.:

#ifdef PLATFORM_DESKTOP // specified by the makefile
#ifdef SUPPORT_IMAGE_EXPORT // specified by the config.h

I'd like some option of translation for these. Some I want to be version statements, some I want to be enums/static if:

version(PLATFORM_DESKTOP) {
static if(SUPPORT_IMAGE_EXPORT) {

I'm not sure how to envision this. Maybe a configuration file for ctod? I'm not sure if there would be a way to infer the right usage from the existing file. Especially since a lot of the config options are commented out in the config file, so ctod won't even see how they are defined.

varargs calls would be nice to translate

with C, the macro va_arg does some funky stuff with a type name. You use it like:

va_arg(v, int);

which comes out untouched on the D side, but obviously this is invalid syntax.

This should translate to:

va_arg!int(v);

This translation isn't critical, I can do a search/replace, but it would be nice to have. Probably not a huge problem, as not many functions are actually varargs.

lib-tree-sitter-src/makefile is empty

Not sure if this was intentional. In order to build on macos, I used the build from the original tree-sitter source, so I don't technically need this to build. But I did expect it to actually work with an apparent makefile, only to find it's empty.

ifndef with definition

#ifndef foo
  #define foo bar
#endif
version (foo) {} else {
  enum foo = bar;
}

Somewhat nonsensical. Though I get how this happens. Just bringing it up in case there's any better way to handle this.

unsigned without extra type doesn't get copied properly

typedef struct S {
   unsigned x[10];
   unsigned y;
   unsigned int z[10];
} S;

void foo(void)
{
    unsigned x = 5;
}

=>

struct S {
   [10] x;
    y;
   uint[10] z;
}

void foo() {
     x = 5;
}

I believe unsigned without a further type is unsigned int.

adding integer to C array

C:

void foo(void) {
    int x[10];
    int *ptr = x + 5;
}

D:

void foo() {
    int[10] x;
    int* ptr = x + 5; // should be x.ptr + 5
}

Not sure if this is solvable in the general case, but you seem to be able to sniff out pointer usage in other cases when it's a static array.

`typedef enum { ... } E;` should provide aliases for values

In C, when you define an enum type, the members are accessible without the namespace.

This needs to be reproduced in D for equivalent code to compile.

e.g.:

enum X
{
    A,
    B
};

int x = A;

current conversion:

enum X
{
    A,
    B
}

int x = A;

Proposed conversion:

enum X
{
    A,
    B
}
alias A = X.A;
alias B = X.B;

int x = A;

What to do with linkage definitions?

In a file I'm translating, I have this (this is common for Windows systems):

// Function specifiers in case library is build/used as a shared library (Windows)
// NOTE: Microsoft specifiers to tell compiler that symbols are imported/exported from a .dll
#if defined(_WIN32)
    #if defined(BUILD_LIBTYPE_SHARED)
        #define RAYGUIAPI __declspec(dllexport)     // We are building the library as a Win32 shared library (.dll)
    #elif defined(USE_LIBTYPE_SHARED)
        #define RAYGUIAPI __declspec(dllimport)     // We are using the library as a Win32 shared library (.dll)
    #endif
#endif

// Function specifiers definition
#ifndef RAYGUIAPI
    #define RAYGUIAPI       // Functions defined as 'extern' by default (implicit specifiers)
#endif

Then things are defined like:

RAYGUIAPI void GuiEnable(void);

But when passed via ctod it comes out like:

RAYGUIAPI GuiEnable();

Which somehow swallows the return type. I can work around by just removing all the RAYGUIAPI in all cases, but this seems like something that might need addressing.

No rush of course on this, I'm not building DLLs here.

Some possible thoughts -- I don't see how you can correctly translate this to D, as it doesn't allow such a string replacement as the C preprocessor allows. But, what if you could just define direct string replacements? Like, just say, ctod --redefine RAYGUIAPI=export or ctod --redefine RAYGUIAPI=?

C code seems to have degenerate parsing time for some construct

I was playing with transforming neomutt/nntp source code and it seemed to hang. I didn't hone in on the exact construct that is causing the parsing issue. :/

The attached newsrc.txt file is a slightly reduced version. This is about 100 lines and takes 20s to translate. Delete a few lines and it goes to 9 seconds and the right lines and it's under 1 sec. I'm not sure if this is still a valid reproduce case as I've deleted enough arbitrarily that it likely isn't valid C anymore either.

newsrc.txt

What to do with `char x[] = "str"`

So in my code base, I have something like:

char header[] = "LOTS OF TEXT...";
// sometime later
foo(header, sizeof(header)-1);

This gets translated using ctod to:

char * header = "LOTS OF TEXT...";
// sometime later
foo(header, sizeof(header)-1);

It's clear from this that we don't want the size of the pointer minus 1, but the number of bytes (minus the null character).

A couple of problems here:

  1. The correct "type" for this really is a char[n].
  2. If that was the correct translation, then sizeof(header) - 1 is going to strip of the last character, not the zero terminator!
  3. If the type of header was typed as const char header[], then this would have compiled and done exactly the wrong thing!

So what to do?

One of the worst things ctod can do is to translate the code into something that compiles, but does the wrong thing. Because nobody is going to scrutinize this.

The sizeof call is obviously wrong, so at least it's flagged by the compiler. But i'm wondering if that was an accident, because other sizeof calls are properly translated.

But really I wonder if this kind of pattern should be recognized, and changed to char[N] = "LOTS OF TEXT...\0";, where N is detected by ctod to at least make the sizeof calculation accurate?

For reference, the real code is here:
https://github.com/schveiguy/draylib/blob/acb0b099169d73ac2fc4c11ddf00776bdf0aaa40/raylibc/external/stb_image_write.h#L770

static array parameters

If I have a function in C that takes a sized array, and a call with that same type, the translated D code will build, but won't be equivalent.

e.g.:

#include <stdio.h>

void foo(unsigned short arr[2]) {
    arr[0] = 5;
}

int main() {
    // nested array needed to trick ctod into not putting a .ptr on it
    unsigned short arr[4][2] = { 0 };
    foo(arr[0]);
    printf("arr[0] is %d\n", arr[0][0]);
    return 0;
}
module test;
@nogc nothrow:
extern(C): __gshared:
public import core.stdc.stdio;

void foo(ushort[2] arr) {
    arr[0] = 5;
}

int main() {
    // nested array needed to trick ctod into not putting a .ptr on it
    ushort[2][4] arr = 0;
    foo(arr[0]);
    printf("arr[0] is %d\n", arr[0][0]);
    return 0;
}

The C code prints 5, the D code prints 0

My recommendation is probably to use a pointer instead of the static array for the parameters. Or else, use ref. The former is more likely to compile with correct code without modification.

Sometimes structs aren't being copied

In a translated file, I have

struct sdefl_freq {
  unsigned lit[SDEFL_SYM_MAX];
  unsigned off[SDEFL_OFF_MAX];
};
struct sdefl_code_words {
  unsigned lit[SDEFL_SYM_MAX];
  unsigned off[SDEFL_OFF_MAX];
};
struct sdefl_lens {
  unsigned char lit[SDEFL_SYM_MAX];
  unsigned char off[SDEFL_OFF_MAX];
};

In the D file I get:

sdefl_freq;
sdefl_code_words;
sdefl_lens;

Not sure why this is happening.

Reference file is: https://github.com/schveiguy/draylib/blob/0a7b3d1ada6ce4daedd95ed7fee0d34422b1782b/raylib/external/sdefl.h#L138

ifndef with else results in bad version construct

C:

#ifndef foo
   int x;
#else
   long x;
#endif

D:

version (foo) {} else {
   int x;
} else {
   c_long x;
}

What needs to happen, unfortunately, is the else branch needs to be copied into the first brace set. Not sure if this is easy to do.

M1(arm) macOS support?

I see that probably it is required to add some libtree-sitter and libc-parser objects to make it run.
Can you please give some hints how to build it?

I can build it for arm architecture so you will be able to add it to the repo.

bad static array translation

int foo[5]= {0,1,2,3,4};
int bar[5]= {1,2,3,4,5};
int[5] foo = 0;
int[5] bar = [1,2,3,4,5];

The key is it has to be a static array, and the initializer values have to start with a 0.

This took me forever to figure out because I'm translating stb_image which is a giant nest of bit manipulation/lookup tables, and there are some static tables in the huffman decoding that started with 0! So basically, the huffman decoding was failing, and I couldn't figure out why.

Now that I have found this, I have it building and working ;)

What am I doing wrong?

I got it to build on macos.

I ran it on my first c file, here: https://github.com/schveiguy/draylib/blob/0a7b3d1ada6ce4daedd95ed7fee0d34422b1782b/raylib/rmodels.c

After running, I got a rmodels.d. But the diff is:

0a1,4
> module rmodels;
> @nogc nothrow:
> extern(C): __gshared:
> 
106,108c110,111
< #ifndef MAX_MATERIAL_MAPS
<     #define MAX_MATERIAL_MAPS       12    // Maximum number of maps supported
< #endif
---
>  
>     
5041c5044
< #endif
---
> #endif
\ No newline at end of file

It's almost like it's giving up early or something. Does it deal properly with header files? Would it be best to translate preprocessed files?

Incorrect translation of array of structs

str.c

struct S { double x; int y; }
Sarray[2] = {
	{1.5, 2},
	{2.5, 3}
}

That produces

module str;
@nogc nothrow:
extern(C): __gshared:
struct S { double x = 0; int y; }S[2] Sarray = [
	[1.5, 2],
	[2.5, 3]
];

Which results in the error

str.d(5): Error: cannot implicitly convert expression `[1.5, 2.0]` of type `double[]` to `S`
str.d(6): Error: cannot implicitly convert expression `[2.5, 3.0]` of type `double[]` to `S`

The fix is simple, it should instead generate

struct S { double x = 0; int y; }S[2] Sarray = [
	{1.5, 2},
	{2.5, 3}
];

Weird translation of extern "C" closing guard

This:

#ifdef __cplusplus
}
#endif

translates to this:

version(none) {
}
}

Which doesn't work... The initial header translates to:

#ifdef __cplusplus
extern "C" {
//! #endif

Which isn't great, but at least is obviously wrong, and it still has the __cplusplus statement there instead of the unrelated version(none)

Distinguish struct / array initializers

C:

int x[3] = {10, 20};

typedef struct {
    int x;
    int y;
} S;

S y = {10, 20};

D:

int[3] x = [10, 20];

struct _S {
    int x;
    int y;
}alias _S S;

S y = [10, 20];

The struct initializer should not use [] brackets.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.