using other compilers generator

Started by
25 comments, last by Aardvajk 10 years, 1 month ago

when writing a compiler for some language it is maybe relatively easy (and reletively hard) wrote some part that will make some stright froward assembly (or some more abstract bytecode or something) for this - but then comes a part of optymize this, as x84 is very complex i am afraid it can be hard (maybe much harder then first step, im not sure) - so i wonder if this is an option for writing only first crucial part for my own language of my invention and use an existing generator/optymalizator (for example this of gcc compiler which is probably good one)

?

Advertisement
You should look into the LLVM backend:

http://llvm.org/docs/tutorial/
http://llvm.org/docs/CodeGenerator.html

It's the most popular thing right now for what you want to do.

edit:sorry, the first link is not what you wanted, changed now.

Hi,

Trust me, I've worked 6 years in that domain, writing optimizing compiler middle and backend is a quite complex task, especially to achieve decent performance.

LLVM is a very good framework. It is very modular and have a lot of state of the art optimization and analysis passes. But it's also quite complex beast to start with ;)

If you plan to generate native x86 code, have you considered writing a translator from your language to C or C++ language ? This approach permits you to rapidly focus on your langage developement, and let your favorite compiler do the nasty stuff. With this first prototype, you could then analyze the code generation, and think about the oportunities of further optimizations based on your language characteristics (i.e, aliasing rules, loop parallism and vectorization and so on). Then you can switch to a compilation framework such as LLVM to code those specific optimizations.

----

www.winthatwar.com

Download our latest demo here.

www.insaneunity.com

@InsaneUnity

@DTR666

Hi,

Trust me, I've worked 6 years in that domain, writing optimizing compiler middle and backend is a quite complex task, especially to achieve decent performance.

LLVM is a very good framework. It is very modular and have a lot of state of the art optimization and analysis passes. But it's also quite complex beast to start with ;)

If you plan to generate native x86 code, have you considered writing a translator from your language to C or C++ language ? This approach permits you to rapidly focus on your langage developement, and let your favorite compiler do the nasty stuff. With this first prototype, you could then analyze the code generation, and think about the oportunities of further optimizations based on your language characteristics (i.e, aliasing rules, loop parallism and vectorization and so on). Then you can switch to a compilation framework such as LLVM to code those specific optimizations.

generate c code is ugly - I think i will manage to make my own compiler (full path from source to binary .obj files, maybe not neccesary deep down ) but i would consider roughly imposible to make decent optymization - well maybe simple basic strightforward not optymising compilation would suffice - and if it would become popular maybe people will add teh optymizer part...

Ok, just a suggestion ;)

In that case, LLVM is a good choice. You "just" have to write your frontend and generate LLVM IL.

That can be a bit tricky but LLVM will produce .obj for you, and you will have many choice about optimization passes to apply.

If you choose that path, don't hesitate to PM me if you need help !

In any case, I hope you will inform us of your progresses ;)

----

www.winthatwar.com

Download our latest demo here.

www.insaneunity.com

@InsaneUnity

@DTR666

Ok, just a suggestion ;)

In that case, LLVM is a good choice. You "just" have to write your frontend and generate LLVM IL.

That can be a bit tricky but LLVM will produce .obj for you, and you will have many choice about optimization passes to apply.

If you choose that path, don't hesitate to PM me if you need help !

alright,

ps you ask what kind of language it would be, this is basically c with some changes to make it more the kind i would like, some

main features would be things like

1)

Something a,b,c,x,y,z;

x, y, z = f(a,b,c)

passes structures through the implicit adress (not by value) read only way and returns values also thru adresses to upper scope

2) some big doze of syntaktic sugar like

a,b,c = x,y,z;

(also thinning the ammount of present symbols esp made ;

optional etc, made free symbol names like

int `this is some free name` = 10;

void `some function name`()

{

}

3) ability to reach to static data of another function

f()

{

static x=1;

}

g()

{

f.x = 10;

}

4) hardware support for sse realated types fooat4 double2 double4 etc

float4 x = 10, 20, 30, 40

float4 y = 10, 20, 30, 20

float4 z = x*y

5) better suport for modular programming (by made storing

type info for symbols in obj so no declatarions needed tu use

a symbol (only a module name)

module myProgram;

reaches window, audio, bitmap;

void main()

{

window init(10,10, "some window");

window show();

audio play("some.mp3"); //calls the function play() in module audio.obj

}

6) support for full arrays (with two pointers begin and end

not half arrays like here only begin pointer)

7) reallocked arrays

int tab[1000];

realloc tab[3000];

8) yet more

Im working on il almost 10 years now, (mentioned ideas my seem easy here but it took a years of deep philosophical thinking to uncover them)

would like to get it running some day

I second the LLVM recommendation. Good, well documented compiler back end that's worked great in my own similar project.


3) ability to reach to static data of another function



f()

{

static x=1;

}



g()

{

f.x = 10;

}
What's the motivation for this one? Seems like it A) opens a new possible bug with function static variables being accessed before they are initialized, B) undermines the mechanism of restricting variables to a bounded scope, and C) tries to improve a feature that is usually better to avoid.

If you just want to modify C slightly, don't bother writing your own whole frontend to LLVM. Just modify Clang, the C/C++/Objective-C/OpenCL frontend that's already part of LLVM and already includes a production-grade parser, IR generator, and diagnostic toolset. Writing your own toy parse can be a long and arduous task even when you already know what you're doing, and getting it to truly usable state is a multi-man-decade project that you could never accomplish by yourself.

Some of the things you're asking for are even already a part of Clang's extensions, such as hardware SIMD support. Your notion of "full arrays" is also a part of the Cyclone derivative of C which you could perhaps work on adding to Clang for upstream submission.

Sean Middleditch – Game Systems Engineer – Join my team!


3) ability to reach to static data of another function



f()

{

static x=1;

}



g()

{

f.x = 10;

}
What's the motivation for this one? Seems like it A) opens a new possible bug with function static variables being accessed before they are initialized, B) undermines the mechanism of restricting variables to a bounded scope, and C) tries to improve a feature that is usually better to avoid.

Motivation is to give a usable tool /way for programmers

Could be used for some optymistations when you can set the statics

(and reuse them) instead as passing through arguments - it also

could be used for some other ways for example you could have

an static int err and check this err value after the call (if you want

no need to passing) ect

This can be used in some way to optymize co programs (and it is very hard to optymize c which is fuck*n efficient language, so I am very proud of it especially becouse this could really optymize c (by

some microseconds but still)

this all is some kind oldschool low-level spirit but i am an oldschool

c (and b) spirit worshipper.

A) i dont know nothing about this, statics are initialized before program execution is done

B/C) you could avoid that by a matter of your decizion on which style you consider a good one (as you avoid goto etc), also it can be restricted you can mark 'accesible' statics and todays non accessible by some keywords (if it should be done, probably it should because not all statics have such external usability but some can have)

(this syntax functionname.x is not yet quite decided it may be a bit

changed also the details of this keywords for restricting access etc is not yet chosen, and all the syntax in all improvements can jet be a little changed Im working on this new language (codenamed c2) near 10 years, I even sent a mail to D Ritchie one day (about 5 years ago) but not sure if he readed it,)

(other very very important thing very worth noticing is the point 1 on this list it is much improvement on todays c style, much cleaning the sourcess look)

ps if someone would like to participate in help in writting the compiler it would be welcome - but i would must have absolutely deciding voice about each every slightest detail of the language (I dont think i will find somebody and probably would need to do it myself some day)

fir

This topic is closed to new replies.

Advertisement