Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
796 views
in Technique[技术] by (71.8m points)

gcc - Build AST from C code

How can I build an AST (Abstract Syntax Tree) from gcc C code in order to make some modifications, like converting some int variables to float, and reproduce(generate) the code to C syntax again after that.

Actually, for the moment, the only functionality I truly need is to extract a table of variables and their types from a c program consisting of few lines... I think there is a simple parser doing so.

I have some variables like:

int  var_bss ;           
float var_f_bss;            
int var_data = 4;        
float var_f_data = 5;  

And a function:

int Foo(){          
   some local variables;            
}    

The code is in a single c file.

I want to introduce all the variables to the end user to let him choose a source type in a specific memory segment e.g. int variables in the .data. Then the user can convert those variables into floats. Finally, I generate the same code for the user but with the new variable types those he has chosen.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

First, it is a difficult task, because the abstract syntax tree of C is much more complex than what you believe it is. Read the C11 standard n1570 for details, and see this website. Look also into tinyCC or nwcc (at least for inspiration).

Then if you are using a recent GCC (e.g. 4.7 or 4.8), I strongly suggest customizing GCC e.g. with a MELT extension (or your GCC plugin).

I don't claim it is a simple task, because very probably you need to understand the details of GCC internal representations (at least GIMPLE)

BTW, MELT is (was) a domain specific language to extend GCC, and is designed exactly for the kind of tasks you are dreaming about. You would be able with MELT to transform the internal GCC representations (Gimple and Tree-s). Today in 2020, MELT is not worked upon because of lack of funding.

The advantage of working inside GCC (or inside some other compiler like Clang/LLVM) is that you don't have to spit back some C code (which is actually much more difficult than what you think); you just transform the internal compiler representation and, perhaps most importantly, you take advantage "gratis" of the many things a compiler always do: all kind of optimizations like constant folding, inlining, common-subexpression elimination, etc, etc, etc, ....

In 2020, you could also consider using the libgccjit framework inside recent GCC 10, and read this draft report (related to Bismon; but see also RefPerSys, sharing some ideas but no code with Bismon). Try perhaps also the Clang static analyzer and/or Frama-C.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...