Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
89 views
in Technique[技术] by (71.8m points)

clang++ - Clang: How to get the macro name used for size of a constant size array declaration

TL;DR;

How to get the macro name used for size of a constant size array declaration, from a callExpr -> arg_0 -> DeclRefExpr.

Detailed Problem statement:

Recently I started working on a challenge which requires source to source transformation tool for modifying specific function calls with an additional argument. Reasearching about the ways i can acheive introduced me to this amazing toolset Clang. I've been learning how to use different tools provided in libtooling to acheive my goal. But now i'm stuck at a problem, seek your help here.

Considere the below program (dummy of my sources), my goal is to rewrite all calls to strcpy function with a safe version of strcpy_s and add an additional parameter in the new function call i.e - destination pointer maximum size. so, for the below program my refactored call would be like strcpy_s(inStr, STR_MAX, argv[1]);

I wrote a RecursiveVisitor class and inspecting all function calls in VisitCallExpr method, to get max size of the dest arg i'm getting VarDecl of the first agrument and trying to get the size (ConstArrayType). Since the source file is already preprocessed i'm seeing 2049 as the size, but what i need is the macro STR_MAX in this case. how can i get that? (Creating replacements with this info and using RefactoringTool replacing them afterwards)

#include <stdio.h>
#include <string.h>
#include <stdlib.h> 

#define STR_MAX 2049

int main(int argc, char **argv){
  char inStr[STR_MAX];

  if(argc>1){
    //Clang tool required to transaform the below call into strncpy_s(inStr, STR_MAX, argv[1], strlen(argv[1]));
    strcpy(inStr, argv[1]);
  } else {
    printf("
 not enough args");
    return -1;
  }

  printf("got [%s]", inStr);

  return 0;
}
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

As you noticed correctly, the source code is already preprocessed and it has all the macros expanded. Thus, the AST will simply have an integer expression as the size of array.

A little bit of information on source locations

NOTE: you can skip it and proceed straight to the solution below

The information about expanded macros is contained in source locations of AST nodes and usually can be retrieved using Lexer (Clang's lexer and preprocessor are very tightly connected and can be even considered one entity). It's a bare minimum and not very obvious to work with, but it is what it is.

As you are looking for a way to get the original macro name for a replacement, you only need to get the spelling (i.e. the way it was written in the original source code) and you don't need to carry much about macro definitions, function-style macros and their arguments, etc.

Clang has two types of different locations: SourceLocation and CharSourceLocation. The first one can be found pretty much everywhere through the AST. It refers to a position in terms of tokens. This explains why begin and end positions can be somewhat counterintuitive:

// clang::DeclRefExpr
//
//  ┌─ begin location
foo(VeryLongButDescriptiveVariableName);
//  └─ end location
// clang::BinaryOperator
//
//           ┌─ begin location
int Result = LHS + RHS;
//                 └─ end location

As you can see, this type of source location points to the beginning of the corresponding token. CharSourceLocation on the other hand, points directly to the characters.

So, in order to get the original text of the expression, we need to convert SourceLocation's to CharSourceLocation's and get the corresponding text from the source.

The solution

I've modified your example to show other cases of macro expansions as well:

#define STR_MAX 2049
#define BAR(X) X

int main() {
  char inStrDef[STR_MAX];
  char inStrFunc[BAR(2049)];
  char inStrFuncNested[BAR(BAR(STR_MAX))];
}

The following code:

// clang::VarDecl *VD;
// clang::ASTContext *Context;
auto &SM = Context->getSourceManager();
auto &LO = Context->getLangOpts();
auto DeclarationType = VD->getTypeSourceInfo()->getTypeLoc();

if (auto ArrayType = DeclarationType.getAs<ConstantArrayTypeLoc>()) {
  auto *Size = ArrayType.getSizeExpr();

  auto CharRange = Lexer::getAsCharRange(Size->getSourceRange(), SM, LO);
  // Lexer gets text for [start, end) and we want him to grab the end as well
  CharRange.setEnd(CharRange.getEnd().getLocWithOffset(1));

  auto StringRep = Lexer::getSourceText(CharRange, SM, LO);
  llvm::errs() << StringRep << "
";
}

produces this output for the snippet:

STR_MAX
BAR(2049)
BAR(BAR(STR_MAX))

I hope this information is helpful. Happy hacking with Clang!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...