Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
330 views
in Technique[技术] by (71.8m points)

possible to share work when parsing multiple files with libclang?

If I have multiple files in a large project, all of which share a large number of included header files, is there any way to share the work of parsing the header files? I had hoped that creating one Index and then adding multiple translationUnits to it could cause some work to be shared - however even code along the lines of (pseudocode)

index = clang_createIndex();
clang_parseTranslationUnit(index, "myfile");
clang_parseTranslationUnit(index, "myfile");

seems to take the full amount of time for each call to parseTranslationUnit, performing no better than

index1 = clang_createIndex();
clang_parseTranslationUnit(index1, "myfile");
index2 = clang_createIndex();
clang_parseTranslationUnit(index2, "myfile");

I am aware that there are specialized functions for reparsing the exact same file; however what I really want is that parsing "myfile1" and "myfile2" can share the work of parsing "myheader.h", and reparsing-specific functions won't help there.

As a sub-question, is there any meaningful difference between reusing an index and creating a new index for each translation unit?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

One way of doing this consists in creating Precompiled Headers (PCH file) from the shared header in your project.

Something along these lines seems to work (you can see the whole example here):

  auto Idx = clang_createIndex (0, 0);
  CXTranslationUnit TU;
  Timer t;

  {
    char const *args[] = { "-xc++", "foo.hxx" };
    int nargs = 2;

    t.reset();
    TU = clang_parseTranslationUnit(Idx, 0, args, nargs, 0, 0, CXTranslationUnit_ForSerialization);
    std::cerr << "PCH parse time: " << t.get() << std::endl;
    displayDiagnostics (TU);
    clang_saveTranslationUnit (TU, "foo.pch", clang_defaultSaveOptions(TU));
    clang_disposeTranslationUnit (TU);
  }

  {
    char const *args[] = { "-include-pch", "foo.pch", "foo.cxx" };
    int nargs = 3;

    t.reset();
    TU = clang_createTranslationUnitFromSourceFile(Idx, 0, nargs, args, 0, 0);
    std::cerr << "foo.cxx parse time: " << t.get() << std::endl;
    displayDiagnostics (TU);
    clang_disposeTranslationUnit (TU);
  }

  {
    char const *args[] = { "-include-pch", "foo.pch", "foo2.cxx" };
    int nargs = 3;

    t.reset();
    TU = clang_createTranslationUnitFromSourceFile(Idx, 0, nargs, args, 0, 0);
    std::cerr << "foo2.cxx parse time: " << t.get() << std::endl;
    displayDiagnostics (TU);
    clang_disposeTranslationUnit (TU);
  }

yielding the following output:

PCH parse time: 5.35074
0 diagnostics

foo1.cxx parse time: 0.158232
0 diagnostics

foo2.cxx parse time: 0.143654
0 diagnostics

I did not find much information about libclang and precompiled headers in the API documentation, but here are a few pages where the keyword appears: CINDEX and TRANSLATION_UNIT

Please note that this solution is not optimal by any ways. I'm looking forward to seeing better answers. In particular:

  1. each source file can have at most one precompiled header
  2. nothing here is libclang-specific ; this is the exact same strategy that is used for build time optimization using the standard clang command lines.
  3. it is not really automated, in that you have to explicitly create the precompiled header (and must thus know the name of the shared header file)
  4. I don't think using different CXIndex objects would have made any difference here

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...