Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
412 views
in Technique[技术] by (71.8m points)

c - Redefining function from standard library

Context: In a recent conversation, the question "does gcc/clang do strlen("static string") at compile time?" came up. After some testing, the answer seems to be yes, regardless the level of optimization. I was a bit surprised to see this done even at -O0, so I did some testing, and eventually arrived to the following code:

#include <stdio.h>

unsigned long strlen(const char* s) {
  return 10;
}

unsigned long f() {
  return strlen("abcd");
}

unsigned long g(const char* s) {
  return strlen(s);
}

int main() {
  printf("%ld %ld
",f(),g("abcd"));
  return 0;
}

To my surprise, it prints 4 10 and not 10 10. I tried compiling with gcc and clang, and with various flags (-pedantic, -O0, -O3, -std=c89, -std=c11, ...) and the behavior is consistent between the tests.

Since I didn't include string.h, I expected my definition of strlen to be used. But the assembly code shows indeed that strlen("abcd") was basically replaced by return 4 (which is what I'm observing when running the program).

Also, the compilers print no warnings with -Wall -Wextra (more precisely, none related to the issue: they still warn that parameter s is unused in my definition of strlen).

Two (related) questions arise (I think they are related enough to be asked in the same question):
- is it allowed to redefine a standard function in C when the header declaring it isn't included?
- does this program behave as it should? If so, what happens exactly?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Per C 2011 (draft N1570) 7.1.3 1 and 2:

All identifiers with external linkage in any of the following subclauses … are always reserved for use as identifiers with external linkage.

If the program declares or defines an identifier in a context in which it is reserved (other than as allowed by 7.1.4), or defines a reserved identifier as a macro name, the behavior is undefined.

The “following subclauses” specify the standard C library, including strlen. Your program defines strlen, so its behavior is undefined.

What is happening in the case you observe is:

  • The compiler knows how strlen is supposed to behave, regardless of your definition, so, while optimizing strlen("abcd") in f, it evaluates strlen at compile time, resulting in four.
  • In g("abcd"), the compiler fails to recognize that, because of the definition of g, this is equivalent to strlen("abcd"), so it does not optimize it at compile time. Instead, it compiles it to a call to g, and it compiles g to call strlen, and it also compiles your definition of strlen, with the result that g("abcd") calls g, which calls your strlen, which returns ten.

The C standard would allow the compiler to discard your definition of strlen completely, so that g returned four. However, a good compiler should warn that your program defines a reserved identifier.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...