Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
373 views
in Technique[技术] by (71.8m points)

c - Is memcpy of a pointer the same as assignment?

Introduction: This question is part of my collection of C and C++ (and C/C++ common subset) questions regarding the cases where pointers object with strictly identical byte-wise representation are allowed to have different "values", that is, to behave differently for some operation (including to have defined behavior on one object and undefined behavior on the other).

Following another question which caused much confusion, here is question about pointer semantics that will hopefully clear things up:

Is this program valid in all cases? The only interesting part is in the "pa1 == pb" branch.

#include <stdio.h>
#include <string.h>

int main() {
    int a[1] = { 0 }, *pa1 = &a[0] + 1, b = 1, *pb = &b;
    if (memcmp (&pa1, &pb, sizeof pa1) == 0) {
        int *p;
        printf ("pa1 == pb
"); // interesting part
        memcpy (&p, &pa1, sizeof p); // make a copy of the representation
        memcpy (&pa1, &p, sizeof p); // pa1 is a copy of the bytes of pa1 now
        // and the bytes of pa1 happens to be the bytes of pb 
        *pa1 = 2; // does pa1 legally point to b?
    }
    else {
        printf ("pa1 != pb
"); // failed experiment, nothing to see
        pa1 = &a[0]; // ensure well defined behavior in printf
    }
    printf ("b = %d *pa1 = %d
", b, *pa1);
    return 0;
 }

I would like an answer based on standard quotes.

EDIT

By popular demand, here is what I want to know:

  • is a pointer's semantic "value" (its behavior according to the specification) determined only by its numerical value (the numerical address it contains), for a pointer of a given type?
  • if not, it is possible to copy only the physical address contained in a pointer while leaving out the associated semantic?

Here let's say that some one past the end pointer happens to accidentally point to another object; how can I use such one past the end pointer to access the other object?

I have the right to do anything, except use a copy of the address of the other object. (It's a game to understand pointers in C.)

IOW, I try to recycle dirty money just like the mafia. But I recycle a dirty pointer by extracting its value representation. Then it looks like the clean money, I mean pointer. Nobody can tell the difference, no?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The question was:

Is this program valid in all cases?

The answer is "no, it is not".


The only interesting part of the program is what happens within the block guarded by the if statement. It is somewhat difficult to guarantee the truthness of the controlling expression, so I've modified it somewhat by moving the variables to global scope. The same question remains: is this program always valid:

#include <stdio.h>
#include <string.h>

static int a[1] = { 2 };
static int b = 1;
static int *pa1 = &a[0] + 1;
static int *pb = &b;

int main(void) {
    if (memcmp (&pa1, &pb, sizeof pa1) == 0) {
        int *p;
        printf ("pa1 == pb
"); // interesting part
        memcpy (&p, &pa1, sizeof p); // make a copy of the representation
        memcpy (&pa1, &p, sizeof p); // pa1 is a copy of the bytes of pa1 now
        // and the bytes of pa1 happens to be the bytes of pb 
        *pa1 = 2; // does pa1 legally point to b?
    }
}

Now the guarding expression is true on my compiler (of course, by having these have static storage duration, a compiler cannot really prove that they're not modified by something else in the interim...)

The pointer pa1 points to just past the end of the array a, and is a valid pointer, but must not be dereferenced, i.e. *pa1 has undefined behaviour given that value. The case is now made that copying this value to p and back again would make the pointer valid.

The answer is no, this is still not valid, but it is not spelt out very explicitly in the standard itself. The committee response to C standard defect report DR 260 says this:

If two objects have identical bit-pattern representations and their types are the same they may still compare as unequal (for example if one object has an indeterminate value) and if one is an indeterminate value attempting to read such an object invokes undefined behavior. Implementations are permitted to track the origins of a bit-pattern and treat those representing an indeterminate value as distinct from those representing a determined value. They may also treat pointers based on different origins as distinct even though they are bitwise identical.

I.e. you cannot even draw the conclusion that if pa1 and pb are pointers of same type and memcmp (&pa1, &pb, sizeof pa1) == 0 is true that it is also necessary pa1 == pb, let alone that copying the bit pattern of undereferenceable pointer pa1 to another object and back again would make pa1 valid.

The response continues:

Note that using assignment or bitwise copying via memcpy or memmove of a determinate value makes the destination acquire the same determinate value.

i.e. it confirms that memcpy (&p, &pa1, sizeof p); will cause p to acquire the same value as pa1, which it didn't have before.


This is not just a theoretical problem - compilers are known to track pointer provenance. For example the GCC manual states that

When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined. That is, one may not use integer arithmetic to avoid the undefined behavior of pointer arithmetic as proscribed in C99 and C11 6.5.6/8.

i.e. were the program written as:

int a[1] = { 0 }, *pa1 = &a[0] + 1, b = 1, *pb = &b;
if (memcmp (&pa1, &pb, sizeof pa1) == 0) {
    uintptr_t tmp = (uintptr_t)&a[0]; // pointer to a[0]
    tmp += sizeof (a[0]); // value of address to a[1]
    pa1 = (int *)tmp;
    *pa1 = 2; // pa1 still would have the bit pattern of pb,
              // hold a valid pointer just past the end of array a,
              // but not legally point to pb
}

the GCC manual points out that this is explicitly not legal.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...