Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
850 views
in Technique[技术] by (71.8m points)

performance - In what cases should I use memcpy over standard operators in C++?

When can I get better performance using memcpy or how do I benefit from using it? For example:

float a[3]; float b[3];

is code:

memcpy(a, b, 3*sizeof(float));

faster than this one?

a[0] = b[0];
a[1] = b[1];
a[2] = b[2];
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Efficiency should not be your concern.
Write clean maintainable code.

It bothers me that so many answers indicate that the memcpy() is inefficient. It is designed to be the most efficient way of copy blocks of memory (for C programs).

So I wrote the following as a test:

#include <algorithm>

extern float a[3];
extern float b[3];
extern void base();

int main()
{
    base();

#if defined(M1)
    a[0] = b[0];
    a[1] = b[1];
    a[2] = b[2];
#elif defined(M2)
    memcpy(a, b, 3*sizeof(float));    
#elif defined(M3)
    std::copy(&a[0], &a[3], &b[0]);
 #endif

    base();
}

Then to compare the code produces:

g++ -O3 -S xr.cpp -o s0.s
g++ -O3 -S xr.cpp -o s1.s -DM1
g++ -O3 -S xr.cpp -o s2.s -DM2
g++ -O3 -S xr.cpp -o s3.s -DM3

echo "=======" >  D
diff s0.s s1.s >> D
echo "=======" >> D
diff s0.s s2.s >> D
echo "=======" >> D
diff s0.s s3.s >> D

This resulted in: (comments added by hand)

=======   // Copy by hand
10a11,18
>   movq    _a@GOTPCREL(%rip), %rcx
>   movq    _b@GOTPCREL(%rip), %rdx
>   movl    (%rdx), %eax
>   movl    %eax, (%rcx)
>   movl    4(%rdx), %eax
>   movl    %eax, 4(%rcx)
>   movl    8(%rdx), %eax
>   movl    %eax, 8(%rcx)

=======    // memcpy()
10a11,16
>   movq    _a@GOTPCREL(%rip), %rcx
>   movq    _b@GOTPCREL(%rip), %rdx
>   movq    (%rdx), %rax
>   movq    %rax, (%rcx)
>   movl    8(%rdx), %eax
>   movl    %eax, 8(%rcx)

=======    // std::copy()
10a11,14
>   movq    _a@GOTPCREL(%rip), %rsi
>   movl    $12, %edx
>   movq    _b@GOTPCREL(%rip), %rdi
>   call    _memmove

Added Timing results for running the above inside a loop of 1000000000.

   g++ -c -O3 -DM1 X.cpp
   g++ -O3 X.o base.o -o m1
   g++ -c -O3 -DM2 X.cpp
   g++ -O3 X.o base.o -o m2
   g++ -c -O3 -DM3 X.cpp
   g++ -O3 X.o base.o -o m3
   time ./m1

   real 0m2.486s
   user 0m2.478s
   sys  0m0.005s
   time ./m2

   real 0m1.859s
   user 0m1.853s
   sys  0m0.004s
   time ./m3

   real 0m1.858s
   user 0m1.851s
   sys  0m0.006s

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...