Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
484 views
in Technique[技术] by (71.8m points)

gcc - execute binary machine code from C

following this instructions I have managed to produce only 528 bytes in size a.out (when gcc main.c gave me 8539 bytes big file initially).

main.c was:

int main(int argc, char** argv) {

    return 42;
}

but I have built a.out from this assembly file instead:

main.s:

; tiny.asm
  BITS 64
  GLOBAL _start
  SECTION .text
  _start:
                mov     eax, 1
                mov     ebx, 42  
                int     0x80

with:

me@comp# nasm -f elf64 tiny.s
me@comp# gcc -Wall -s -nostartfiles -nostdlib tiny.o
me@comp# ./a.out ; echo $?
42
me@comp# wc -c a.out
528 a.out

because I need machine code I do:

objdump -d a.out

a.out:     file format elf64-x86-64


Disassembly of section .text:

00000000004000e0 <.text>:
  4000e0:   b8 01 00 00 00          mov    $0x1,%eax
  4000e5:   bb 2a 00 00 00          mov    $0x2a,%ebx
  4000ea:   cd 80                   int    $0x80

># objdump -hrt a.out

a.out:     file format elf64-x86-64

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
 0 .note.gnu.build-id 00000024  00000000004000b0  00000000004000b0  000000b0 2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
 1 .text         0000000c  00000000004000e0  00000000004000e0  000000e0 2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
SYMBOL TABLE:
no symbols

file is in little endian convention:

me@comp# readelf -a a.out
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x4000e0
  Start of program headers:          64 (bytes into file)
  Start of section headers:          272 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         2
  Size of section headers:           64 (bytes)
  Number of section headers:         4
  Section header string table index: 3

now I want to execute this like this:

#include <unistd.h>
 // which version is (more) correct?
 // this might be related to endiannes (???)
char code[] = "x01xb8x00x00xbbx00x00x2ax00x00x80xcdx00";
char code_v1[] = "xb8x01x00x00x00xbbx2ax00x00x00xcdx80x00";

int main(int argc, char **argv)
{
/*creating a function pointer*/
int (*func)();
func = (int (*)()) code;
(int)(*func)();

return 0;
}

however I get segmentation fault. My question is: is this section of text

  4000e0:   b8 01 00 00 00          mov    $0x1,%eax
  4000e5:   bb 2a 00 00 00          mov    $0x2a,%ebx
  4000ea:   cd 80                   int    $0x80

(this machine code) all I really need? What I do wrong (endiannes??), maybe I just need to call this in different way since SIGSEGV?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The code must be in a page with execute permission. By default, stack and read-write static data (like non-const globals) are in pages mapped without exec permission, for security reasons.

The simplest way is to compile with gcc -z execstack, which links your program such that stack and global variables (static storage) get mapped in executable pages, and so do allocations with malloc.


Another way to do it without making everything executable is to copy this binary machine code into an executable buffer.

#include <unistd.h>
#include <sys/mman.h>
#include <string.h>

char code[] = {0x55,0x48,0x89,0xe5,0x89,0x7d,0xfc,0x48,
    0x89,0x75,0xf0,0xb8,0x2a,0x00,0x00,0x00,0xc9,0xc3,0x00};
/*
00000000004004b4 <main> 55                          push   %rbp
00000000004004b5 <main+0x1>  48 89 e5               mov    %rsp,%rbp
00000000004004b8 <main+0x4>  89 7d fc               mov    %edi,-0x4(%rbp)
00000000004004bb <main+0x7>  48 89 75 f0            mov    %rsi,-0x10(%rbp)
'return 42;'
00000000004004bf <main+0xb>  b8 2a 00 00 00         mov    $0x2a,%eax
'}'
00000000004004c4 <main+0x10> c9                     leaveq 
00000000004004c5 <main+0x11> c3                     retq 
*/

int main(int argc, char **argv) { 
    void *buf;

    /* copy code to executable buffer */    
    buf = mmap (0,sizeof(code),PROT_READ|PROT_WRITE|PROT_EXEC,
                MAP_PRIVATE|MAP_ANON,-1,0);
    memcpy (buf, code, sizeof(code));
    __builtin___clear_cache(buf, buf+sizeof(code)-1);  // on x86 this just stops memcpy from optimizing away as a dead store

    /* run code */
    int i = ((int (*) (void))buf)();
    printf("get this done. returned: %d", i);

    return 0;
}

output:

get this done. returned: 42

RUN SUCCESSFUL (total time: 57ms)

Without __builtin___clear_cache, this could break with optimization enabled because gcc would think the memcpy was a dead store and optimize it away. When compiling for x86, __builtin___clear_cache does not actually clear any cache; there are zero extra instructions; it just marks the memory as "used" so stores to it aren't considered "dead". (See the gcc manual.)


Another option would be to mprotect the page containing the char code[] array, giving it PROT_READ|PROT_WRITE|PROT_EXEC. This works whether it's a local array (on the stack) or global in the .data.

Or if it's const char code[] in the .rodata section, you might just give it PROT_READ|PROT_EXEC.

(In versions of binutils ld from before about 2019, the .rodata got linked as part of the same segment as .text, and was already mapped executable. But recent ld gives it a separate segment so it can be mapped without exec permission so const char code[] doesn't give you an executable array anymore, but it used to so you may this old advice in other places.)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...