I'm learning Assembly as part of a malware analysis project and trying to use a few Node.js libraries to scrape executables from GitHub and disassemble them.
Specifically I'm focusing on x86-64 PE.
But a disassembler, such as the one I chose isn't necessarily supposed to find the instructions in a particular executable format such as in a PE.
In addition to first needing to know where my instructions should start, when I started using the disassembler, I realized I also needed to set a particular RIP value for the program to start at. I don't fully understand why some programs start at different memory offsets, but supposedly it's to allow other cooperating processes to put memory in the same block. Or something like that.
So my goal is to know:
- the correct starting value for the RIP
- the correct byte to look for the first instruction, beyond the header.
So I used a library to find meta data, like so:
let metaData = await executableMetadata.getMetadataObjectFromExecutableFilePath_Async(execPath);
Which when passed an exe with a header like this:
0: 4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00
16: b8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00
32: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
48: 00 00 00 00 00 00 00 00 00 00 00 00 80 00 00 00
64: 0e 1f ba 0e 00 b4 09 cd 21 b8 01 4c cd 21 54 68
80: 69 73 20 70 72 6f 67 72 61 6d 20 63 61 6e 6e 6f
96: 74 20 62 65 20 72 75 6e 20 69 6e 20 44 4f 53 20
112: 6d 6f 64 65 2e 0d 0d 0a 24 00 00 00 00 00 00 00
128: 50 45 00 00 4c 01 03 00 91 3f 9a ef 00 00 00 00
144: 00 00 00 00 e0 00 22 00 0b 01 30 00 00 12 00 00
tells us:
{
format: 'PE',
pe_header_offset_16le: 128,
machine_type: 332,
machine_type_object: {
constant: 'IMAGE_FILE_MACHINE_I386',
description: 'Intel 386 or later processors and compatible processors'
},
number_of_sections: 3,
timestamp: -275103855,
coff_symbol_table_offset: 0,
coff_number_of_symbol_table_entries: 0,
size_of_optional_header: 224,
characteristics_bitflag: 34,
characteristics_bitflags: [
{
constant: 'IMAGE_FILE_EXECUTABLE_IMAGE',
description: 'Image only. This indicates that the image file is valid and can be run. If this flag is not set, it indicates a linker error.',
flag_code: 2
},
{
constant: 'IMAGE_FILE_LARGE_ADDRESS_AWARE',
description: 'Application can handle > 2-GB addresses.',
flag_code: 32
}
],
object_type_code: 267,
object_type: 'PE32',
linker: { major_version: 48, minor_version: 0 },
size_of_code: 4608,
size_of_initialized_data: 2048,
size_of_uninitialized_data: 0,
address_of_entry_point: 12586,
base_of_code: 8192,
windows_specific: {
image_base: 4194304,
section_alignment: 8192,
file_alignment: 512,
major_os_version: 4,
minor_os_version: 0,
major_image_version: 0,
minor_image_version: 0,
major_subsystem_version: 6,
minor_subsystem_version: 0,
win32_version: 0,
size_of_image: 32768,
size_of_headers: 512,
checksum: 0,
subsystem: {
constant: 'IMAGE_SUBSYSTEM_WINDOWS_CUI',
description: 'The Windows character subsystem',
subsystem_code: 3
},
dll_characteristics: 34144,
dll_characteristic_flags: [ [Object], [Object], [Object], [Object], [Object] ]
},
base_of_data: 16384
}
And from this, I think maybe I found the two pieces of info I needed:
- First instruction byte:
windows_specific.size_of_headers
(512)
- RIP starting value:
address_of_entry_point
(12586)
But I'm basically guessing. Could anyone more familiar with this meta data explain the correct properties to look at to get the info I need?
See Question&Answers more detail:
os