Wednesday, March 23, 2011

Windows and its PE file structure

I'll start this post of by asking a question; WTF is a PE file? A PE file is something we use on a day to day basis when we use our computer systems. The files that have the ".exe" and ".dll" extensions are what we refer to as PE (Portable executable) files. A PE file contains one of the most complex file structures that i've ever seen and its very important to understand most, if not all of it if you want to be modify the binary file or become a reverse-engineer. Becasue there are so many structures, i can't go through them all (i don't even understand 50% of them) but i will try to focus on the most common ones.

For a visual of what the structure looks like, goto google images and search "PE file format".
Here is one that i found and usually reference: link



[ MZ header] - "hex bytes: 4d 5a"
[ Dos stub ] - "This program cannot be run in dos mode"
[PE header] - "Hex bytes: 50 45 00 00"
[optional header]
[Data directory] - "Structure of important locations such as import table, export table, etc."
[Section table header] - "array of structures describing the properties of each section."
[section 1]
[section 2]
[section n]

Every PE file should contain the above information. The very first two bytes of the file should be "4d5a", which is MZ. This indicates the start of the dos header. At position 0x3c in the Dos header, is a dword (4 bytes) that indicates the offset of the start of PE header. Directly after this should be the DOS stub that basically prints a string saying that this program cannot be run in dos mode or something similar.

Following the dword offset at positon 0x3c should take you to the start of the PE header and should containt the hex bytes "50 45 00 00". Other useful information contained in here include . the machine type (i386, i686, etc.) , the number of sections and size of optional header.

24 bytes from the PE header starts the Optional header. This structure is in every PE file and isn't really optional as it may suggest. It contains many relevant fields that the windows loader needs in order to load the file correctly into memory.

The data directory is a listing of the locations of important data such as the import tables (when you use functions from windows DLLS, you have to import them.) and export tables.

Section header is a structure containting the properties of each section. This information includes its name, its size on disk and in memory and its location.

The last sections will house the individual sections referenced in the section header. You can use the information in the section header to find the relevant offsets and size of each section.