A file with .pdb extension is a program database file that contains debugging information for a compiled executable (EXE/DLL). PDB files are generated by Microsoft Compilers when an application program is compiled in debug mode. The presence of PDB file can help in reverse engineering an executable as it contains significant information about all symbols of the modules. It is for this reason that these files are kept separate from the final executable. Microsoft’s DgbHelp API can open a PDB file to obtain information such as publics and exports, global symbols, local symbols, type data, source files and line numbers.
PDB File Format
PDB is Microsoft’s proprietary file format and has not been officially documented anywhere yet. However, a starting documentation is available here and can be referenced.
PDB files consist of multiple streams where each stream acts as a virtual individual file and contains information. PDB file writers can write to these files and the file is finalized only after an explicit commit is issued. A compiler can keep writing to a PDB file but commit only if all user code compiles successfully. A PDB file consists of following streams:
Version information, and information to connect this PDB to the EXE
Tpi (Type manager)
All the types used in the executable.
Dbi (Debug information)
Holds section contributions, and list of ‘Mods’
Holds a hashed string table
n Mod’s (Module information)
Each Mod stream holds symbols and line numbers for one compiland
Global symbol hash
An index that allows searching in global symbols by name
Public symbol hash
An index that allows searching in public symbols by addresses
Actual symbol records of global and public symbols
Hash used by the TPI stream.
Each stream in a PDB file comprises of several pages which are not necessarily consecutively numbered.
A PDB file beings with a Header that consists of a signature for identifying and validating the specific format. The length of the signature depends on the PDB format. The header may be longer than a single page.
The PDB metadata is responsible to recognize all of the component streams, giving the length, and sequence of pages for each stream. Orders are given to streams consecutively; starting with 0. There is also a an un-ordered root stream, which contains some of the metadata.