How to analyze API address acquisition process

Overview

In the previous article, we got the address of the loaded DLL. In this article, I will explain how to find the address of a specific API from the address of the DLL and how to analyze the program. This technique is also already widely known. Therefore, this technique is used by many malware. If you don’t know this technique, you need to know it to be a good malware analyst.

The DLL has an area called “IMAGE_NT_HEADERS” that starts with the magic of “PE”. Information about the DLL is stored in the “OptionalHeader” structure (type:”IMAGE_OPTIONAL_HEADER32") of the member variable. The export function information is stored in “DataDirectory [0]” (type:”IMAGE_DATA_DIRECTORY”) which is a member of “IMAGE_OPTIONAL_HEADER”. It is known that the “VirtualAddress” of a member of type “IMAGE_DATA_DIRECTORY” indicates an “IMAGE_EXPORT_DIRECTORY” structure. We can find the APIs address from “IMAGE_EXPORT_DIRECTORY” type structure.

*Remarks: This document is about Win32 applications. I also refer to the MSDN website as Microsoft’s official documentation.

1st Step — IMAGE_DOS_HEADER

The DLL base address (DLL start address) points to the “IMAGE_DOS_HEADER” structure. We must understand the “IMAGE_DOS_HEADER” structure to proceed to the next step. I searched Microsoft’s official documentation for the “IMAGE_DOS_HEADER” structure, but couldn’t find it. However, the “IMAGE_DOS_HEADER” structure is defined in “winnt.h” included in the Microsoft Visual Studio SDK.

I also referred to other documentation about “IMAGE_DOS_HEADER”.

struct IMAGE_DOS_HEADER
https://www.nirsoft.net/kernel_struct/vista/IMAGE_DOS_HEADER.html

We want to know the address of “IMAGE_NT_HEADERS” structure starting with magic “PE”. Its value is stored in “e_lfanew” of “IMAGE_DOS_HEADER” structure. “e_lfanew” is the offset value from the base address of the DLL. We need to add the value of “e_lfanew” to the base address of the DLL to get the address to access the memory. “e_lfanew” is located at “0x3C” in the “IMAGE_DOS_HEADER” structure.

2nd Step — IMAGE_NT_HEADERS and IMAGE_OPTIONAL_HEADER

“IMAGE_NT_HEADERS” is the area of the structure that starts with the magic of “PE”.

MAGE_NT_HEADERS32 structure (winnt.h)
https://docs.microsoft.com/ja-jp/windows/win32/api/winnt/ns-winnt-image_nt_headers32

Members include signature containing the magic of “PE”, data of type “IMAGE_FILE_HEADER”, and data of type “IMAGE_OPTIONAL_HEADER32”. To find the address of the DLL API, refer to “OptionalHeader” of type “IMAGE_OPTIONAL_HEADER32”. The address of “Optional Header” is the address of “IMAGE_NT_HEADERS” + 0x18.

The “IMAGE_OPTIONAL_HEADER32” type is a structure that contains versions and size information.

IMAGE_OPTIONAL_HEADER32 structure
https://docs.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-image_optional_header32

The member has a “DataDirectory []” variable of type “IMAGE_DATA_DIRECTORY”. Information on the Export function is stored in “DataDirectory [0]”. The address of “DataDirectory [0]” is the address of “IMAGE_OPTIONAL_HEADER” + 0x60.

Therefore, the address of “DataDirectory [0]” is the address of “IMAGE_NT_HEADERS” + 0x78 (0x18 + 0x60).

3rd Step — IMAGE_EXPORT_DIRECTORY

The “IMAGE_DATA_DIRECTORY” type has only two members.

IMAGE_DATA_DIRECTORY structure (winnt.h)
https://docs.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-image_data_directory

“VirtualAddress” contains the offset value to “IMAGE_EXPORT_DIRECTORY”. The official Microsoft documentation may not give you a good idea of what the “VirtualAddress” value means. The address of “DataDirectory [0]” of type “IMAGE_OPTIONAL_HEADER” is the same as the first member variable “VirtualAddress” of type “IMAGE_DATA_DIRECTORY”.

Therefore, the offset value of “IMAGE_EXPORT_DIRECTORY” is stored in the address of “IMAGE_NT_HEADERS” + 0x78.

I couldn’t find the “IMAGE_EXPORT_DIRECTORY” type in Microsoft’s official documentation. However, the “IMAGE_DOS_HEADER” structure is defined in “winnt.h” included in the Microsoft Visual Studio SDK.

IMAGE_EXPORT_DIRECTORY (Structures)
http://pinvoke.net/default.aspx/Structures.IMAGE_EXPORT_DIRECTORY

“AddressOfFunctions” is an array that contains the offset values of the API addresses.

“AddressOfNames” is an array that contains API name’s pointers.

“AddressOfNameOrdinals” is an array containing the ordinal numbers to get the address position of the API function from the index of the API name.

By referring to these, we can find the API address from the API name.

4th Step — Search API address by API Name

We use “AddressOfFunctions”, “AddressOfNames”, and “AddressOfNameOrdinals” to find the API address. These values are offsets, we need to add the DLL base address to each.

First, find the target API name from “AddressOfNames”. The strings are ASCII strings, separated by NULL. Then note the number of the desired API found. This is the “API name index”.

As an example, let’s search for the “Assoc Query Key” API. The “AssocQueryKey” API is fourth, so the index is “3”.

Next, get the “Ordinal number” from “AddressOfNameOrdinals” using the “API name index”. Note that the ordinal numbers are in 2-byte units.

As an example, the index of “AssocQueryKey” is “3”, so the ordinal number is “0x1F6”.

Finally, get the value of “AddressOfFunctions” in the ordinal position. This is the “API offset address”. Add the DLL base address to the “API offset address” to get the API address.

As an example, the ordinal number of “AssocQueryKey” is “0x1F6”. The size of the address is in 4 bytes, so the offset is 0x1F6 * 4 = 0x7D8. AddressOfFunctions (= 0x76AA6238) + 0x7D8 is the area where the API address is stored. The address is 0x76AA0238 + 0x7D8 = 0x76AA6A10.

The offset address of the API is “0x22E40”. The API base address is “0x76A70000”, so the API address is “0x76A92E40”. If we look at the code, the address is correct.

Conclution

I showed how to find the address of “IMAGE_EXPORT_DIRECTORY” by referring to the header information from the DLL address and find the API address using its members “AddressOfFunctions”, “AddressOfNames”, and “AddressOfNameOrdinals”. Combined with the method in the previous article, you can find out the API address of the loaded DLL. This method allows you to know the API address without using the Windows API. Therefore, this method is used by malware to make code analysis difficult.

Malware also often uses hash values to conceal API names. The only difference is that it doesn’t compare API names in ASCII, it just calculate the hash value of the API name and then compare. This allows malware authors to keep their API names secret.

This is a basic technique, but we hope it helps you analyze your malware.

Security Analyst in Japan. GIAC GREM (Gold)