VMProtect trace parser

6 min readSep 13, 2024

Abstract

In my previous article, I explained several features of VMProtect. While the obfuscation techniques employed by VMProtect are not particularly advanced, its extremely redundant mechanisms still require significant time to analyze. As analysts, we often need to identify where Windows API calls are made and which APIs are being called. To address this, I created a simple tool that extracts API calls using trace data from the well-known disassembly tool, IDA. This tool allows us to identify the addresses of API calls during the execution of a process, as well as the names of the called APIs.

Restrictions:

This is an article about the analysis of a packer, believed to be VMProtect, found in “PrivateLoader”. There remains the possibility that this malware disguised the packer it is using.
The version of the packer is unknown. Latest or older versions may have different functionality.

Malware hash:

MD5: F6570495946923AA4D1467FDBAFBC2F6
SHA1: A0390712FE78C98DB97DC7CCAEA6E0929F548C95

Tool (on Github):

https://github.com/Sachiel-archangel/VMProtect_TraceParser

Tool overview

When analyzing a packer that appears to be VMProtect, static analysis faces several challenges. In this packer, the main code runs in the .text section. However, the .text section of the packed executable code is filled with 0x00, and during the initial processing of the code, the section is stored the decrypted code. Additionally, when calling Windows APIs from the .text section, it goes through the .vmp0 section for obfuscation. As a result, the ImportTable is not used, making it unclear which Windows APIs are being called from the .text section. For more details on these techniques, please refer to the previous article.

I considered dynamic analysis to be a good approach to solve this issue. Instead of going through the trouble of analyzing the decryption logic and creating an unpacker, it would be better to let the software decrypt the code itself. One method of dynamic analysis is to execute the software and obtain trace logs. This approach resolves the problems of encryption and Windows API obfuscation. However, as mentioned in the previous article, this packer inserts many redundant instructions, which contaminates the trace logs. Therefore, I came up with a method to extract Windows API calls from the trace logs. The Windows API calls are characterized by starting with a call instruction in the .text section and going through the code in the .vmp0 section to call the Windows API. By identifying this pattern, we can discover pairs of call instructions in the .text section and the corresponding Windows APIs being called.

This time, I used IDA to output the trace. IDA is one of the major tools in reverse engineering. Additionally, IDA offers a Free version and a more affordable Home edition. If you want to learn malware reverse engineering, I recommend adding this tool to your toolkit.

How to Use: Step-by-Step Guide

Load malware that uses VMProtect to IDA. Then perform the following steps:

Set the debugger to stop at the entry point.
Run debugging and stop at the entry point.
Set up the trace and start tracing.
Resume debugging.
Stop at any point and obtain a trace log file.
Pick up Windows API calls by tool.

1. Set the debugger to stop at the entry point

Set the debugger to stop at the entry point in the “Debugger” menu → “Debugger setup” dialog. Check the “Suspend on process entry point”. If you’re unsure what to choose when clicking the “Debugger” menu and seeing “Select debugger”, I recommend selecting “Local Windows debugger”.

2. Run debugging and stop at the entry point

“Debugger” → “Start process” to start the debugger. The process stops at the entry point.

3. Set up the trace and start tracing

To open the “Tracing options” dialog, select the menu item “Debugger” → “Tracing” → “Tracing options”. In this dialog, you must at least set the output file name for the trace and check the “Log return instructions” checkbox. For other parameters, see the figure.

After setting up the tracing, start tracing. Execute “Debugger” → “Tracing” → “Function tracing” from the menu.

4. Resume debugging

Resume the malware process after starting the trace. Execute from the toolbar or “Debugger” → “Continue process” menu. The execution of malware packed with VMProtect to output a trace takes time.

5. Stop at any point and obtain a trace log file

At any location, stop the process by clicking on the toolbar or “Debug”→ “Suspend process” from the menu. Then, obtain a trace file with the file name set in “Tracing options”.

6. Pick up Windows API calls by tool

Analyze trace log files with tools downloaded from Github. This tool should be run at the command prompt. There are two main options for the command:

-d input(trace) file name (default:data.log)
-o output file name（default:output.txt）

Verification and use of analyzed trace logs

We will examine how the trace log was analyzed. The acquired trace looks like the example shown in the diagram. When a Windows API is called from the .text section, instead, a function in the .vmp0 section is called. After several obfuscated calls, the Windows API is executed using the retn instruction. The tool identifies this characteristic and picks up the pairs of the call from the .text section and the corresponding Windows API name.

The results analyzed by the tool are shown in the diagram below. The red boxes highlight the locations picked up, corresponding to the areas marked with red lines in the diagram above. Subsequent code sections are also picked up in a similar manner. This allows us to understand the relationship between the call instructions in the code unpacked into the .text section and the Windows APIs that were called. In addition, if an address is expressed as an addition formula, it has been modified to display the result of the addition.

By reflecting these results in the static analysis code, it can aid in malware analysis. The figure below shows the result of editing function names in IDA (red under lines). Users can edit function names in IDA using the ’n’ key. Changing the function names to API names makes the code easier to understand.

Conclusion

The tool I created might be somewhat basic. It relies on IDA’s tracing functionality, and this method of tracing takes time. If you use IDA, you might be able to develop a feature to automatically rename the output using Python code. By understanding the mechanism of the packer, users can gain various ideas for tools. I attempted to create one such tool. If you find value in it, please feel free to use it. I would also be happy if my article helps you develop better analysis tools.