The Obfuscation used in “Emotet” in the fall of 2020

5 min readJan 17, 2021

Overview

I am analyzing one of the “Emotet” specimens found in the fall of 2020. The reason is that “Emotet” is one of the most abused malware in Japan. I’m also studying changes in malware tactics. This article describes the use and analysis of techniques that differ from “Emotet” in the spring of 2020.

I previously wrote about “Function-based encryption” and “Heaven’s Gate”. These techniques were used in “Emotet” specimens discovered in the spring of 2020. However, these were not used in the specimens discovered in the fall of 2020. And I discovered several different techniques. I will introduce some of these techniques and how to analyze them in this article.

Target malware hash value:
MD5:46A69F3FBCD669B05BD2FB2D82E85A57
SHA1:D101F7C5642F4BA5BBDDA2EE85979FF97B8CFD13

Outline of techniques

There are three obfuscation techniques introduced in this article:

Obfuscation due to useless processing
Concealment of the using API
Obfuscation by looping processing

No.1 reduces readability by mixing unnecessary processing with the original processing.
No.2 does not use the import table to keep the API used secret. It also does not use APIs such as “LoadLibrary “or “GetProcAddress” to avoid tracing by analysts. In addition, it uses 32-bit hash values to conceal library and API names. By using irreversible cryptography, the analyst cannot know the strings of the library or API.
These techniques are also found in many other malware.

No.3 is a technique that obfuscates sequential processing by looping and conditional branching. This technique makes it difficult for analysts to understand the process flow. Analysts need to think of ways to efficiently analyze this technique.

Obfuscation due to useless processing

This is a very simple method. It just makes it harder for analysts to see the code because of useless processing. The code it really need is just a few steps. Analysts have to figure out the code they need. Alternatively, analysts can solve it by observing important parameters at the exit of the process. In the case of Figure 1, we only need to focus on 0x001F99C5 and 0x001F99CA.

Concealment of the using API

API concealment is an obstacle to static analysis. When an analysts look at the code, they can’t tell if it’s using the API. Even if they can determine it, they don’t know which API they are using. This hinders the analysis of program flow.
On the other hand, malware must get the address of the API in order to use it. To do this, malware must use the name of the library or API. Analysts may be able to know which API to use by knowing their names. However, the author of the malware is cunning. They use irreversible hashes to encrypt library and API names. Therefore, analysts cannot get the text of encrypted libraries or APIs.

Windows system knowledge is used to obtain the addresses of libraries and APIs. Therefore, it is difficult to prevent this technique. In analysis, it is effective to monitor the acquisition functions of libraries and APIs and their usage.

Obfuscation by looping processing

This technique changes the order of the apparent programs. Moreover, it makes it difficult for analysts to determine in what order they should process. Therefore, this technique is an obstacle to static analysis.

Code is usually written sequentially. However, this technique uses loops and conditionals to change the order of appearance. Furthermore, the determination of the return value is complicated. Normally, the condition is judged by jumping with mnemonics such as “jz” and “jnz”. However, in this technique, the return value is calculated and used as a loop judgment condition. Analysts cannot analyze without predicting the value calculated by the return value. An effective analysis of this technique is to trace the code.

Figure 3. Setting condition value by return value

Figure 4. Calculation of condition value by return value

Figure 5. Determining the order of processing

In Figure 3, the return value is determined by 0x001F64A2. Then, “cmovnz” is used to determine whether the result is 0 or not, and the value of eax is controlled.
Figure 4 calculates the return value. In 0x001F149D, the return value of the function of 0x001F1491 is set in eax. If eax is 1, the execution result of 0x001F149D is 0xFFFFFFFF. If eax is 0, the execution result of 0x001F149D is 0x00000000. In 0x001F14A1, if eax is 0xFFFFFFFF, the execution result is 0x1EDE78AD. If eax is 0x00000000, the execution result is 0x00000000. Therefore, the calculation result of 0x001F14A6 changes depending on the return value of 0x001F1491.

As a result of Figures 3 and 4, eax is set with parameters for the next function to execute. Figure 5 controls to execute the next process by determining the value.

Figure 7. Obfuscation by looping processing like C language

Figures 6 and 7 are examples in C language-like notation. Both are the same process, but in Figure 7, the order is difficult to understand due to the loop and condition determining.

Conclusion

These techniques have been used to thwart static analysis. Its purpose is to avoid security solutions that perform static analysis and to avoid analysis by analysts.

What’s interesting is that it’s very different from Emotet’s spring 2020 technique.

“Why did the attacker change the obfuscation technique?”

Is the Spring 2020 technique no longer available? Or has the actor changed?
… I think that malware analysis will bring us more insights.