Remcos RAT - Malware Analysis Lab
Overview
Part 1: Preliminary Static Analysis of Starting Binary
Taking a malicious executable which has been categorised as a trojan with the name âMSIL/AgentTeslaâ and âTR/AD.Remcosâ on VirusTotal, we can explore it further.
- Source: MalwareBazaar
- Source: VirusTotal.com
First off obtain the sample with a particular SHA256 hash:
Starting IOC (SHA256): 7a1bb4fe0f62425fdd2e163ea17d84465323c4f2df8aabb8a50b1433e7d42a9f
Analysis in pestudio reveals this is a .NET, 32-bit executable with timestomped Debugger and Compiler timestamps. It also had an original name during development of âtocehi.exeâ; however, this may also have been tampered with.
Examining the resources section there is a resource with exceptionally large entropy which indicates it likely contains compressed or encrypted data. Thereâs also a repeating theme of bytes spelling out âPADâ which may indicate junk data has been added as padding to the binary to make it more challenging to analyse.
Examining the imports shows this is likely dynamically invoking and loading assembly into memory in addition to possibly performing string reversal operations.
Part 2: Decompiling Binary
Opening in dnSpyEx, by right clicking the executable and using âGo to Entry Pointâ this takes us to the start of our binary where it runs a new instance of Form1.
Examining this shows what looks to be decoy code and an instance of the form component being initialized. Of interest is that this form uses the System.Reflection class which is unusual and signifies reflective loading of code will likely occur.
Examining this shows a lot of form initialization which seems innoculous at first; however, at one point it gets a string stored within a resource object called âCFDâ within the class âForm11â, replaces all instances of â$â with âEâ, and reverses it before convering this to base16 (Hex) and loading it in as raw assembly.
By copying the resource CFD into CyberChef and performing these operations, itâs revealed this is likely a PE file being loaded into memory.
Saving this to a file, a new SHA256 hash can be obtained which has been seeen by VirusTotal and is flagged as a âspreaderâ and âinjectorâ.
IOC (SHA256): 280001013946838a651abbdee890fa4a4d49c382b7b5e78b7805caef036304e2
Of interest is that when this instance is created it is passing an object array called âextâ which is defined from Form1.EXT. This is defined as a string array containing the entries â71474C547242â, â69786Fâ, âDoanHQTCSDLâ all 3 of which are essential for later analysis.
Part 3: Examining Embedded 1st Stage Payload (Pend.dll)
Opening up in pestudio shows this had an internal name of âPend.dllâ during development, is likely obfuscated using the SmartAssembly .NET obfuscator, and was compiled on May 3rd, 2023 at 04:45:48 UTC which seems far more plausible to be legitimate given the time this sample was found in the wild.
Using de4dot the binary can be deobfuscated automatically.
Looking at the main method of this binary shows more decoy code and then a call to a method named âxpâ, passing in 3 strings string_0, string_1, string_2.
It should be noted that these 3 strings are the 3 strings seen in our original binary which were passed to this DLL upon instantiation.
This method instantiates an instance of whatâs returned from the âoJâ method, specifically of type âMunoz.Himentaterâ.
Examining this reveals an overly large byte array which is being GZip decompressed back into a MemoryStream to be loaded.
By changing our decompiler back into Common Intermediate Language language (IL) and examining method âoJâ again, the full array bytes can be located within a defined structure, and the instructions seem similar to what was seen before, specifically defining a byte array of size 14340.
By copying this array into CyberChef, removing whitespace and line breaks, and converting from hex, it is then identified as a Gzip file.
Saving this and using a tool such as 7-zip allows it to be decompressed, noting that thereâs data appended to this memory stream.
Part 3: Examining Embedded 2nd Stage Payload (Cruiser.dll)
Examining this new binary in pestudio reveals it is also likely obfuscated using SmartAssembly .NET obfuscator, and had a name of âCruiser.dllâ. It was also likely compiled on Monday April 10th 2023 at 10:01:02 UTC which seems plausible given itâs before the injector binary was compiled.
IOC (SHA256): 40C050C20D957D26B932FAF690F9C2933A194AA6607220103EC798F46AC03403
Examining this on VirusTotal it is flagged as a trojan with the name âtedy/vsntdh23â.
Repeating the same process with de4dot and decompiling this shows that it has the namespace âMunozâ which contains the class âHimentaterâ amongst others which is specifically what weâre looking for. If we consider the first stage of this malware which ran the method âxpâ, we can see that it was using the method âCasualitySourceâ, and is first passing in string_0 (â71474C547242â) before string_1 (â69786Fâ).
Examining CasualitySource reveals it is a simple string operator which converts hex given to its raw ASCII format which results in the values âqGLTrBâ and âixoâ.
Part 3: Examining Embedded Steganography Binary
The next part of code involves looking at multiple binaries together and gets a bit involved. The chain of events are as follows:
- Return a bitmap through a method called âjRâ via a class called âsZâ from within the namespace âTNâ which is inside of the 1st stage payload. By using the variables string_0 (qGLTrB) and string_2 (DoanHQTCSDL), make up the targeted resource (DoanHQTCSDL.Properties.Resources.qGLTrB) and store this into a byte array after subtracting 150 pixels from its height and width.
- Convert the returned bitmap into a byte array through a method called âsZâ via a class called âsZâ from within the namespace âTNâ which is inside of the 1st stage payload.
- Perform operations using the âSearchResultâ method from within the 2nd stage payload to convert the byte array using the term variable string_1 (ixo).
- Load the deobfuscated byte array as an assembly into memory
Although these operations can be manually reversed, itâs a bit tedious and complicated. Instead we can run the original binary in dnspyEx to get the final product. To do this:
- Open the original executable in dnSpyEx and create a breakpoint on mscorlib.dll within the Sleep function statement checking if the AppDomainPauseManager is paused.
- Run the binary until the breakpoint is hit and use Step Out (Shift + 11) to land at the start of the decompiled 1st stage binary (Pend.dll).
- Create breakpoints at the operations which are retrieving the 3rd stage payload and observe the Local variable window to see the modifications occurring.
- At the third breakpoint observe the 3rd stage binary in memory which can now be saved to disk.
Part 4: Examining 3rd Stage Payload (Discompard.dll)
This new binary crashes pestudio, and examining it in dnSpyEx shows it is posing as software from the company âCitroenâ, has the name âPlant Scientistâ, and is apparently copyrighted to the 2004 Citroen C5âŚrighteo then. The binary hasnât been seen by VirusTotal either; however, we still have luck using de4dot which detects an unknown obfuscator has been used, cleans it up and gives us something pestudio can analyse.
IOC (SHA256): ACB4301D445B5C125A8CEDD00427D6F89EC89A1F01A9D7D4E7CC183D017F984D
The binary appears to have had an internal name of âDiscompard.dllâ, and was possibly compiled on May 3rd, 2023 at 06:40:48 UTC almost 2 hours after the 1st stage was compiled.
Examining this in dnSpyEx shows a number of unknown methods and types without any clear entry point.
Locating Method To Be Invoked
Despite this we know that the binary was being dynamically loaded into memory through reflection and that it was looking for the 20th element in the returned assembly types. Reflectively loading this module into memory we can store this in an object and examine it.
$malware=$([System.Reflection.Assembly]::Load(([byte[]]@(Get-Content "C:\Users\Barry\Downloads\stage3.dll" -Encoding byte))).GetTypes()[20])
$malware.GetMethods()[29]
In the above we have an issue. Although it looks like a correct method to invoke has been pulled, the method isnât marked static so canât be invoked like was seen to be occurring when examining the 1st stage malware. Looking at the methods available thereâs 9 different ones which can be found with the following:
$malware.GetMethods() | ? {$_.IsStatic -eq "True" -AND $_.IsPublic -eq "True"} | Select -exp Name
Based on what was seen in the 1st stage malware, thereâs 2 parameters being passed to the invoke method, one of which is the object this is being run on, the other of which are parameters being passed. As the methods are static the object field is ignored. The parameters on the other hand are not, so this tells us that no parameters are to be passed to the method being invoked. By cross-correlating the methods that were seen to be static with those shown in dnspyEx, itâs seen that thereâs only 2 methods this could be âT4Z5pBPufAâ or âuqH5vT69wmâ.
A glance at âT4Z5pBPufAâ shows a single line which sets an integer to 0 and nothing more, whereas âuqH5vT69wmâ seems far more promising. In particular âuqH5vT69wmâ has a reference to the ApplicationData directory which is suspicious, isnât called by any other method unlike âT4Z5pBPufAâ and to top it all off, itâs method number 29 in the dnSpyEx hierarchy which is the number that was being invoked in the 1st stage payload.
Itâs currently unclear why the order in dnSpyEx shows correctly, but when dynamically loading into memory using PowerShell this order is scrambled and fails. Itâs likely due to multiple methods being retrieved from other DLLs upon reflectively loading, but alas weâre on the right path again.
Debugging The Binary
The binary itself is highly obfuscated and has a number of string building operations which makes manually analysing every component of it tedious; however, the main functions can be seen by stepping through this by debugging it in dnSpyEx. Breaking at line 360 of class âTVmkvjWsJHbPcmFjgwâ shows the method âuqH5vT69wmâ evaluating what looks to be the original binary being run, and a hardcoded executable name in the Roaming AppData folder which may indicate a copy of the malware is going to be placed there.
Breaking at line 14 of the class âKIpLLvYdUNjv6s5VFsqâ shows the method âJ4GMnwe6tâ returning deobfuscated instructions which confirm this suspicion as shown clearly in variables on the stack. This is also reflected in the local variables by breaking on the âcopyâ method inside of the âSystem.IO.Fileâ class.
Breaking at line 109 of the class âTVmkvjWsJHbPcmFjgwâ shows the method âzJ45fVOtmsâ setting permissions and file attributes on the malware which was copied into the AppData folder. Specifically it obtains the user details of who ran the binary and sets it so they only have Read, ReadAndExecute, and ReadData permissions to the malware on disk. It also sets the malware to not be indexed by Windows, and sets it to be Hidden and a seen as a critical system, binary to make it even more hidden by default.
At the end of this method it is seen that the permissions are successfully applied to the malware.
Breaking at line 13 of the class âCRSlWTd5bfbCGtYKORâ shows the method âJ4GMnwe6tâ returning a base64 string. Itâs important to note that thereâs a large number of classes each with a different method called âJ4GMnwe6tâ used for deobfuscating strings.
Breaking at line 132 of the class âTVmkvjWsJHbPcmFjgwâ shows the method âQle573GMuPâ having base64 decoded the string in local variables. We can also base64 decode it ourselves in something like CyberChef to show a XML configuration schema for a scheduled task. Of note is that the UserId field is set as [USERID] and isnât filled in.
Breaking at line 135 of the same class shows the same method having now deobfuscated the user identity to be used. It also has built a string to a temporary file location.
Stepping through a couple more instructions shows that a file is written to the identified temporary file containing the complete scheduled task XML. It should be noted this has a hardcoded IOC of the scheduled task registration time being spoofed.
IOC Scheduled Task Date: 2014-10-25 14:27:44:8929027
Stepping over a few more functions shows strings building out the value âschtasks.exeâ. Breaking at line 145 of the class âTVmkvjWsJHbPcmFjgwâ shows the method âQle573GMuPâ returning a commandline which will be used with schtasks.exe to register a scheduled task and establish persistence with the task name âUpdates\ALKgmyycVaEjJxâ.
Breaking at line 64 of the class âiY9VXx99XrYI4WaMs1â shows the method âkHZWDDrKNwâ returning yet another binary which is retrieved and deobfuscated from a resouce which can be saved for analysis.
Stepping through a little further it appears that this is being injected into a surrogate process in method âEr95CrvjJcâ of class âTVmkvjWsJHbPcmFjgwâ, so we can now move onto stage 4.
Part 5: Examining 4th Stage Payload (Remcos RAT)
The 4th stage payload is the most promising yet. Examining it in pestudio shows it was likely compiled on December 20th, 2022 at 21:35:57 UTC, is created in a completely different language to the previous injectors (3 injected DLLs plus the initial injector wrapper), and it is created in C++ as opposed to .NET.
IOC SHA256: 94a4e5c7a3524175c0306c5748c719a940a7bfbe778c5a16627193a684fa10f0
Checking this binary on VirusTotal it has been categorised as a trojan with the name âremcosâ on VirusTotal and has a significant detection rate. This means weâve likely finally hit the final stage payload of remcos. Further to this by examining the resources sections there is a âSETTINGSâ resource which is a known indicator of Remcos RAT. IT also has a high entropy level indicating it is likely compressed or encrypted.
Decrypting the Remcos RAT Configuration Resource
Leveraging a post from the team at Morphisec who have analysed a different sample in the past highlighted how Remcos RAT uses rc4 encryption on the âSETTINGSâ resource to encrypt to malware configuration. Specifically it uses the first byte in this resource to define the key length, the next amount of bytes up to that key length is the key, and the rest of it is the encrypted data.
Using this in CyberChef provides what looks to be a configured C2 server and port, in addition to what may be a unique identifier and a number of other fields.
IOC C2 Domain: gdyhjjdhbvxgsfe[.]gotdns[.]ch"
IOC Port: 2718
IOC Host Identifier: Rmc-JQX1JF
A publicly avaialble decoder by kevthehermit provides extra insight into this configuration file and what data it may contain.
Rebasing and Dynamically Resolving Imported Functions
Given the RAT is created in C++, leveraging x32dbg and Ghidra is a great way to uncover how it works at different parts of the program. Specifically if the base address between these are synced then itâll make the analysis process that much smoother. After opening in x32dbg the base address can be seen and copied in the Memory Map.
Opening the memory map in Ghidra, this can be rebased by using the house icon.
A quick and dirty way of getting context on what may be resolved from or sent to a particular API is to create a conditional breakpoint which logs the address information of all registers whenever an API call of interest is made. Breaking on LoadLibraryExW and LoadLibraryA in x32dbg by running the below command can be used as a starting point.
bp LoadLibraryExW
bp LoadLibraryA
These can then be edited to include âLog Textâ similar to the below.
LoadLibraryExW: eax:{a:eax} ebx:{a:ebx} ecx:{a:ecx} edx:{a:edx} ebp:{a:ebp} esp:{a:esp} esi:{a:esi} edi:{a:edi}
LoadLibraryA: eax:{a:eax} ebx:{a:ebx} ecx:{a:ecx} edx:{a:edx} ebp:{a:ebp} esp:{a:esp} esi:{a:esi} edi:{a:edi}
By running the program, every time the breakpoint is hit, x32dbg will log the registers. Although it may seem noisy, this approach can quickly gain useful information. In this sample âLoadLibraryAâ appears to be used to get a handle on and load a number of functions at run time such as âGetComputerNameExWâ, and âGetSystemTimesâ which are not explicitly imported by the RAT.
Examining Entry and Editing WinMain Function in Ghidra Decompiler
Examining the entrypoint in Ghidra shows 2 functions, â__security_init_cookieâ and â__scrt_common_main_sehâ which are part of C and C++ initialisation code.
The main remcos method exists inside of â__scrt_common_main_sehâ and can be examined. Using Ghidra the function call tress can be shown which at a high level reveals 2 custom functions of interest; however, only one doesnât contain a single line and has substantial subfunctions âFUN_0040db10â.
Glancing at this function in Ghidra makes it apparent that this is the WinMain function given it is running all subfunctions; however, the Ghidra decompiler has failed to identify this, and it is reporting only 3 parameters being passed to the function â0x400000â, â0â, and âpcVar7â. By right clicking and editing the function this can be cleaned up.
Going through each datatype, this function can be fixed to more accurately replicate the WinMain function signature shown below.
int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, PSTR lpCmdLine, int nCmdShow)
The end result is as follows:
Saving this and returning to the decompiled code shows something which makes a lot more sense. The instance is getting a handle to the current executableâs DOS Header, passed parameeters are being retrieved by â__get_narrow_winmain_command_lineâ, and whether or not to show this application window is being retrieved from â__get_show_window_modeâ.
A quick look at whatâs retrieved for the show window value shows this will always return 0.
Initial Analysis of WinMain Function in Ghidra
Combing over WinMain shows a large number of functions are present. The first function âFUN_0041ae1aâ at a glance looks like it is dynamically importing libraries to be used at runtime based on its API calls.
Examining the function confirms these suspicions, and also provides more context to the imported functions which were seen during dynamic analysis.
This can be renamed to something more meaningful such as âFUN_Load_Importsâ.
Decrypting SETTINGS Resource in Ghidra and x32dbg
Examining the next function âFUN_0040e4a3â, shows it immediately makes a call to âFUN_004199a9â which appears to be getting the SETTINGS resource using FindResourceA and LoadResource, and is storing this into a byte stream to be used.
Thereâs also mention of FID_conflict which occurs from the âFunction IDâ Ghidra analyser which has found multiple functions which match a computed hash during analysis. This can be resolved by using a plugin such as Andrew Strelskyâs ResolveFidScript
Using x32dbg, a breakpoint can be set at the address â0040E4E3â (offset 0xE4E3). Once run, itâs shown in the memory dump of register EBX that this was in fact retrieving the RC4 key from the SETTINGS resource.
At this point the Base Pointer Register (EBP) is also set to 76 which is the RC4 key length.
Finally the entire contents of the SETTINGS resource is stored within the Destination Index Register (EDI).
âFUN_004199a9â can now be renamed to âFUN_Load_Configâ in Ghidra. Creating breakpoints on each function call and running the program in x32dbg gives some idea of later functions. Specifically only minor operations occur until âFUN_0040644câ at address â0040E53Bâ (offset 0xE53B). At this call the Source Index Register points to memory containing the RC4 encrypted content, and EBX contains the RC4 key.
Looking into this function it runs a couple of other functions, but of particular interest is function âFUN_004063b0â which is performing some sort of iterative looping operation with a noted array of 256 integers having been defined which is being used in subsequent XOR operations.
Knowing how RC4 works makes identifying this function as the Pseudo-random generation algorithm (PRGA) much easier. Comparing the pseudocode to Ghidraâs decompiled output these operations can mostly be seen.
The function immediately prior to the PRGA function is part of a necessary Key-scheduling algorithm (KSA) that takes place during RC4 encryption and decryption (FUN_0040632b), and can be noted by the use of 0x100 (256) which is the max keylength that is defined in an array used in âFUN_004063b0â. At a glance itâs more difficult to determine what is occurring here based solely on Ghidraâs decompiled interpretation; however, the function graph helps to see common looping trends.
Jumping over to x32dbg, a breakpoint can be placed at address â0040643Aâ (offset 643A) to see how this impacts the RC4 encrypted data on the stack. On first run it can be seen that the first byte changes to a âgâ in ascii.
Breaking outside of this loop at address â0040642Bâ (offset 0x642B) shows the decrypted content on the stack.
From this âFUN_0040644câ can be renamed to âFUN_RC4_PRGAâ, âFUN_0040632bâ can be renamed to âFUN_RC4_KSAâ and âFUN_0040e4a3â can be renamed to âFUN_Load_Decrypt_Configâ in Ghidra.
TBA
The next major operation which occurs is a comparison checking to see whether â-lâ is being passed to Remcos as part of a string comparison at address 0040DB76 (offset 0xDB76)
MORE TBA