Neel Mehta, Billy Leonard, Shane Huntiey Google Security Team Version: 1.0 Published: September 5, 2014 TLP Green Neel Mehta, Billy Leonard, Shane Huntley Google Security Team Version: 1.0 Published: September 5, 2014 TLP Green Table of Contents introduction ..2 Sofacy Analysis ..3 Sofacy Persistence Mechanism ..15 Sofacy Functionality. 6 Network Sofacy X?Agent Analysis ..19 X-Age nt identifiers . .20 X-Agent InternalsPersistence Network Air?GappedOperations ..32 X~Agent AppendixA ..37 Appendix 1 . WM Introduction Many sophisticated state-sponsored attackers use multi-stage malware toolkits. First?stage implants are widely distributed, easily discovered, and serve as a simple beachhead. in contrast, complex second-stage implants are typically used sparingly on only the most interesting systems, after determining there is limited risk of detection by security products. As such, a first-stage tool exists primarily to limit the exposure of second-stage tools, extending their usable shelflife. This analysis describes one family of malware: a first-stage tool, Sofacy, and an associated second-stage tool, X-Agent. Sofacy is an antivirus industry name, while X-Agent was named by the malware authors. Together, these tools are used by a SOphisticated state-sponsored group targeting primarily former Soviet republics, NATO members, and other Western European countries. This information has been determined from VirusTotal submissions. Antivirus detection for both Sofacy and X-Agent is subpar, with plenty of room for improvement. Antivirus detection for Sofacy, based on VirusTotal data, was roughly 36.6%. Detection for X-Agent was lower, at only 34.2%. Our goal in releasing this analysis is to improve antivirus detection for both. Consequently, recipients of this paper are free to share it with interested parties in the security community. This analysis is Acknowledgements This analysis was compiled with the tireless help and extensive expertise of the Google Security Team, especially Heather Adkins, Daniel White, Joachim Metz, Andrew Lyons, Liam Murphy, Elizabeth Schweinsberg, Matty Pellegrino, Kristinn Gu?j?nsson, Cory Aitheide, Armon Bakhshi, and Mike Wiacek. VirusTotal Submissions By Country Analysis of VirusTotal submissions for Sofacy and X-Agent yields insights into the attackers operations. As a first~stage tool, Sofacy is used relatively indiscriminately against potential targets. X-Agent is reserved for high?priority targets. This is borne out by the data. VirusTotai submissions show that Sofacy was three times more common than X-Agent in the wild, with over 600 distinct samples in the data set. Proportionai differences in the geographical distribution of submissions of the first-stage tooi, Sofacy, and the second stage tool, X-Agent, provide some interesting insights. For example, the Republic of Georgia represents only 3.5% of Sofacy submissions, but makes up 28.9% of all X-Agent submissions, more than any other country. This suggests that, at one point, Georgia was a high priority target for the attackers. This ratio is iikely a lagging indicator of attacker interest (attackers must first be caught, and compromise can go undetected for years). The same comparison shows attacker interest in Ukraine, Germany, Poland, Denmark, and also Russia. X-Agent submissions from the United States and Canada were proportionaliy smailer than Sofacy submissions. Sofacy submissions to VirusTotal Israel 2.7% Ukraine 7.1% USA 22.6% Georgia 3.5% Belgium Austria 2'7? 396 .4 Others Germany 0 China 1 2 2.7% X?Agent submissions to VirusTotal Denmark 1.996 Romania 1.9% Russia 193% Georgla 28.9% Germany Poland 3.8% Canada Japan 1.996 Submission Share Ratio of X?Agent Sofacy in VirusTotaI by Country Georgia Romania Russia Denmark Poland Germany Ukraine Vietnam Canada 5. Korea USA Ratio 9x Sofacy Analysis Sofacy is an above average first-stage implant. Early variants are more technically complex than recent ones. For example, older samples feature the ability to move seamlessly between processes and harvest credentials. Newer variants are more mature and focused in their design. They provide functionality to detect personal security products, survey infected machines, and install a second stage tool, all without exposing techniques such as lateral process movement. Dropped By Boring-Looking Exploits Sofacy is often delivered by Microsoft Word exploits as RTF, DOC or DOCX files CVE-2012-0158, CVE-2010-3333). It is occasionally delivered by Adobe Acrobat PDF reader exploits. These exploits are often first used by Chinese attackers, but have been repurposed by the actors responsible for Sofacy. To achieve this, the Sofacy executable is swapped in for the original exploits payload, leaving other parts intact, including shellcode. Sofacy Internals Position-Independent Code Development The authors of Sofacy use clever compiler tricks to produce a binary with no dependence on imports, relocations, or initial code position. This allows it to be copied into another process and executed, without any additional dependencies or setup requirements. Assemny language development is slower and more expensive than development in higher level languages. This ease of development comes at the cost of new dependencies on OS-specific loaders, which complicate cross-process injection. The authors of Sofacy have found an elegant middle ground, which at first glance might appear to be hand written assembly, but is consistent with the register allocation and plpelining of Microsoft Visual The entry point is passed two arguments: a pointer to an address in kerne132.dll and a base address to identify where the code is in memory. Function calls and global variables are then accessed as an offset from this base address. Here are two examples: Function Call: seg600:@@19DBF2 lea eax, [esi+i93721h] seg000:001903F8 call eax Global Variable Access: lea eax. [esi+194?85h] mov eax Each function is passed the code base address as its first argument, and this is used consistently, like a calling convention. Thusly, the authors have produced a position-independent binary with no external imports, by utilizing preprocessor macros, or a similar mechanism, to do pointer math for each function call or global variable access. In order to use system functions, the start function first walks back in memory to find the start of the kerne 32 module. Then the code manually walks the export table, hashing the function names to resolve the required imports. A full list of hashes for the imports is shown below. seg00@:0@18BlBl imported_function_hashes dd CloseHandle segoeezeelBBiBS dd CopyFileW dd CreateDirectoryW dd 30C48297h CreateEventw seg000:0@1881C1 dd CreateFileA seg000:0@1881C5 dd CreateFileMappingw seg080:661881C9 dd CreateFileW seg00@:@elaBiCD dd 6411920Bh CreateMailslotw dd CreateMutexw seg60@:001831D5 dd CreatePipe seg600:0018BlD9 dd CreateProcessw seg@0@:@@18BiDD dd CreateRemoteThread dd CreateThread seg0002891881E5 dd seg0891601881E9 dd DeleteFileW dd ExitProcess seg000:@el8BiFl dd ExitThread dd 235459?8h FindClose dd 6306C065h FindFirstFileA dd FindFirstFileW seg808:00188201 dd FindNextFileA seg090:00185205 dd FindNextFilew seg0@@:@@188209 dd FreeLibrary dd GetCommandLinew dd GetCurrentProcess seg000:00188215 dd GetCurrentProcessId seg@0@:00188219 dd seg@0@:@@18321D dd seg000:00188221 dd GetEnvironmentVariableW seg000:00188225 dd GetExitCodeProcess seg080280188229 dd GetExitCodeThread dd 6C0i32B93h GetFileInformationByHandle 593000:00188231 dd GetFileSize seg090:@018B235 dd GetFileTime seg000:00183239 dd GetLocalTime seg000:00188230 dd 4586608Ch GetModuleFileNamew seg000: seg000: seg000: seg000: seg000: seg000: seg000: seg000: 5eg000: seg000: seg000: segaeez seg000: seg000: segeeo: segeoe: segaoe: segoeez seg000: segoeo: segeoo: 593000: segaee: segeea: segoee: segaeo: segeeo: segeoe: segGGO: segeea: seg000: segeae: seg000: segeea: segaee: segeea: segeoe: seg000: segae?: segeeo: seg000: seg000: 00183241 00183245 00183249 00183240 00183251 00183255 00183259 00183250 00183261 00183265 00183269 00183260 00183271 00183275 00183279 00183270 00183281 00183285 00183289 00183280 00183291 00183295 00183299 00183290 001832A1 001832A5 001832A9 001832A0 00183231 00183235 00183239 00183230 001832C1 001832C5 001832C9 001832C0 00183201 00183205 00183209 00183200 001832E1 001882E5 001832E9 001832E0 001832F1 001832F5 001832F9 001832F0 00183301 00183305 00183309 00183300 00183311 00183315 00183319 00183310 038E52902h 89076100h 51268313h 8A324136h 6E824142h 0C3?34651h 000434733h 000434751h 0053992A4h 579013E9h 56F73980h 0E1159330h 003204930h 78353983h 030016F89h 032089259h 4363076Ch GetPrivateProfileStringA GetPrivateProFileStringw GetProcAddress GetStartupInfoW GetTickCount GetTimeZoneInformation GetVersionExw GetVolumeInFormationW GlobalAlloc GlobalFree HeapCreate HeapDestroy IsBadReadPtr LoadLibraryW MapViewOFFile MultiByteToWideChar OpenMutexW OpenProcess PeekNamedPipe Process32Firstw ReadFile ReadProcessMemory ReleaseMutex ResumeThread SetCurrentDirectoryW SetEndOFFile SetFileAttributesw SetFilePointer SetFileTime SetThreadPriority SetThreadPriorityBoost Sleep TerminateProcess TerminateThread UnmapViewOfFile VirtualAlloc VirtualAllocEx VirtualFree VirtualFreeEx WaitForSingleObject WideCharToMultiByte WriteFile writePrivateProfileStringA WriteProcessMemory LoaderFuncHonath Sofacy persists on infected machines as an and compressed payload, appended to a small loader executable file. The loader the payload by permuting a 32-bit key and X0 Ring each byte with the lowest eight bits. The last byte of the payload is left untouched. The 32?bit key is initialized with a literal value in the loader?s main function: .text: 00401025 mov This value is then modified, often using MMX instructions: .text: .text: .text: .text: .text: .text: .text: Finally, .text: .text: .text: .text: 0040107F movd mm0, [ebp+var_28] 00401083 mm0, 2 00401087 movd mm0 0040113D mov eax, [ebp+var_28] 00401140 eax, 4 00401143 inc eax 00401144 mov eax the key is passed to the function: 0040117A push [ebp+var_28] 00401170 push 7D68h 00401182 push [ebp+var_8_buffer] 00401185 call Here is the equivalent code in C: void unsigned char* payload, size_t len, unsigned int key, unsigned char* out) for (size_t i 0; i (len unsigned char (key A 1) 8 8 0xff; outEi] payloadEi] x; key 0xea61; key 0x24142871; last byte is not obfuscated. out[i] payloadEi]; LZSS Decompression The payload contains a decompression stub, which implements a simple Lempel-Ziv variant commonly used in malware. Malware will likely recognize the decompression code, with one small change. in addition to the first layer, each compressed input byte is with a permutation of a hardcoded 32-bit key: unsigned int key unsigned char next_byte input_byte (key @xff); Equivalent of x86 ?ror? instruction key rotate_right(key, Dynamic Dependency Resolution The loader invokes the entry point of the position?independent code blob. it must then first resolve dynamic dependencies, before doing anything else. Sofacy identifies dynamic dependencies by iterating through a list of files in %windir%\system32\, hashing file names, and comparing those hashes to a list of hashes for needed DLLs. Ultimately, it depends on at least the following system DLLs: kernel32.dll user32.dll w52_32.dll shlwapi.dll advapi32.dll iphlpapi.dll pstorec.dll inetmib1.dll snmpapi.dll wininet.dll setupapi.dll shell32.dll ole32.dll 10 String Contemporary variants of Sofacy strings using an algorithm that resembles RC5. The loader contains three distinct blobs of data: 1. Dynamic dependencies, configuration, and C2 servers. 2. A list of antivirus and personal security products to detect. 3. The actual implant binary. Each variation of RC5 permutes an 8?byte block of data with an 8?byte key, using a single round. A 4-byte window of the key is used to each byte of input data. For example, taking the following key bytesThe first byte of input will be using the first 4 bytes of the keyThe second byte will be using bytes 2 through And, eventually, the window wraps at the 6th byte of input, using the last 3 and first byte of the keyThe four key bytes, along with an 8-bit representation of the input position, is combined to generate an 8-bit value that is with the input byte. Each algorithm is a variation on this theme: unsigned char *input; size_t input_1ength; unsigned char *key; for (size_t i 0; i input_length; unsigned int x, y; unsigned char a, b, c, d; unsigned char input_index_char i exff; size_t block_index i 7; keyEi key[(i l) 8 key[(i 2) keY[(i 3) [permute values - get an 8-bit value to xor with input byte] inputEi] x; There are at least 6 variations of the permutation algorithm. For most Sofacy samples, one of these six variations can be used to the three blobs of data. Variation input_index_char; 4; b; input_index_char; Variation 2: input_index_char; 903(- ll input_index_char; block_index; b; Variation 3: input_index_char; A: 9a input_index_char; 7; A: 12 @xff; Variation 4: a; +2 input_index_char; block_index; d; input_index_char; @xff; I: c; b; y; Variation 5: a; input_index_char; 4; Oxff; b; input_index_char; d: Oxff; Variation 6: a; input_index_char; b10ck_index; Bxff; b; d; input_index_char; 3! 8F C: y; fof; Parameter Store Recent Sofacy droppers configuration data and store it in a registry key. This key hangs off HKLM if the dropper has permissions to write there, otherwise HKCU, and is located at: The configuration data is stored in a proprietary key/value format. It starts with a 6-byte key, followed by 20 bytes of UINT8 The remainder of the data is configuration values. Configuration values are identified by their index into the length table. The parameter store allows run-time updates to the configuration, and serves to separate it from the implant binary. For example, the C2 servers cannot be found in the implant binary, and may only be recovered statically from a dropper, or by the data from the parameter store. Keystroke Logging Sofacy's keystroke logger attaches its input processing methods to those of the active foreground window. It polls the foreground window, detecting changes as the user switches applications. it also captures process context, such as executable paths and arguments. Captured keystrokes are normalized to Unicode, taking into account the active keyboard layout. Inter-Instance Communication Via Mailslots Sofacy communicates with itself over a mailslot3 such as: \\.\Mailslot\LSAMailSlot As an example, the keystroke logger uses this mailslot to communicate with the main Sofacy process. As it receives keystrokes, it sends them back over the mailslot as serialized HTML. Another instance of the implant, running in a different process, will read the keystroke log data from the mailslot, it and re-transmit it over the C2 network connection. 14 Persistence Mechanisms Persistence Via LNK Shortcuts Sofacy may persist via changes to an existing LNK file4 in a shell startup folder. This LNK file is invoked each time the user logs in. Sofacy adds a ?Shell Item? to the end of the .LNK file The shell startup folder locations are determined by reading the following registry keys: Folders\Startup Folders\DesI;top Folders\Commor1 Startup Folders\Common Desktop Sofacy scans the startup folders for an appropriate, pre-existing LNK file. The LNK file's original timestamps are captured and a small change to the file is made. After modification, the Windows API SetFileTime function is called to restore the file's ?creation?, "last access?, and "last write" times. An example LNK file is included in Appendix A. Persistence Via Windows Shell Sofacy is also known to persist via Quick Launch6 folders, Shell Icon Overlay Handlers and Shell Service Objects. Older versions of Sofacy may drop itself into one of the following Quick Launch folders: Data\Microsoft\Internet Explorer\Ouick Launch Data\Microsoft\Internet Explorer\Quick Launch Shell Icon Overlay Handlers7 are COM objects that implement the IShellIconOverlayIdentifier interface to show icon overlays (where one icon is displayed on top of another). Icon Overlay handlers are loaded in the context of explorer.exe when each user logs in. This is used by legitimate applications such as TortoiseSVN. Sofacy registers itself as a Shell Icon Overlay Handler by setting the appropriate registry key to the UID of its registered COM object: 4 hup.cimsdn micmso? aspObserved names for the Icon Overlay Value are: AdvancedStorageShell The icon overlay handler key points to a registered COM object, a Sofacy DLL: Sofacy can also persist as a Shell Service Object, another class of COM objects that load on user login. They are registered in the following key: Observed names for Sofacy Shell Service Objects are: netids The shell service object CLSID used is: Sofacy Functionality Disabling Error Reporting To avoid detection, Sofacy systematically disables crash reporting, logging and post-mortem debugging each time it starts. It is delivered via memory corruption exploits, which are inherently unpredictable. Also, Sofacy performs complicated inter-process inspection and code injection. Finally, the code may have bugs. Any of these factors may lead to crashes, which if logged are likely to be noticed. Sofacy disables crash and PC health reporting by changing the following registry DWORD values to O: It suppresses system hard error message display by setting the following registry DWO RD value to 2: It also disables Dr. Watson (or other post-mortem debuggers) by deleting the following registry key: NT\CurrentVersion\AeDebug Interest in the Physical Location of the Machine Sofacy tries to read a value PhysicalLocation_Name from the system administrative template file: \Windows\inf\system. adm Administrators, especially in large organizations, will populate this field with the physical location of the system in the field. Sofacy gathers this information as part of its machine survey. It is sent back to the malware operator, adding context that may inform operator interest. Email Credential Harvesting Sofacy recovers cached email credentials from several sources. Specifically, it can recover saved credentials from Outlook, The Bat, Eudora, and Becky. Local Output Queue Sofacy temporarin queues data it gathers on disk. This data is LZSS-compressed and The location of the queue file is configurable and specified in a registry key. The registry key is subject to frequent change, as is the location of the queue file. In one sample, the queue file location was stored in this key: cense Network Communications Impersonating Legitimate Processes For Network Communication When communicating with the C2 server, Sofacy will scan a list of running processes, looking for a running web browser or email client. When one is found, it will clone the process arguments exactly, then create a new instance of the process. The main thread of the cloned process is started in a suspended state, and the implant is injected into the new process address space. The implant is started instead of the original Sofacy will pick a C2 that matches the cloned process: HTTP, SMTP, or P0 P3. By doing this, Sofacy mimics legitimate user processes, making it difficult to discern that network traffic originated from malware, not user actions. Asymmetric of Session Keys Sofacy uses the Windows API to create session keys for C2 communications. It creates an ephemeral RC4 session key and seals it with a hardcoded 1024-bit RSA public key. This sealed session key is included with the data transmitted to the C2 server. As such, only a recipient with the matching private key can decode traffic. Proxy Awareness Sofacy will detect proxies configured for Winlnet and Firefox. It will then use the correct proxies when connecting outbound to C2 servers. Sofacy Indicators Known mailslots (for IPC): \\.\Mailslot\LSAMailSlot Representative Sample Hashes Example Signatures The following ClamAV and Yara signatures can be used to detect Sofacy: 9451072da?8bc75f5e5dc20c00 rule 18 strings$sucmset?configuby?numw?[0?16condition: 1 of them rule stringscondition: 1 of them Known 02 Servers C2 domains: securitypractic.com checkmalware.org adawareblock.com Checkmalware.info scanmalware.info updatepc.org updatesoftware24.com testservice24.net symanttec.org microsofi.org microsof?update.com 1P Addresses: 123.100.229.59 200.74.244.118 74.52.1l5.178 88.198.55.146 67.18.l?2.18 203.117.68.58 X-Agent Analysis X-Agent is a second-stage toolkit complementing Sofacy. Portions of the X-Agent code base can be found in malware dating back to at least 2004. Somewhere down the Line, -Agent became the internal name for this tool. The features of X?Agent demonstrate its sophistication. For example, it can operate in an air-gapped environment via an ad-hoc pseudo-network of USB flash drives. X-Agent is multi-platform capable. With minor changes to platform-specific code, X-Agent will run on Linux instead of Windows. It can also be repackaged in different forms, for example as a DLL, by the addition of a single module. This analysis applies to X-Agent on two known platforms: Linux and Windows. X-Agent Identifiers Windows PE File Resource Locale IDs Windows Portable Executable resources are localized and include the locale ID9 of Windows running on build systems. As such, it may reveal the origin of malware. The locale iD field can be faked, but is often overlooked in malware build environments. PE resources are organized into a 4-level deep tree, with the third level specifying the locale ID of the resource. This is different from a code page, such as Windows-1251, and is more specific. The Windows resource compiler (Rchl.dll) uses the default locale iD Of 113 X-Agent PE samples observed in VT's dataset, 68 had PE resources. Three unique locale iDs were found in these samples: 0409 en?US (English US) 0419 ru?RU (Russian) 0000 NULL (invalid) Of the 68 samples that contained PE resources, the most common locale ID was ru?RU (Russian). 20 Locale IDs Number of Samples . .. . ru-RU and NULL 1 1 Program Database File Paths Microsoft's Visual compiler may include a fully-qualified path to a program database (PDB) file to help a. debugger can locate symbols. This build-time artifact can provide information about the systems used to build the malware. The following PDB paths have been observed in X-Agent samples: C:\Documents and Visual Studio 2005\Projeots\NET\Mail 1.1\ Mail l.l\obj\Release\rund1132.pdb C:\WORK\SOFT\Joiner\joiner O.l\Release\joiner.pdb C:\WORK\SOFT\Joiner\joiner O.2\Release\joiner.pdb d:\Shared DATA\spec_ver\ X-Agent Internals X-Agent Framework The X-Agent framework is a set of components, communicating over well-defined methods. Each component is a module, and they communicate over channels. Individual instances of X-Agent are termed agents. Each agent is assigned a unique ID (agent ID), calculated from a hash of the MAC addresses of all network interfaces on the machine. 2?l The X-Agent framework uses the term controller to refer to the software running on the C2 server. Each X-Agent agent communicates with its controller over a C2 channeL Kernel The core module in the X-Agent framework is the agent kernel, a small user-mode microkemel. This microkernei can register other modules and communication channels, as well as handle thread management, and It has a generic interface to storage and configuration data. Implant Initialization and Lifetime On startup, X?Agent?s main() function registers relevant modules and an external channel. It then starts a channel controller thread, which handles message distribution and channel selection. Finally, X-Agent starts a worker thread for each module. X?Agent continues to run until ali these workers terminate, or untii operator commands instruct it to exit or uninstall. Parameter Storage X-Agent, like Sofacy, can maintain a parameter store that contains C2 servers and other configurable parameters. This would be initialized by the dropper, separating the configuration from the implant configuration on disk. It also allows for runtime configuration changes. For unknown reasons, most X-Agent builds do not use the parameter store in practice. Windows The Windows Registry provides the underlying datastore for the parameter store on Windows and can be found at: individual parameters are keyed off their registry value name, a hexadecimal number string. Linux On Linux, the parameter storage is held in a SQLite database, located in Each row in the database contains an id column which serves as the key. Each parameter is then stored as a binary or dword value. Channels X-Agent uses channels to structure communication and connections. Channels are used for and C2. Multiple channels are multiplexed over a single 22 network connection. External channels are used to communicate with the controller, abstracting the network C2 protocol from higher?level channels. The following channel types have been found in X-Agent samples: HTTP Channel Ox2101, 0x2102 Mail Channel 0x2302 Local Channel 0x2301 Channel Controller The X~Agent channel controller is responsible for passing module messages between external channels and local modules. The channel controller is unaware of any specific C2 protocols. These are abstracted and entirely the responsibility of the external channel. The channel controller also passes controller-generated (inbound) module messages to local modules. It queues these messages in memory as a vector, and passes them to the target module. The channel controller?s final responsibility is to control which channels are used for communication. through a channel changing mechanism exposed via a module command. An operator sitting at a remote console can switch from one external channel, switching C2 protocols on the fly. For example, X?Agent might switch from communicating over HTTP to email protocols. External Channels External channels are used to multiplex messages from modules to the controller. X-Agent agents must register at least one external channel with the kernel. They imitate legitimate network activity, such as web browsing, or sending and receiving email. Local Channels X-Agent contains a local channel implementation that uses a hidden file for module message IIO. This local channel is used in conjunction with the Net Flash module in air-gapped environments (see below). 23 The X?Agent kernel will selectively intercept messages to load and unload modules before they passed to the channel controller. This is conceptually similar to a local channel. Modules Each X-Agent component is a module, including the kernel. The modules register with the kernel, and are identified by a unique 16-bit The following modules have been observed in X?Agent binaries: Kernel UXODOZ Remote Key Logger 0x1002 Process Retranslator 0x1302 DLL Ox? 602 Net Flash 0x120? Module classes are derived from a common base class, and accessed over the same basic abstract interface. X-Agent modules may override five methods in the module base class. In a compiled X-Agent binary, they appear in the following order in a module vtabie: i. A take message method. This method passes inbound module messages to the modules, which take ownership of them. 2. A give message method, by which the module gives up ownership of outbound module messages, to send them to the controller. 3. Aget module ID method, that returns the 16-bit module ID. 4. A set module ID method, that sets the module ID. 5. A worker run method, which is the main function for the module. It invoked in a dedicated thread, started by the kernel. 24 Module Messages Module messages are X-Agent's internal message representation. A module message contains the agent ID, a module ID, a command number, a priority, and an opaque data field and size. The module ID on outbound messages specifies the module that created the message. On inbound messages, the module lD specifies which module should receive the message. These messages are called questions, and come from the controller. The destination module will receive these questions, and may choose to answer them 'with a response. Responses are also constructed as module messages. Some modules will generate messages autonomously. For example, the keystroke logger module will generate module messages containing logged keystrokes. Module Message Serialization X-Agent serializes module messages starting with a simple header, followed by an opaque field: struct modu1e_message UINT16LE modu1e_id; UINT8 command_number; UINT8 X-Agent serializes each module message by wrapping it in a raw packet (see Appendix B). That raw packet is then sent over the network to the C2 controller. The size of the C2 message specifies the raw packet size, and subsequently the module message size. The protocol design does not include sequence numbers and behaves like an unreliable transport mechanism. Statefulness is tracked completely in response module messages. For example, when X-Agent receives a command to read a file, it responds with a log message that says it read a specific file, followed by the file?s contents. HTML Log Messages X-Agent log messages are written as HTML and color coded, perhaps to make it easier for human operators to read. Error messages tend to be colored red: process is exist File don?t 25 Persistence On Windows, X-Agent will persist via a Registry Run key, using rundll32 .exe to invoke its publicly exported init method. The Run key may be named after its DLL filename on disk, such as: With a registry key value of: rund1132.exe Alternatively, X-Agent may persist as a Windows service, or as a Shell Icon Overlay Handler, like Sofacy. Linux deployments of X-Agent may persist via a .desktop file located in . config/autostart/. When installed as root, the X?Agent binary may be installed as and persist via run level scripts such as re . local. Network Communications Packet Queues X-Agent uses packet queues to buffer C2 traffic when passed between the kernel and channel controller. The inbound message queue is a vector in memory, accessed Outbound messages are buffered in two local queue files on disk, one each for high and normal priority messages. Each queue holds module messages, prefixed by a UINT32LE Observed names for the queue files are: .edg6E85F986 .edg6EF885E2 edg6E85F98675.tmp zdg6E85F98675.tmp edg6EF885E2.tmp zdg6EF885E2.tmp These queue files are most often located in the /tmp directory on Linux and the path returned by GetTempPathl) on Windows. Some DLL builds of X-Agent will put these queue files in their working directory instead, although this is less common. 26 After reading queue files, the channel controller deletes them. They are not securely wiped from disk, and may be recoverable. External Channels and X-Agent HTTP traffic is clear?text. SMTP and POP3 channels use TLS. and are more challenging to detect on the network. HTTP External Channel X-Agent?s HTTP external channel is commonly-used to talk to the controller. POST requests are used to send messages while GET requests retrieve inbound messages. An example HTTP external channel session has been provided as a text file and is available via VirusTotal with a SHA-256 hash of: All HTTP messages include a magic token value in the POST messages also include a request body containing an encoded module message. HTTP URI Generation The full for HTTP requests is randomly generated, according to a template implicitly agreed upon by both agent and controller. The base URls for GET and POST requests is generated by selecting a random string from a list. Since this base is ignored by the controller it is not unusual for it to change between X-Agent versions. in one X-Agent sample the following list of base URIs was observed: /watch/? /search/? /find/? /results/? /open/? /searchl? /close/? 27 Parameters for the URI are chosen from a list and appended to the base URI. The following parameter name choices have been observed: text: from: aim age: oem btnG= Oprnd= ai: utm= channel: One of these parameters is agreed upon (by the agent and the controller) to encode the agent ID, and is henceforth referred to as the HTTP agent ID token. This is used by the controller to track sessions. In the representative sample, the chosen parameter was ai=. All other URI parameters appear to contain meaningless, randomly~generated base64~ ike data. Older X-Agent samples used a static URI for HTTP channel requests. This ends with a hardcoded session tracking parameter value name ai=. The HTTP agent ID taken was simply appended to this base URI: HTTP Adent ID Token Format and Encoding The controller will extract the HITP agent ID token from the correct URI parameter. It is then decoded to identify which agent is communicating. The HITP agent ID token is base64 encoded data, using the web-safe alphabet (see Appendix B). The encoded string is padded with a 5-byte random prefix so that it looks like valid base64 data. When encoded as binary data, the HTTP agent ID token starts with a 4-byte XOR key, followed by a 7 or 20-byte magic token value, and the agent ID: xor?keyi4]] magicwtokenET or 203} agentuid] The XOR key is repeated and extended out to a length of II or 24 bytes, then with the magic token and agent ID fields. The 7~byte magic token for HTTP data, when XOR decoded, should beOlder versions of X-Agent use a 20-byte ASCII magic token value: The following steps may be used to decode an HTFP agent iD token: 1. Discard the 5 bytes of prefix data. 2. Base64 decode using the web-safe alphabet (see Appendix B). 3. De-obfusca?te, XO Ring with the repeated XOR key. The following example demonstrates the decoding operation. Client (agent) request: GET m=j byi . 1 Accept: Accept?Language: gzip, deflate User?Agent: Mozillaf5.0 Gecko/20100101 Firefox/20.0 Host: windows?updater.com Server (controller) response: . 200 OK Date: Thu, 12 Jun 2014 22:18:27 GMT Server: Apache Content?Length: 3 Connection: Close Content?Type: text/plain; charseteUTFWB 400 in this example, the HTTP agent ID token is in the aim URI parameter: ai=oedQJ3vMSQ6j9N7oleYALu8C To decode, discard the 5 bytes of prefix data, leaving: This data must be base64 decoded using the web-safe alphabet (see Appendix B). The result isThe first 4 bytes of this data are the XOR key. To continue decoding, XOR with the repeated key, giving a result ofThe first 7 bytes are the expected HTTP agent ID tokenThe remaining 4 bytes are the agent ID, as a 32?bit littie?endian integer: 43 f0 1C 10 The agent iD in this case was 0x101cf04 3. in some situations, the high 8-bits of the agent ID may be zero, causing only 3 bytes of the 32?bit agent ID to be base64 encoded. The decoded output for HTTP agent ID token tokens will look truncated, missing the East byte. This is likely unintended. HTTP Message Format and Encodinu HTTP channel messages are encoded in a format common to both inbound and outbound messages. inbound messages are responses to GET requests, and outbound messages are contained in POST request bodies. The encoding of HTTP channel messages is similar to that of HTTP agent ID tokens. To decode, a 5-bytejunk prefix should be discarded, and the remaining data base64 decoded with the web?safe alphabet (see Appendix B). The result will be binary data, starting with an ?ii?byte header, containing the following fields: xor_key[4i] magicwtoken[7]] The following steps will decode a HTTP channel message: Discard the 5?byte prefix from the body. 2. Decode the remainder with the web-safe base64 alphabet. 3. Retrieve the 4-byte XOR key (the first 4 bytes of decoded data). 4. the next ?ll bytes of the message with the XOR key. This includes the HTTP magic token and the agent ID. 5. Vaiidate the 7~byte magic token in the header has the expected value: 30 Discard the magic token bytes. The result of this decoding is a raw packet message, encoded in the previously-described format. An example POST request for X-Agent?s HTTP channel is available via VirusTotal: The final output is a serialized module message from module 0x1 002, command 0x64, with an opaque message body whose contents have a hash of: Mail External Channel The SMTP and POP3 channels together make up a common mail channel. The SMTP channel is used to send messages, and the POP3 channel is used to receive them. These channels are an alternative to the HTTP channel, which can both send and retrieve messages. The oldest versions of X-Agent exclusively used mail protocols for C2 communication. lnconqruous Mail Subiect Fields - Hardcoded Vaiues or Encodina X-Agent sends SMTP messages to the controller with specific magic values in the Subject line. The presence of these values is enforced by the C2 controller and by X-Agent when fetching messages via P0 P3. The most common Subject line observed contains ?plradl nomen?? which refers to a Georgian government-issued citizen identification number, similar to a US Social Security Number. Other versions of X-Agent expect the Subject line to contain an encoded token for session management, much like the HTTP agent lD token. This data is encoded using an encoding method called the P2Scheme. The P25cheme encodes binary data using the standard base64 alphabet (see Appendix B). The binary data starts with a random 5-byte XOR key, foliowed by a 7-byte magic subject token, and 4 bytes for the agent ID, as a xormkeyi5ll agenthidi 31 The 5-byte XOR key is repeated, extended out to 11 bytes, covering the magic subject token and the agent It). The magic subject token, when XO R-decoded, should have the following valueother words, followed by the ASCII string ?china?. The choice of magic token values, using Georgian phrases and the word ?china', seems incongruous. Mail Message Format and Encoding The mail channel sends and receives messages as multipart MIME email. The first message part contains a 7-bit UTF-8 representation of "gamarjoba", which is Georgian for "hello." The second message part is a base64 encoded attachment with the filename detaluri.dat. Alternatively the file may be named where %s is a string representation of the current time. "Data/uni? means ?detailed? in Georgian. The file may also be called winmail . dat. The attachment contents are a single raw packet message (see Appendix B). Air-Gapped Operations Some versions of X-Agent are designed to operate in an environment without an Internet connection, such as an air-gapped network. In this situation, X-Agent relies on human intervention to carry commands and data in and out via writable external media, such as USB flash drives. X-Agent will register a local channel for external communication, and use a module called Net Flash. The Net Flash module receives notifications from the OS when a new file-system on writable external media is mounted. The Net Flash module then checks for incoming module messages, in the following locations: \System Volume High priority incoming messages \System Volume Information\sys Normal priority incoming messages logs\data\* \System Volume Information\sys Outbound messages logs\com\* 32 If these folders do not exist, they are created as hidden system directories. Inbound message files are deleted after they?re read. The X-Agent microkernel contains a message shim for the Net Flash module. When Net Flash is active, this shim intercepts all outbound messages, rerouting them before they reach an external channel. Linux versions of X-Agent also contain this shim, but a Linux version of the Net Flash module has not been observed. This architecture indicates that the X-Agent kernel was designed or specifically adapted to work in air-gapped environments. Autorun Infection Perhaps to support infection in air-gapped networks, X-Agent has the ability to spread via autorun invocation on USB flash drives. Some samples have been observed with residual strings from an autorun. inf file: [autorun] open: shell\open=Explore Volume Information\USBGuard.exe? install shell\open\Default=l X-Agent Indicators Known mutexes: Known mailslots (for Packet queue file names: edg6E85F98675.tmp edg6EF885E2.tmp zdg6E85F98675.tmp zdg6EF885E2.tmp 33 Representative Sample Hashes Signatures The following Yara signatures can be used to detect X-Agent: rule ecksumAlgo rithm stringscondition: 1 of them rule XAG strings: $s_uniq1 wide $s_uniq2 ascii $s_unic;3 ascii $s_uniq4 wide $swuniq5 wide $swuniq6 "engE85F98675.trnp" wide $s_uniq7 wide "4font size=4 coior=red>comm isn't wide 6 "com 6 is success" ascii "com 7 is success" ascii "com isn't success" ascii EXC: - Cannot create Post Channel!" asci? EXC: - Cannot create Get Channei!" ascii Cannot create ascii Cannot create ExtChannetToProcessTitread!" ascii Cannot create ProcTo Ext Pipe!" ascii Cannot create ExtToProc PipeE? ascii Cannot create Process!" ascii "Calloc 3 error!" ascii wide ?{autoru Volume nformation\\USBGuard.exe\" ascii size=4 colorzred>comm" wide "comm" wide ?
" wide 35 width=800 height=500 ascii 2 "fiie is blocked another process
? wide "Calloc 1 error! Packet lost!? ascii "Error Broken Pipe!" ascii condition: 1 of ($swuniqt?) or 8 of them . rule strings: $s_uniq1 "WRlTE FILE IS NOT ascii 34 $shuniq2 "
? ascii $s?uniq3 "Terminal don?t started" ascii $s_uniq6 ".configldbus-notifier" ascii ascii 2 "rm -f ~f.configiautostartl? ascii "mkdir ascii "11AgentKemei" ascii 55 "12EAgentModule" ascii $:'W3ResavedApWasdi "BFSModule" ascii ?i of or 6 of them SMTP and POP3 Servers and Accounts When the mail channel is active, the following SMTP and POP3 servers and accounts have been observed being used for C2. X-Agent binaries contain hard-coded credentials for free webmail providers or presumably compromised accounts. SMTP and POPB Servers: smtp.mail.ru pop.mail.ru smtp.yandex.ru smtp.bk.ru smtp.gmail.com smtp.mia.gov.ge mail.mia.gov.ge SMTP and POP3 accounts: arkadmo@mail.ru roe.xichard@yandex.ru john.dory@mail.ru Colin.mcrae1968@gmail.com devil.666.666.13@gmail.oom interppol?gmail.com :obert.fastand@gmail.com jose.karreras@bk.ru 35 kar1.fridrikh@yandex.ru sarah.nyassa@gmail.com i1ya.kasatonov@list.ru zurab.razmadze11@gmail.Com albertborough@yahoo.com ahmedOmed8outlook.com shjanashvili0mia.gov.ge u.kakhidze@mia.gov.ge r.gvarjaladze@mia.gov.ge maia.otxmezuri8mia.gov.ge 1.maghradze@mia.gov.ge CZ Servers and Domains The following observed C2 domains and IP addresses are most used by the external channel. Domain names: hotfix?update.com adobeincorp.com Check-fix.c0m secnetcontrol.com checkwinframe.com testsnetcontrol.c0m azureon?line.com windows~updater.com IP addresses: 62.205.175.96 63.247.82.242 63.247.82.243 64.92.172.221 64.92.172.222 67.18.172.18 70.85.221.10 74.52.115.118 80.94.84.21 80.94.84.22 81.177.20.109 81.112.20.110 82.103.128.81 82.103.128.82 82.103.132.81 82.103.132.82 83.102.136.86 88.198.55.146 94.23.254.109 201.218.236.26 203.117.68.58 216.244.65.34 36 Appendix A Sofacy LNK Persistence File The following LNK file shows how Sofacy creates persistence using this method. This can also be found in VirusTotal with a hash of: Windows Shortcut Contains Contains Contains Contains Contains COntains Contains information: link target identifier description string working directory string command line arguments string a a a relative path string a a an icon location string an icon location block Link information: Creation Modifica Access File si: File att Drive ty Drive se Volume Local pa Descript time Jan 06, 2011 21:30:40.983625000 UTC tion time Aug 14, 2007 02:43:56.000000000 UTC ime Jan 07, 2011 06:47:58.593750000 UTC 622080 bytes ribute flags 0x00000020 Should be archived pe Fixed rial number 0xec6d8bll abel th C:\Program Files\Internet Explorer\iexplore.exe ion Relative path Working Command Icon loc Link target iden Shell it Shell it Shell it Shell it Extensio directory line arguments ation tifier: em list Number of items em: 1 Class type Shell folder identifier Shell folder name em: 2 Class type Volume name em: 3 Class type Name Modification time File attribute flags Is directory block: 1 Signature Long name Creation time Access time Users\hpplication C:\Program Files\Internet Explorer "C:\Program Files\Internet Explorer\iexplore.exe" iProgramFiles \Internet Explorer\iexplore.exe {Root folder! My Computer 0x2f {Volume} 0331 {File entry: Directory) Documents and Settings Not set (0) 0x00000010 0xbeef0004 (File entry extension) Documents and Settings Not set (0) Not set 37 Shell item: 4 Class type Name Modification time File attribute flags Is directory Extension block: 1 Signature Long name Creation time time Shell item: 5 Class type Name Modification time File attribute flags Is directory Extension block: 1_ Signature Long name Creation time Access time Shell item: 6 Class type Name Modification time File attribute flags is directory Extension block: 1 Signature Long name Creation time Access time Shell item: 7 Class type ame Modification time File attribute flags Is directory Extension block: Signature Long name Creation time Access time Shell item: 8 Class Name Modification time File attribute flags type 1 0x31 {File entry: All Users Not set 0200000010 Directory) 0xbeef0004 (File entry extension) All Users Not set (0) Not set (0) x31 (File entry: Directory} Application Data Not set 0x00000010 0xbeef0004 {File entry extension} Application Data Not set Not set (0) 0x31 (File entry: Microsoft Not set (0) 0300000010 Directory) 0xbeef0004 Microsoft Not set (0) Not set {File entry extension} 0x31 {File entry: MediaPlayer Not set 0x00000010 Directory) 0xbeef0004 {File entry extension) MediaPlayer Not set (0) Not set (0) 0x32 (File entry: File) service.exe Not set 0x00000020 Should be archived Extension block: 1 Signature Long name Creation time Access time Distributed link tracking data: Machine identifier Droid volume identifier Droid file identifier Birth droid volume identifier Birth droid file identifier 0xbeef0004 [File entry extension) service.exe Not set Not set (0) xp 38 Appendix X-Agent CZ Raw Packet Decoding Base64 Alphabets X-Agent uses two base64 alphabets during message encoding. The first is a standard base64 alphabet, used for mail messages (SMTP and P0 P3): HTTP messages are encoded with a different web-safe base64 alphabet: Raw Packet Message Format Raw packets are a generic container and packet format, used to transmit module messages over external channels such as HTTP, SMTP, or POP3. Raw packets are transmitted one-by-one, each in its own external channel message. For example, the SMTP mail channel sends each raw packet message as a mail attachment file. The size of the raw packet message is the size of the decoded attachment. Raw packets include the following fields agent_id] crc[2]] The raw packet message format was meant to be abstracted from the external channel, but there is one implementation inconsistency. The HTTP external channel XORs the agent ID field with an XOR key intended to obfuscate the previous header. The mail channels do not do this, and it is likely an unintentional oversight. Raw Packet Message CRC Checking 39 A CRC is calculated over the data and session key fields and then sent as two UINT16LE fields in the packet. The first is a polynomial seed for the CRC-16 algorithm, followed by the calculated (good) CRC value. Here is an implementation of the CRC check functionality in unsigned short crcl6(const unsigned char* input. size_t len, unsigned short poly_seed) unsigned short result for (size_t i 0; i len; unsigned char input[i]; For (int 0; 8: if (result 0xff)) i) result 1; result poly_seed; else result 1; 1; return result; bool input) unsigned char header[4]; unsigned short seed, expected_crc, actual_crc; if 4) return false; memcpy(header, seed headerEG] (headerEiJ expected_crc headerEZ] (header[3] actual_crc unsigned 4), 4, seed); return (actual_crc expected_crc); Raw Packet Message Raw packet messages are using a key built by concatenating a static private key with a public key that changes each packet. Afew simple steps can be used to a raw packet message: 1. Retrieve the agent (D (first 4 bytes of the message) as a little-endian Discard these message bytes from the stream. 2. Retrieve the 6 polynomial seed value, and the expected CRC-16 value, as the next two 6LEs (immediately following the agent ID). Discard the CRC bytes (4 in total) from the stream. 40 3. Calculate the actual CRC of the remaining packet bytes, seeding the CRC with correct polynomial seed. This should match the expected value. 4. Create the full RC4 key for the message which starts with a [SO-byte static private RC4 keyThen append the last 4 bytes of the message (the public key) to create the full RC4 key. Finally, discard the last 4 bytes of the stream (the public key). 5. the remainder of the message stream using the full RC4 key. 6. Check that the last i ?1 bytes of the message are the magic token bytesDiscard these bytes. The result is a clear?text, serialized module message. 4?