Neel Mehta, Billy Leonard, Shane Huntiey
Google Security Team
Version: 1.0
Published: September 5, 2014
TLP Green
Neel Mehta, Billy Leonard, Shane Huntley
Google Security Team
Version: 1.0
Published: September 5, 2014
TLP Green
Table of Contents
introduction ..2
Sofacy Analysis ..3
Sofacy
Persistence Mechanism ..15
Sofacy Functionality. 6
Network
Sofacy
X?Agent Analysis ..19
X-Age nt identifiers . .20
X-Agent InternalsPersistence
Network
Air?GappedOperations ..32
X~Agent
AppendixA ..37
Appendix
1 .
WM
Introduction
Many sophisticated state-sponsored attackers use multi-stage
malware toolkits. First?stage implants are widely distributed,
easily discovered, and serve as a simple beachhead. in
contrast, complex second-stage implants are typically used
sparingly on only the most interesting systems, after
determining there is limited risk of detection by security
products. As such, a first-stage tool exists primarily to limit
the exposure of second-stage tools, extending their usable
shelflife.
This analysis describes one family of malware: a first-stage
tool, Sofacy, and an associated second-stage tool, X-Agent.
Sofacy is an antivirus industry name, while X-Agent was
named by the malware authors. Together, these tools are
used by a SOphisticated state-sponsored group targeting
primarily former Soviet republics, NATO members, and
other Western European countries. This information has
been determined from VirusTotal submissions.
Antivirus detection for both Sofacy and X-Agent is subpar,
with plenty of room for improvement. Antivirus detection
for Sofacy, based on VirusTotal data, was roughly 36.6%.
Detection for X-Agent was lower, at only 34.2%. Our goal in
releasing this analysis is to improve antivirus detection for
both. Consequently, recipients of this paper are free to share
it with interested parties in the security community.
This analysis is
Acknowledgements
This analysis was compiled with the tireless help and extensive expertise of the
Google Security Team, especially Heather Adkins, Daniel White, Joachim Metz,
Andrew Lyons, Liam Murphy, Elizabeth Schweinsberg, Matty Pellegrino, Kristinn
Gu?j?nsson, Cory Aitheide, Armon Bakhshi, and Mike Wiacek.
VirusTotal Submissions By Country
Analysis of VirusTotal submissions for Sofacy and X-Agent yields insights into
the attackers operations.
As a first~stage tool, Sofacy is used relatively indiscriminately against potential
targets. X-Agent is reserved for high?priority targets. This is borne out by the
data. VirusTotai submissions show that Sofacy was three times more common
than X-Agent in the wild, with over 600 distinct samples in the data set.
Proportionai differences in the geographical distribution of submissions of the
first-stage tooi, Sofacy, and the second stage tool, X-Agent, provide some
interesting insights.
For example, the Republic of Georgia represents only 3.5% of Sofacy
submissions, but makes up 28.9% of all X-Agent submissions, more than any
other country. This suggests that, at one point, Georgia was a high priority
target for the attackers.
This ratio is iikely a lagging indicator of attacker interest (attackers must first be
caught, and compromise can go undetected for years).
The same comparison shows attacker interest in Ukraine, Germany, Poland,
Denmark, and also Russia.
X-Agent submissions from the United States and Canada were proportionaliy
smailer than Sofacy submissions.
Sofacy submissions to VirusTotal
Israel
2.7%
Ukraine
7.1%
USA
22.6%
Georgia
3.5%
Belgium
Austria 2'7?
396
.4 Others
Germany
0 China
1 2 2.7%
X?Agent submissions to VirusTotal
Denmark
1.996
Romania
1.9%
Russia
193% Georgla
28.9%
Germany
Poland
3.8%
Canada
Japan
1.996
Submission Share Ratio of X?Agent Sofacy in VirusTotaI by Country
Georgia
Romania
Russia
Denmark
Poland
Germany
Ukraine
Vietnam
Canada
5. Korea
USA
Ratio
9x
Sofacy Analysis
Sofacy is an above average first-stage implant. Early variants are more
technically complex than recent ones. For example, older samples feature the
ability to move seamlessly between processes and harvest credentials. Newer
variants are more mature and focused in their design. They provide
functionality to detect personal security products, survey infected machines,
and install a second stage tool, all without exposing techniques such as lateral
process movement.
Dropped By Boring-Looking Exploits
Sofacy is often delivered by Microsoft Word exploits as RTF, DOC or DOCX files
CVE-2012-0158, CVE-2010-3333). It is occasionally delivered by Adobe
Acrobat PDF reader exploits. These exploits are often first used by Chinese
attackers, but have been repurposed by the actors responsible for Sofacy. To
achieve this, the Sofacy executable is swapped in for the original exploits
payload, leaving other parts intact, including shellcode.
Sofacy Internals
Position-Independent Code Development
The authors of Sofacy use clever compiler tricks to produce a binary with no
dependence on imports, relocations, or initial code position. This allows it to be
copied into another process and executed, without any additional dependencies
or setup requirements.
Assemny language development is slower and more expensive than
development in higher level languages. This ease of development comes at the
cost of new dependencies on OS-specific loaders, which complicate
cross-process injection. The authors of Sofacy have found an elegant middle
ground, which at first glance might appear to be hand written assembly, but is
consistent with the register allocation and plpelining of Microsoft Visual
The entry point is passed two arguments: a pointer to an address in
kerne132.dll and a base address to identify where the code is in memory.
Function calls and global variables are then accessed as an offset from this base
address.
Here are two examples:
Function Call:
seg600:@@19DBF2 lea eax, [esi+i93721h]
seg000:001903F8 call eax
Global Variable Access:
lea eax. [esi+194?85h]
mov eax
Each function is passed the code base address as its first argument, and this is
used consistently, like a calling convention. Thusly, the authors have produced
a position-independent binary with no external imports, by utilizing
preprocessor macros, or a similar mechanism, to do pointer math for each
function call or global variable access.
In order to use system functions, the start function first walks back in memory
to find the start of the kerne 32 module. Then the code manually walks the
export table, hashing the function names to resolve the required imports. A full
list of hashes for the imports is shown below.
seg00@:0@18BlBl imported_function_hashes dd CloseHandle
segoeezeelBBiBS dd CopyFileW
dd CreateDirectoryW
dd 30C48297h CreateEventw
seg000:0@1881C1 dd CreateFileA
seg000:0@1881C5 dd CreateFileMappingw
seg080:661881C9 dd CreateFileW
seg00@:@elaBiCD dd 6411920Bh CreateMailslotw
dd CreateMutexw
seg60@:001831D5 dd CreatePipe
seg600:0018BlD9 dd CreateProcessw
seg@0@:@@18BiDD dd CreateRemoteThread
dd CreateThread
seg0002891881E5 dd
seg0891601881E9 dd DeleteFileW
dd ExitProcess
seg000:@el8BiFl dd ExitThread
dd 235459?8h FindClose
dd 6306C065h FindFirstFileA
dd FindFirstFileW
seg808:00188201 dd FindNextFileA
seg090:00185205 dd FindNextFilew
seg0@@:@@188209 dd FreeLibrary
dd GetCommandLinew
dd GetCurrentProcess
seg000:00188215 dd GetCurrentProcessId
seg@0@:00188219 dd
seg@0@:@@18321D dd
seg000:00188221 dd GetEnvironmentVariableW
seg000:00188225 dd GetExitCodeProcess
seg080280188229 dd GetExitCodeThread
dd 6C0i32B93h GetFileInformationByHandle
593000:00188231 dd GetFileSize
seg090:@018B235 dd GetFileTime
seg000:00183239 dd GetLocalTime
seg000:00188230 dd 4586608Ch GetModuleFileNamew
seg000:
seg000:
seg000:
seg000:
seg000:
seg000:
seg000:
seg000:
5eg000:
seg000:
seg000:
segaeez
seg000:
seg000:
segeeo:
segeoe:
segaoe:
segoeez
seg000:
segoeo:
segeoo:
593000:
segaee:
segeea:
segoee:
segaeo:
segeeo:
segeoe:
segGGO:
segeea:
seg000:
segeae:
seg000:
segeea:
segaee:
segeea:
segeoe:
seg000:
segae?:
segeeo:
seg000:
seg000:
00183241
00183245
00183249
00183240
00183251
00183255
00183259
00183250
00183261
00183265
00183269
00183260
00183271
00183275
00183279
00183270
00183281
00183285
00183289
00183280
00183291
00183295
00183299
00183290
001832A1
001832A5
001832A9
001832A0
00183231
00183235
00183239
00183230
001832C1
001832C5
001832C9
001832C0
00183201
00183205
00183209
00183200
001832E1
001882E5
001832E9
001832E0
001832F1
001832F5
001832F9
001832F0
00183301
00183305
00183309
00183300
00183311
00183315
00183319
00183310
038E52902h
89076100h
51268313h
8A324136h
6E824142h
0C3?34651h
000434733h
000434751h
0053992A4h
579013E9h
56F73980h
0E1159330h
003204930h
78353983h
030016F89h
032089259h
4363076Ch
GetPrivateProfileStringA
GetPrivateProFileStringw
GetProcAddress
GetStartupInfoW
GetTickCount
GetTimeZoneInformation
GetVersionExw
GetVolumeInFormationW
GlobalAlloc
GlobalFree
HeapCreate
HeapDestroy
IsBadReadPtr
LoadLibraryW
MapViewOFFile
MultiByteToWideChar
OpenMutexW
OpenProcess
PeekNamedPipe
Process32Firstw
ReadFile
ReadProcessMemory
ReleaseMutex
ResumeThread
SetCurrentDirectoryW
SetEndOFFile
SetFileAttributesw
SetFilePointer
SetFileTime
SetThreadPriority
SetThreadPriorityBoost
Sleep
TerminateProcess
TerminateThread
UnmapViewOfFile
VirtualAlloc
VirtualAllocEx
VirtualFree
VirtualFreeEx
WaitForSingleObject
WideCharToMultiByte
WriteFile
writePrivateProfileStringA
WriteProcessMemory
LoaderFuncHonath
Sofacy
persists on infected machines as an and compressed payload,
appended to a small loader executable file. The loader the payload by
permuting a 32-bit key and X0 Ring each byte with the lowest eight bits. The last
byte of the payload is left untouched.
The 32?bit key is initialized with a literal value in the loader?s main function:
.text:
00401025 mov
This value is then modified, often using MMX instructions:
.text:
.text:
.text:
.text:
.text:
.text:
.text:
Finally,
.text:
.text:
.text:
.text:
0040107F movd mm0, [ebp+var_28]
00401083 mm0, 2
00401087 movd mm0
0040113D mov eax, [ebp+var_28]
00401140 eax, 4
00401143 inc eax
00401144 mov eax
the key is passed to the function:
0040117A push [ebp+var_28]
00401170 push 7D68h
00401182 push [ebp+var_8_buffer]
00401185 call
Here is the equivalent code in C:
void
unsigned char* payload, size_t len, unsigned int key,
unsigned char* out)
for (size_t i 0; i (len
unsigned char (key A 1) 8 8 0xff;
outEi] payloadEi] x;
key 0xea61;
key 0x24142871;
last byte is not obfuscated.
out[i] payloadEi];
LZSS Decompression
The payload contains a decompression stub, which implements a
simple Lempel-Ziv variant commonly used in malware. Malware
will likely recognize the decompression code, with one small change. in
addition to the first layer, each compressed input byte is with
a permutation of a hardcoded 32-bit key:
unsigned int key
unsigned char next_byte input_byte (key @xff);
Equivalent of x86 ?ror? instruction
key rotate_right(key,
Dynamic Dependency Resolution
The loader invokes the entry point of the position?independent code blob. it
must then first resolve dynamic dependencies, before doing anything else.
Sofacy identifies dynamic dependencies by iterating through a list of files in
%windir%\system32\, hashing file names, and comparing those hashes to a
list of hashes for needed DLLs. Ultimately, it depends on at least the following
system DLLs:
kernel32.dll
user32.dll
w52_32.dll
shlwapi.dll
advapi32.dll
iphlpapi.dll
pstorec.dll
inetmib1.dll
snmpapi.dll
wininet.dll
setupapi.dll
shell32.dll
ole32.dll
10
String
Contemporary variants of Sofacy strings using an algorithm that
resembles RC5. The loader contains three distinct blobs of data:
1. Dynamic dependencies, configuration, and C2 servers.
2. A list of antivirus and personal security products to detect.
3. The actual implant binary.
Each variation of RC5 permutes an 8?byte block of data with an 8?byte key,
using a single round. A 4-byte window of the key is used to each byte of
input data. For example, taking the following key bytesThe first byte of input will be using the first 4 bytes of the keyThe second byte will be using bytes 2 through And, eventually, the window wraps at the 6th byte of input, using the last 3 and
first byte of the keyThe four key bytes, along with an 8-bit representation of the input position, is
combined to generate an 8-bit value that is with the input byte.
Each algorithm is a variation on this theme:
unsigned char *input;
size_t input_1ength;
unsigned char *key;
for (size_t i 0; i input_length;
unsigned int x, y;
unsigned char a, b, c, d;
unsigned char input_index_char i exff;
size_t block_index i 7;
keyEi
key[(i l) 8
key[(i 2)
keY[(i 3)
[permute values - get an 8-bit value to xor with input byte]
inputEi] x;
There are at least 6 variations of the permutation algorithm. For most Sofacy
samples, one of these six variations can be used to the three
blobs of data.
Variation
input_index_char;
4;
b;
input_index_char;
Variation 2:
input_index_char;
903(-
ll
input_index_char;
block_index;
b;
Variation 3:
input_index_char;
A:
9a
input_index_char;
7;
A:
12
@xff;
Variation 4:
a;
+2 input_index_char;
block_index;
d;
input_index_char;
@xff;
I: c;
b;
y;
Variation 5:
a;
input_index_char;
4;
Oxff;
b;
input_index_char;
d:
Oxff;
Variation 6:
a;
input_index_char;
b10ck_index;
Bxff;
b;
d;
input_index_char;
3! 8F C:
y;
fof;
Parameter Store
Recent Sofacy droppers configuration data and store it in a registry key.
This key hangs off HKLM if the dropper has permissions to write there,
otherwise HKCU, and is located at:
The configuration data is stored in a proprietary key/value format. It starts with
a 6-byte key, followed by 20 bytes of UINT8 The remainder of the data
is configuration values. Configuration values are identified by their
index into the length table.
The parameter store allows run-time updates to the configuration, and serves to
separate it from the implant binary. For example, the C2 servers cannot be
found in the implant binary, and may only be recovered statically from a
dropper, or by the data from the parameter store.
Keystroke Logging
Sofacy's keystroke logger attaches its input processing methods to those of the
active foreground window. It polls the foreground window, detecting changes as
the user switches applications.
it also captures process context, such as executable paths and arguments.
Captured keystrokes are normalized to Unicode, taking into account the active
keyboard layout.
Inter-Instance Communication Via Mailslots
Sofacy communicates with itself over a mailslot3 such as:
\\.\Mailslot\LSAMailSlot
As an example, the keystroke logger uses this mailslot to communicate with the
main Sofacy process. As it receives keystrokes, it sends them back over the
mailslot as serialized HTML. Another instance of the implant, running in a
different process, will read the keystroke log data from the mailslot, it
and re-transmit it over the C2 network connection.
14
Persistence Mechanisms
Persistence Via LNK Shortcuts
Sofacy may persist via changes to an existing LNK file4 in a shell startup folder.
This LNK file is invoked each time the user logs in. Sofacy adds a ?Shell Item? to
the end of the .LNK file
The shell startup folder locations are determined by reading the following
registry keys:
Folders\Startup
Folders\DesI;top
Folders\Commor1 Startup
Folders\Common Desktop
Sofacy scans the startup folders for an appropriate, pre-existing LNK file. The
LNK file's original timestamps are captured and a small change to the file is
made. After modification, the Windows API SetFileTime function is called
to restore the file's ?creation?, "last access?, and "last write" times. An example
LNK file is included in Appendix A.
Persistence Via Windows Shell
Sofacy is also known to persist via Quick Launch6 folders, Shell Icon Overlay
Handlers and Shell Service Objects.
Older versions of Sofacy may drop itself into one of the following Quick Launch
folders:
Data\Microsoft\Internet Explorer\Ouick Launch
Data\Microsoft\Internet Explorer\Quick Launch
Shell Icon Overlay Handlers7 are COM objects that implement the
IShellIconOverlayIdentifier interface to show icon overlays (where
one icon is displayed on top of another). Icon Overlay handlers are loaded in the
context of explorer.exe when each user logs in. This is used by legitimate
applications such as TortoiseSVN.
Sofacy registers itself as a Shell Icon Overlay Handler by setting the appropriate
registry key to the UID of its registered COM object:
4 hup.cimsdn micmso? aspObserved names for the Icon Overlay Value are: AdvancedStorageShell
The icon overlay handler key points to a registered COM object, a Sofacy DLL:
Sofacy can also persist as a Shell Service Object, another class of COM objects
that load on user login. They are registered in the following key:
Observed names for Sofacy Shell Service Objects are: netids
The shell service object CLSID used is:
Sofacy Functionality
Disabling Error Reporting
To avoid detection, Sofacy systematically disables crash reporting, logging and
post-mortem debugging each time it starts. It is delivered via memory
corruption exploits, which are inherently unpredictable. Also, Sofacy performs
complicated inter-process inspection and code injection. Finally, the code may
have bugs. Any of these factors may lead to crashes, which if logged are likely to
be noticed.
Sofacy disables crash and PC health reporting by changing the following registry
DWORD values to O:
It suppresses system hard error message display by setting the following
registry DWO RD value to 2:
It also disables Dr. Watson (or other post-mortem debuggers) by deleting the
following registry key:
NT\CurrentVersion\AeDebug
Interest in the Physical Location of the Machine
Sofacy tries to read a value PhysicalLocation_Name from the system
administrative template file: \Windows\inf\system. adm
Administrators, especially in large organizations, will populate this field with the
physical location of the system in the field.
Sofacy gathers this information as part of its machine survey. It is sent back to
the malware operator, adding context that may inform operator interest.
Email Credential Harvesting
Sofacy recovers cached email credentials from several sources. Specifically, it
can recover saved credentials from Outlook, The Bat, Eudora, and Becky.
Local Output Queue
Sofacy temporarin queues data it gathers on disk. This data is LZSS-compressed
and The location of the queue file is configurable and specified in a
registry key. The registry key is subject to frequent change, as is the location of
the queue file. In one sample, the queue file location was stored in this key:
cense
Network Communications
Impersonating Legitimate Processes For Network Communication
When communicating with the C2 server, Sofacy will scan a list of running
processes, looking for a running web browser or email client. When one is
found, it will clone the process arguments exactly, then create a new instance of
the process. The main thread of the cloned process is started in a suspended
state, and the implant is injected into the new process address space. The
implant is started instead of the original Sofacy will pick a C2 that
matches the cloned process: HTTP, SMTP, or P0 P3.
By doing this, Sofacy mimics legitimate user processes, making it difficult to
discern that network traffic originated from malware, not user actions.
Asymmetric of Session Keys
Sofacy uses the Windows API to create session keys for C2
communications. It creates an ephemeral RC4 session key and seals it with a
hardcoded 1024-bit RSA public key. This sealed session key is included with the
data transmitted to the C2 server. As such, only a recipient with the matching
private key can decode traffic.
Proxy Awareness
Sofacy will detect proxies configured for Winlnet and Firefox. It will then use the
correct proxies when connecting outbound to C2 servers.
Sofacy Indicators
Known mailslots (for IPC):
\\.\Mailslot\LSAMailSlot
Representative Sample Hashes
Example Signatures
The following ClamAV and Yara signatures can be used to
detect Sofacy:
9451072da?8bc75f5e5dc20c00
rule
18
strings$sucmset?configuby?numw?[0?16condition:
1 of them
rule
stringscondition:
1 of them
Known 02 Servers
C2 domains:
securitypractic.com
checkmalware.org
adawareblock.com
Checkmalware.info
scanmalware.info
updatepc.org
updatesoftware24.com
testservice24.net
symanttec.org
microsofi.org
microsof?update.com
1P Addresses:
123.100.229.59
200.74.244.118
74.52.1l5.178
88.198.55.146
67.18.l?2.18
203.117.68.58
X-Agent Analysis
X-Agent is a second-stage toolkit complementing Sofacy. Portions of the
X-Agent code base can be found in malware dating back to at least 2004.
Somewhere down the Line, -Agent became the internal name for this tool. The
features of X?Agent demonstrate its sophistication. For example, it can operate
in an air-gapped environment via an ad-hoc pseudo-network of USB flash
drives.
X-Agent is multi-platform capable. With minor changes to platform-specific
code, X-Agent will run on Linux instead of Windows. It can also be repackaged in
different forms, for example as a DLL, by the addition of a single module. This
analysis applies to X-Agent on two known platforms: Linux and Windows.
X-Agent Identifiers
Windows PE File Resource Locale IDs
Windows Portable Executable resources are localized and include the
locale ID9 of Windows running on build systems. As such, it may reveal the
origin of malware. The locale iD field can be faked, but is often overlooked in
malware build environments.
PE resources are organized into a 4-level deep tree, with the third level
specifying the locale ID of the resource. This is different from a code
page, such as Windows-1251, and is more specific.
The Windows resource compiler (Rchl.dll) uses the default locale iD
Of 113 X-Agent PE samples observed in VT's dataset, 68 had PE resources.
Three unique locale iDs were found in these samples:
0409 en?US (English US)
0419 ru?RU (Russian)
0000 NULL (invalid)
Of the 68 samples that contained PE resources, the most common locale ID was
ru?RU (Russian).
20
Locale IDs Number of Samples
. .. .
ru-RU and NULL 1
1
Program Database File Paths
Microsoft's Visual compiler may include a fully-qualified path to a program
database (PDB) file to help a. debugger can locate symbols. This build-time
artifact can provide information about the systems used to build the malware.
The following PDB paths have been observed in X-Agent samples:
C:\Documents and
Visual Studio 2005\Projeots\NET\Mail 1.1\
Mail l.l\obj\Release\rund1132.pdb
C:\WORK\SOFT\Joiner\joiner O.l\Release\joiner.pdb
C:\WORK\SOFT\Joiner\joiner O.2\Release\joiner.pdb
d:\Shared DATA\spec_ver\
X-Agent Internals
X-Agent Framework
The X-Agent framework is a set of components, communicating over
well-defined methods. Each component is a module, and they communicate
over channels.
Individual instances of X-Agent are termed agents. Each agent is assigned a
unique ID (agent ID), calculated from a hash of the MAC addresses of all network
interfaces on the machine.
2?l
The X-Agent framework uses the term controller to refer to the software running
on the C2 server. Each X-Agent agent communicates with its controller over a C2
channeL
Kernel
The core module in the X-Agent framework is the agent kernel, a small
user-mode microkemel. This microkernei can register other modules and
communication channels, as well as handle thread management,
and It has a generic interface to storage and
configuration data.
Implant Initialization and Lifetime
On startup, X?Agent?s main() function registers relevant modules and an
external channel. It then starts a channel controller thread, which handles
message distribution and channel selection. Finally, X-Agent starts a worker
thread for each module. X?Agent continues to run until ali these workers
terminate, or untii operator commands instruct it to exit or uninstall.
Parameter Storage
X-Agent, like Sofacy, can maintain a parameter store that contains C2 servers
and other configurable parameters. This would be initialized by the dropper,
separating the configuration from the implant configuration on disk. It also
allows for runtime configuration changes. For unknown reasons, most X-Agent
builds do not use the parameter store in practice.
Windows The Windows Registry provides the underlying datastore for the
parameter store on Windows and can be found at:
individual parameters are keyed off their registry value name, a hexadecimal
number string.
Linux On Linux, the parameter storage is held in a SQLite database, located in
Each row in the database contains an id column which serves as
the key. Each parameter is then stored as a binary or dword value.
Channels
X-Agent uses channels to structure communication and connections. Channels
are used for and C2. Multiple channels are multiplexed over a single
22
network connection. External channels are used to communicate with the
controller, abstracting the network C2 protocol from higher?level channels.
The following channel types have been found in X-Agent samples:
HTTP Channel Ox2101, 0x2102
Mail Channel 0x2302
Local Channel 0x2301
Channel Controller
The X~Agent channel controller is responsible for passing module messages
between external channels and local modules. The channel controller is unaware
of any specific C2 protocols. These are abstracted and entirely the responsibility
of the external channel.
The channel controller also passes controller-generated (inbound) module
messages to local modules. It queues these messages in memory as a
vector, and passes them to the target module.
The channel controller?s final responsibility is to control which channels are used
for communication. through a channel changing mechanism exposed via a
module command. An operator sitting at a remote console can switch from one
external channel, switching C2 protocols on the fly. For example, X?Agent might
switch from communicating over HTTP to email protocols.
External Channels
External channels are used to multiplex messages from modules to the
controller. X-Agent agents must register at least one external channel with the
kernel. They imitate legitimate network activity, such as web browsing, or
sending and receiving email.
Local Channels
X-Agent contains a local channel implementation that uses a hidden file for
module message IIO. This local channel is used in conjunction with the Net Flash
module in air-gapped environments (see below).
23
The X?Agent kernel will selectively intercept messages to load and unload
modules before they passed to the channel controller. This is conceptually
similar to a local channel.
Modules
Each X-Agent component is a module, including the kernel. The modules
register with the kernel, and are identified by a unique 16-bit
The following modules have been observed in X?Agent binaries:
Kernel UXODOZ
Remote Key Logger 0x1002
Process Retranslator 0x1302
DLL Ox? 602
Net Flash 0x120?
Module classes are derived from a common base class, and accessed over the
same basic abstract interface. X-Agent modules may override five methods in
the module base class. In a compiled X-Agent binary, they appear in the
following order in a module vtabie:
i. A take message method. This method passes inbound module
messages to the modules, which take ownership of them.
2. A give message method, by which the module gives up ownership of
outbound module messages, to send them to the controller.
3. Aget module ID method, that returns the 16-bit module ID.
4. A set module ID method, that sets the module ID.
5. A worker run method, which is the main function for the module.
It invoked in a dedicated thread, started by the kernel.
24
Module Messages
Module messages are X-Agent's internal message representation. A module
message contains the agent ID, a module ID, a command number, a priority, and
an opaque data field and size.
The module ID on outbound messages specifies the module that created the
message.
On inbound messages, the module lD specifies which module should receive the
message. These messages are called questions, and come from the controller.
The destination module will receive these questions, and may choose to answer
them 'with a response. Responses are also constructed as module messages.
Some modules will generate messages autonomously. For example, the
keystroke logger module will generate module messages containing logged
keystrokes.
Module Message Serialization
X-Agent serializes module messages starting with a simple header, followed by
an opaque field:
struct modu1e_message
UINT16LE modu1e_id;
UINT8 command_number;
UINT8
X-Agent serializes each module message by wrapping it in a raw packet (see
Appendix B). That raw packet is then sent over the network to the C2 controller.
The size of the C2 message specifies the raw packet size, and subsequently the
module message size.
The protocol design does not include sequence numbers and behaves like an
unreliable transport mechanism. Statefulness is tracked completely in response
module messages. For example, when X-Agent receives a command to read a
file, it responds with a log message that says it read a specific file, followed by
the file?s contents.
HTML Log Messages
X-Agent log messages are written as HTML and color coded, perhaps to make it
easier for human operators to read. Error messages tend to be colored red:
process is exist
File don?t
25
Persistence
On Windows, X-Agent will persist via a Registry Run key, using rundll32 .exe
to invoke its publicly exported init method. The Run key may be named
after its DLL filename on disk, such as:
With a registry key value of:
rund1132.exe
Alternatively, X-Agent may persist as a Windows service, or as a Shell Icon
Overlay Handler, like Sofacy.
Linux deployments of X-Agent may persist via a .desktop file located in
. config/autostart/. When installed as root, the X?Agent binary may be
installed as and persist via run level scripts such as re . local.
Network Communications
Packet Queues
X-Agent uses packet queues to buffer C2 traffic when passed between the kernel
and channel controller. The inbound message queue is a vector in
memory, accessed
Outbound messages are buffered in two local queue files on disk, one each for
high and normal priority messages. Each queue holds module
messages, prefixed by a UINT32LE
Observed names for the queue files are:
.edg6E85F986
.edg6EF885E2
edg6E85F98675.tmp
zdg6E85F98675.tmp
edg6EF885E2.tmp
zdg6EF885E2.tmp
These queue files are most often located in the /tmp directory on Linux and the
path returned by GetTempPathl) on Windows. Some DLL builds of X-Agent
will put these queue files in their working directory instead, although this is less
common.
26
After reading queue files, the channel controller deletes them. They are not
securely wiped from disk, and may be recoverable.
External Channels and
X-Agent HTTP traffic is clear?text. SMTP and POP3 channels use TLS. and are
more challenging to detect on the network.
HTTP External Channel
X-Agent?s HTTP external channel is commonly-used to talk to the controller.
POST requests are used to send messages while GET requests retrieve inbound
messages.
An example HTTP external channel session has been provided as a text file and
is available via VirusTotal with a SHA-256 hash of:
All HTTP messages include a magic token value in the POST messages also
include a request body containing an encoded module message.
HTTP URI Generation
The full for HTTP requests is randomly generated, according to a template
implicitly agreed upon by both agent and controller.
The base URls for GET and POST requests is generated by selecting a random
string from a list. Since this base is ignored by the controller it is not
unusual for it to change between X-Agent versions. in one X-Agent sample the
following list of base URIs was observed:
/watch/?
/search/?
/find/?
/results/?
/open/?
/searchl?
/close/?
27
Parameters for the URI are chosen from a list and appended to the base URI.
The following parameter name choices have been observed:
text:
from:
aim
age:
oem
btnG=
Oprnd=
ai:
utm=
channel:
One of these parameters is agreed upon (by the agent and the controller) to
encode the agent ID, and is henceforth referred to as the HTTP agent ID token.
This is used by the controller to track sessions. In the representative sample, the
chosen parameter was ai=. All other URI parameters appear to contain
meaningless, randomly~generated base64~ ike data.
Older X-Agent samples used a static URI for HTTP channel requests. This ends
with a hardcoded session tracking parameter value name ai=. The HTTP agent
ID taken was simply appended to this base URI:
HTTP Adent ID Token Format and Encoding
The controller will extract the HITP agent ID token from the correct URI
parameter. It is then decoded to identify which agent is communicating.
The HITP agent ID token is base64 encoded data, using the web-safe alphabet
(see Appendix B). The encoded string is padded with a 5-byte random prefix so
that it looks like valid base64 data.
When encoded as binary data, the HTTP agent ID token starts with a 4-byte XOR
key, followed by a 7 or 20-byte magic token value, and the agent ID:
xor?keyi4]] magicwtokenET or 203} agentuid]
The XOR key is repeated and extended out to a length of II or 24 bytes, then
with the magic token and agent ID fields.
The 7~byte magic token for HTTP data, when XOR decoded, should beOlder versions of X-Agent use a 20-byte ASCII magic token value:
The following steps may be used to decode an HTFP agent iD token:
1. Discard the 5 bytes of prefix data.
2. Base64 decode using the web-safe alphabet (see Appendix B).
3. De-obfusca?te, XO Ring with the repeated XOR key.
The following example demonstrates the decoding operation.
Client (agent) request:
GET
m=j byi . 1
Accept:
Accept?Language:
gzip, deflate
User?Agent: Mozillaf5.0 Gecko/20100101 Firefox/20.0
Host: windows?updater.com
Server (controller) response:
. 200 OK
Date: Thu, 12 Jun 2014 22:18:27 GMT
Server: Apache
Content?Length: 3
Connection: Close
Content?Type: text/plain; charseteUTFWB
400
in this example, the HTTP agent ID token is in the aim URI parameter:
ai=oedQJ3vMSQ6j9N7oleYALu8C
To decode, discard the 5 bytes of prefix data, leaving:
This data must be base64 decoded using the web-safe alphabet (see Appendix
B). The result isThe first 4 bytes of this data are the XOR key. To continue decoding, XOR with
the repeated key, giving a result ofThe first 7 bytes are the expected HTTP agent ID tokenThe remaining 4 bytes are the agent ID, as a 32?bit littie?endian integer:
43 f0 1C 10
The agent iD in this case was 0x101cf04 3.
in some situations, the high 8-bits of the agent ID may be zero, causing only 3
bytes of the 32?bit agent ID to be base64 encoded. The decoded output for HTTP
agent ID token tokens will look truncated, missing the East byte. This is likely
unintended.
HTTP Message Format and Encodinu
HTTP channel messages are encoded in a format common to both inbound and
outbound messages. inbound messages are responses to GET requests, and
outbound messages are contained in POST request bodies.
The encoding of HTTP channel messages is similar to that of HTTP agent ID
tokens. To decode, a 5-bytejunk prefix should be discarded, and the remaining
data base64 decoded with the web?safe alphabet (see Appendix B).
The result will be binary data, starting with an ?ii?byte header, containing the
following fields:
xor_key[4i] magicwtoken[7]]
The following steps will decode a HTTP channel message:
Discard the 5?byte prefix from the body.
2. Decode the remainder with the web-safe base64 alphabet.
3. Retrieve the 4-byte XOR key (the first 4 bytes of decoded data).
4. the next ?ll bytes of the message with the XOR key. This
includes the HTTP magic token and the agent ID.
5. Vaiidate the 7~byte magic token in the header has the expected value:
30
Discard the magic token bytes.
The result of this decoding is a raw packet message, encoded in the
previously-described format.
An example POST request for X-Agent?s HTTP channel is available via VirusTotal:
The final output is a serialized module message from module
0x1 002, command 0x64, with an opaque message body whose contents have a
hash of:
Mail External Channel
The SMTP and POP3 channels together make up a common mail channel. The
SMTP channel is used to send messages, and the POP3 channel is used to
receive them. These channels are an alternative to the HTTP channel, which can
both send and retrieve messages. The oldest versions of X-Agent exclusively
used mail protocols for C2 communication.
lnconqruous Mail Subiect Fields - Hardcoded Vaiues or Encodina
X-Agent sends SMTP messages to the controller with specific magic values in the
Subject line. The presence of these values is enforced by the C2 controller and
by X-Agent when fetching messages via P0 P3.
The most common Subject line observed contains ?plradl nomen?? which refers
to a Georgian government-issued citizen identification number, similar to a US
Social Security Number.
Other versions of X-Agent expect the Subject line to contain an encoded token
for session management, much like the HTTP agent lD token. This data is
encoded using an encoding method called the P2Scheme.
The P25cheme encodes binary data using the standard base64 alphabet (see
Appendix B). The binary data starts with a random 5-byte XOR key, foliowed by
a 7-byte magic subject token, and 4 bytes for the agent ID, as a
xormkeyi5ll agenthidi
31
The 5-byte XOR key is repeated, extended out to 11 bytes, covering the magic
subject token and the agent It).
The magic subject token, when XO R-decoded, should have the following valueother words, followed by the ASCII string ?china?.
The choice of magic token values, using Georgian phrases and the word
?china', seems incongruous.
Mail Message Format and Encoding
The mail channel sends and receives messages as multipart MIME email. The
first message part contains a 7-bit UTF-8 representation of "gamarjoba", which is
Georgian for "hello."
The second message part is a base64 encoded attachment with the filename
detaluri.dat. Alternatively the file may be named
where %s is a string representation of the current time. "Data/uni? means
?detailed? in Georgian. The file may also be called winmail . dat.
The attachment contents are a single raw packet message (see Appendix B).
Air-Gapped Operations
Some versions of X-Agent are designed to operate in an environment without an
Internet connection, such as an air-gapped network. In this situation, X-Agent
relies on human intervention to carry commands and data in and out via
writable external media, such as USB flash drives.
X-Agent will register a local channel for external communication, and use a
module called Net Flash.
The Net Flash module receives notifications from the OS when a new file-system
on writable external media is mounted. The Net Flash module then checks for
incoming module messages, in the following locations:
\System Volume High priority incoming messages
\System Volume Information\sys Normal priority incoming messages
logs\data\*
\System Volume Information\sys Outbound messages
logs\com\*
32
If these folders do not exist, they are created as hidden system directories.
Inbound message files are deleted after they?re read.
The X-Agent microkernel contains a message shim for the Net Flash module.
When Net Flash is active, this shim intercepts all outbound messages, rerouting
them before they reach an external channel. Linux versions of X-Agent also
contain this shim, but a Linux version of the Net Flash module has not been
observed.
This architecture indicates that the X-Agent kernel was designed or specifically
adapted to work in air-gapped environments.
Autorun Infection
Perhaps to support infection in air-gapped networks, X-Agent has the ability to
spread via autorun invocation on USB flash drives. Some samples have been
observed with residual strings from an autorun. inf file:
[autorun]
open:
shell\open=Explore
Volume Information\USBGuard.exe?
install
shell\open\Default=l
X-Agent Indicators
Known mutexes:
Known mailslots (for
Packet queue file names:
edg6E85F98675.tmp
edg6EF885E2.tmp
zdg6E85F98675.tmp
zdg6EF885E2.tmp
33
Representative Sample Hashes
Signatures
The following Yara signatures can be used to detect X-Agent:
rule ecksumAlgo rithm
stringscondition:
1 of them
rule XAG
strings:
$s_uniq1 wide
$s_uniq2 ascii
$s_unic;3 ascii
$s_uniq4 wide
$swuniq5 wide
$swuniq6 "engE85F98675.trnp" wide
$s_uniq7 wide
"4font size=4 coior=red>comm isn't wide
6 "com 6 is success" ascii
"com 7 is success" ascii
"com isn't success" ascii
EXC: - Cannot create Post Channel!" asci?
EXC: - Cannot create Get Channei!" ascii
Cannot create ascii
Cannot create ExtChannetToProcessTitread!" ascii
Cannot create ProcTo Ext Pipe!" ascii
Cannot create ExtToProc PipeE? ascii
Cannot create Process!" ascii
"Calloc 3 error!" ascii
wide
?{autoru
Volume nformation\\USBGuard.exe\" ascii
size=4 colorzred>comm" wide
"comm" wide
?
" wide
35 width=800 height=500 ascii
2 "fiie is blocked another process
? wide
"Calloc 1 error! Packet lost!? ascii
"Error Broken Pipe!" ascii
condition:
1 of ($swuniqt?) or 8 of them
.
rule
strings:
$s_uniq1 "WRlTE FILE IS NOT ascii
34
$shuniq2 "
? ascii
$s?uniq3 "Terminal don?t started" ascii
$s_uniq6
".configldbus-notifier" ascii
ascii
2 "rm -f ~f.configiautostartl? ascii
"mkdir ascii
"11AgentKemei" ascii
55 "12EAgentModule" ascii
$:'W3ResavedApWasdi
"BFSModule" ascii
?i of or 6 of them
SMTP and POP3 Servers and Accounts
When the mail channel is active, the following SMTP and POP3 servers and
accounts have been observed being used for C2. X-Agent binaries contain
hard-coded credentials for free webmail providers or presumably compromised
accounts.
SMTP and POPB Servers:
smtp.mail.ru
pop.mail.ru
smtp.yandex.ru
smtp.bk.ru
smtp.gmail.com
smtp.mia.gov.ge
mail.mia.gov.ge
SMTP and POP3 accounts:
arkadmo@mail.ru
roe.xichard@yandex.ru
john.dory@mail.ru
Colin.mcrae1968@gmail.com
devil.666.666.13@gmail.oom
interppol?gmail.com
:obert.fastand@gmail.com
jose.karreras@bk.ru
35
kar1.fridrikh@yandex.ru
sarah.nyassa@gmail.com
i1ya.kasatonov@list.ru
zurab.razmadze11@gmail.Com
albertborough@yahoo.com
ahmedOmed8outlook.com
shjanashvili0mia.gov.ge
u.kakhidze@mia.gov.ge
r.gvarjaladze@mia.gov.ge
maia.otxmezuri8mia.gov.ge
1.maghradze@mia.gov.ge
CZ Servers and Domains
The following observed C2 domains and IP addresses are most used by the
external channel.
Domain names:
hotfix?update.com
adobeincorp.com
Check-fix.c0m
secnetcontrol.com
checkwinframe.com
testsnetcontrol.c0m
azureon?line.com
windows~updater.com
IP addresses:
62.205.175.96
63.247.82.242
63.247.82.243
64.92.172.221
64.92.172.222
67.18.172.18
70.85.221.10
74.52.115.118
80.94.84.21
80.94.84.22
81.177.20.109
81.112.20.110
82.103.128.81
82.103.128.82
82.103.132.81
82.103.132.82
83.102.136.86
88.198.55.146
94.23.254.109
201.218.236.26
203.117.68.58
216.244.65.34
36
Appendix A
Sofacy LNK Persistence File
The following LNK file shows how Sofacy creates persistence using this method.
This can also be found in VirusTotal with a hash of:
Windows Shortcut
Contains
Contains
Contains
Contains
Contains
COntains
Contains
information:
link target identifier
description string
working directory string
command line arguments string
a
a
a relative path string
a
a
an icon location string
an icon location block
Link information:
Creation
Modifica
Access
File si:
File att
Drive ty
Drive se
Volume
Local pa
Descript
time Jan 06, 2011 21:30:40.983625000 UTC
tion time Aug 14, 2007 02:43:56.000000000 UTC
ime Jan 07, 2011 06:47:58.593750000 UTC
622080 bytes
ribute flags 0x00000020
Should be archived
pe Fixed
rial number 0xec6d8bll
abel
th C:\Program Files\Internet Explorer\iexplore.exe
ion
Relative path
Working
Command
Icon loc
Link target iden
Shell it
Shell it
Shell it
Shell it
Extensio
directory
line arguments
ation
tifier:
em list
Number of items
em: 1
Class type
Shell folder identifier
Shell folder name
em: 2
Class type
Volume name
em: 3
Class type
Name
Modification time
File attribute flags
Is directory
block: 1
Signature
Long name
Creation time
Access time
Users\hpplication
C:\Program Files\Internet Explorer
"C:\Program Files\Internet Explorer\iexplore.exe"
iProgramFiles \Internet Explorer\iexplore.exe
{Root folder!
My Computer
0x2f
{Volume}
0331 {File entry: Directory)
Documents and Settings
Not set (0)
0x00000010
0xbeef0004 (File entry extension)
Documents and Settings
Not set (0)
Not set
37
Shell item: 4
Class type
Name
Modification time
File attribute flags
Is directory
Extension block: 1
Signature
Long name
Creation time
time
Shell item: 5
Class type
Name
Modification time
File attribute flags
Is directory
Extension block: 1_
Signature
Long name
Creation time
Access time
Shell item: 6
Class type
Name
Modification time
File attribute flags
is directory
Extension block: 1
Signature
Long name
Creation time
Access time
Shell item: 7
Class type
ame
Modification time
File attribute flags
Is directory
Extension block:
Signature
Long name
Creation time
Access time
Shell item: 8
Class
Name
Modification time
File attribute flags
type
1
0x31 {File entry:
All Users
Not set
0200000010
Directory)
0xbeef0004 (File entry extension)
All Users
Not set (0)
Not set (0)
x31 (File entry: Directory}
Application Data
Not set
0x00000010
0xbeef0004 {File entry extension}
Application Data
Not set
Not set (0)
0x31 (File entry:
Microsoft
Not set (0)
0300000010
Directory)
0xbeef0004
Microsoft
Not set (0)
Not set
{File entry extension}
0x31 {File entry:
MediaPlayer
Not set
0x00000010
Directory)
0xbeef0004 {File entry extension)
MediaPlayer
Not set (0)
Not set (0)
0x32 (File entry: File)
service.exe
Not set
0x00000020
Should be archived
Extension block: 1
Signature
Long name
Creation time
Access time
Distributed link tracking data:
Machine identifier
Droid volume identifier
Droid file identifier
Birth droid volume identifier
Birth droid file identifier
0xbeef0004 [File entry extension)
service.exe
Not set
Not set (0)
xp
38
Appendix
X-Agent CZ Raw Packet Decoding
Base64 Alphabets
X-Agent uses two base64 alphabets during message encoding. The first is a
standard base64 alphabet, used for mail messages (SMTP and P0 P3):
HTTP messages are encoded with a different web-safe base64 alphabet:
Raw Packet Message Format
Raw packets are a generic container and packet format, used to transmit
module messages over external channels such as HTTP, SMTP, or
POP3.
Raw packets are transmitted one-by-one, each in its own external channel
message. For example, the SMTP mail channel sends each raw packet message
as a mail attachment file. The size of the raw packet message is the size of the
decoded attachment.
Raw packets include the following fields
agent_id]
crc[2]]
The raw packet message format was meant to be abstracted from the external
channel, but there is one implementation inconsistency. The HTTP external
channel XORs the agent ID field with an XOR key intended to obfuscate the
previous header. The mail channels do not do this, and it is likely an
unintentional oversight.
Raw Packet Message CRC Checking
39
A CRC is calculated over the data and session key fields and then sent
as two UINT16LE fields in the packet. The first is a polynomial seed for the
CRC-16 algorithm, followed by the calculated (good) CRC value.
Here is an implementation of the CRC check functionality in
unsigned short crcl6(const unsigned char* input. size_t len, unsigned short poly_seed)
unsigned short result
for (size_t i 0; i len;
unsigned char input[i];
For (int 0; 8:
if (result 0xff)) i)
result 1;
result poly_seed;
else
result 1;
1;
return result;
bool input)
unsigned char header[4];
unsigned short seed, expected_crc, actual_crc;
if 4)
return false;
memcpy(header,
seed headerEG] (headerEiJ
expected_crc headerEZ] (header[3]
actual_crc unsigned 4),
4, seed);
return (actual_crc expected_crc);
Raw Packet Message
Raw packet messages are using a key built by concatenating a
static private key with a public key that changes each packet.
Afew simple steps can be used to a raw packet message:
1. Retrieve the agent (D (first 4 bytes of the message) as a little-endian
Discard these message bytes from the stream.
2. Retrieve the 6 polynomial seed value, and the expected CRC-16 value,
as the next two 6LEs (immediately following the agent ID). Discard the
CRC bytes (4 in total) from the stream.
40
3. Calculate the actual CRC of the remaining packet bytes, seeding the CRC with
correct polynomial seed. This should match the expected value.
4. Create the full RC4 key for the message which starts with a [SO-byte static
private RC4 keyThen append the last 4 bytes of the message (the public key) to create the
full RC4 key. Finally, discard the last 4 bytes of the stream (the public key).
5. the remainder of the message stream using the full RC4 key.
6. Check that the last i ?1 bytes of the message are the magic token
bytesDiscard these bytes.
The result is a clear?text, serialized module message.
4?