Jit linker, mapper, obfuscator, and mutator
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 
Go to file
_xeroxz 3b5c442f32
Update README.md
4 years ago
Examples added examples 4 years ago
Theodosius added examples 4 years ago
imgs added imgs 4 years ago
README.md Update README.md 4 years ago
Theodosius.sln added examples 4 years ago
clang.zip added clang... 4 years ago

README.md

Theodosius - Jit linker, Mapper, Mutator, and Obfuscator

Theodosius (Theo for short) is a jit linker created entirely for obfuscation and mutation of both code, and code flow. The project is extremely modular in design and supports both kernel and usermode projects. Since Theo inherits HMDM (highly modular driver mapper), any vulnerable driver that exposes arbitrary MSR writes, or physical memory read/write can be used with this framework to map unsigned code into the kernel. This is possible since HMDM inherits VDM (vulnerable driver manipulation), and MSREXEC (elevation of arbitrary MSR writes to kernel execution).

Since Theo is a jit linker, unexported symbols can be jit linked. Resolving such symbols is open ended and allows the programmer of this framework to handle how they want to resolve symbols. More on this later (check out example projects).

Linking - Dynamic And Static

Object Files

If you define a c++ file called "main.cpp" the compiler will generate an object file by the name of "main.obj". When you refer to data or code defined in another c/c++ file, the linker uses a symbol table to resolve the address of said code/data. In this situation I am the linker and I resolve all of your symbols :).

What Is A Linker

A linker is a program which takes object files produces by a compiler and generates a final executable native to the operating system. A linker interfaces with not only object files but also static libraries, "lib" files. What is a "lib" file? Well a lib file is just an archive of obj's. You can invision it as a zip/rar without any compression, just concatination of said object files.

Theo is a jit linker, which means it will link objs together and map them into memory all at once. For usability however, instead of handling object files, Theo can parse entire lib files and extract the objects out of the lib.

Static Linking

Static linking is when the linker links entire routines not created by you, into your code. Say memcpy (if its not inlined), will be staticlly linked with the CRT. Static linking also allows for your code to be more independant as all the code you need you bring with you. However, with Theo, you cannot link static libraries which are not compiled with mcmodel=large. Theo supports actual static linking, in other words, using multiple static libraries at the same time.

Dynamic Linking

Dynamic linking is when external symbols are resolved at runtime. This is done by imports and exports in DLL's (dynamiclly linked libraries). Theo supports "dynamic linking", or in better terms, linking against exported routines. You can see examples of this inside of both usermode and kernelmode examples.

RIP Relative Addressing

In order to allow for a routine to be scattered throughout a 64bit address space, RIP relative addressing must not be used. In order to facilitate this, a very special version of clang-cl is used which can use mcmodel=large. This will generate instructions which do not use RIP relative addressing when referencing symbols outside of the routine in which the instruction itself resides. The only exception to this is JCC instructions, (besides call) also known as branching instructions. Take this c++ code for an example:

ObfuscateRoutine 
extern "C" int ModuleEntry()
{
	MessageBoxA(0, "Demo", "Hello From Obfuscated Routine!", 0);
	UsermodeMutateDemo();
	UsermodeNoObfuscation();
}

This c++ function, compiled by clang-cl with mcmodel=large, will generate a routine with the following instructions:

0x00:                               ; void UsermodeNoObfuscation(void)
0x00:                                               public ?UsermodeNoObfuscation@@YAXXZ
0x00:                               ?UsermodeNoObfuscation@@YAXXZ proc near ; CODE XREF: ModuleEntry+42↓p
0x00:                               var_4           = dword ptr -4
0x00: 48 83 EC 28                                   sub     rsp, 28h
0x04: C7 44 24 24 00 00 00 00                       mov     [rsp+28h+var_4], 0
0x0C:                               loc_C:
0x0C: 83 7C 24 24 05                                cmp     [rsp+28h+var_4], 5
0x11: 0F 83 38 00 00 00                             jnb     loc_4F
0x17: 31 C0                                         xor     eax, eax
0x19: 48 BA 28 01 00 00 00 00 00 00                 mov     rdx, offset ??_C@_04DKDMNOEB@Demo?$AA@ ; "Demo"
0x23: 49 B8 00 01 00 00 00 00 00 00                 mov     r8, offset ??_C@_0CD@JEJKPGNA@Hello?5... ; "Hello From Non-Obfuscated Routine!"
0x2D: 48 B8 A0 01 00 00 00 00 00 00                 mov     rax, offset MessageBoxA
0x37: 45 31 C9                                      xor     r9d, r9d        ; uType
0x3A: 44 89 C9                                      mov     ecx, r9d        ; hWnd
0x3D: FF D0                                         call    rax ; MessageBoxA
0x3F: 8B 44 24 24                                   mov     eax, [rsp+28h+var_4]
0x43: 83 C0 01                                      add     eax, 1
0x46: 89 44 24 24                                   mov     [rsp+28h+var_4], eax
0x4A: E9 BD FF FF FF                                jmp     loc_C
0x4F:                               loc_4F:
0x4F: 48 83 C4 28                                   add     rsp, 28h
0x53: C3                                            retn
0x53:                               ?UsermodeNoObfuscation@@YAXXZ endp

As you can see from the code above, (sorry for the terrible syntax highlighting), references to strings and calls to functions are done by first loading the address of the symbol into a register and then interfacing with the symbol.

0x2D: 48 B8 A0 01 00 00 00 00 00 00                 mov     rax, offset MessageBoxA
; ...
0x3D: FF D0                                         call    rax ; MessageBoxA

Each of these instructions can be anywhere in virtual memory and it would not effect code execution one bit. However this is not the case with routines which have conditional branches. Take the following c++ code for example.

ObfuscateRoutine 
void LoopDemo()
{
    for (auto idx = 0u; idx < 10; ++idx)
		DbgPrint("> Loop Demo: %d\n", idx);
}

This c++ function, compiled by clang-cl with mcmodel=large, will generate a routine with the following instructions:

0x58                               ; void LoopDemo(void)
0x58                                               public ?LoopDemo@@YAXXZ
0x58                               ?LoopDemo@@YAXXZ proc near
0x58                               var_4           = dword ptr -4
0x58                               
0x58 48 83 EC 28                                   sub     rsp, 28h
0x5C C7 44 24 24 00 00 00 00                       mov     [rsp+28h+var_4], 0
0x64                               loc_64:
0x64 83 7C 24 24 0A                                cmp     [rsp+28h+var_4], 0Ah
0x69 0F 83 2A 00 00 00                             jnb     loc_99
0x6F 8B 54 24 24                                   mov     edx, [rsp+28h+var_4]
0x73 48 B9 60 01 00 00 00 00 00 00                 mov     rcx, offset ??_C@_0BB@HGKDPLMC@?$.... ; "> Loop Demo: %d\n"
0x7D 48 B8 38 02 00 00 00 00 00 00                 mov     rax, offset DbgPrint
0x87 FF D0                                         call    rax ; DbgPrint
0x89 8B 44 24 24                                   mov     eax, [rsp+28h+var_4]
0x8D 83 C0 01                                      add     eax, 1
0x90 89 44 24 24                                   mov     [rsp+28h+var_4], eax
0x94 E9 CB FF FF FF                                jmp     loc_64
0x99                               loc_99:
0x99 48 83 C4 28                                   add     rsp, 28h
0x9D C3                                            retn
0x9D                               ?LoopDemo@@YAXXZ endp

Uh oh, jnb loc_99?, thats RIP relative! In order to handle branching operations, a "jump table" is generated by obfuscation::obfuscate explicit default constructor. Instead of branching to the RIP relative code, it will instead branch to an inline jump (JMP [RIP+0x0]). As demonstrated below, the branching operation is altered to branch to an asbolute jump.

ffff998b`c5369e60 0f830e000000    jnb     ffff998b`c5369e74
ffff998b`c5369e66 ff2500000000    jmp     qword ptr [ffff998b`c5369e6c]
...
ffff998b`c5369e74 ff2500000000    jmp     qword ptr [ffff998b`c5369e7a]

The linker is able to get the address of the branching code by taking the rip relative virtual address of the branching operation, which is a signed number, and adding it to the current byte offset into the current routine, plus the size of the branching instruction. For example LoopDemo@17 + size of the branching instruction, which is six bytes, then adding the signed relative virtual address (0x2A). The result of this simple calculation gives us LoopDemo@65, which is correct, the branch goes to add rsp, 28h in the above example.

Usage - Using Theodosius

Theodosius uses the same class structure as HMDM does. Its a highly modular format which allows for extreme usage, supporting almost every idea one might have. In order to use Theo, you must first define three lambdas, theo::memcpy_t a method of copying memory, theo::malloc_t a method to allocate executable memory, and lastely theo::resolve_symbol_t a lamdba to resolve external symbols.

theo::memcpy_t - copy memory lambda

This is used to write memory, it is never used to read memory. An example of this lambda using VDM could be:

theo::memcpy_t _kmemcpy =
[&](void* dest, const void* src, std::size_t size) -> void*
{
	static const auto kmemcpy =
		reinterpret_cast<void*>(
			utils::kmodule::get_export(
				"ntoskrnl.exe", "memcpy"));

	return vdm.syscall<decltype(&memcpy)>(kmemcpy, dest, src, size);
};

This uses VDM to syscall into memcpy exported by ntoskrnl... If you want to do something in usermode you can proxy memcpy to WriteProcessMemory or any other method of writing memory.

theo::memcpy_t _memcpy =
[&](void* dest, const void* src, std::size_t size) -> void*
{
	SIZE_T bytes_handled;
	if (!WriteProcessMemory(phandle, dest, src, size, &bytes_handled))
	{
		std::printf("[!] failed to write process memory...\n");
		exit(-1);
	}
	return dest;
};

theo::malloc_t - allocate executable memory

This lambda is used to allocate executable memory. Any method will do as long as the memcpy lambda can write to the allocated memory. An MSREXEC example for this lambda is defined below.

theo::malloc_t _kalloc = [&](std::size_t size) -> void*
{
	void* alloc_base;
	msrexec.exec
	(
		[&](void* krnl_base, get_system_routine_t get_kroutine) -> void
		{
			using ex_alloc_pool_t =
				void* (*)(std::uint32_t, std::size_t);

			const auto ex_alloc_pool =
				reinterpret_cast<ex_alloc_pool_t>(
					get_kroutine(krnl_base, "ExAllocatePool"));

			alloc_base = ex_alloc_pool(NULL, size);
		}
	);
	return alloc_base;
};

This lambda uses MSREXEC to allocate kernel memory via ExAllocatePool. However this is completely open ended on how you want to do it, you can allocate your memory into discarded sections, you can allocate your memory in another address space, etc... Its extremely modular.

Another, yet simple, usermode example for this lambda is defined below.

theo::malloc_t _alloc = [&](std::size_t size) -> void*
{
	return VirtualAllocEx
	(
		phandle,
		nullptr,
		size,
		MEM_COMMIT | MEM_RESERVE,
		PAGE_EXECUTE_READWRITE
	);
};

theo::resolve_symbol_t -

Obfuscation

The usage of the word obfuscation in this project is use to define any changes made to code, this includes code flow. obfuscation::obfuscate, a base class, which is inherited and expanded upon by obfuscation::mutation, obfuscates code flow by inserting JMP [RIP+0x0] instructions after every single instruction. This allows for a routine to be broken up into unique allocations of memory and thus provides more canvas room for creative ideas.

Obfuscate - Base Class

The base class, as described in the above section, contains a handful of util routines and a single explicit constructor which is the corner stone of the class. The constructor fixes JCC relative virtual addresses so that if the condition is met, instead of jumping instruction pointer relativitly, it will jump to an addition jmp (JMP [RIP+0x0]).

LEA's, nor CALL's are rip relative, even for symbols defined inside of the routine in which the instruction is compiled into. In other words JCC instructions are the only instruction pointer relative instructions that are generated.

instruction
jmp next instruction


instruction
jmp next instruction


instruction
jmp next instruction

Mutation - Inherts Obfuscation

This class inherits from obfuscate and adds additional code, or "mutation". This class is a small example of how to use inheritance with obfuscate base class. It generates a stack push/pop palindrome. The state of the stack is restored before the routines actual instruction is executed. The assembly will now look like this in memory:

push gp
push gp
push gp
...
pop gp
pop gp
pop gp
exec routine instruction
jmp next instruction

push gp
push gp
push gp
push gp
push gp
...
pop gp
pop gp
pop gp
pop gp
pop gp
exec routine instruction
jmp next instruction

push gp
push gp
push gp
...
pop gp
pop gp
pop gp
exec routine instruction
jmp next instruction

Again this is just a demo/POC on how you can inherit obfuscate. This also shows an example of how to use asmjit.