# Theodosius - Jit linker, Mapper, Mutator, and Obfuscator Theodosius (Theo for short) is a jit linker created entirely for obfuscation and mutation of both code, and code flow. The project is extremely modular in design and supports both kernel and usermode projects. Since Theo inherits HMDM (highly modular driver mapper), any vulnerable driver that exposes arbitrary MSR writes, or physical memory read/write can be used with this framework to map unsigned code into the kernel. This is possible since HMDM inherits VDM (vulnerable driver manipulation), and MSREXEC (elevation of arbitrary MSR writes to kernel execution). Since Theo is a jit linker, unexported symbols can be jit linked. Resolving such symbols is open ended and allows the programmer of this framework to handle how they want to resolve symbols. More on this later (check out example projects). # RIP Relative Addressing In order to allow for a routine to be scattered throughout a 64bit address space, RIP relative addressing must not be used. In order to facilitate this, a very special version of clang-cl is used which can use `mcmodel=large`. This will generate instructions which do not use RIP relative addressing when referencing symbols outside of the routine in which the instruction itself resides. The only exception to this is JCC instructions, (besides call) also known as branching instructions. Take this c++ code for an example: ```cpp ObfuscateRoutine extern "C" int ModuleEntry() { MessageBoxA(0, "Demo", "Hello From Obfuscated Routine!", 0); UsermodeMutateDemo(); UsermodeNoObfuscation(); } ``` This c++ function, compiled by clang-cl with `mcmodel=large`, will generate a routine with the following instructions: ```nasm 0x00: ; void UsermodeNoObfuscation(void) 0x00: public ?UsermodeNoObfuscation@@YAXXZ 0x00: ?UsermodeNoObfuscation@@YAXXZ proc near ; CODE XREF: ModuleEntry+42↓p 0x00: var_4 = dword ptr -4 0x00: 48 83 EC 28 sub rsp, 28h 0x04: C7 44 24 24 00 00 00 00 mov [rsp+28h+var_4], 0 0x0C: loc_C: 0x0C: 83 7C 24 24 05 cmp [rsp+28h+var_4], 5 0x11: 0F 83 38 00 00 00 jnb loc_4F 0x17: 31 C0 xor eax, eax 0x19: 48 BA 28 01 00 00 00 00 00 00 mov rdx, offset ??_C@_04DKDMNOEB@Demo?$AA@ ; "Demo" 0x23: 49 B8 00 01 00 00 00 00 00 00 mov r8, offset ??_C@_0CD@JEJKPGNA@Hello?5... ; "Hello From Non-Obfuscated Routine!" 0x2D: 48 B8 A0 01 00 00 00 00 00 00 mov rax, offset MessageBoxA 0x37: 45 31 C9 xor r9d, r9d ; uType 0x3A: 44 89 C9 mov ecx, r9d ; hWnd 0x3D: FF D0 call rax ; MessageBoxA 0x3F: 8B 44 24 24 mov eax, [rsp+28h+var_4] 0x43: 83 C0 01 add eax, 1 0x46: 89 44 24 24 mov [rsp+28h+var_4], eax 0x4A: E9 BD FF FF FF jmp loc_C 0x4F: loc_4F: 0x4F: 48 83 C4 28 add rsp, 28h 0x53: C3 retn 0x53: ?UsermodeNoObfuscation@@YAXXZ endp ``` As you can see from the code above, (sorry for the terrible syntax highlighting), references to strings and calls to functions are done by first loading the address of the symbol into a register and then interfacing with the symbol. ```nasm 0x2D: 48 B8 A0 01 00 00 00 00 00 00 mov rax, offset MessageBoxA ; ... 0x3D: FF D0 call rax ; MessageBoxA ``` Each of these instructions can be anywhere in virtual memory and it would not effect code execution one bit. However this is not the case with routines which have conditional branches. Take the following c++ code for example. ```cpp ObfuscateRoutine void LoopDemo() { for (auto idx = 0u; idx < 10; ++idx) DbgPrint("> Loop Demo: %d\n", idx); } ``` This c++ function, compiled by clang-cl with `mcmodel=large`, will generate a routine with the following instructions: ```nasm 0x58 ; void LoopDemo(void) 0x58 public ?LoopDemo@@YAXXZ 0x58 ?LoopDemo@@YAXXZ proc near 0x58 var_4 = dword ptr -4 0x58 0x58 48 83 EC 28 sub rsp, 28h 0x5C C7 44 24 24 00 00 00 00 mov [rsp+28h+var_4], 0 0x64 loc_64: 0x64 83 7C 24 24 0A cmp [rsp+28h+var_4], 0Ah 0x69 0F 83 2A 00 00 00 jnb loc_99 0x6F 8B 54 24 24 mov edx, [rsp+28h+var_4] 0x73 48 B9 60 01 00 00 00 00 00 00 mov rcx, offset ??_C@_0BB@HGKDPLMC@?$.... ; "> Loop Demo: %d\n" 0x7D 48 B8 38 02 00 00 00 00 00 00 mov rax, offset DbgPrint 0x87 FF D0 call rax ; DbgPrint 0x89 8B 44 24 24 mov eax, [rsp+28h+var_4] 0x8D 83 C0 01 add eax, 1 0x90 89 44 24 24 mov [rsp+28h+var_4], eax 0x94 E9 CB FF FF FF jmp loc_64 0x99 loc_99: 0x99 48 83 C4 28 add rsp, 28h 0x9D C3 retn 0x9D ?LoopDemo@@YAXXZ endp ``` # Obfuscation The usage of the word obfuscation in this project is use to define any changes made to code, this includes code flow. `obfuscation::obfuscate`, a base class, which is inherited and expanded upon by `obfuscation::mutation`, obfuscates code flow by inserting `JMP [RIP+0x0]` instructions after every single instruction. This allows for a routine to be broken up into unique allocations of memory and thus provides more canvas room for creative ideas. ### Obfuscation - Base Class The base class, as described in the above section, contains a handful of util routines and a single explicit constructor which is the corner stone of the class. The constructor fixes JCC relative virtual addresses so that if the condition is met, instead of jumping instruction pointer relativitly, it will jump to an addition jmp (`JMP [RIP+0x0]`). LEA, nor CALL are rip relative, even for symbols defined inside of the routine in which the instruction is compiled into. In other words JCC instructions are the only instruction pointer relative instructions that are generated. ### Mutation - Inherts Obfuscation