|3 years ago
|3 years ago
|3 years ago
LLO - Low Level Obfuscation (Framework)
Purpose For This Document
This is a design draft for the possible LLO framework. This document contains theoretical ideas at a drafting stage. This document is being released publicly for the purpose of obtaining outsider input and is entirely subject to change based upon this input.
Introduction - What Is LLO?
LLO is a low level obfuscation framework designed to be file format agnostic, whilst targeting the Intel x86 ISA, as well as AMD64. LLO's purpose is to take the heavy burden of code analysis and file format deconstruction and reconstruction off the backs of those interested in designing a code obfuscation algorithm targeted for x86 and x86_64 specifically. The design of LLO will allow for already compiled binaries to be obfuscated, thus no recompilation will be required, nor any specific compilers, linkers, or compiler configurations.
It's important to state that LLO's end goal is not to prodive obfuscation, it's to provide a framework that allows for one to create their own obfuscation.
LLVM Optimization Pass For Obfuscation v.s. LLO (Comparing Against LLVM)
LLVM optimization passes repurposed for obfuscation is useful when compiling to multiple targets. However, this level of code obfuscation does not allow for instruction level mutation. In addition, compilation with said optimization pass must be done in order for obfuscated code to be produced. This requires code to be compiled/recompiled and thus without having the source code the software could not be mutated, unless first lifted to LLVM IR then obfuscated.
LLVM "obfuscation passes", will not be able to do simple, instruction level code obfuscation such as appending a segment to a memory operand, or adding in a scaling memory value.
mov rax, [rax]
mov rax, [rax*1+0]
mov rax, ss:[rax]
mov rax, ss:[rax*1+0]
However, the design of LLVM is much of an inspiration for LLO, as the usage of class inheritance, passes, and other code design features will be applied to LLO. LLVM "obfuscation passes" are fantastic if you want IR level code obfuscation and not native instruction level code obfuscation.
LLO will consist of four major stages. The first being native file format deconstruction and internal file format construction, in addition, this stage will also declare where functions begin and end. The second stage will apply obfuscation passes on specified functions. The third and final stage will reconstruct the obfuscated routines back into a native file format and fix rip relative instructions.
Stage One Overview
Stage one and stage three are the most undecided stages of the framework currently as there needs to be more public discourse on each and every detail. However as a given, stage one will take, for example, a PE file, and generate an internal file format called LLOIFF or "Low Level Obfuscation Format". This file format in itself has not been defined yet as, again, there needs to be more discussion and designing. Furthermore, this internal file format will contain a series of sections. Each of these sections will have a value containing its page protections, virtual memory size, physical memory size, and number of symbols. There could be a structure for symbol as this could mean many things, a function, a variable in a data section, an import descriptor, etc. These are the details that need to be mulled over. However, this should provide a clear enough explanation for what stage one should do.
Relocations must be handled at this stage as well... The internal file format will contain a place to hold relocations.
Stage Two Overview
Stage two is responsible for dispatching obfuscation passes on specified symbols. Not every single symbol can, or needs to run any obfuscation pass(es). This stage is really a logistical stage which will allow for the fine tuning of one's desire. A CLI and GUI would implement the features of this stage.
Stage Three Overview
Stage three is responsible for fixing rip relative instructions. Code obfuscation will change the spacing between instructions and thus RIP relative instructions will need their offsets recalculated or in slang terms "fixed". This stage is also responsible for converting LLOIFF to a native file format.
LLOIFF - Low Level Obfuscation Intermediate File Format
LLOIFF is the intermediate file format of LLO which allows the framework to be file format agnostic. All stages interface with this file format in one way or another so the design of this format must be considered DEEPLY. Currently this format will not be serializable, meaning it only exists in memory and cannot be sent over the network or written to disk. The reason for this being ease of use, lack of need for serialization, and expediting development.
LLOIFF takes all of the similarities between ELF and PE and abstracts them. For example, for both file formats (and almost all), there are sections. Each of these sections have a size (virtual and size on disk), protection/characteristic, name, etc... Both PE and ELF have import and export directories, thus LLOIFF would have an abstracted concept of these. Both ELF and PE have entry points, thus LLOIFF would too... This should be enough information to see what we are going for here, the details on this are still being considered, but for the time being this should be enough to make sense of the usage...
Obfuscation Passes - Overview
Obfuscation passes are a modular concept which allow for a programmer to interface with the LLO framework in such a way that they will not need to know the underlying file format, nor will they even need to deconstruct the native file format, nor will they need to find symbols. Obfuscation passes will be modular enough to support both instruction virtualization, and instruction obfuscation.
Obfuscation passes will take the form of native libraries, no other details have been decided though.