Frida Doc (#515)

* draft * add * more newlines
2022-02-08 07:39:53 +09:00 · 2022-02-08 07:39:53 +09:00 · 914bcd5c47
commit 914bcd5c47
parent 98fbe83c15
2 changed files with 91 additions and 6 deletions
--- a/docs/src/advanced_features/frida/frida.md
+++ b/docs/src/advanced_features/frida/frida.md
@ -0,0 +1,74 @@
+# Binary-only Fuzzing with Frida
+LibAFL supports binary-only fuzzing with Frida; the dynamic instrumentation tool.
+
+In this section, we'll talk about some of the components in fuzzing with `libafl_frida`.
+You can take a look at a working example in our `fuzzers/frida_libpng` folder.
+
+# Dependencies
+If you are on Linux or OSX, you'll need [libc++](https://libcxx.llvm.org/) for `libafl_frida` in addition to libafl's dependencies.
+If you are on Windows, you'll need to install llvm tools.
+
+
+# Harness & Instrumentation
+LibAFL uses Frida's [__Stalker__](https://frida.re/docs/stalker/) to trace the execution of your program and instrument your harness.
+Thus you have to compile your harness to a dynamic library. Frida instruments your PUT after dynamically loading it.
+
+For example in our `frida_libpng` example, we load the dynamic library and find the symbol to harness as follows:
+```rust
+        let lib = libloading::Library::new(module_name).unwrap();
+        let target_func: libloading::Symbol<
+            unsafe extern "C" fn(data: *const u8, size: usize) -> i32,
+        > = lib.get(symbol_name.as_bytes()).unwrap();
+```
+
+
+# `FridaInstrumentationHelper` and Runtimes
+To use functionalities that Frida offers, we'll first need to obtain `Gum` object by `Gum::obtain()`.
+
+In LibAFL, We use struct `FridaInstrumentationHelper` to manage all the stuff related to Frida. `FridaInstrumentationHelper` is a key component that sets up the [__Transformer__](https://frida.re/docs/stalker/#transformer) that is used to to generate the instrumented code. It also initializes the `Runtimes` that offers various instrumentation.
+
+We have `CoverageRuntime` that has tracks the edge coverage,  `AsanRuntime` for address sanitizer, `DrCovRuntime` that uses [__DrCov__](https://dynamorio.org/page_drcov.html) for coverage collection, and `CmpLogRuntime` for cmplog instrumentation. All these runtimes can be used by slotting them into `FridaInstrumentationHelper`
+
+Combined with any `Runtime` you'd like to use, you can initialize the `FridaInstrumentationHelpe`r like this:
+```rust
+
+        let gum = Gum::obtain();
+        let frida_options = FridaOptions::parse_env_options();
+        let coverage = CoverageRuntime::new();
+        let mut frida_helper = FridaInstrumentationHelper::new(
+            &gum,
+            &frida_options,
+            module_name,
+            modules_to_instrument,
+            tuple_list!(coverage),
+        );
+```
+
+# Run the fuzzer
+After setting up the `FridaInstrumentationHelper`. You can obtain the pointer to the coverage map by calling `map_ptr_mut()`.
+```rust
+        let edges_observer = HitcountsMapObserver::new(StdMapObserver::new_from_ptr(
+            "edges",
+            frida_helper.map_ptr_mut().unwrap(),
+            MAP_SIZE,
+        ));
+```
+You can link this observer to `FridaInProcessExecutor`,
+```rust
+        let mut executor = FridaInProcessExecutor::new(
+            &gum,
+            InProcessExecutor::new(
+                &mut frida_harness,
+                tuple_list!(
+                    edges_observer,
+                    time_observer,
+                    AsanErrorsObserver::new(&ASAN_ERRORS)
+                ),
+                &mut fuzzer,
+                &mut state,
+                &mut mgr,
+            )?,
+            &mut frida_helper,
+        );
+```
+and finally you can run the fuzzer.
--- a/docs/src/core_concepts/executor.md
+++ b/docs/src/core_concepts/executor.md
@ -18,11 +18,14 @@ A common pattern when creating an Executor is wrapping an existing one, for inst
 ## InProcessExecutor
 Let's begin with the base case; `InProcessExecutor`.
 This executor uses [_SanitizerCoverage_](https://clang.llvm.org/docs/SanitizerCoverage.html) as its backend, as you can find the related code in `libafl_targets/src/sancov_pcguards`. Here we allocate a map called `EDGES_MAP` and then our compiler wrapper compiles the harness to write the coverage into this map.
+
 When you want to execute the harness as fast as possible, you will most probably want to use this `InprocessExecutor`.
+
 One thing to note here is, when your harness is likely to have heap corruption bugs, you want to use another allocator so that corrupted heap does not affect the fuzzer itself. (For example, we adopt MiMalloc in some of our fuzzers.). Alternatively you can compile your harness with address sanitizer to make sure you can catch these heap bugs.

 ## ForkserverExecutor
 Next, we'll take a look at the `ForkserverExecutor`. In this case, it is `afl-cc` (from AFLplusplus/AFLplusplus) that compiles the harness code, and therefore, we can't use `EDGES_MAP` anymore. Hopefully, we have [_a way_](https://github.com/AFLplusplus/AFLplusplus/blob/2e15661f184c77ac1fbb6f868c894e946cbb7f17/instrumentation/afl-compiler-rt.o.c#L270) to tell the forkserver which map to record the coverage.
+
 As you can see from the forkserver example,
 ```rust,ignore
 //Coverage map shared between observer and executor
@ -34,16 +37,23 @@ let mut shmem_buf = shmem.as_mut_slice();
 Here we make a shared memory region; `shmem`, and write this to environmental variable `__AFL_SHM_ID`. Then the instrumented binary, or the forkserver, finds this shared memory region (from the aforementioned env var) to record its coverage. On your fuzzer side, you can pass this shmem map to your `Observer` to obtain coverage feedbacks combined with any `Feedback`.

 Another feature of the `ForkserverExecutor` to mention is the shared memory testcases. In normal cases, the mutated input is passed between the forkserver and the instrumented binary via `.cur_input` file. You can improve your forkserver fuzzer's performance by passing the input with shared memory.
+
 See AFL++'s [_documentation_](https://github.com/AFLplusplus/AFLplusplus/blob/stable/instrumentation/README.persistent_mode.md#5-shared-memory-fuzzing) or the fuzzer example in `forkserver_simple/src/program.c` for reference.
+
 It is very simple, when you call `ForkserverExecutor::new()` with `use_shmem_testcase` true, the `ForkserverExecutor` sets things up and your harness can just fetch the input from `__AFL_FUZZ_TESTCASE_BUF`

 ## InprocessForkExecutor
 Finally, we'll talk about the `InProcessForkExecutor`.
 `InProcessForkExecutor` has only one difference from `InprocessExecutor`; It forks before running the harness and that's it.
+
 But why do we want to do so? well, under some circumstances, you may find your harness pretty unstable or your harness wreaks havoc on the global states. In this case, you want to fork it before executing the harness runs in the child process so that it doesn't break things.
+
 However, we have to take care of the shared memory, it's the child process that runs the harness code and writes the coverage to the map.
+
 We have to make the map shared between the parent process and the child process, so we'll use shared memory again. You should compile your harness with `pointer_maps` (for `libafl_targes`) features enabled, this way, we can have a pointer; `EDGES_MAP_PTR` that can point to any coverage map.
+
 On your fuzzer side, you can allocate a shared memory region and make the `EDGES_MAP_PTR` point to your shared memory.
+
 ```rust,ignore
 let mut shmem;
 unsafe{
@ -54,4 +64,5 @@ unsafe{
    EDGES_PTR = shmem_buf.as_ptr();
 }
 ```
+
 Again, you can pass this shmem map to your `Observer` and `Feedback` to obtain coverage feedbacks.