Reworked Book, add missing files (#888)

* Added missing links to docs

* Reworked docs

* Remove empty file

* remove Launcher info (that moved to spawn_instances)

* ignore more
This commit is contained in:
Dominik Maier 2022-11-10 13:08:35 +01:00 committed by GitHub
parent 893f284482
commit 977415cad2
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
27 changed files with 180 additions and 111 deletions

View File

@ -24,6 +24,7 @@
- [Design](./design/design.md) - [Design](./design/design.md)
- [Architecture](./design/architecture.md) - [Architecture](./design/architecture.md)
- [Metadata](./design/metadata.md) - [Metadata](./design/metadata.md)
- [Migrating from LibAFL <0.9 to 0.9](./design/migration-0.9.md)
- [Message Passing](./message_passing/message_passing.md) - [Message Passing](./message_passing/message_passing.md)
- [Spawning Instances](./message_passing/spawn_instances.md) - [Spawning Instances](./message_passing/spawn_instances.md)
@ -33,5 +34,7 @@
- [Introduction](./tutorial/intro.md) - [Introduction](./tutorial/intro.md)
- [Advanced Features](./advanced_features/advanced_features.md) - [Advanced Features](./advanced_features/advanced_features.md)
- [Concolic Tracing & Hybrid Fuzzing](./advanced_features/concolic/concolic.md) - [Binary-Only Fuzzing with `Frida`](./advanced_features/frida.md)
- [LibAFL in `no_std` environments (Kernels, Hypervisors, ...)](./advanced_features/no_std/no_std.md) - [Concolic Tracing & Hybrid Fuzzing](./advanced_features/concolic.md)
- [LibAFL in `no_std` environments (Kernels, Hypervisors, ...)](./advanced_features/no_std.md)
- [Snapshot Fuzzing in Nyx](./advanced_features/nyx.md)

View File

@ -1,3 +1,4 @@
# Advanced Features # Advanced Features
In addition to core building blocks for fuzzers, LibAFL also has features for more advanced/niche fuzzing techniques. In addition to core building blocks for fuzzers, LibAFL also has features for more advanced/niche fuzzing techniques.
The following sections are dedicated to these features. The following sections are dedicated to some of these features.

View File

@ -6,7 +6,9 @@ Then, we'll go through the relationship of SymCC and LibAFL concolic tracing.
Finally, we'll walk through building a basic hybrid fuzzer using LibAFL. Finally, we'll walk through building a basic hybrid fuzzer using LibAFL.
## Concolic Tracing by Example ## Concolic Tracing by Example
Suppose you want to fuzz the following program: Suppose you want to fuzz the following program:
```rust ```rust
fn target(input: &[u8]) -> i32 { fn target(input: &[u8]) -> i32 {
match &input { match &input {
@ -19,6 +21,7 @@ fn target(input: &[u8]) -> i32 {
} }
} }
``` ```
A simple coverage-maximizing fuzzer that generates new inputs somewhat randomly will have a hard time finding an input that triggers the fictitious crashing input. A simple coverage-maximizing fuzzer that generates new inputs somewhat randomly will have a hard time finding an input that triggers the fictitious crashing input.
Many techniques have been proposed to make fuzzing less random and more directly attempt to mutate the input to flip specific branches, such as the ones involved in crashing the above program. Many techniques have been proposed to make fuzzing less random and more directly attempt to mutate the input to flip specific branches, such as the ones involved in crashing the above program.
@ -27,6 +30,7 @@ In principle, concolic tracing works by observing all executed instructions in a
To understand what this entails, we'll run an example with the above program. To understand what this entails, we'll run an example with the above program.
First, we'll simplify the program to simple if-then-else-statements: First, we'll simplify the program to simple if-then-else-statements:
```rust ```rust
fn target(input: &[u8]) -> i32 { fn target(input: &[u8]) -> i32 {
if input.len() == 4 { if input.len() == 4 {
@ -56,8 +60,10 @@ fn target(input: &[u8]) -> i32 {
} }
} }
``` ```
Next, we'll trace the program on the input `[]`. Next, we'll trace the program on the input `[]`.
The trace would look like this: The trace would look like this:
```rust,ignore ```rust,ignore
Branch { // if input.len() == 4 Branch { // if input.len() == 4
condition: Equals { condition: Equals {
@ -74,6 +80,7 @@ Branch { // if input.len() == 0
taken: true // This condition turned out to be true! taken: true // This condition turned out to be true!
} }
``` ```
Using this trace, we can easily deduce that we can force the program to take a different path by having an input of length 4 or having an input with non-zero length. Using this trace, we can easily deduce that we can force the program to take a different path by having an input of length 4 or having an input with non-zero length.
We do this by negating each branch condition and analytically solving the resulting 'expression'. We do this by negating each branch condition and analytically solving the resulting 'expression'.
In fact, we can create these expressions for any computation and give them to an [SMT](https://en.wikipedia.org/wiki/Satisfiability_modulo_theories)-Solver that will generate an input that satisfies the expression (as long as such an input exists). In fact, we can create these expressions for any computation and give them to an [SMT](https://en.wikipedia.org/wiki/Satisfiability_modulo_theories)-Solver that will generate an input that satisfies the expression (as long as such an input exists).
@ -81,13 +88,16 @@ In fact, we can create these expressions for any computation and give them to an
In hybrid fuzzing, we combine this tracing + solving approach with more traditional fuzzing techniques. In hybrid fuzzing, we combine this tracing + solving approach with more traditional fuzzing techniques.
## Concolic Tracing in LibAFL, SymCC and SymQEMU ## Concolic Tracing in LibAFL, SymCC and SymQEMU
The concolic tracing support in LibAFL is implemented using SymCC. The concolic tracing support in LibAFL is implemented using SymCC.
SymCC is a compiler plugin for clang that can be used as a drop-in replacement for a normal C or C++ compiler. SymCC is a compiler plugin for clang that can be used as a drop-in replacement for a normal C or C++ compiler.
SymCC will instrument the compiled code with callbacks into a runtime that can be supplied by the user. SymCC will instrument the compiled code with callbacks into a runtime that can be supplied by the user.
These callbacks allow the runtime to construct a trace that similar to the previous example. These callbacks allow the runtime to construct a trace that similar to the previous example.
### SymCC and its Runtimes ### SymCC and its Runtimes
SymCC ships with 2 runtimes: SymCC ships with 2 runtimes:
* a 'simple' runtime that attempts to solve any branches it comes across using [Z3](https://github.com/Z3Prover/z3/wiki) and * a 'simple' runtime that attempts to solve any branches it comes across using [Z3](https://github.com/Z3Prover/z3/wiki) and
* a [QSym](https://github.com/sslab-gatech/qsym)-based runtime, which does a bit more filtering on the expressions and also solves using Z3. * a [QSym](https://github.com/sslab-gatech/qsym)-based runtime, which does a bit more filtering on the expressions and also solves using Z3.
@ -96,15 +106,18 @@ This crate allows you to easily build a custom runtime out of the built-in build
Checkout out the `symcc_runtime` docs for more information on how to build your own runtime. Checkout out the `symcc_runtime` docs for more information on how to build your own runtime.
### SymQEMU ### SymQEMU
[SymQEMU](https://github.com/eurecom-s3/symqemu) is a sibling project to SymCC. [SymQEMU](https://github.com/eurecom-s3/symqemu) is a sibling project to SymCC.
Instead of instrumenting the target at compile-time, it inserts instrumentation via dynamic binary translation, building on top of the [`QEMU`](https://www.qemu.org) emulation stack. Instead of instrumenting the target at compile-time, it inserts instrumentation via dynamic binary translation, building on top of the [`QEMU`](https://www.qemu.org) emulation stack.
This means that using SymQEMU, any (x86) binary can be traced without the need to build in instrumentation ahead of time. This means that using SymQEMU, any (x86) binary can be traced without the need to build in instrumentation ahead of time.
The `symcc_runtime` crate supports this use case and runtimes built with `symcc_runtime` also work with SymQEMU. The `symcc_runtime` crate supports this use case and runtimes built with `symcc_runtime` also work with SymQEMU.
## Hybrid Fuzzing in LibAFL ## Hybrid Fuzzing in LibAFL
The LibAFL repository contains an [example hybrid fuzzer](https://github.com/AFLplusplus/LibAFL/tree/main/fuzzers/libfuzzer_stb_image_concolic). The LibAFL repository contains an [example hybrid fuzzer](https://github.com/AFLplusplus/LibAFL/tree/main/fuzzers/libfuzzer_stb_image_concolic).
There are three main steps involved with building a hybrid fuzzer using LibAFL: There are three main steps involved with building a hybrid fuzzer using LibAFL:
1. Building a runtime, 1. Building a runtime,
2. choosing an instrumentation method and 2. choosing an instrumentation method and
3. building the fuzzer. 3. building the fuzzer.
@ -113,11 +126,13 @@ Note that the order of these steps is important.
For example, we need to have runtime ready before we can do instrumentation with SymCC. For example, we need to have runtime ready before we can do instrumentation with SymCC.
### Building a Runtime ### Building a Runtime
Building a custom runtime can be done easily using the `symcc_runtime` crate. Building a custom runtime can be done easily using the `symcc_runtime` crate.
Note, that a custom runtime is a separate shared object file, which means that we need a separate crate for our runtime. Note, that a custom runtime is a separate shared object file, which means that we need a separate crate for our runtime.
Check out the [example hybrid fuzzer's runtime](https://github.com/AFLplusplus/LibAFL/tree/main/fuzzers/libfuzzer_stb_image_concolic/runtime) and the [`symcc_runtime` docs](https://docs.rs/symcc_runtime/0.1/symcc_runtime) for inspiration. Check out the [example hybrid fuzzer's runtime](https://github.com/AFLplusplus/LibAFL/tree/main/fuzzers/libfuzzer_stb_image_concolic/runtime) and the [`symcc_runtime` docs](https://docs.rs/symcc_runtime/0.1/symcc_runtime) for inspiration.
### Instrumentation ### Instrumentation
There are two main instrumentation methods to make use of concolic tracing in LibAFL: There are two main instrumentation methods to make use of concolic tracing in LibAFL:
* Using an **compile-time** instrumented target with **SymCC**. * Using an **compile-time** instrumented target with **SymCC**.
This only works when the source is available for the target and the target is reasonably easy to build using the SymCC compiler wrapper. This only works when the source is available for the target and the target is reasonably easy to build using the SymCC compiler wrapper.
@ -127,6 +142,7 @@ It should be noted, however, that the 'quality' of the generated expressions can
Therefore, it is recommended to use SymCC over SymQEMU when possible. Therefore, it is recommended to use SymCC over SymQEMU when possible.
#### Using SymCC #### Using SymCC
The target needs to be instrumented ahead of fuzzing using SymCC. The target needs to be instrumented ahead of fuzzing using SymCC.
How exactly this is done does not matter. How exactly this is done does not matter.
However, the SymCC compiler needs to be made aware of the location of the runtime that it should instrument against. However, the SymCC compiler needs to be made aware of the location of the runtime that it should instrument against.
@ -139,11 +155,13 @@ The [`symcc_libafl` crate](https://docs.rs/symcc_libafl) contains helper functio
Make sure you satisfy the [build requirements](https://github.com/eurecom-s3/symcc#readme) of SymCC before attempting to build it. Make sure you satisfy the [build requirements](https://github.com/eurecom-s3/symcc#readme) of SymCC before attempting to build it.
#### Using SymQEMU #### Using SymQEMU
Build SymQEMU according to its [build instructions](https://github.com/eurecom-s3/symqemu#readme). Build SymQEMU according to its [build instructions](https://github.com/eurecom-s3/symqemu#readme).
By default, SymQEMU looks for the runtime in a sibling directory. By default, SymQEMU looks for the runtime in a sibling directory.
Since we don't have a runtime there, we need to let it know the path to your runtime by setting `--symcc-build` argument of the `configure` script to the path of your runtime. Since we don't have a runtime there, we need to let it know the path to your runtime by setting `--symcc-build` argument of the `configure` script to the path of your runtime.
### Building the Fuzzer ### Building the Fuzzer
No matter the instrumentation method, the interface between the fuzzer and the instrumented target should now be consistent. No matter the instrumentation method, the interface between the fuzzer and the instrumented target should now be consistent.
The only difference between using SymCC and SymQEMU should be the binary that represents the target: The only difference between using SymCC and SymQEMU should be the binary that represents the target:
In the case of SymCC it will be the binary that was build with instrumentation and with SymQEMU it will be the emulator binary (eg. `x86_64-linux-user/symqemu-x86_64`), followed by your uninstrumented target binary and arguments. In the case of SymCC it will be the binary that was build with instrumentation and with SymQEMU it will be the emulator binary (eg. `x86_64-linux-user/symqemu-x86_64`), followed by your uninstrumented target binary and arguments.
@ -152,6 +170,7 @@ You can use the [`CommandExecutor`](https://docs.rs/libafl/0.6.0/libafl/executor
When configuring the command, make sure you pass the `SYMCC_INPUT_FILE` environment variable the input file path, if your target reads input from a file (instead of standard input). When configuring the command, make sure you pass the `SYMCC_INPUT_FILE` environment variable the input file path, if your target reads input from a file (instead of standard input).
#### Serialization and Solving #### Serialization and Solving
While it is perfectly possible to build a custom runtime that also performs the solving step of hybrid fuzzing in the context of the target process, the intended use of the LibAFL concolic tracing support is to serialize the (filtered and pre-processed) branch conditions using the [`TracingRuntime`](https://docs.rs/symcc_runtime/0.1/symcc_runtime/tracing/struct.TracingRuntime.html). While it is perfectly possible to build a custom runtime that also performs the solving step of hybrid fuzzing in the context of the target process, the intended use of the LibAFL concolic tracing support is to serialize the (filtered and pre-processed) branch conditions using the [`TracingRuntime`](https://docs.rs/symcc_runtime/0.1/symcc_runtime/tracing/struct.TracingRuntime.html).
This serialized representation can be deserialized in the fuzzer process for solving using a [`ConcolicObserver`](https://docs.rs/libafl/0.6.0/libafl/observers/concolic/struct.ConcolicObserver.html) wrapped in a [`ConcolicTracingStage`](https://docs.rs/libafl/0.6.0/libafl/stages/concolic/struct.ConcolicTracingStage.html), which will attach a [`ConcolicMetadata`](https://docs.rs/libafl/0.6.0/libafl/observers/concolic/struct.ConcolicMetadata.html) to every [`TestCase`](https://docs.rs/libafl/0.6.0/libafl/corpus/testcase/struct.Testcase.html). This serialized representation can be deserialized in the fuzzer process for solving using a [`ConcolicObserver`](https://docs.rs/libafl/0.6.0/libafl/observers/concolic/struct.ConcolicObserver.html) wrapped in a [`ConcolicTracingStage`](https://docs.rs/libafl/0.6.0/libafl/stages/concolic/struct.ConcolicTracingStage.html), which will attach a [`ConcolicMetadata`](https://docs.rs/libafl/0.6.0/libafl/observers/concolic/struct.ConcolicMetadata.html) to every [`TestCase`](https://docs.rs/libafl/0.6.0/libafl/corpus/testcase/struct.Testcase.html).
@ -161,5 +180,5 @@ The [`SimpleConcolicMutationalStage`](https://docs.rs/libafl/0.6.0//libafl/stage
It will attempt to solve all branches, like the original simple backend from SymCC, using Z3. It will attempt to solve all branches, like the original simple backend from SymCC, using Z3.
### Example ### Example
The example fuzzer shows how to use the [`ConcolicTracingStage` together with the `SimpleConcolicMutationalStage`](https://github.com/AFLplusplus/LibAFL/blob/main/fuzzers/libfuzzer_stb_image_concolic/fuzzer/src/main.rs#L203) to build a basic hybrid fuzzer.
The example fuzzer shows how to use the [`ConcolicTracingStage` together with the `SimpleConcolicMutationalStage`](https://github.com/AFLplusplus/LibAFL/blob/main/fuzzers/libfuzzer_stb_image_concolic/fuzzer/src/main.rs#L222) to build a basic hybrid fuzzer.

View File

@ -1,36 +1,43 @@
# Binary-only Fuzzing with Frida # Binary-only Fuzzing with Frida
LibAFL supports binary-only fuzzing with Frida; the dynamic instrumentation tool.
In this section, we'll talk about some of the components in fuzzing with `libafl_frida`. LibAFL supports different instrumentation engines for binary-only fuzzing.
You can take a look at a working example in our `fuzzers/frida_libpng` folder. A potent cross-platform (Windows, MacOS, Android, Linux, iOS) option for binary-only fuzzing is Frida; the dynamic instrumentation tool.
In this section, we will talk about the components in fuzzing with `libafl_frida`.
You can take a look at a working example in our [`fuzzers/frida_libpng`](https://github.com/AFLplusplus/LibAFL/tree/main/fuzzers/frida_libpng) folder for Linux, and [`fuzzers/frida_gdiplus`](https://github.com/AFLplusplus/LibAFL/tree/main/fuzzers/frida_gdiplus) for Windows.
## Dependencies
# Dependencies
If you are on Linux or OSX, you'll need [libc++](https://libcxx.llvm.org/) for `libafl_frida` in addition to libafl's dependencies. If you are on Linux or OSX, you'll need [libc++](https://libcxx.llvm.org/) for `libafl_frida` in addition to libafl's dependencies.
If you are on Windows, you'll need to install llvm tools. If you are on Windows, you'll need to install llvm tools.
# Harness & Instrumentation ## Harness & Instrumentation
LibAFL uses Frida's [__Stalker__](https://frida.re/docs/stalker/) to trace the execution of your program and instrument your harness. LibAFL uses Frida's [__Stalker__](https://frida.re/docs/stalker/) to trace the execution of your program and instrument your harness.
Thus you have to compile your harness to a dynamic library. Frida instruments your PUT after dynamically loading it. Thus, you have to compile your harness to a dynamic library. Frida instruments your PUT after dynamically loading it.
For example in our `frida_libpng` example, we load the dynamic library and find the symbol to harness as follows: For example in our `frida_libpng` example, we load the dynamic library and find the symbol to harness as follows:
```rust
```rust,ignore
let lib = libloading::Library::new(module_name).unwrap(); let lib = libloading::Library::new(module_name).unwrap();
let target_func: libloading::Symbol< let target_func: libloading::Symbol<
unsafe extern "C" fn(data: *const u8, size: usize) -> i32, unsafe extern "C" fn(data: *const u8, size: usize) -> i32,
> = lib.get(symbol_name.as_bytes()).unwrap(); > = lib.get(symbol_name.as_bytes()).unwrap();
``` ```
## `FridaInstrumentationHelper` and Runtimes
# `FridaInstrumentationHelper` and Runtimes
To use functionalities that Frida offers, we'll first need to obtain `Gum` object by `Gum::obtain()`. To use functionalities that Frida offers, we'll first need to obtain `Gum` object by `Gum::obtain()`.
In LibAFL, We use struct `FridaInstrumentationHelper` to manage all the stuff related to Frida. `FridaInstrumentationHelper` is a key component that sets up the [__Transformer__](https://frida.re/docs/stalker/#transformer) that is used to to generate the instrumented code. It also initializes the `Runtimes` that offers various instrumentation. In LibAFL, we use the `FridaInstrumentationHelper` struct to manage frida-related state. `FridaInstrumentationHelper` is a key component that sets up the [__Transformer__](https://frida.re/docs/stalker/#transformer) that is used to generate the instrumented code. It also initializes the `Runtimes` that offer various instrumentation.
We have `CoverageRuntime` that has tracks the edge coverage, `AsanRuntime` for address sanitizer, `DrCovRuntime` that uses [__DrCov__](https://dynamorio.org/page_drcov.html) for coverage collection, and `CmpLogRuntime` for cmplog instrumentation. All these runtimes can be used by slotting them into `FridaInstrumentationHelper` We have `CoverageRuntime` that can track the edge coverage, `AsanRuntime` for address sanitizer, `DrCovRuntime` that uses [__DrCov__](https://dynamorio.org/page_drcov.html) for coverage collection (to be imported in coverage tools like Lighthouse, bncov, dragondance,...), and `CmpLogRuntime` for cmplog instrumentation.
All of these runtimes can be slotted into `FridaInstrumentationHelper` at build time.
Combined with any `Runtime` you'd like to use, you can initialize the `FridaInstrumentationHelper` like this: Combined with any `Runtime` you'd like to use, you can initialize the `FridaInstrumentationHelper` like this:
```rust
```rust,ignore
let gum = Gum::obtain(); let gum = Gum::obtain();
let frida_options = FridaOptions::parse_env_options(); let frida_options = FridaOptions::parse_env_options();
@ -44,17 +51,21 @@ Combined with any `Runtime` you'd like to use, you can initialize the `FridaInst
); );
``` ```
# Run the fuzzer ## Running the Fuzzer
After setting up the `FridaInstrumentationHelper`. You can obtain the pointer to the coverage map by calling `map_ptr_mut()`. After setting up the `FridaInstrumentationHelper`. You can obtain the pointer to the coverage map by calling `map_ptr_mut()`.
```rust
```rust,ignore
let edges_observer = HitcountsMapObserver::new(StdMapObserver::new_from_ptr( let edges_observer = HitcountsMapObserver::new(StdMapObserver::new_from_ptr(
"edges", "edges",
frida_helper.map_ptr_mut().unwrap(), frida_helper.map_ptr_mut().unwrap(),
MAP_SIZE, MAP_SIZE,
)); ));
``` ```
You can link this observer to `FridaInProcessExecutor`,
```rust You can then link this observer to `FridaInProcessExecutor` as follows:
```rust,ignore
let mut executor = FridaInProcessExecutor::new( let mut executor = FridaInProcessExecutor::new(
&gum, &gum,
InProcessExecutor::new( InProcessExecutor::new(
@ -71,4 +82,6 @@ You can link this observer to `FridaInProcessExecutor`,
&mut frida_helper, &mut frida_helper,
); );
``` ```
and finally you can run the fuzzer.
And, finally you can run the fuzzer.
See the `frida_` examples in [`./fuzzers`](https://github.com/AFLplusplus/LibAFL/tree/main/fuzzers/) for more information and, for linux or full-system, play around with `libafl_qemu`, another binary-only tracer.

View File

@ -8,7 +8,8 @@ You can simply add LibAFL to your `Cargo.toml` file:
libafl = { path = "path/to/libafl/", default-features = false} libafl = { path = "path/to/libafl/", default-features = false}
``` ```
Then build your project e.g. for `aarch64-unknown-none` using Then build your project e.g. for `aarch64-unknown-none` using:
```sh ```sh
cargo build --no-default-features --target aarch64-unknown-none cargo build --no-default-features --target aarch64-unknown-none
``` ```
@ -18,18 +19,22 @@ cargo build --no-default-features --target aarch64-unknown-none
The minimum amount of input LibAFL needs for `no_std` is a monotonically increasing timestamp. The minimum amount of input LibAFL needs for `no_std` is a monotonically increasing timestamp.
For this, anywhere in your project you need to implement the `external_current_millis` function, which returns the current time in milliseconds. For this, anywhere in your project you need to implement the `external_current_millis` function, which returns the current time in milliseconds.
// Assume this a clock source from a custom stdlib, which you want to use, which returns current time in seconds.
```c ```c
// Assume this a clock source from a custom stdlib, which you want to use, which returns current time in seconds.
int my_real_seconds(void) int my_real_seconds(void)
{ {
return *CLOCK; return *CLOCK;
} }
``` ```
and here we use it in Rust. `external_current_millis` is then called from LibAFL.
Note that it needs to be `no_mangle` in order to get picked up by LibAFL at linktime. Here, we use it in Rust. `external_current_millis` is then called from LibAFL.
Note that it needs to be `no_mangle` in order to get picked up by LibAFL at linktime:
```rust,ignore ```rust,ignore
#[no_mangle] #[no_mangle]
pub extern "C" fn external_current_millis() -> u64 { pub extern "C" fn external_current_millis() -> u64 {
unsafe { my_real_seconds()*1000 } unsafe { my_real_seconds()*1000 }
} }
``` ```
See [./fuzzers/baby_no_std](https://github.com/AFLplusplus/LibAFL/tree/main/fuzzers/baby_no_std) for an example.

View File

@ -1,20 +1,18 @@
# target setup # Snapshot Fuzzing in Nyx
## instruction
NYX supports both source-based and binary-only fuzzing. NYX supports both source-based and binary-only fuzzing.
Currently, NYX only supports [afl++](https://github.com/AFLplusplus/AFLplusplus)'s instruction. To install it, you can use `sudo apt install aflplusplus`. Or compile from the source: Currently, `libafl_nyx` only supports [afl++](https://github.com/AFLplusplus/AFLplusplus)'s instruction. To install it, you can use `sudo apt install aflplusplus`. Or compile from the source:
``` ```bash
git clone https://github.com/AFLplusplus/AFLplusplus git clone https://github.com/AFLplusplus/AFLplusplus
cd AFLplusplus cd AFLplusplus
make all # this will not compile afl's additional extension make all # this will not compile afl's additional extension
``` ```
Then you should compile the target with afl's compiler, like `afl-clang-fast` or `afl-clang-lto`: Then you should compile the target with the afl++ compiler wrapper:
``` ```bash
export CC=afl-clang-fast export CC=afl-clang-fast
export CXX=afl-clang-fast++ export CXX=afl-clang-fast++
# the following line depends on your target # the following line depends on your target
@ -22,20 +20,20 @@ export CXX=afl-clang-fast++
make make
``` ```
For binary-only fuzzing, nyx uses intel-PT(Intel® Processor Trace). You can find the supported CPU at https://www.intel.com/content/www/us/en/support/articles/000056730/processors.html. For binary-only fuzzing, Nyx uses intel-PT(Intel® Processor Trace). You can find the supported CPU at <https://www.intel.com/content/www/us/en/support/articles/000056730/processors.html>.
## prepare nyx working directory ## Preparing Nyx working directory
This step is used to pack the target into Nyx's kernel. Don't worry, we have a template shell script in our [example](https://github.com/AFLplusplus/LibAFL/blob/main/fuzzers/nyx_libxml2_parallel/setup_libxml2.sh): This step is used to pack the target into Nyx's kernel. Don't worry, we have a template shell script in our [example](https://github.com/AFLplusplus/LibAFL/blob/main/fuzzers/nyx_libxml2_parallel/setup_libxml2.sh):
the parameter's meaning is listed below: the parameter's meaning is listed below:
``` ```bash
git clone https://github.com/nyx-fuzz/packer git clone https://github.com/nyx-fuzz/packer
python3 "./packer/packer/nyx_packer.py" \ python3 "./packer/packer/nyx_packer.py" \
./libxml2/xmllint \ # your target binary ./libxml2/xmllint \ # your target binary
/tmp/nyx_libxml2 \ # the nyx work directory /tmp/nyx_libxml2 \ # the nyx work directory
afl \ # instrction type afl \ # instruction type
instrumentation \ instrumentation \
-args "/tmp/input" \ # the args of the program, means that we will run `xmllint /tmp/input` in each run. -args "/tmp/input" \ # the args of the program, means that we will run `xmllint /tmp/input` in each run.
-file "/tmp/input" \ # the input will be generated in `/tmp/input`. If no `--file`, then input will be passed through stdin -file "/tmp/input" \ # the input will be generated in `/tmp/input`. If no `--file`, then input will be passed through stdin
@ -45,28 +43,26 @@ python3 "./packer/packer/nyx_packer.py" \
Then, you can generate the config file: Then, you can generate the config file:
``` ```bash
python3 ./packer/packer/nyx_config_gen.py /tmp/nyx_libxml2/ Kernel || exit python3 ./packer/packer/nyx_config_gen.py /tmp/nyx_libxml2/ Kernel || exit
``` ```
# LibAFL's code ## Standalone fuzzing
## standalone fuzzing
In the [example fuzzer](https://github.com/AFLplusplus/LibAFL/blob/main/fuzzers/nyx_libxml2_standalone/src/main.rs). First you need to run `./setup_libxml2.sh`, It will prepare your target and create your nyx work directory in `/tmp/libxml2`. After that, you can start write your code. In the [example fuzzer](https://github.com/AFLplusplus/LibAFL/blob/main/fuzzers/nyx_libxml2_standalone/src/main.rs). First you need to run `./setup_libxml2.sh`, It will prepare your target and create your nyx work directory in `/tmp/libxml2`. After that, you can start write your code.
First to create `Nyxhelper`: First, to create `Nyxhelper`:
```rust ```rust,ignore
let share_dir = Path::new("/tmp/nyx_libxml2/"); let share_dir = Path::new("/tmp/nyx_libxml2/");
let cpu_id = 0; // use first cpu let cpu_id = 0; // use first cpu
let parallel_mode = false; // close parallel_mode let parallel_mode = false; // close parallel_mode
let mut helper = NyxHelper::new(share_dir, cpu_id, true, parallel_mode, None).unwrap();// we don't need last parmeter in standalone mode, so just use None let mut helper = NyxHelper::new(share_dir, cpu_id, true, parallel_mode, None).unwrap(); // we don't the set the last parameter in standalone mode, we just use None, here
``` ```
Then fetch `trace_bits`, create observer and `NyxExecutor`: Then, fetch `trace_bits`, create an observer and the `NyxExecutor`:
```rust ```rust,ignore
let trace_bits = unsafe { std::slice::from_raw_parts_mut(helper.trace_bits, helper.map_size) }; let trace_bits = unsafe { std::slice::from_raw_parts_mut(helper.trace_bits, helper.map_size) };
let observer = StdMapObserver::new("trace", trace_bits); let observer = StdMapObserver::new("trace", trace_bits);
let mut executor = NyxExecutor::new(&mut helper, tuple_list!(observer)).unwrap(); let mut executor = NyxExecutor::new(&mut helper, tuple_list!(observer)).unwrap();
@ -74,19 +70,19 @@ let mut executor = NyxExecutor::new(&mut helper, tuple_list!(observer)).unwrap()
Finally, use them as normal and pass them into `fuzzer.fuzz_loop(&mut stages, &mut executor, &mut state, &mut mgr)` to start fuzzing. Finally, use them as normal and pass them into `fuzzer.fuzz_loop(&mut stages, &mut executor, &mut state, &mut mgr)` to start fuzzing.
## parallel fuzzing ## Parallel fuzzing
In the [example fuzzer](https://github.com/AFLplusplus/LibAFL/blob/main/fuzzers/nyx_libxml2_parallel/src/main.rs). First you need to run `./setup_libxml2.sh` as described before. In the [example fuzzer](https://github.com/AFLplusplus/LibAFL/blob/main/fuzzers/nyx_libxml2_parallel/src/main.rs). First you need to run `./setup_libxml2.sh` as described before.
parallel fuzzing relies on `Launcher`, so spawn logic should be written in the scoop of anonymous function `run_client`: Parallel fuzzing relies on [`Launcher`](../message_passing/spawn_instances.md), so spawn logic should be written in the scoop of anonymous function `run_client`:
```rust ```rust,ignore
let mut run_client = |state: Option<_>, mut restarting_mgr, _core_id: usize{} let mut run_client = |state: Option<_>, mut restarting_mgr, _core_id: usize| {}
``` ```
In `run_client`, you need to create `NyxHelper` first: In `run_client`, you need to create `NyxHelper` first:
```rust ```rust,ignore
let share_dir = Path::new("/tmp/nyx_libxml2/"); let share_dir = Path::new("/tmp/nyx_libxml2/");
let cpu_id = _core_id as u32; let cpu_id = _core_id as u32;
let parallel_mode = true; let parallel_mode = true;
@ -102,7 +98,7 @@ let mut helper = NyxHelper::new(
Then you can fetch the trace_bits and create an observer and `NyxExecutor` Then you can fetch the trace_bits and create an observer and `NyxExecutor`
```rust ```rust,ignore
let trace_bits = let trace_bits =
unsafe { std::slice::from_raw_parts_mut(helper.trace_bits, helper.map_size) }; unsafe { std::slice::from_raw_parts_mut(helper.trace_bits, helper.map_size) };
let observer = StdMapObserver::new("trace", trace_bits); let observer = StdMapObserver::new("trace", trace_bits);
@ -111,7 +107,7 @@ let mut executor = NyxExecutor::new(&mut helper, tuple_list!(observer)).unwrap()
Finally, open a `Launcher` as normal to start fuzzing: Finally, open a `Launcher` as normal to start fuzzing:
```rust ```rust,ignore
match Launcher::builder() match Launcher::builder()
.shmem_provider(shmem_provider) .shmem_provider(shmem_provider)
.configuration(EventConfig::from_name("default")) .configuration(EventConfig::from_name("default"))

View File

@ -378,4 +378,4 @@ Bye!
As you can see, after the panic message, the `objectives` count of the log increased by one and you will find the crashing input in `crashes/`. As you can see, after the panic message, the `objectives` count of the log increased by one and you will find the crashing input in `crashes/`.
The complete code can be found in `./fuzzers/baby_fuzzer`. The complete code can be found in [`./fuzzers/baby_fuzzer`](https://github.com/AFLplusplus/LibAFL/tree/main/fuzzers/baby_fuzzer) alongside other `baby_` fuzzers.

View File

@ -1,4 +1,5 @@
# More Examples # More Examples
Examples can be found under `./fuzzer`. Examples can be found under `./fuzzer`.
|fuzzer name|usage| |fuzzer name|usage|

View File

@ -27,6 +27,7 @@ When you want to execute the harness as fast as possible, you will most probably
Next, we'll take a look at the `ForkserverExecutor`. In this case, it is `afl-cc` (from AFLplusplus/AFLplusplus) that compiles the harness code, and therefore, we can't use `EDGES_MAP` anymore. Hopefully, we have [_a way_](https://github.com/AFLplusplus/AFLplusplus/blob/2e15661f184c77ac1fbb6f868c894e946cbb7f17/instrumentation/afl-compiler-rt.o.c#L270) to tell the forkserver which map to record the coverage. Next, we'll take a look at the `ForkserverExecutor`. In this case, it is `afl-cc` (from AFLplusplus/AFLplusplus) that compiles the harness code, and therefore, we can't use `EDGES_MAP` anymore. Hopefully, we have [_a way_](https://github.com/AFLplusplus/AFLplusplus/blob/2e15661f184c77ac1fbb6f868c894e946cbb7f17/instrumentation/afl-compiler-rt.o.c#L270) to tell the forkserver which map to record the coverage.
As you can see from the forkserver example, As you can see from the forkserver example,
```rust,ignore ```rust,ignore
//Coverage map shared between observer and executor //Coverage map shared between observer and executor
let mut shmem = StdShMemProvider::new().unwrap().new_shmem(MAP_SIZE).unwrap(); let mut shmem = StdShMemProvider::new().unwrap().new_shmem(MAP_SIZE).unwrap();
@ -34,6 +35,7 @@ let mut shmem = StdShMemProvider::new().unwrap().new_shmem(MAP_SIZE).unwrap();
shmem.write_to_env("__AFL_SHM_ID").unwrap(); shmem.write_to_env("__AFL_SHM_ID").unwrap();
let mut shmem_buf = shmem.as_mut_slice(); let mut shmem_buf = shmem.as_mut_slice();
``` ```
Here we make a shared memory region; `shmem`, and write this to environmental variable `__AFL_SHM_ID`. Then the instrumented binary, or the forkserver, finds this shared memory region (from the aforementioned env var) to record its coverage. On your fuzzer side, you can pass this shmem map to your `Observer` to obtain coverage feedbacks combined with any `Feedback`. Here we make a shared memory region; `shmem`, and write this to environmental variable `__AFL_SHM_ID`. Then the instrumented binary, or the forkserver, finds this shared memory region (from the aforementioned env var) to record its coverage. On your fuzzer side, you can pass this shmem map to your `Observer` to obtain coverage feedbacks combined with any `Feedback`.
Another feature of the `ForkserverExecutor` to mention is the shared memory testcases. In normal cases, the mutated input is passed between the forkserver and the instrumented binary via `.cur_input` file. You can improve your forkserver fuzzer's performance by passing the input with shared memory. Another feature of the `ForkserverExecutor` to mention is the shared memory testcases. In normal cases, the mutated input is passed between the forkserver and the instrumented binary via `.cur_input` file. You can improve your forkserver fuzzer's performance by passing the input with shared memory.
@ -43,6 +45,7 @@ See AFL++'s [_documentation_](https://github.com/AFLplusplus/AFLplusplus/blob/st
It is very simple, when you call `ForkserverExecutor::new()` with `use_shmem_testcase` true, the `ForkserverExecutor` sets things up and your harness can just fetch the input from `__AFL_FUZZ_TESTCASE_BUF` It is very simple, when you call `ForkserverExecutor::new()` with `use_shmem_testcase` true, the `ForkserverExecutor` sets things up and your harness can just fetch the input from `__AFL_FUZZ_TESTCASE_BUF`
## InprocessForkExecutor ## InprocessForkExecutor
Finally, we'll talk about the `InProcessForkExecutor`. Finally, we'll talk about the `InProcessForkExecutor`.
`InProcessForkExecutor` has only one difference from `InprocessExecutor`; It forks before running the harness and that's it. `InProcessForkExecutor` has only one difference from `InprocessExecutor`; It forks before running the harness and that's it.

View File

@ -3,7 +3,7 @@
The Feedback is an entity that classifies the outcome of an execution of the program under test as interesting or not. The Feedback is an entity that classifies the outcome of an execution of the program under test as interesting or not.
Typically, if an execution is interesting, the corresponding input used to feed the target program is added to a corpus. Typically, if an execution is interesting, the corresponding input used to feed the target program is added to a corpus.
Most of the times, the notion of Feedback is deeply linked to the Observer, but they are different concepts. Most of the time, the notion of Feedback is deeply linked to the Observer, but they are different concepts.
The Feedback, in most of the cases, processes the information reported by one or more observers to decide if the execution is interesting. The Feedback, in most of the cases, processes the information reported by one or more observers to decide if the execution is interesting.
The concept of "interestingness" is abstract, but typically it is related to a novelty search (i.e. interesting inputs are those that reach a previously unseen edge in the control flow graph). The concept of "interestingness" is abstract, but typically it is related to a novelty search (i.e. interesting inputs are those that reach a previously unseen edge in the control flow graph).
@ -13,6 +13,14 @@ As an example, given an Observer that reports all the sizes of memory allocation
In terms of code, the library offers the [`Feedback`](https://docs.rs/libafl/0/libafl/feedbacks/trait.Feedback.html) and the [`FeedbackState`](https://docs.rs/libafl/0/libafl/feedbacks/trait.FeedbackState.html) traits. In terms of code, the library offers the [`Feedback`](https://docs.rs/libafl/0/libafl/feedbacks/trait.Feedback.html) and the [`FeedbackState`](https://docs.rs/libafl/0/libafl/feedbacks/trait.FeedbackState.html) traits.
The first is used to implement functors that, given the state of the observers from the last execution, tells if the execution was interesting. The second is tied with `Feedback` and it is the state of the data that the feedback wants to persist in the fuzzers's state, for instance the cumulative map holding all the edges seen so far in the case of a feedback based on edge coverage. The first is used to implement functors that, given the state of the observers from the last execution, tells if the execution was interesting. The second is tied with `Feedback` and it is the state of the data that the feedback wants to persist in the fuzzers's state, for instance the cumulative map holding all the edges seen so far in the case of a feedback based on edge coverage.
Multiple Feedbacks can be combined into boolean formula, considering for instance an execution as interesting if it triggers new code paths or execute in less time compared to the average execution time using [`feedback_or`](https://docs.rs/libafl/0/libafl/macro.feedback_or.html). Multiple Feedbacks can be combined into boolean formula, considering for instance an execution as interesting if it triggers new code paths or execute in less time compared to the average execution time using [`feedback_or`](https://docs.rs/libafl/*/libafl/macro.feedback_or.html).
TODO objective feedbacks and fast feedback logic operators On top, logic operators like `feedback_or` and `feedback_and` have a `_fast` option (`feedback_or_fast` where the second feedback will not be evaluated, if the first part already answers the `interestingness` question, to save precious performance.
Using `feedback_and_fast` in combination with [`ConstFeedback`](https://docs.rs/libafl/*/libafl/feedbacks/enum.ConstFeedback.html#method.new), certain feedbacks can be disabled dynamically.
## Objectives
While feedbacks are commonly used to decide if an [`Input`](https://docs.rs/libafl/*/libafl/inputs/trait.Input.html) should be kept for future mutations, they serve a double-purpose, as so-called `Objective Feedbacks`.
In this case, the `interestingness` of a feedback indicates, if an `Objective` has been hit.
Commonly, these would be a`crash or a timeout, but they can also be used to find specific parts of the program, for sanitization, or a differential fuzzing success.

View File

@ -1,6 +1,6 @@
# Input # Input
Formally, the input of a program is the data taken from external sources that affect the program behaviour. Formally, the input of a program is the data taken from external sources that affect the program behavior.
In our model of an abstract fuzzer, we define the Input as the internal representation of the program input (or a part of it). In our model of an abstract fuzzer, we define the Input as the internal representation of the program input (or a part of it).
@ -10,4 +10,6 @@ But it is not always the case. A program can expect inputs that are not byte arr
In case of a grammar fuzzer for instance, the Input is generally an Abstract Syntax Tree because it is a data structure that can be easily manipulated while maintaining the validity, but the program expects a byte array as input, so just before the execution, the tree is serialized to a sequence of bytes. In case of a grammar fuzzer for instance, the Input is generally an Abstract Syntax Tree because it is a data structure that can be easily manipulated while maintaining the validity, but the program expects a byte array as input, so just before the execution, the tree is serialized to a sequence of bytes.
In the Rust code, an [`Input`](https://docs.rs/libafl/0/libafl/inputs/trait.Input.html) is a trait that can be implemented only by structures that are serializable and have only owned data as fields. In the Rust code, an [`Input`](https://docs.rs/libafl/*/libafl/inputs/trait.Input.html) is a trait that can be implemented only by structures that are serializable and have only owned data as fields.
While most fuzzer use a normal `BytesInput`], more advanced inputs like inputs include special inputs for grammar fuzzing ([GramatronInput](https://docs.rs/libafl/*/libafl/inputs/gramatron/struct.GramatronInput.html) or `NautilusInput` on nightly), as well as the token-level [EncodedInput](https://docs.rs/libafl/*/libafl/inputs/encoded/struct.EncodedInput.html).

View File

@ -1,10 +0,0 @@
# Launcher
Launcher is used to launch multiple fuzzer instances in parallel in one click. On `Unix` systems, Launcher will use `fork` if the `fork` feature is enabled. Else, it will start subsequent nodes with the same command line, and will set special `env` variables accordingly.
To use launcher, first you need to write an anonymous function `let mut run_client = |state: Option<_>, mut mgr, _core_id|{}`, which uses three parameters to create individual fuzzer. Then you can specify the `shmem_provider`,`broker_port`,`monitor`,`cores` and other stuff through `Launcher::builder()`:
1. To connect multiple nodes together via TCP, you can use the `remote_broker_addr`. this requires the `llmp_bind_public` compile-time feature for `LibAFL`.
2. To use multiple launchers for individual configurations, you can set `spawn_broker` to `false` on all but one.
3. Launcher will not select the cores automatically, so you need to specify the `cores` that you want.
For more examples, you can check out `qemu_launcher` and `libfuzzer_libpng_launcher` in `./fuzzers/`.

View File

@ -2,8 +2,8 @@
The Mutator is an entity that takes one or more Inputs and generates a new derived one. The Mutator is an entity that takes one or more Inputs and generates a new derived one.
Mutators can be composed and they are generally linked to a specific Input type. Mutators can be composed, and they are generally linked to a specific Input type.
There can be, for instance, a Mutator that applies more than a single type of mutation on the input. Consider a generic Mutator for a byte stream, bit flip is just one of the possible mutations but not the only one, there is also, for instance, the random replacement of a byte of the copy of a chunk. There can be, for instance, a Mutator that applies more than a single type of mutation on the input. Consider a generic Mutator for a byte stream, bit flip is just one of the possible mutations but not the only one, there is also, for instance, the random replacement of a byte of the copy of a chunk.
In LibAFL, [`Mutator`](https://docs.rs/libafl/0/libafl/mutators/trait.Mutator.html) is a trait. In LibAFL, [`Mutator`](https://docs.rs/libafl/*/libafl/mutators/trait.Mutator.html) is a trait.

View File

@ -1,12 +1,14 @@
# Observer # Observer
An Observer, or Observation Channel, is an entity that provides an information observed during the execution of the program under test to the fuzzer. An Observer is an entity that provides an information observed during the execution of the program under test to the fuzzer.
The information contained in the Observer is not preserved across executions. The information contained in the Observer is not preserved across executions, but it may be serialized and passed on to other nodes if an `Input` is considered `intersting`, and added to the `Corpus`.
As an example, the coverage shared map filled during the execution to report the executed edges used by fuzzers such as AFL and HonggFuzz can be considered an Observation Channel. As an example, the coverage map, filled during the execution to report the executed edges used by fuzzers such as AFL and `HonggFuzz` can be considered an observation. Another `Observer` can be the time spent executing a run, the program output, or more advanced observation, like maximum stack depth at runtime.
This information is not preserved across runs and it is an observation of a dynamic property of the program. This information is not preserved across runs, and it is an observation of a dynamic property of the program.
In terms of code, in the library this entity is described by the [`Observer`](https://docs.rs/libafl/0/libafl/observers/trait.Observer.html) trait. In terms of code, in the library this entity is described by the [`Observer`](https://docs.rs/libafl/0/libafl/observers/trait.Observer.html) trait.
In addition to holding the volatile data connected with the last execution of the target, the structures implementing this trait can define some execution hooks that are executed before and after each fuzz case. In this hooks, the observer can modify the fuzzer's state. In addition to holding the volatile data connected with the last execution of the target, the structures implementing this trait can define some execution hooks that are executed before and after each fuzz case. In these hooks, the observer can modify the fuzzer's state.
The fuzzer will act based on these observers through a [`Feedback`](./feedback.md), that reduces the observation to the choice if a testcase is `interesting` for the fuzzer, or not.

View File

@ -6,4 +6,4 @@ For instance, a Mutational Stage, given an input of the corpus, applies a Mutato
A stage can also be an analysis stage, for instance, the Colorization stage of Redqueen that aims to introduce more entropy in a testcase or the Trimming stage of AFL that aims to reduce the size of a testcase. A stage can also be an analysis stage, for instance, the Colorization stage of Redqueen that aims to introduce more entropy in a testcase or the Trimming stage of AFL that aims to reduce the size of a testcase.
There are several stages in the LibAFL codebases implementing the [`Stage`](https://docs.rs/libafl/0/libafl/stages/trait.Stage.html) trait. There are several stages in the LibAFL codebase implementing the [`Stage`](https://docs.rs/libafl/*/libafl/stages/trait.Stage.html) trait.

View File

@ -2,14 +2,14 @@
The LibAFL architecture is built around some entities to allow code reuse and low-cost abstractions. The LibAFL architecture is built around some entities to allow code reuse and low-cost abstractions.
Initially, we started thinking to implement LibAFL in an Object Oriented language, such C++. When we landed to Rust, we immediately changed our idea as we realized that, while Rust allows a sort of OOP pattern, we can build the library using a more sane approach like the one described in [this blogpost](https://kyren.github.io/2018/09/14/rustconf-talk.html) about game design in Rust. Initially, we started thinking about implementing LibAFL in a traditional Object-Oriented language, like C++. When we switched to Rust, we immediately changed our idea as we realized that, we can build the library using a more rust-y approach, namely the one described in [this blogpost](https://kyren.github.io/2018/09/14/rustconf-talk.html) about game design in Rust.
The LibAFL code reuse mechanism is based on components rather than sub-classes, but there are still some OOP patterns in the library. The LibAFL code reuse mechanism is based on components, rather than sub-classes, but there are still some OOP patterns in the library.
Thinking about similar fuzzers, you can observe that most of the times the data structures that are modified are the ones related to testcases and the fuzzer global state. Thinking about similar fuzzers, you can observe that most of the time the data structures that are modified are the ones related to testcases and the fuzzer global state.
Beside the entities previously described, we introduce the [`Testcase`](https://docs.rs/libafl/0.6/libafl/corpus/testcase/struct.Testcase.html) and [`State`](https://docs.rs/libafl/0.6/libafl/state/struct.StdState.html) entities. The Testcase is a container for an Input stored in the Corpus and its metadata (so, in the implementation, the Corpus stores Testcases) and the State contains all the metadata that are evolved while running the fuzzer, Corpus included. Beside the entities previously described, we introduce the [`Testcase`](https://docs.rs/libafl/0.6/libafl/corpus/testcase/struct.Testcase.html) and [`State`](https://docs.rs/libafl/0.6/libafl/state/struct.StdState.html) entities. The Testcase is a container for an Input stored in the Corpus and its metadata (so, in the implementation, the Corpus stores Testcases) and the State contains all the metadata that are evolved while running the fuzzer, Corpus included.
The State, in the implementation, contains only owned objects that are serializable and it is serializable itself. Some fuzzers may want to serialize its state when pausing or just, when doing in-process fuzzing, serialize on crash and deserialize in the new process to continue to fuzz with all the metadata preserved. The State, in the implementation, contains only owned objects that are serializable, and it is serializable itself. Some fuzzers may want to serialize its state when pausing or just, when doing in-process fuzzing, serialize on crash and deserialize in the new process to continue to fuzz with all the metadata preserved.
Additionally, we group the entities that are "actions", like the CorpusScheduler and the Feedbacks, in a common place, the [`Fuzzer'](https://docs.rs/libafl/0.6.1/libafl/fuzzer/struct.StdFuzzer.html). Additionally, we group the entities that are "actions", like the `CorpusScheduler` and the `Feedbacks`, in a common place, the [`Fuzzer'](https://docs.rs/libafl/*/libafl/fuzzer/struct.StdFuzzer.html).

View File

@ -1,6 +1,6 @@
# Metadata # Metadata
A metadata in LibAFL is a self contained structure that holds associated data to the State or to a Testcase. A metadata in LibAFL is a self-contained structure that holds associated data to the State or to a Testcase.
In terms of code, a metadata can be defined as a Rust struct registered in the SerdeAny register. In terms of code, a metadata can be defined as a Rust struct registered in the SerdeAny register.

View File

@ -1,6 +1,6 @@
# Migrating from libafl <0.9 to 0.9 # Migrating from LibAFL <0.9 to 0.9
Internal APIs of libafl have changed in version 0.9 to prefer associated types in cases where components were "fixed" to Internal APIs of LibAFL have changed in version 0.9 to prefer associated types in cases where components were "fixed" to
particular versions of other components. As a result, many existing custom components will not be compatible between particular versions of other components. As a result, many existing custom components will not be compatible between
versions prior to 0.9 and version 0.9. versions prior to 0.9 and version 0.9.
@ -11,9 +11,9 @@ result, everywhere where consistency across generic types was required to implem
and explicitly constrained at every point. This led to `impl`s which were at best difficult to debug and, at worst, and explicitly constrained at every point. This led to `impl`s which were at best difficult to debug and, at worst,
incorrect and caused confusing bugs for users. incorrect and caused confusing bugs for users.
For example, consider the MapCorpusMinimizer implementation (from <0.9) below: For example, consider the `MapCorpusMinimizer` implementation (from <0.9) below:
```rust ```rust,ignore
impl<E, I, O, S, TS> CorpusMinimizer<I, S> for MapCorpusMinimizer<E, I, O, S, TS> impl<E, I, O, S, TS> CorpusMinimizer<I, S> for MapCorpusMinimizer<E, I, O, S, TS>
where where
E: Copy + Hash + Eq, E: Copy + Hash + Eq,
@ -47,7 +47,7 @@ and that the input will necessarily be the same over every implementation for th
Below is the same code, but with the associated types changes (note that some generic names have changed): Below is the same code, but with the associated types changes (note that some generic names have changed):
```rust ```rust,ignore
impl<E, O, T, TS> CorpusMinimizer<E> for MapCorpusMinimizer<E, O, T, TS> impl<E, O, T, TS> CorpusMinimizer<E> for MapCorpusMinimizer<E, O, T, TS>
where where
E: UsesState, E: UsesState,
@ -82,8 +82,9 @@ are all present as associated types for `E`. Additionally, we don't even need to
## Scope ## Scope
You are affected by this change if: You are affected by this change if:
- You specified explicit generics for a type (e.g., `MaxMapFeedback::<_, (), _>::new(...)`)
- You implemented a custom component (e.g., `Mutator`, `Executor`, `State`, `Fuzzer`, `Feedback`, `Observer`, etc.) - You specified explicit generics for a type (e.g., `MaxMapFeedback::<_, (), _>::new(...)`)
- You implemented a custom component (e.g., `Mutator`, `Executor`, `State`, `Fuzzer`, `Feedback`, `Observer`, etc.)
If you did neither of these, congrats! You are likely unaffected by these changes. If you did neither of these, congrats! You are likely unaffected by these changes.
@ -105,9 +106,9 @@ In many scenarios, Input, Observers, and State generics have been moved into tra
straightforward to implement. In a majority of cases, you will have generics on your custom implementation or a fixed straightforward to implement. In a majority of cases, you will have generics on your custom implementation or a fixed
type to implement this with. Thankfully, Rust will let you know when you need to implement this type. type to implement this with. Thankfully, Rust will let you know when you need to implement this type.
As an example, InMemoryCorpus before 0.9 looked like this: As an example, `InMemoryCorpus` before 0.9 looked like this:
```rust ```rust,ignore
#[derive(Default, Serialize, Deserialize, Clone, Debug)] #[derive(Default, Serialize, Deserialize, Clone, Debug)]
#[serde(bound = "I: serde::de::DeserializeOwned")] #[serde(bound = "I: serde::de::DeserializeOwned")]
pub struct InMemoryCorpus<I> pub struct InMemoryCorpus<I>
@ -129,7 +130,7 @@ where
After 0.9, all `Corpus` implementations are required to implement `UsesInput` and `Corpus` no longer has a generic for After 0.9, all `Corpus` implementations are required to implement `UsesInput` and `Corpus` no longer has a generic for
the input type (as it is now provided by the UsesInput impl). The migrated implementation is shown below: the input type (as it is now provided by the UsesInput impl). The migrated implementation is shown below:
```rust ```rust,ignore
#[derive(Default, Serialize, Deserialize, Clone, Debug)] #[derive(Default, Serialize, Deserialize, Clone, Debug)]
#[serde(bound = "I: serde::de::DeserializeOwned")] #[serde(bound = "I: serde::de::DeserializeOwned")]
pub struct InMemoryCorpus<I> pub struct InMemoryCorpus<I>

View File

@ -8,6 +8,8 @@ A crate is an individual library in Rust's Cargo build system, that you can use
libafl = { version = "*" } libafl = { version = "*" }
``` ```
## Crate List
For LibAFL, each crate has its self-contained purpose, and the user may not need to use all of them in its project. For LibAFL, each crate has its self-contained purpose, and the user may not need to use all of them in its project.
Following the naming convention of the folders in the project's root, they are: Following the naming convention of the folders in the project's root, they are:
@ -19,13 +21,13 @@ This crate has a number of feature flags that enable and disable certain aspects
The features can be found in [LibAFL's `Cargo.toml`](https://github.com/AFLplusplus/LibAFL/blob/main/libafl/Cargo.toml) under "`[features]`", and are usually explained with comments there. The features can be found in [LibAFL's `Cargo.toml`](https://github.com/AFLplusplus/LibAFL/blob/main/libafl/Cargo.toml) under "`[features]`", and are usually explained with comments there.
Some features worthy of remark are: Some features worthy of remark are:
- `std` enables the parts of the code that use the Rust standard library. Without this flag, LibAFL is `no_std` compatible. This disables a range of features, but allows us to use LibAFL in embedded environments, read [the `no_std` section](../advanced_features/no_std/no_std.md) for further details. - `std` enables the parts of the code that use the Rust standard library. Without this flag, LibAFL is `no_std` compatible. This disables a range of features, but allows us to use LibAFL in embedded environments, read [the `no_std` section](../advanced_features/no_std.md) for further details.
- `derive` enables the usage of the `derive(...)` macros defined in libafl_derive from libafl. - `derive` enables the usage of the `derive(...)` macros defined in libafl_derive from libafl.
- `rand_trait` allows you to use LibAFL's very fast (*but insecure!*) random number generator wherever compatibility with Rust's [`rand` crate](https://crates.io/crates/rand) is needed. - `rand_trait` allows you to use LibAFL's very fast (*but insecure!*) random number generator wherever compatibility with Rust's [`rand` crate](https://crates.io/crates/rand) is needed.
- `llmp_bind_public` makes LibAFL's LLMP bind to a public TCP port, over which other fuzzers nodes can communicate with this instance. - `llmp_bind_public` makes LibAFL's LLMP bind to a public TCP port, over which other fuzzers nodes can communicate with this instance.
- `introspection` adds performance statistics to LibAFL. - `introspection` adds performance statistics to LibAFL.
You can chose the features by using `features = ["feature1", "feature2", ...]` for LibAFL in your `Cargo.toml`. You can choose the features by using `features = ["feature1", "feature2", ...]` for LibAFL in your `Cargo.toml`.
Out of this list, by default, `std`, `derive`, and `rand_trait` are already set. Out of this list, by default, `std`, `derive`, and `rand_trait` are already set.
You can choose to disable them by setting `default-features = false` in your `Cargo.toml`. You can choose to disable them by setting `default-features = false` in your `Cargo.toml`.
@ -64,10 +66,9 @@ To understand it deeper, look through the tutorials and examples.
### libafl_frida ### libafl_frida
This library bridges LibAFL with Frida as instrumentation backend. This library bridges LibAFL with Frida as instrumentation backend.
With this crate, you can instrument targets on Linux/macOS/Windows/Android for coverage collection. With this crate, you can instrument targets on Linux/macOS/Windows/Android for coverage collection.
Additionally, it supports CmpLog, and AddressSanitizer instrumentation and runtimes for aarch64. Additionally, it supports CmpLog, and AddressSanitizer instrumentation and runtimes for aarch64.
See further information, as well as usage instructions, [later in the book](../advanced_features/frida.md).
### libafl_qemu ### libafl_qemu
@ -75,3 +76,13 @@ This library bridges LibAFL with QEMU user-mode to fuzz ELF cross-platform binar
It works on Linux and can collect edge coverage without collisions! It works on Linux and can collect edge coverage without collisions!
It also supports a wide range of hooks and instrumentation options. It also supports a wide range of hooks and instrumentation options.
### libafl_nyx
[Nyx](https://nyx-fuzz.com/) is a KVM-based snapshot fuzzer. `libafl_nyx` adds these capabilities to LibAFL. There is a specific section explaining usage of libafl_nyx [later in the book](../advanced_features/nyx.md).
### libafl_concolic
Concolic fuzzing is the combination of fuzzing and a symbolic execution engine.
This can reach greater depth than normal fuzzing, and is exposed in this crate.
There is a specific section explaining usage of libafl_concolic [later in the book](../advanced_features/concolic.md).

View File

@ -1,5 +1,5 @@
# Getting Started # Getting Started
To get started with LibAFL, there are some initial steps to do. To get started with LibAFL, there are some initial steps to take.
In this chapter, we discuss how to download and build LibAFL, using Rust's `cargo` command. In this chapter, we discuss how to download and build LibAFL, using Rust's `cargo` command.
We also describe the structure of LibAFL's components, so-called crates, and the purpose of each individual crate. We also describe the structure of LibAFL's components, so-called crates, and the purpose of each individual crate.

View File

@ -22,7 +22,7 @@ $ git clone git@github.com:AFLplusplus/LibAFL.git
You can alternatively, on a UNIX-like machine, download a compressed archive and extract it with: You can alternatively, on a UNIX-like machine, download a compressed archive and extract it with:
```sh ```sh
$ wget https://github.com/AFLplusplus/LibAFL/archive/main.tar.gz wget https://github.com/AFLplusplus/LibAFL/archive/main.tar.gz
$ tar xvf LibAFL-main.tar.gz $ tar xvf LibAFL-main.tar.gz
$ rm LibAFL-main.tar.gz $ rm LibAFL-main.tar.gz
$ ls LibAFL-main # this is the extracted folder $ ls LibAFL-main # this is the extracted folder

View File

@ -18,6 +18,7 @@ Be it a specific target, a particular instrumentation backend, or a custom mutat
LibAFL gives you many of the benefits of an off-the-shelf fuzzer, while being completely customizable. LibAFL gives you many of the benefits of an off-the-shelf fuzzer, while being completely customizable.
Some highlight features currently include: Some highlight features currently include:
- `multi platform`: LibAFL works pretty much anywhere you can find a Rust compiler for. We already used it on *Windows*, *Android*, *MacOS*, and *Linux*, on *x86_64*, *aarch64*, ... - `multi platform`: LibAFL works pretty much anywhere you can find a Rust compiler for. We already used it on *Windows*, *Android*, *MacOS*, and *Linux*, on *x86_64*, *aarch64*, ...
- `portable`: `LibAFL` can be built in `no_std` mode. - `portable`: `LibAFL` can be built in `no_std` mode.
This means it does not require a specific OS-dependent runtime to function. This means it does not require a specific OS-dependent runtime to function.

View File

@ -5,6 +5,5 @@ The chapter describes how to run nodes with different configurations
in one fuzzing cluster. in one fuzzing cluster.
This allows, for example, a node compiled with ASAN, to know that it needs to rerun new testcases for a node without ASAN, while the same binary/configuration does not. This allows, for example, a node compiled with ASAN, to know that it needs to rerun new testcases for a node without ASAN, while the same binary/configuration does not.
> ## Under Construction! Fuzzers with the same configuration can exchange Observers for new testcases and reuse them without rerunning the input.
> This section is under construction. A different configuration indicates, that only the raw input can be exchanged, it must be rerun on the other node to capture relevant observations.
> Please check back later (or open a PR)

View File

@ -36,6 +36,7 @@ It then sends the information needed to map the newly-allocated page in connecte
Once the receiver maps the new page, flags it as safe for unmapping from the sending process (to avoid race conditions if we have more than a single EOP in a short time), and then continues to read from the new `ShMem`. Once the receiver maps the new page, flags it as safe for unmapping from the sending process (to avoid race conditions if we have more than a single EOP in a short time), and then continues to read from the new `ShMem`.
The schema for client's maps to the broker is as follows: The schema for client's maps to the broker is as follows:
```text ```text
[client0] [client1] ... [clientN] [client0] [client1] ... [clientN]
| | / | | /

View File

@ -17,11 +17,13 @@ Launching nodes manually has the benefit that you can have multiple nodes with d
While it's called "restarting" manager, it uses `fork` on Unix operating systems as optimization and only actually restarts from scratch on Windows. While it's called "restarting" manager, it uses `fork` on Unix operating systems as optimization and only actually restarts from scratch on Windows.
## Launcher
## Automated, with Launcher
The Launcher is the lazy way to do multiprocessing. The Launcher is the lazy way to do multiprocessing.
You can use the Launcher builder to create a fuzzer that spawns multiple nodes, all using restarting event managers. You can use the Launcher builder to create a fuzzer that spawns multiple nodes with one click, all using restarting event managers and the same configuration.
An example may look like this:
To use launcher, first you need to write an anonymous function `let mut run_client = |state: Option<_>, mut mgr, _core_id|{}`, which uses three parameters to create individual fuzzer. Then you can specify the `shmem_provider`,`broker_port`,`monitor`,`cores` and other stuff through `Launcher::builder()`:
```rust,ignore ```rust,ignore
Launcher::builder() Launcher::builder()
@ -42,8 +44,16 @@ The value is a string indicating the cores to bind to, for example, `0,2,5` or `
For each client, `run_client` will be called. For each client, `run_client` will be called.
On Windows, the Launcher will restart each client, while on Unix, it will use `fork`. On Windows, the Launcher will restart each client, while on Unix, it will use `fork`.
Advanced use-cases:
1. To connect multiple nodes together via TCP, you can use the `remote_broker_addr`. this requires the `llmp_bind_public` compile-time feature for `LibAFL`.
2. To use multiple launchers for individual configurations, you can set `spawn_broker` to `false` on all but one.
3. Launcher will not select the cores automatically, so you need to specify the `cores` that you want.
For more examples, you can check out `qemu_launcher` and `libfuzzer_libpng_launcher` in [`./fuzzers/`](https://github.com/AFLplusplus/LibAFL/tree/main/fuzzers).
## Other ways ## Other ways
The LlmpEventManager family is the easiest way to spawn instances, but for obscure targets, you may need to come up with other solutions. The `LlmpEventManager` family is the easiest way to spawn instances, but for obscure targets, you may need to come up with other solutions.
LLMP is even, in theory, `no_std` compatible, and even completely different EventManagers can be used for message passing. LLMP is even, in theory, `no_std` compatible, and even completely different EventManagers can be used for message passing.
If you are in this situation, please either read through the current implementations and/or reach out to us. If you are in this situation, please either read through the current implementations and/or reach out to us.

View File

@ -1,5 +1,8 @@
# Introduction # Introduction
> ## Under Construction! > ## Under Construction!
>
> This section is under construction. > This section is under construction.
> Please check back later (or open a PR) > Please check back later (or open a PR)
>
> In the meantime, find the final Lain-based fuzzer in [the fuzzers folder](https://github.com/AFLplusplus/LibAFL/tree/main/fuzzers/tutorial)