222 lines
7.3 KiB
ReStructuredText
222 lines
7.3 KiB
ReStructuredText
================
|
|
MemorySanitizer
|
|
================
|
|
|
|
.. contents::
|
|
:local:
|
|
|
|
Introduction
|
|
============
|
|
|
|
MemorySanitizer is a detector of uninitialized reads. It consists of a
|
|
compiler instrumentation module and a run-time library.
|
|
|
|
Typical slowdown introduced by MemorySanitizer is **3x**.
|
|
|
|
How to build
|
|
============
|
|
|
|
Build LLVM/Clang with `CMake <https://llvm.org/docs/CMake.html>`_.
|
|
|
|
Usage
|
|
=====
|
|
|
|
Simply compile and link your program with ``-fsanitize=memory`` flag.
|
|
The MemorySanitizer run-time library should be linked to the final
|
|
executable, so make sure to use ``clang`` (not ``ld``) for the final
|
|
link step. When linking shared libraries, the MemorySanitizer run-time
|
|
is not linked, so ``-Wl,-z,defs`` may cause link errors (don't use it
|
|
with MemorySanitizer). To get a reasonable performance add ``-O1`` or
|
|
higher. To get meaningful stack traces in error messages add
|
|
``-fno-omit-frame-pointer``. To get perfect stack traces you may need
|
|
to disable inlining (just use ``-O1``) and tail call elimination
|
|
(``-fno-optimize-sibling-calls``).
|
|
|
|
.. code-block:: console
|
|
|
|
% cat umr.cc
|
|
#include <stdio.h>
|
|
|
|
int main(int argc, char** argv) {
|
|
int* a = new int[10];
|
|
a[5] = 0;
|
|
if (a[argc])
|
|
printf("xx\n");
|
|
return 0;
|
|
}
|
|
|
|
% clang -fsanitize=memory -fno-omit-frame-pointer -g -O2 umr.cc
|
|
|
|
If a bug is detected, the program will print an error message to
|
|
stderr and exit with a non-zero exit code.
|
|
|
|
.. code-block:: console
|
|
|
|
% ./a.out
|
|
WARNING: MemorySanitizer: use-of-uninitialized-value
|
|
#0 0x7f45944b418a in main umr.cc:6
|
|
#1 0x7f45938b676c in __libc_start_main libc-start.c:226
|
|
|
|
By default, MemorySanitizer exits on the first detected error. If you
|
|
find the error report hard to understand, try enabling
|
|
:ref:`origin tracking <msan-origins>`.
|
|
|
|
``__has_feature(memory_sanitizer)``
|
|
------------------------------------
|
|
|
|
In some cases one may need to execute different code depending on
|
|
whether MemorySanitizer is enabled. :ref:`\_\_has\_feature
|
|
<langext-__has_feature-__has_extension>` can be used for this purpose.
|
|
|
|
.. code-block:: c
|
|
|
|
#if defined(__has_feature)
|
|
# if __has_feature(memory_sanitizer)
|
|
// code that builds only under MemorySanitizer
|
|
# endif
|
|
#endif
|
|
|
|
``__attribute__((no_sanitize("memory")))``
|
|
-----------------------------------------------
|
|
|
|
Some code should not be checked by MemorySanitizer. One may use the function
|
|
attribute ``no_sanitize("memory")`` to disable uninitialized checks in a
|
|
particular function. MemorySanitizer may still instrument such functions to
|
|
avoid false positives. This attribute may not be supported by other compilers,
|
|
so we suggest to use it together with ``__has_feature(memory_sanitizer)``.
|
|
|
|
Blacklist
|
|
---------
|
|
|
|
MemorySanitizer supports ``src`` and ``fun`` entity types in
|
|
:doc:`SanitizerSpecialCaseList`, that can be used to relax MemorySanitizer
|
|
checks for certain source files and functions. All "Use of uninitialized value"
|
|
warnings will be suppressed and all values loaded from memory will be
|
|
considered fully initialized.
|
|
|
|
Report symbolization
|
|
====================
|
|
|
|
MemorySanitizer uses an external symbolizer to print files and line numbers in
|
|
reports. Make sure that ``llvm-symbolizer`` binary is in ``PATH``,
|
|
or set environment variable ``MSAN_SYMBOLIZER_PATH`` to point to it.
|
|
|
|
.. _msan-origins:
|
|
|
|
Origin Tracking
|
|
===============
|
|
|
|
MemorySanitizer can track origins of uninitialized values, similar to
|
|
Valgrind's --track-origins option. This feature is enabled by
|
|
``-fsanitize-memory-track-origins=2`` (or simply
|
|
``-fsanitize-memory-track-origins``) Clang option. With the code from
|
|
the example above,
|
|
|
|
.. code-block:: console
|
|
|
|
% cat umr2.cc
|
|
#include <stdio.h>
|
|
|
|
int main(int argc, char** argv) {
|
|
int* a = new int[10];
|
|
a[5] = 0;
|
|
volatile int b = a[argc];
|
|
if (b)
|
|
printf("xx\n");
|
|
return 0;
|
|
}
|
|
|
|
% clang -fsanitize=memory -fsanitize-memory-track-origins=2 -fno-omit-frame-pointer -g -O2 umr2.cc
|
|
% ./a.out
|
|
WARNING: MemorySanitizer: use-of-uninitialized-value
|
|
#0 0x7f7893912f0b in main umr2.cc:7
|
|
#1 0x7f789249b76c in __libc_start_main libc-start.c:226
|
|
|
|
Uninitialized value was stored to memory at
|
|
#0 0x7f78938b5c25 in __msan_chain_origin msan.cc:484
|
|
#1 0x7f7893912ecd in main umr2.cc:6
|
|
|
|
Uninitialized value was created by a heap allocation
|
|
#0 0x7f7893901cbd in operator new[](unsigned long) msan_new_delete.cc:44
|
|
#1 0x7f7893912e06 in main umr2.cc:4
|
|
|
|
By default, MemorySanitizer collects both allocation points and all
|
|
intermediate stores the uninitialized value went through. Origin
|
|
tracking has proved to be very useful for debugging MemorySanitizer
|
|
reports. It slows down program execution by a factor of 1.5x-2x on top
|
|
of the usual MemorySanitizer slowdown and increases memory overhead.
|
|
|
|
Clang option ``-fsanitize-memory-track-origins=1`` enables a slightly
|
|
faster mode when MemorySanitizer collects only allocation points but
|
|
not intermediate stores.
|
|
|
|
Use-after-destruction detection
|
|
===============================
|
|
|
|
You can enable experimental use-after-destruction detection in MemorySanitizer.
|
|
After invocation of the destructor, the object will be considered no longer
|
|
readable, and using underlying memory will lead to error reports in runtime.
|
|
|
|
This feature is still experimental, in order to enable it at runtime you need
|
|
to:
|
|
|
|
#. Pass addition Clang option ``-fsanitize-memory-use-after-dtor`` during
|
|
compilation.
|
|
#. Set environment variable `MSAN_OPTIONS=poison_in_dtor=1` before running
|
|
the program.
|
|
|
|
Handling external code
|
|
======================
|
|
|
|
MemorySanitizer requires that all program code is instrumented. This
|
|
also includes any libraries that the program depends on, even libc.
|
|
Failing to achieve this may result in false reports.
|
|
For the same reason you may need to replace all inline assembly code that writes to memory
|
|
with a pure C/C++ code.
|
|
|
|
Full MemorySanitizer instrumentation is very difficult to achieve. To
|
|
make it easier, MemorySanitizer runtime library includes 70+
|
|
interceptors for the most common libc functions. They make it possible
|
|
to run MemorySanitizer-instrumented programs linked with
|
|
uninstrumented libc. For example, the authors were able to bootstrap
|
|
MemorySanitizer-instrumented Clang compiler by linking it with
|
|
self-built instrumented libc++ (as a replacement for libstdc++).
|
|
|
|
Supported Platforms
|
|
===================
|
|
|
|
MemorySanitizer is supported on the following OS:
|
|
|
|
* Linux
|
|
* NetBSD
|
|
* FreeBSD
|
|
|
|
Limitations
|
|
===========
|
|
|
|
* MemorySanitizer uses 2x more real memory than a native run, 3x with
|
|
origin tracking.
|
|
* MemorySanitizer maps (but not reserves) 64 Terabytes of virtual
|
|
address space. This means that tools like ``ulimit`` may not work as
|
|
usually expected.
|
|
* Static linking is not supported.
|
|
* Older versions of MSan (LLVM 3.7 and older) didn't work with
|
|
non-position-independent executables, and could fail on some Linux
|
|
kernel versions with disabled ASLR. Refer to documentation for older versions
|
|
for more details.
|
|
* MemorySanitizer might be incompatible with position-independent executables
|
|
from FreeBSD 13 but there is a check done at runtime and throws a warning
|
|
in this case.
|
|
|
|
Current Status
|
|
==============
|
|
|
|
MemorySanitizer is known to work on large real-world programs
|
|
(like Clang/LLVM itself) that can be recompiled from source, including all
|
|
dependent libraries.
|
|
|
|
More Information
|
|
================
|
|
|
|
`<https://github.com/google/sanitizers/wiki/MemorySanitizer>`_
|