Uncontrolled memory consumption is a kind of critical software security weaknesses. It can also become a security-critical vulnerability when attackers can take control of the input to consume a large amount of memory and launch a Denial-of-Service attack. However, detecting such vulnerability is challenging, as the state-of-the-art fuzzing techniques focus on the code coverage but not memory consumption. To this end, we propose a memory usage guided fuzzing technique, named MemLock, to generate the excessive memory consumption inputs and trigger uncontrolled memory consumption bugs. The fuzzing process is guided with memory consumption information so that our approach is general and does not require any domain knowledge. We perform a thorough evaluation for MemLock on 14 widely-used real-world programs. Our experiment results show that MemLock substantially outperforms the state-of-the-art fuzzing techniques, including AFL, AFLfast, PerfFuzz, FairFuzz, Angora and QSYM, in discovering memory consumption bugs. During the experiments, we discovered many previously unknown memory consumption bugs and received 26 new CVEs.
This website contains the following pages:
As the definition in CWE 400, "The software does not properly control the allocation and maintence of a limited resource thereby enabling an actor to influence the amount of resources consumed, eventually leading to the exhaustion of available resources". If an attacker can trigger the allocation of these limited resources, but the number or size of the resources is not controlled, then the attacker could cause a denial of service that consumes all available resources. This would prevent valid users from accessing the software, and it could potentially have an impact on the surrounding environment. For example, a memory exhaustion attack against an application could slow down the application as well as its host operating system.
However, memory consumption bugs are essentially different from memory corruption vulnerabilities, because they are primarily in relation to the problem of uncontrolled memory resource consumption. In our work, we focus on three types of memory consumption related bugs: uncontrolled-recursion, uncontrolled-memory-allocation, and memory leak.
CWE-674: Uncontrolled Recursion: The product does not properly control the amount of recursion that takes place, which consumes excessive resources, such as allocated memory or the program stack.
CWE-789: Uncontrolled Memory Allocation: The product allocates memory based on an untrusted size value, but it does not validate or incorrectly validates the size, allowing arbitrary amounts of memory to be allocated.
CWE-401: Memory Leak: The software does not sufficiently track and release allocated memory after it has been used, which slowly consumes remaining memory.
We present MemLock to enhance grey-box fuzzing to find memory consumption bugs. The coverage information guides to explore different program paths, and the memory consumption information guides the program path to consume more memory. This input can be further mutated so that the newly generated input consumes more and more memory. After some mutations, MemLock is expected to generate an input whereby the memory consumption exceeds the available memory.
The Figure above shows the workflow of MemLock, which contains two main components: static analysis and fuzzing loop. In particular, the static analysis takes the program source code as the input, and analyzes three kinds of information: control flow graph, call graph, and memory operations. The static analysis in MemLock helps decide where to instrument and what to instrument. The control flow graph information is used to collect the branch coverage; the call graph information aids to instrument the function call entries and returns. Based on the memory operation statements, MemLock instruments the locations of memory allocation and free.
Once the program is instrumented, MemLock enters the continuous fuzzing loop to detect memory consumption bugs. Given the initial seeds, MemLock selects a seed s from the seed pool. As for the seed s, MemLock uses the mutation strategy to generate the new inputs (test cases). MemLock then runs the generated inputs against the instrumented program, and collects their memory consumption information and branch coverage information. If the generated seeds consume more memory or have new branch coverage, they are retained as interesting seeds. MemLock prioritizes these interesting seeds and adds them into the seed pool, again. MemLock repeats this process until reaching time or resource budget limits.
Consider the following simple example, the program has only one path. It requests memory resources in line 8 through malloc() function. The size of allocated memory is determined by the user's input. If the allocation in line 8 fails, the return value of buf will be NULL, leading to the program crash (Null-pointer deference in line 9). In the following, we will use this example to describe how Memlock's seed updating scheme works.
let us assume the fuzzer holds a initial seed input() = 10
in line 7. After some mutations, the fuzzers may generate input () = 20
, input () = 1000
, input () = 10000
, etc., which make the allocation size in line 8 increase. AFL, SlowFuzz and PerfFuzz will discard these mutants, since these mutants do not increase the coverage or the number of instructions. It is clearly that input () = 20
consumes much more memory than input () = 10
and is closer to running out of memory. Memlock uses a novel seed updating scheme to replace the original seed input () = 20
with the generated input input () = 10
, instead of adding input () = 20
to the seed pool directly. We propose a simple yet effective **seed updating scheme** that allows MemLock to keep on generating new seeds with increased peak memory consumption for each already-covered path. This is essentially different from Slowfuzz and PerfFuzz.
The following figure describes the workflow of MemLock's seed updating scheme.
Read the MemLock paper for more details.
With MemLock, we have responsibly disclosed several previously unknown security-critical vulnerabilities. These vulnerabilities were not previously reported and rated as a medium-security risk. We informed the maintainers, and Mitre assigned 26 CVEs for these issues. Among these 26 CVEs, 18 CVEs are uncontrolled-recursion vulnerabilities, 6 CVEs are vulnerabilities due to uncontrolled-memory-allocation issues, and 2 CVEs are about memory leak vulnerabilities. An attacker might leverage these vulnerabilities to launch an attack, by providing well-conceived inputs that trigger excessive memory consumption. Our report made these vulnerable programs being patched. At the time of writing, 12 of these vulnerabilities have been patched.
Our bug report has received wide acclaim in the community. We are confident that MemLock is effective and viable in practice.
In a word, MemLock is well received by the community, and we are confident that MemLock is effective and viable in practice.
CVEs | Vulnerability Type | Vulnerability Description |
---|---|---|
CVE-2020-36375 | CWE-674: Uncontrolled Recursion | Stack overflow vulnerability in parse_equality Cesanta MJS 1.20.1, allows remote attackers to cause a Denial of Service (DoS) via a crafted file. |
CVE-2020-36374 | CWE-674: Uncontrolled Recursion | Stack overflow vulnerability in parse_comparison Cesanta MJS 1.20.1, allows remote attackers to cause a Denial of Service (DoS) via a crafted file. |
CVE-2020-36373 | CWE-674: Uncontrolled Recursion | Stack overflow vulnerability in parse_shifts Cesanta MJS 1.20.1, allows remote attackers to cause a Denial of Service (DoS) via a crafted file. |
CVE-2020-36372 | CWE-674: Uncontrolled Recursion | Stack overflow vulnerability in parse_plus_minus Cesanta MJS 1.20.1, allows remote attackers to cause a Denial of Service (DoS) via a crafted file. |
CVE-2020-36371 | CWE-674: Uncontrolled Recursion | Stack overflow vulnerability in parse_mul_div_rem Cesanta MJS 1.20.1, allows remote attackers to cause a Denial of Service (DoS) via a crafted file. |
CVE-2020-36370 | CWE-674: Uncontrolled Recursion | Stack overflow vulnerability in parse_unary Cesanta MJS 1.20.1, allows remote attackers to cause a Denial of Service (DoS) via a crafted file. |
CVE-2020-36369 | CWE-674: Uncontrolled Recursion | Stack overflow vulnerability in parse_statement_list Cesanta MJS 1.20.1, allows remote attackers to cause a Denial of Service (DoS) via a crafted file. |
CVE-2020-36368 | CWE-674: Uncontrolled Recursion | Stack overflow vulnerability in parse_statement Cesanta MJS 1.20.1, allows remote attackers to cause a Denial of Service (DoS) via a crafted file. |
CVE-2020-36367 | CWE-674: Uncontrolled Recursion | Stack overflow vulnerability in parse_block Cesanta MJS 1.20.1, allows remote attackers to cause a Denial of Service (DoS) via a crafted file. |
CVE-2020-36366 | CWE-674: Uncontrolled Recursion | Stack overflow vulnerability in parse_value Cesanta MJS 1.20.1, allows remote attackers to cause a Denial of Service (DoS) via a crafted file. |
CVE-2020-18392 | CWE-674: Uncontrolled Recursion | Stack overflow vulnerability in parse_array Cesanta MJS 1.20.1, allows remote attackers to cause a Denial of Service (DoS) via a crafted file. |
CVE-2019-6293 | CWE-674: Uncontrolled Recursion | An issue was discovered in the function mark_beginning_as_normal in nfa.c in flex 2.6.4. There is a stack exhaustion problem caused by the mark_beginning_as_normal function making recursive calls to itself in certain scenarios involving lots of '*' characters. Remote attackers could leverage this vulnerability to cause a denial-of-service. |
CVE-2019-6292 | CWE-674: Uncontrolled Recursion | An issue was discovered in singledocparser.cpp in yaml-cpp (aka LibYaml-C++) 0.6.2. Stack Exhaustion occurs in YAML::SingleDocParser, and there is a stack consumption problem caused by recursive stack frames: HandleCompactMap, HandleMap, HandleFlowSequence, HandleSequence, HandleNode. Remote attackers could leverage this vulnerability to cause a denial-of-service via a cpp file. |
CVE-2019-6291 | CWE-674: Uncontrolled Recursion | An issue was discovered in the function expr6 in eval.c in Netwide Assembler (NASM) through 2.14.02. There is a stack exhaustion problem caused by the expr6 function making recursive calls to itself in certain scenarios involving lots of '!' or '+' or '-' characters. Remote attackers could leverage this vulnerability to cause a denial-of-service via a crafted asm file. |
CVE-2019-6290 | CWE-674: Uncontrolled Recursion | An infinite recursion issue was discovered in eval.c in Netwide Assembler (NASM) through 2.14.02. There is a stack exhaustion problem resulting from infinite recursion in the functions expr, rexp, bexpr and cexpr in certain scenarios involving lots of '{' characters. Remote attackers could leverage this vulnerability to cause a denial-of-service via a crafted asm file. |
CVE-2018-18701 | CWE-674: Uncontrolled Recursion | An issue was discovered in cp-demangle.c in GNU libiberty, as distributed in GNU Binutils 2.31. There is a stack consumption vulnerability resulting from infinite recursion in the functions next_is_type_qual() and cplus_demangle_type() in cp-demangle.c. Remote attackers could leverage this vulnerability to cause a denial-of-service via an ELF file, as demonstrated by nm. |
CVE-2018-18700 | CWE-674: Uncontrolled Recursion | An issue was discovered in cp-demangle.c in GNU libiberty, as distributed in GNU Binutils 2.31. There is a stack consumption vulnerability resulting from infinite recursion in the functions d_name(), d_encoding(), and d_local_name() in cp-demangle.c. Remote attackers could leverage this vulnerability to cause a denial-of-service via an ELF file, as demonstrated by nm. |
CVE-2018-18484 | CWE-674: Uncontrolled Recursion | An issue was discovered in cp-demangle.c in GNU libiberty, as distributed in GNU Binutils 2.31. Stack Exhaustion occurs in the C++ demangling functions provided by libiberty, and there is a stack consumption problem caused by recursive stack frames: cplus_demangle_type, d_bare_function_type, d_function_type. |
CVE-2018-17985 | CWE-674: Uncontrolled Recursion | An issue was discovered in cp-demangle.c in GNU libiberty, as distributed in GNU Binutils 2.31. There is a stack consumption problem caused by the cplus_demangle_type function making recursive calls to itself in certain scenarios involving many 'P' characters. |
CVE-2019-7704 | CWE-789: Uncontrolled Memory Allocation | wasm::WasmBinaryBuilder::readUserSection in wasm-binary.cpp in Binaryen 1.38.22 triggers an attempt at excessive memory allocation, as demonstrated by wasm-merge and wasm-opt. |
CVE-2019-7698 | CWE-789: Uncontrolled Memory Allocation | An issue was discovered in AP4_Array |
CVE-2019-7148 | CWE-789: Uncontrolled Memory Allocation | An attempted excessive memory allocation was discovered in the function read_long_names in elf_begin.c in libelf in elfutils 0.174. Remote attackers could leverage this vulnerability to cause a denial-of-service via crafted elf input, which leads to an out-of-memory exception. NOTE: The maintainers believe this is not a real issue, but instead a "warning caused by ASAN because the allocation is big. By setting ASAN_OPTIONS=allocator_may_return_null=1 and running the reproducer, nothing happens." |
CVE-2018-20652 | CWE-789: Uncontrolled Memory Allocation | An attempted excessive memory allocation was discovered in the function tinyexr::AllocateImage in tinyexr.h in tinyexr v0.9.5. Remote attackers could leverage this vulnerability to cause a denial-of-service via crafted input, which leads to an out-of-memory exception. |
CVE-2018-18483 | CWE-789: Uncontrolled Memory Allocation | The get_count function in cplus-dem.c in GNU libiberty, as distributed in GNU Binutils 2.31, allows remote attackers to cause a denial of service (malloc called with the result of an integer-overflowing calculation) or possibly have unspecified other impact via a crafted string, as demonstrated by c++filt. |
CVE-2018-20657 | CWE-401: Memory Leak | The demangle_template function in cplus-dem.c in GNU libiberty, as distributed in GNU Binutils 2.31.1, has a memory leak via a crafted string, leading to a denial of service (memory consumption), as demonstrated by cxxfilt, a related issue to CVE-2018-12698. |
CVE-2018-20002 | CWE-401: Memory Leak | The _bfd_generic_read_minisymbols function in syms.c in the Binary File Descriptor (BFD) library (aka libbfd), as distributed in GNU Binutils 2.31, has a memory leak via a crafted ELF file, leading to a denial of service (memory consumption), as demonstrated by nm. |
We investigated an uncontrolled recursive vulnerability we identified. Users can directly control the depth of recursive function calls. Then, the developers also confirm the existence of this vulnerability and recognizes that uncontrolled recursion is bad behavior. And the developers reply to us, "Thanks for presenting your analysis of the defect. It is trivially handled by the user as you say bug a default recursion limit would be nice. Do you plan to implement that feature to fight off the CVE hounds? Thanks". Finally, We help developers fix this vulnerability by restricting recursion.
In another example, we found an uncontrolled memory allocation vulnerability through MemLock. The conditions for triggering this vulnerability are very harsh, and a large value is required to fail the memory allocation. After the memory allocation failed, the program crashed directly. We worked with the author to artificially analyze the vulnerability. Finally, we draw some conclusions. The program would crash under two situations: 1. try to allocate zero sized data; 2. try to allocate 1TB or more memory. The autore reply, "I have added a workaround for the latter case, but it is recommended to check the retrun value of malloc. You can Contribute PR".
Other Examples
We also provide a repo containing the benchmarks, CVE information, as well as the inital seeds to replicate our experiments.
MemLock's evaluation Data Set: https://github.com/ICSE2020-MemLock/MemLock_Benchmark
To evaluate the effectiveness of fuzzers, a direct measurement is the number of unique crashes found by different fuzzers. It is believed that, more unique crashes usually indicate higher chances of covering more unique vulnerabilities.
The table below shows the number of unique crashes found by 7 different fuzzers within 24 hours in the benchmark programs. It is worth noting that, we only count the unique crashes related to memory consumption. Out of the 17 groups of experiments, MemLock performs best in the 10 (58.8%) group of experiments among 7 different fuzzers, as shown in column MemLock. In total, MemLock finds 2009 unique memory consumption-related crashes in the benchmark programs, improving by 59.2%, 70.5%, 76.9%, 98.1%, 40.5% and 66.7% respectively, compared to state-of-the-art fuzzers AFL, AFLfast, PerfFuzz, FairFuzz, Angora and QSYM. Especially, MemLock is able to find unique crashes in all benchmark programs, while other 6 state-of-the-art fuzzers may not find unique crashes in some benchmark programs. For example, other 5 state-of-the-art fuzzers cannot find any unique crashes in the program flex, however, MemLock can find 61 unique crashes within 24 hours.
Following Klees' recommendation, we also conduct the statistic test for the results. The A12 statistic measures the probability that one fuzzer outperforms another fuzzer, as shown in columns with the heading A12. Further, we use Mann-Whitney U to measure the statistical significance of performance gain. When significant, we mark the corresponding A12 values in the bold. Out of 102 A12 values in the table, 72 (70.6%) A12 values are bold and exceeding the conventionally large effect size (0.71). Thus, we can conclude that MemLock significantly outperforms other 5 state-of-the-art fuzzers in most benchmark programs.
As MemLock focuses on the space complexity issues, it prefers the longer test cases and thus may degrade the capability to find other types of vulnerabilities. For example, longer test cases tend to trigger uncontrolled-recursion issues. Therefore, we also evaluate the capability to find other types of crashes. In the benchmark programs, the state-of-the-art fuzzers MemLock, AFL, AFLfast, PerfFuzz, FairFuzz and QSYM find 77, 239, 228, 189, 276 and 236 other types of unique crashes, respectively.
Program | Raw_Data | Vulnerability | Unique Crashes | Statistical Test | ||||||||||||||||||||||||
MemLock | AFL | AFLfast | PerfFuzz | fairFuzz | Angora | QSYM | AFL | AFLfast | PerfFuzz | FairFuzz | Angora | QSYM | ||||||||||||||||
MemCrash | Other | MemCrash | Other | MemCrash | Other | MemCrash | Other | MemCrash | Other | MemCrash | Other | MemCrash | Other | p-value | A12 | p-value | A12 | p-value | A12 | p-value | A12 | p-value | A12 | p-value | A12 | |||
mjs | Download | uncontrolled-recursion | 114 | 16 | 36 | 184 | 31 | 172 | 88 | 106 | 12 | 219 | 0 | 0 | 30 | 181 | 0.0152 | 1.00 | 0.0152 | 1.00 | 0.0204 | 0.96 | 0.0152 | 1.00 | 0.0152 | 1.00 | 0.0152 | 1.00 |
cxxfilt | Download | uncontrolled-recursion | 448 | 0 | 373 | 0 | 304 | 0 | 401 | 0 | 39 | 0 | 0 | 0 | 327 | 0 | 0.0152 | 1.00 | 0.0152 | 1.00 | 0.0561 | 0.88 | 0.0152 | 1.00 | 0.0152 | 1.00 | 0.0152 | 1.00 |
nm | Download | uncontrolled-recursion | 127 | 0 | 12 | 0 | 21 | 0 | 17 | 0 | 0 | 0 | 0 | 0 | 20 | 0 | 0.0147 | 1.00 | 0.0152 | 1.00 | 0.0105 | 1.00 | 0.0105 | 1.00 | 0.0105 | 1.00 | 0.0142 | 1.00 |
nasm | Download | uncontrolled-recursion | 132 | 0 | 6 | 0 | 4 | 0 | 40 | 0 | 0 | 0 | 0 | 0 | 4 | 0 | 0.0147 | 1.00 | 0.0147 | 1.00 | 0.0147 | 1.00 | 0.0101 | 1.00 | 0.0101 | 1.00 | 0.0142 | 1.00 |
flex | Download | uncontrolled-recursion | 61 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.0101 | 1.00 | 0.0101 | 1.00 | 0.0101 | 1.00 | 0.0101 | 1.00 | 0.0101 | 1.00 | 0.0101 | 1.00 |
yaml-cpp | Download | uncontrolled-recursion | 4 | 0 | 0 | 0 | 1 | 0 | 3 | 0 | 0 | 0 | 0 | 296 | 0 | 0 | 0.0105 | 1.00 | 0.0142 | 1.00 | 0.4404 | 0.56 | 0.0142 | 1.00 | 0.0105 | 1.00 | 0.0105 | 1.00 |
libsass | Download | uncontrolled-recursion | 23 | 49 | 6 | 39 | 4 | 41 | 23 | 73 | 11 | 44 | 26 | 22 | 7 | 33 | 0.0142 | 1.00 | 0.0142 | 1.00 | 0.5000 | 0.53 | 0.0551 | 0.88 | 0.0101 | 0.25 | 0.0142 | 1.00 |
yara | Download | uncontrolled-recursion | 156 | 0 | 34 | 0 | 33 | 0 | 65 | 0 | 0 | 0 | 13 | 0 | 31 | 1 | 0.0152 | 1.00 | 0.0152 | 1.00 | 0.0303 | 0.94 | 0.0147 | 1.00 | 0.0152 | 1.00 | 0.0147 | 1.00 |
readelf | Download | uncontrolled-memory-allocation | 273 | 0 | 104 | 0 | 110 | 0 | 54 | 0 | 181 | 0 | 0 | 11 | 114 | 0 | 0.0061 | 1.00 | 0.0061 | 1.00 | 0.0060 | 1.00 | 0.0301 | 0.88 | 0.0152 | 1.00 | 0.0060 | 1.00 |
exiv2 | Download | uncontrolled-memory-allocation | 10 | 5 | 11 | 8 | 11 | 8 | 6 | 9 | 15 | 6 | 13 | 6 | 8 | 14 | 0.0352 | 0.14 | 0.0675 | 0.20 | 0.0226 | 0.90 | 0.0058 | 0.00 | 0.0443 | 0.16 | 0.5000 | 0.52 |
openjpeg | Download | uncontrolled-memory-allocation | 16 | 0 | 8 | 0 | 5 | 0 | 0 | 0 | 7 | 0 | 0 | 0 | 5 | 0 | 0.0694 | 0.80 | 0.0058 | 1.00 | 0.0036 | 1.00 | 0.4569 | 0.46 | 0.0152 | 1.00 | 0.0706 | 0.80 |
bento4 | Download | uncontrolled-memory-allocation | 5 | 1 | 2 | 2 | 2 | 0 | 2 | 0 | 1 | 1 | 189 | 4 | 1 | 1 | 0.0014 | 1.00 | 0.0063 | 0.98 | 0.0036 | 1.00 | 0.0047 | 1.00 | 0.0038 | 0.00 | 0.0036 | 1.00 |
memory leak | 145 | 78 | 72 | 61 | 125 | 290 | 74 | 0.0060 | 1.00 | 0.0056 | 1.00 | 0.0056 | 1.00 | 0.0061 | 1.00 | 0.0061 | 0.00 | 0.0060 | 1.00 | |||||||||
libming | Download | uncontrolled-memory-allocation | 18 | 0 | 20 | 0 | 18 | 0 | 17 | 0 | 20 | 0 | 3 | 0 | 16 | 0 | 0.0651 | 0.20 | 0.3342 | 0.60 | 0.2926 | 0.62 | 0.0116 | 0.20 | 0.0626 | 1.00 | 0.0060 | 0.80 |
memory leak | 264 | 336 | 324 | 324 | 371 | 87 | 354 | 0.0061 | 0.00 | 0.0060 | 0.00 | 0.0061 | 0.00 | 0.0061 | 0.00 | 0.0060 | 1.00 | 0.0061 | 0.00 | |||||||||
jasper | Download | uncontrolled-memory-allocation | 3 | 6 | 2 | 6 | 3 | 7 | 0 | 1 | 3 | 6 | 2 | 4 | 2 | 6 | 0.0279 | 0.84 | 0.4102 | 0.56 | 0.0037 | 1.00 | 0.4106 | 0.56 | 0.0060 | 1.00 | 0.0124 | 0.92 |
memory leak | 210 | 234 | 235 | 35 | 216 | 820 | 212 | 0.0177 | 0.08 | 0.0177 | 0.08 | 0.0059 | 1.00 | 0.3381 | 0.40 | 0.0061 | 0.00 | 0.4583 | 0.46 | |||||||||
Total Number of Unique Crashes | 2009 | 77 | 1262 | 239 | 1178 | 228 | 1136 | 189 | 1014 | 276 | 1430 | 343 | 1205 | 236 | ||||||||||||||
Improvement | - | +59.19% | +70.54% | +76.85% | +98.13% | +40.49% | +66.72% |
To better compare different fuzzers, we also use the plots to depict the performance over time in some benchmark programs, as shown in Figure below. It shows that MemLock has a steady and strong growth trend in finding unique crashes, and MemLock is also the first fuzzer that reported crashes.
Although there is a weak correlation between the number of unique crashes and the number of vulnerabilities found during the fuzzing process. A more important measurement is the capability to find real-world vulnerabilities, as suggested by Klees. Thus, we compare the capability of MemLock to find unique vulnerabilities against baseline fuzzers in this section.
The table below shows the statistic results in 7 different fuzzers. The benchmark programs totally contain 34 vulnerabilities, out of which MemLock performs best in the 25 vulnerabilities among 6 state-of-the-art fuzzers, as shown in column MemLock. MemLock averagely takes about 5.4 hours to find each unique vulnerability, which is 2.15, 2.15, 2.20, 2.69, 3.76, 2.07 times faster than the state-of-the-art fuzzers AFL, AFlfast, PerfFuzz, FairFuzz, Angora and QSYM respectively. In particular, MemLock find 33 out of 34 unique vulnerabilities within 24 hours, while other fuzzers AFL, AFLfast, PerfFuzz, FairFuzz, Angora and QSYM only find 26, 28, 20, 17, 6 and 25, respectively. The three unique vulnerabilities (i.e., issue#106, CVE-2018-18701 and CVE-2019-6293) in mjs, nm, and flex can found by only MemLock within 24 hours. Therefore, it is proved that our memory-consumption guided strategy is very effective to find memory consumption-related bugs.
In addition, we also conduct the statistic test for real-world vulnerability evaluation. Out of 204 A12 values in the table, 139 (68.1%) A12 values are bold and exceeding the conventionally large effect size (0.71). Thus, MemLock significantly outperforms other 6 state-of-the-art fuzzers in finding unique vulnerabilities.
Program | Vulnerability | Type | Exposing Time (hours) | Statistical Test (p-value & A12) | |||||||||||||||||
MemLock | AFL | AFLfast | PerfFuzz | FairFuzz | Angora | QSYM | AFL | AFLfast | PerfFuzz | FairFuzz | Angora | QSYM | |||||||||
mjs | issue#58 | uncontrolled-recursion | 0.5 | 0.3 | 0.4 | 0.2 | 0.4 | T/O | 0.3 | 0.1169 | 0.25 | 0.1561 | 0.25 | 0.0344 | 0.13 | 0.1547 | 0.25 | 0.0101 | 1.00 | 0.1168 | 0.22 |
issue#106 | uncontrolled-recursion | 13.7 | T/O | T/O | T/O | T/O | T/O | T/O | 0.0019 | 1.00 | 0.0019 | 1.00 | 0.0019 | 1.00 | 0.0019 | 1.00 | 0.0019 | 1.00 | 0.0019 | 1.00 | |
cxxfilt | CVE-2018-9138 | uncontrolled-recursion | 0.3 | 7.2 | 10.1 | 0.5 | T/O | T/O | 3.3 | 0.0147 | 1.00 | 0.0147 | 1.00 | 0.0902 | 0.81 | 0.0101 | 1.00 | 0.0101 | 1.00 | 0.0147 | 1.00 |
CVE-2018-9996 | uncontrolled-recursion | T/O | 16.5 | T/O | T/O | T/O | T/O | T/O | 0.0105 | 0.00 | nan | 0.50 | nan | 0.50 | nan | 0.50 | nan | 0.50 | nan | 0.50 | |
CVE-2018-17985 | uncontrolled-recursion | 0.2 | 1.1 | 4.5 | 0.2 | 1.9 | T/O | 1.4 | 0.0089 | 1.00 | 0.0132 | 1.00 | 0.2266 | 0.63 | 0.0127 | 1.00 | 0.0127 | 1.00 | 0.0132 | 1.00 | |
CVE-2018-18484 | uncontrolled-recursion | 0.2 | 1 | 4.5 | 0.2 | 8 | T/O | 1.4 | 0.0132 | 1.00 | 0.0132 | 1.00 | 0.2266 | 0.63 | 0.0105 | 1.00 | 0.0132 | 1.00 | 0.0132 | 1.00 | |
CVE-2018-18700 | uncontrolled-recursion | 0.2 | 1.2 | 4.6 | 0.3 | 12.6 | T/O | 1.4 | 0.0105 | 1.00 | 0.0105 | 1.00 | 0.0929 | 0.75 | 0.0105 | 1.00 | 0.0105 | 1.00 | 0.0105 | 1.00 | |
nm | CVE-2018-12641 | uncontrolled-recursion | 2.6 | 19.1 | 12.6 | 12.2 | T/O | T/O | 12.8 | 0.0132 | 1.00 | 0.0151 | 1.00 | 0.0561 | 0.88 | 0.0105 | 1.00 | 0.0105 | 1.00 | 0.0331 | 0.88 |
CVE-2018-17985 | uncontrolled-recursion | 10.4 | 18.2 | 11.9 | T/O | T/O | T/O | 13.3 | 0.0281 | 0.81 | 0.4426 | 0.56 | 0.0105 | 1.00 | 0.0105 | 1.00 | 0.0105 | 1.00 | 0.3315 | 0.63 | |
CVE-2018-18484 | uncontrolled-recursion | 9.9 | 16.4 | 17.1 | T/O | T/O | T/O | 14 | 0.0448 | 0.84 | 0.0448 | 0.84 | 0.0105 | 1.00 | 0.0105 | 1.00 | 0.0105 | 1.00 | 0.4422 | 0.75 | |
CVE-2018-18700 | uncontrolled-recursion | 9.6 | 14.9 | 17.8 | T/O | T/O | T/O | T/O | 0.3315 | 0.63 | 0.0461 | 0.88 | 0.0105 | 1.00 | 0.0105 | 1.00 | 0.0105 | 1.00 | 0.0105 | 1.00 | |
CVE-2018-18701 | uncontrolled-recursion | 13.9 | T/O | T/O | T/O | T/O | T/O | T/O | 0.0019 | 1.00 | 0.0019 | 1.00 | 0.0019 | 1.00 | 0.0019 | 1.00 | 0.0019 | 1.00 | 0.0019 | 1.00 | |
CVE-2019-9070 | uncontrolled-recursion | 18.4 | 15.6 | 13.9 | T/O | T/O | T/O | 15.8 | 0.4419 | 0.56 | 0.4419 | 0.44 | 0.0105 | 1.00 | 0.0101 | 1.00 | 0.0101 | 1.00 | 0.4419 | 0.56 | |
CVE-2019-9071 | uncontrolled-recursion | 12.4 | T/O | 14 | T/O | T/O | T/O | T/O | 0.0344 | 0.88 | 0.2352 | 0.69 | 0.0344 | 0.88 | 0.0105 | 0.88 | 0.0344 | 1.00 | 0.0344 | 0.88 | |
nasm | CVE-2019-6290, (CVE-2018-1000886) | uncontrolled-recursion | 0.9 | T/O | 19 | 9 | T/O | T/O | 17.6 | 0.0105 | 1.00 | 0.0152 | 1.00 | 0.0152 | 1.00 | 0.0105 | 1.00 | 0.0105 | 1.00 | 0.0147 | 1.00 |
CVE-2019-6291 | uncontrolled-recursion | 1.5 | 9 | 14 | 8.7 | T/O | T/O | 7.5 | 0.0303 | 0.94 | 0.0152 | 1.00 | 0.0152 | 1.00 | 0.0105 | 1.00 | 0.0105 | 1.00 | 0.0152 | 1.00 | |
flex | CVE-2019-6293 | uncontrolled-recursion | 5.4 | T/O | T/O | T/O | T/O | T/O | T/O | 0.0036 | 1.00 | 0.0036 | 1.00 | 0.0036 | 1.00 | 0.0036 | 1.00 | 0.0036 | 1.00 | 0.0036 | 1.00 |
prase | CVE-2019-6292, (CVE-2019-6285) | uncontrolled-recursion | 0.4 | T/O | 18.4 | 0.9 | T/O | T/O | T/O | 0.0101 | 1.00 | 0.0127 | 1.00 | 0.0956 | 0.81 | 0.0101 | 1.00 | 0.0101 | 1.00 | 0.0101 | 1.00 |
CVE-2018-20573, (CVE-2018-20574) | uncontrolled-recursion | 6.1 | T/O | T/O | 12.4 | T/O | T/O | T/O | 0.0334 | 0.88 | 0.0334 | 0.84 | 0.0843 | 0.84 | 0.0334 | 0.84 | 0.0105 | 1.00 | 0.0334 | 0.84 | |
sassc | CVE-2018-19837 | uncontrolled-recursion | 1.6 | 13.3 | 10.5 | 1.8 | 8.5 | T/O | 5 | 0.0451 | 0.88 | 0.0461 | 0.88 | 0.3325 | 0.63 | 0.0461 | 0.88 | 0.0105 | 1.00 | 0.0769 | 0.81 |
CVE-2018-20821 | uncontrolled-recursion | 0.1 | 5.7 | 6.5 | 0.1 | 9.5 | T/O | 7.4 | 0.0105 | 1.00 | 0.0105 | 1.00 | nan | 0.50 | 0.0105 | 1.00 | 0.0105 | 1.00 | 0.0105 | 1.00 | |
CVE-2018-20822 | uncontrolled-recursion | 15.6 | 14.3 | 19.5 | 14.6 | 0.92 | 11.3 | 10.5 | 0.4388 | 0.50 | 0.4388 | 0.56 | 0.5000 | 0.47 | 0.4388 | 0.56 | 0.0105 | 0.00 | 0.4408 | 0.44 | |
yara | CVE-2017-9438, (CVE-2017-9304) | uncontrolled-recursion | 0.2 | 0.9 | 0.61 | 4.3 | 5.3 | T/O | 0.8 | 0.0128 | 1.00 | 0.0132 | 1.00 | 0.0325 | 0.91 | 0.0132 | 1.00 | 0.0105 | 1.00 | 0.0132 | 1.00 |
readelf | CVE-2017-15996 | uncontrolled-memory-allocation | 0.2 | 0.3 | 0.2 | 0.5 | 0.3 | T/O | 0.3 | 0.0326 | 0.86 | 0.1821 | 0.68 | 0.0144 | 0.92 | 0.183 | 0.68 | 0.0105 | 1.00 | 0.0091 | 0.96 |
exiv2 | CVE-2018-4868 | uncontrolled-memory-allocation | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | nan | 0.50 | nan | 0.50 | nan | 0.50 | nan | 0.50 | nan | 0.50 | nan | 0.50 |
mp42hls | CVE-2018-20186 | uncontrolled-memory-allocation | 0.4 | 0.4 | 0.4 | 0.4 | 0.4 | 0.1 | 0.4 | nan | 0.50 | nan | 0.50 | nan | 0.50 | nan | 0.50 | 0.0105 | 0.00 | nan | 0.50 |
CVE-2019-7698, (CVE-2019-6966) | uncontrolled-memory-allocation | 14.6 | T/O | T/O | T/O | T/O | 0.5 | T/O | 0.0190 | 1.00 | 0.0190 | 1.00 | 0.0190 | 1.00 | 0.0190 | 1.00 | 0.0105 | 0.00 | 0.0190 | 1.00 | |
listswf | CVE-2019-7581, (CVE-2018-7876) | uncontrolled-memory-allocation | 0.6 | 0.8 | 1.4 | 2 | 0.4 | T/O | 1.6 | 0.2009 | 0.68 | 0.0712 | 0.80 | 0.0296 | 0.88 | 0.2648 | 0.36 | 0.0105 | 1.00 | 0.0712 | 0.80 |
CVE-2019-7582, (CVE-2018-13251) | uncontrolled-memory-allocation | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | nan | 0.50 | nan | 0.50 | nan | 0.50 | nan | 0.50 | nan | 0.50 | nan | 0.50 | |
issue#155 | uncontrolled-memory-allocation | 1.4 | 1 | 1.3 | 1.4 | 1.2 | T/O | 1.6 | 0.1728 | 0.30 | 0.2654 | 0.36 | 0.338 | 0.40 | 0.3766 | 0.42 | 0.0105 | 1.00 | 0.2654 | 0.64 | |
opj_decompress | CVE-2019-6988 | uncontrolled-memory-allocation | 7.8 | 15.1 | 11.1 | T/O | T/O | T/O | 15.3 | 0.0326 | 0.86 | 0.0417 | 0.84 | 0.0075 | 1.00 | 0.0075 | 1.00 | 0.0075 | 1.00 | 0.0449 | 0.81 |
CVE-2017-12982 | uncontrolled-memory-allocation | 4.5 | 11.4 | 10 | T/O | 11.9 | T/O | 10 | 0.0147 | 0.72 | 0.2016 | 0.60 | 0.0075 | 1.00 | 0.2648 | 0.64 | 0.0075 | 1.00 | 0.3566 | 0.50 | |
jasper | CVE-2016-8886 | uncontrolled-memory-allocation | 4.1 | 17 | 22.3 | T/O | 10.3 | T/O | 18.2 | 0.0285 | 0.88 | 0.0059 | 1.00 | 0.0075 | 1.00 | 0.5000 | 0.52 | 0.0075 | 1.00 | 0.0296 | 0.88 |
issue#207 | uncontrolled-memory-allocation | 1.7 | 2.2 | 3.6 | T/O | 2.2 | 15.9 | 4 | 0.338 | 0.62 | 0.5 | 0.68 | 0.0075 | 1.00 | 0.4582 | 0.68 | 0.0075 | 1.00 | 0.2654 | 0.64 | |
umber of vulnerabilities revealed | 33 | 26 | 28 | 20 | 17 | 6 | 25 | ||||||||||||||
Detectability Improvement | - | +26.92% | +17.86% | +65.00% | +94.12% | +450.00% | +32.00% | ||||||||||||||
Average Time Use | 5.4 | 11.6 | 11.6 | 11.9 | 14.5 | 20.3 | 11.2 | ||||||||||||||
Time Improvement | - | 2.15X | 2.15X | 2.20X | 2.69X | 3.76X | 2.07X |
Memory leak bugs are a little different from uncontrolled-recursion and uncontrolled-memory-allocation bugs, because they may not lead to program crashes immediately. Only enough memory is leaked, it would produce Denial-of-Service (DoS) attack, for example, in a long time running programs (e.g., banking service). To evaluate the effectiveness of fuzzers in finding memory leak, we look into the number of total bytes leaked within 24 hours during 6 different state-of-the-art fuzzers.
The table shows the number of total bytes leaked for each fuzzer on different programs. We can see that \toolname shows an obvious advantage over other baseline fuzzers. The number of bytes leaked is improved by from 257% to 49633%, compared to other baseline fuzzers. This is because MemLock tries to maximize each allocation and generates inputs with high memory consumption. When the memory leak happens, those memory-consuming inputs will often cause more-bytes memory leakage. In particular, after manual analysis, we have found that the total leak size of AFL depends too much on the choice of the initial seeds. It lacks the capability of breaking through the limitation caused by the initial seed inputs.
Since MemLock seeks to generate test inputs that consume more and more memory. In this experiment, we evaluate the test input distribution according to memory consumption for each fuzzer we have tested. A fuzzer that maintains a seed pool with a larger proportion of high memory consumption inputs is considered to have a better chance of detecting memory consumption-related bugs.
The figure shows the input distribution based on memory consumption. In general, we can clearly see that MemLock can generate more seeds with higher memory consumption. This is because the guidance mechanisms in ExMen help to gradually add more and more memory consuming inputs into the seed pool. In particular, for the uncontrolled-recursion bugs (nm, nasm, flex and yara), MemLock generates a large number of inputs that hold more than 30,000 function calls in the call stack, while PerfFuzz generates only a few and AFL/AFLfast can hardly generate inputs that hold more than 10,000 function calls. The pattern is similar for uncontrolled-memory-allocation bugs (readelf, openjpeg, jasper and libming). MemLock can generate a considerable amount of inputs with high memory consumption while the inputs of the other fuzzers concentrate on the low memory consumption region. The results clearly demonstrate the effectiveness of the strategies of MemLock in generating inputs with high memory consumption.
To determine the runtime overhead caused by the guidance mechanisms and static analysis required for MemLock, we compare the code coverage, the execution speed and regular crashes for each baseline fuzzer.
The result of our coverage measurements is shown in the Figure. All we can see, in most of the cases, AFL consistently outperforms MemLock, but the size of the effect varies substantially between target applications. For instance, on nasm and readelf, MemLock and AFL perform similarly well after several hours. But on nm and jasper, AFL covers more edges than MemLock. Our result shows that the coverage of MemLock is still comparable.
Since MemLock intentionally keeps seeds that increase memory consumption, we hypothesize that the test execution over time will take longer. The number of executions achieved by MemLock, AFLfast, PerfFuzz, and FairFuzz are shown in the Figure. MemLock can achieve between 20% and 84% of the executions achieved by AFL, while PerfFuzz can achieve between 9% and 78%. Intuition is that PerfFuzz would much slow down the execution speed, as it is a dedicated fuzzer for maximizes the number of instructions executed. But another interesting observation is that increasing the consumed memory could also much slow down the execution speed. For instance, the execution speed of MemLock is lower than PerfFuzz in exiv2 and listswf. This indicates that memory consumption is related not only to the space issue, but also to the time issue.
As shown in the Other column in the table above, MemLock discover fewer regular crashes than other baseline fuzzers. But our effort is still worth it, as MemLock has a different objective and can expose more memory consumption bugs. As discussed above, the main reason would be the side effect (e.g., coverage, execution speed) caused by our approach.
As an external command, additional overhead may incur for perf mem ’s interaction with the fuzzer. According to our experiments, perf mem would introduce an overhead of 71.3%, perf trace would introduce an overhead of 601% comparing with MemLock's instrumentation.
The detail experimental result can be seen in Table 1. We quickly perform each experiment for 12 hours and calculate the average execution speed. Note that we perfrom perf mem with command line afl-fuzz ... -- perf mem -D record /path/to/fuzzed_app [ ... ]
, and perf trace with command line afl-fuzz ... -- perf trace --call-graph fp /path/to/fuzzed_app [ ... ]
. The average execution speed of the program with MemLock's instrumentation is 680/sec, while the perf trace is 97/sec.
Vulnerable code in flex/src/nfa.c
To demonstrate the reason behind MemLock's superiority, we present the case of CVE-2019-6293, where can found by only MemLock in our experiments. It is an uncontrolled-recursion vulnerability in flex, which is a lexical analyzer generator. This CVE is triggered by the mark_beginning_as_normal function making recursive calls to itself in certain scenarios involving lots of '*' characters, resulting in stack-overflow. We can directly look at the snippet code of nfa.c, as shown in the Figure. Intuitively, in some cases, function mark_beginning_as_normal shows the behavior of recursive calls.
As a brief explanation, This function mark each "beginning" state in a machine as being a "normal" states, and the "beginning" states are the epsilon closure of the first state. transchar represents the transition character of NFA state, trans1 is the state that NFA state arrives after accepting tranchar, and trans2 is the state of NFA reachable through epsilon. To make this function call to itself, firstly, the input should be valid and can success to reach mark_beginning_as_normal function at line 3. Second, the execution should get into the true branch of the if condition on line 10. Third, to reach line 15, this requires the current state to reach another state through epsilon. To satisfy these three requirements, rules with many '*' characters need to be constructed in the input. AFL discard those input that can make mark_beginning_as_normal functions to call itself multiple times, as those inputs do not increase coverage. We can see the peak length of call stack in the Figure. AFL does not retain any seed over 5000 lengths. Comparing to AFL, MemLock intentionally keeps seeds that increase the peak length of call stack, and finally triggering stack-overflow. This explains the reason why MemLock can found the vulnerability, while other fuzzers can not detect it in all 5 runs.
More interestingly, MemLock takes only 5.4 hours on average to discover this vulnerability (CVE-2019-6293), while other fuzzers all fail. We investigate MemLock's mutation history and identify a key mutation step. After a large number of mutation for initial seed, the execution successfully entry the mark_beginning_as_normal function. Once an input causes a recursive call to itself many times, MemLock update such input in Seed Queue. It increases the recursive depth of mark_beginning_as_normal function by adding some '*' characters into the input context, through havoc operation. Usually MemLock retains some seeds from different paths to mark_beginning_as_normal function. MemLock retains a seed for each path that can produce the maximum peak length of call stack. If the splice operation take two inputs with a large number of '*' characters, there is a certain probability that the the peak length of call stack can be doubled instantaneously. After 5 hours, MemLock success in generating inputs that make the peak length of call stack exceed 25000, and finally triggering stack-overflow.
Vulnerable code in cp-demangle.c in Binutils v2.31
We illustrate the limitations of existing coverage-based grey-box fuzzing techniques for detecting memory consumption bugs with two examples summarized from real-world vulnerabilities. We use the vulnerability CVE-2018-17985 in the Figure to demonstrate a uncontrolled-recursion bug.
In the Figure, the function cplus_demangle_type recursively calls itself in line 12 when the input contains the character 'P'. The depth of recursion depends on the number of character 'P's in the input. With a sufficiently large recursive depth, the execution would run out of stack memory, causing stack overflow. To trigger a stack overflow, the fuzzer would need to generate inputs containing a large number of character 'P's.
However, existing coverage-based grey-box fuzzers do not have enough awareness about the change in recursive depth and solely use coverage information to retain interesting inputs. Take AFL as an example, it is aware of repeatedly executed CFG edges but only in a coarse manner. To be specific, AFL adopts the concept of "loop bucket" to retain interesting inputs. The loop bucket cannot tell the fine-grained change in recursive depth. Specially, it does not differentiate the change when the recursive depth is greater than $255$. Nevertheless, this number is still very far from causing stack exhaustion, which normally requires tens of thousands of recursive depth.
Therefore, to expose uncontrolled-recursion effectively, grey-box fuzzers need to have precise awareness about the stack memory consumption of the target program when executing an input.
Vulnerable code in jp2image.cpp in Exiv2 v0.26
The Figure demonstrates an uncontrolled-memory-allocation problem in exiv2. At line 11-12, when the program parses a subBox in readMetadata(), a length is extracted from the user inputs. Then the length is fed directly into DataBuf() at line 13. Finally, this value is used as the size of a memory allocation request at line 4. Note that the program does not check the size before allocating memory. By carefully handcrafting the input, an adversary can provide arbitrarily large values for subBox.length, leading to program crash (i.e., std::bad_alloc) or running out of memory.
To trigger this problem, the fuzzer would need to generate inputs with a large subBox.length. For this purpose, the fuzzer needs to collect information about the value of subBox.length to retain the interesting inputs that can incur a large memory consumption.
However, existing coverage-based grey-box fuzzers lack awareness about the value of subBox.length. Therefore, they cannot effectively generate inputs causing subBox.length to become larger. Take AFL as an example, let us assume AFL now holds a seed input a which incurs the subBox.length of 100 and causes the function to enter the while at line 11 and eventually return at line 16. After some mutations, AFL may generate another input b which incurs the subBox.length of 10000 and also causes the function to enter the while at line 11 and return at line 16. We can clearly see that comparing with a, b consumes much more memory and is closer to running out of memory. However, AFL will discard input $b$ and will not retain it as a seed because b does not bring new branch coverage. Consequently, AFL cannot detect this uncontrolled-memory-allocation problem effectively.
Therefore, to expose uncontrolled-memory-allocation effectively, grey-box fuzzers also need to have precise awareness about the amount of consumed heap memory of the target program when executing an input.
MemLock's Website:
https://wcventure.github.io/MemLock/
MemLock's Github:
https://github.com/wcventure/MemLock-Fuzz
MemLock's Benchmark:
MemLock's Publication: