Code coverage metrics for libFuzzer #41

JulianVolodia · 2020-05-08T08:35:28Z

Hi!

I want to better know how experienced ppl measure coverage for fuzzing nowadays.
There was quite nice method with sancov and libFuzzer -dump_coverage=1 flag in older libFuzzer version, but now deprecated.
I seen that 15mo and 2y ago @kcc was involved in it, so maybe you know what should be done instead?

I haven't managed to make Clang Coverage working with libxml2 fuzzing example mentioned in 8th lesson of Dor1s/libfuzzer-workshop, so could you tell me:

what is 'rule of thumb' for managing code coverage now?
is there any example of Clang Coverage done with complex library and fuzzer to see how it was done and learn from it?
which libFuzzer version is used on OSS-Fuzz project?

Best regards!

The text was updated successfully, but these errors were encountered:

kcc · 2020-05-11T23:36:35Z

Hi Volodia,

On Fri, May 8, 2020 at 1:35 AM Volodia ***@***.***> wrote: Hi! I want to better know how experienced ppl measure coverage for fuzzing nowadays.

By "measure", do you mean measuring for automated purposes (like corpus expansion during fuzzing) or for visualization and tracking for human consumption?

There was quite nice method with sancov and libFuzzer -dump_coverage=1 flag in older libFuzzer version, but now deprecated. I seen that 15mo and 2y ago @kcc <https://github.com/kcc> was involved in it, so maybe you know what should be done instead? I haven't managed to make Clang Coverage working with libxml2 fuzzing example mentioned in 8th lesson of Dor1s/libfuzzer-workshop, so could you tell me: 1. what is 'rule of thumb' for managing code coverage now? OSS-Fuzz maintains a separate build of all fuzz targets with Clang

Coverage and provides the coverage dashboard produced by that build. This is the way we recommend since Clang Coverage has very good visualization.

1. is there any example of Clang Coverage done with complex library and fuzzer to see how it was done and learn from it?

All OSS-Fuzz projects (maybe with some minor exceptions) us Clang Coverage. Perhaps you can send us the description of the problems you are having?

1. which libFuzzer version is used on OSS-Fuzz project?

Current head (maybe with a few weeks delay)

…

--kcc

Best regards! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#41>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANU24L5CMBUYACCAPB2ILTRQO753ANCNFSM4M374SAQ> .

JulianVolodia · 2020-09-07T19:31:55Z

I think I lacking some basic experience and understanding. Until going deeper I think I can't go through it. Give me one more week and if I could I would be very happy if You could help me with my doubts. Still, thanks for info You gave me, @kcc

JulianVolodia · 2020-09-07T19:33:01Z

If you have any resources worth reading about this or could throw me any link which works well and generate that lovely graph that would be awesome.

Dor1s · 2020-09-10T20:26:02Z

https://clang.llvm.org/docs/SourceBasedCodeCoverage.html page has the instructions on how generate code coverage report for a single file.

If you want to generate code coverage report for a fuzz target linked with some library (e.g. libxml), you need to make sure that all files are compiled with -fprofile-instr-generate -fcoverage-mapping.

damgut · 2022-02-01T10:34:48Z

I have the same problem. I would like to have a visual coverage, like the tool "gcovr" for gcc does (e.g. in html).
As @Dor1s wrote, I can generate a simple coverage in text mode by doing:

clang -fprofile-instr-generate -fcoverage-mapping hello.c
LLVM_PROFILE_FILE="coverage.profraw" a.out # this command creates file "coverage.profraw"
llvm-profdata-10 merge -sparse coverage.profraw -o coverage.profdata # this command creates file "coverage.profdata"
llvm-cov-10 show --format=html ./a.out -instr-profile=coverage.profdata > coverage.html

But the problem is that if I use "-fprofile-instr-generate -fcoverage-mapping" together with "-fsanitize=address,fuzzer", after the execution stops (crash or exit) no file ".profraw" is created. I guess the reason is that sanitize breaks the program execution before ".profraw" is created.

Any ideas?

maflcko · 2022-02-01T10:39:01Z

You don't need the address sanitizer enabled to create coverage for you source code. -fsanitize=fuzzer together with the coverage flags should be enough and work around any sanitizer issues. Though, I recommend addressing the address sanitizer reports regardless.

damgut · 2022-02-01T11:50:17Z

Hi @MarcoFalke !
I need address sanitizer. But nevertheless this is not the problem. If I use only "-fsanitize=fuzzer" the result is the same.
Here is the code "hello.cc":

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

//int main() {
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    printf("Hello World\n");
    exit(0);
}

And this is the compilation and execution:

# Compilation:
> clang -fprofile-instr-generate -fcoverage-mapping -g -fsanitize=fuzzer hello.cc

# Execution:
> ./a.out -print_coverage=1
INFO: Seed: 1320401412
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
Hello World
==106162== ERROR: libFuzzer: fuzz target exited
    #0 0x4af000 in __sanitizer_print_stack_trace (.../a.out+0x4af000)
    #1 0x45b308 in fuzzer::PrintStackTrace() (.../a.out+0x45b308)
    #2 0x44050c in fuzzer::Fuzzer::ExitCallback() (.../a.out+0x44050c)
    #3 0x7f167468ca26 in __run_exit_handlers /build/glibc-eX1tMB/glibc-2.31/stdlib/exit.c:108:8
    #4 0x7f167468cbdf in exit /build/glibc-eX1tMB/glibc-2.31/stdlib/exit.c:139:3
    #5 0x4af310 in LLVMFuzzerTestOneInput .../hello.cc:8:5
    #6 0x441b11 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (.../a.out+0x441b11)
    #7 0x44384a in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__Fuzzer::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) (.../a.out+0x44384a)
    #8 0x443ed9 in fuzzer::Fuzzer::Loop(std::__Fuzzer::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) (.../a.out+0x443ed9)
    #9 0x432bae in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (.../a.out+0x432bae)
    #10 0x45b9f2 in main (.../a.out+0x45b9f2)
    #11 0x7f167466a0b2 in __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:308:16
    #12 0x40794d in _start (.../a.out+0x40794d)

SUMMARY: libFuzzer: fuzz target exited
MS: 0 ; base unit: 0000000000000000000000000000000000000000
artifact_prefix='./'; Test unit written to ./crash-da39a3ee5e6b4b0d3255bfef95601890afd80709
Base64:

COVERAGE:
UNCOVERED_FUNC: hits: 0 edges: 0/1 LLVMFuzzerTestOneInput .../hello_fuzzer.cc:5

⚠️ The last 2 lines show a primitive coverage text report, perhaps someone can tell me how to convert it to a friendly format (.gcda .gcno or .html)?

Anyway, the file "default.profraw" is created but with size 0:

> ls -l
-rwxrwxr-x 1 developer developer 1356920 Feb  1 12:36 a.out
-rw-rw-r-- 1 developer developer       0 Feb  1 12:36 crash-da39a3ee5e6b4b0d3255bfef95601890afd80709
-rw-rw-r-- 1 developer developer       0 Feb  1 12:36 default.profraw
-rw-rw-r-- 1 developer developer     206 Feb  1 12:30 hello.cc

If I remove "-fsanitize=fuzzer" (and use main() in hello.cc), then the file "default.profraw" is created (and I can see the coverage);

> clang -fprofile-instr-generate -fcoverage-mapping -g  hello.cc
> ./a.out
Hello World
> ls -l
total 92
-rwxrwxr-x 1 developer developer 78968 Feb  1 12:43 a.out*
-rw-rw-r-- 1 developer developer   152 Feb  1 12:44 default.profraw
-rw-rw-r-- 1 developer developer   206 Feb  1 12:43 hello.cc
> file default.profraw
default.profraw: LLVM raw profile data, version 5

Dor1s · 2022-02-02T07:23:24Z

@damgut you need to exclude any crash inputs when generating code coverage report. You're right that something breaks the program execution before the .profraw is dumped -- it is a fuzzer crash. To have a good coverage report for your fuzzer, let it run for a while and then use the generation corpora for code coverage generation. It also will be faster to do so.

run a normal fuzzer build (without coverage instrumentation)

./fuzzer -any_other_runtime_flags=1 ./your_corpora_directory

minimize the corpora

mkdir corpora_minimized
./fuzzer -merge=1 ./corpora_minimized ./your_corpora_directory

run coverage instrumented build over the minimized corpora

./fuzzer -runs=0 ./corpora_minimized

the .profraw generated at the last step will have the most accurate code coverage that your fuzzer was able to achieve while fuzzing on step 1

damgut · 2022-02-02T14:59:45Z

Thanks @Dor1s, I was trying what you proposed and it works. That means 2 runs: first run until fuzzy crashes and a second run by using the corpus files to generate the coverage in .profraw. The only thing missing in the coverage will be the last path if the fuzz test crashes but this is not terrible.

Nevertheless I still have difficulties to generate .profraw in a big project. I see I can get also a text coverage by passing the option -print_coverage=1 to the executable. The output goes to the console and looks like this:

COVERAGE:
UNCOVERED_FUNC: hits: 0 edges: 0/3 init foo.cpp:97
UNCOVERED_FUNC: hits: 0 edges: 0/1 start foo.cpp:0
UNCOVERED_FUNC: hits: 0 edges: 0/3 open() foo.h:162
COVERED_FUNC: hits: 5 edges: 4/7 bla(int a) foo.cpp:118
UNCOVERED_PC: foo.cpp:0
UNCOVERED_PC: foo.cpp:119
....

Do you know if there is a way to convert this text output in a friendly format (like gcovr which produces an html output)

Dor1s · 2022-02-05T06:40:44Z

-print_coverage=1 has nothing to do with LLVM Source-based coverage instrumentation (which generates .profraw files). What are the difficulties you're having with a big project? Are you sure all the files were instrumented with -fprofile-instr-generate -fcoverage-mapping? Is the application you're running single process or does it spawn multiple processes?

damgut · 2022-02-07T21:26:59Z

Thanks @Dor1s and everybody for the fast answers!

My last problem was that I've missed the options -fprofile-instr-generate -fcoverage-mapping when calling the linker. Now everything works as expected.

Summary

I've found 3 different ways to get a coverage when using -fsanitize=address together with fuzzing:

Using -print_coverage=1

Compile and link with -fprofile-instr-generate -fcoverage-mapping options. Then call the executable by passing the option -print_coverage=1. After the execution is finished (even by abort or crash), a very long list is printed to stdout indicating which lines where accessed. Unfortunately I didn't find any tool which can parse this info to display it in a friendly manner.

Using .profraw file

Compile and link with -fprofile-instr-generate -fcoverage-mapping options. When using -fsanitize=address, no .profraw will be written on crash or abort, so once the fuzzy test is finished, a second run is needed by passing only files in corpus, as @Dor1s proposed above: ./fuzzer -runs=0 ./corpora_minimized
Then to generate an html view I've used:

# create "coverage.profdata"
llvm-profdata-10 merge -sparse coverage.profraw -o coverage.profdata
# Generate output
llvm-cov-10 show --format=html ./a.out -instr-profile=coverage.profdata > coverage.html

The disadvantage here is that coverage.html is a single big file which contain a list of files. There is no summary or statistics.

Using gcovr

This is my favorite since the generated html contains different files, one for each source code, together with a summary and nice statistics. Here also 2 runs are needed:

First run to generate the corpus files (for this no specific coverage options is needed). You can start many runs as you want to fill the corpus.
Compile again by using the option --coverage. I would not recommend to use this option for the first run since it introduces an additional overhead to generate the .gcno .gcda files (in my measurement the execution took 35% longer). Since coverage run is not done frequently and normally with few files in the corpus, this additional time is not critical.
To generate the coverage:

./fuzzer -runs=0 ./corpora_minimized
mkdir coverage
gcovr --gcov-executable "llvm-cov-10 gcov" --html --html-details \
--object-directory=[directory where .o .gcno .gcdaj are located] \
-r [root directory, normally .] \
-f [filter for source files as regex, for example .*src/.*] \
-o coverage/coverage.html

See also: https://stackoverflow.com/questions/60840386/how-do-i-produce-a-graphical-code-profile-report-for-c-code-compiled-with-clan

chinggg · 2022-07-03T16:33:05Z

Thanks to everyone involved in the discussion! I find this issue really helpful since there seems to be no official document about generating code coverage reports for libfuzzer. Just FYI, I find another tool to get libfuzzer HTML coverage overview https://github.com/vanhauser-thc/libfuzzer-cov

vors · 2023-02-28T06:21:55Z

Hi friends! I have troubles with empty coverage. I tried running the simple @damgut 's example (thank you for the documenting it) #41 (comment) in latest clang docker container and it doesn't produce the COVERAGE.

Here is the repro (based on @damgut 's post).

create hello.cc file

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

//int main() {
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    printf("Hello World\n");
    exit(0);
}

Run docker container with latest clang

docker run -v$(pwd):/app -it silkeh/clang:15-bullseye

Compile in it and run

clang -fprofile-instr-generate -fcoverage-mapping -g -fsanitize=fuzzer hello.cc
./a.out -print_coverage=1

Output that I'm getting

root@bed059cbaae9:/app# ./a.out -print_coverage=1
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 1659063642
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
Hello World
==12== ERROR: libFuzzer: fuzz target exited
    #0 0x561dd2bf6784 in __sanitizer_print_stack_trace (/app/a.out+0x68784) (BuildId: d8b0663adf9142cf63d5e69ce28e312bd0471ab9)
    #1 0x561dd2bcdb37 in fuzzer::PrintStackTrace() (/app/a.out+0x3fb37) (BuildId: d8b0663adf9142cf63d5e69ce28e312bd0471ab9)
    #2 0x561dd2bb3eac in fuzzer::Fuzzer::ExitCallback() (/app/a.out+0x25eac) (BuildId: d8b0663adf9142cf63d5e69ce28e312bd0471ab9)
    #3 0x7f3b0aaba4d6  (/lib/x86_64-linux-gnu/libc.so.6+0x3b4d6) (BuildId: b503275bf9fee51581fdceef97533b194035b4f7)
    #4 0x7f3b0aaba679 in exit (/lib/x86_64-linux-gnu/libc.so.6+0x3b679) (BuildId: b503275bf9fee51581fdceef97533b194035b4f7)
    #5 0x561dd2bf6a96 in LLVMFuzzerTestOneInput /app/hello.cc:8:5
    #6 0x561dd2bb5512 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/app/a.out+0x27512) (BuildId: d8b0663adf9142cf63d5e69ce28e312bd0471ab9)
    #7 0x561dd2bb67d0 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/app/a.out+0x287d0) (BuildId: d8b0663adf9142cf63d5e69ce28e312bd0471ab9)
    #8 0x561dd2bb6e93 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/app/a.out+0x28e93) (BuildId: d8b0663adf9142cf63d5e69ce28e312bd0471ab9)
    #9 0x561dd2ba51f2 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/app/a.out+0x171f2) (BuildId: d8b0663adf9142cf63d5e69ce28e312bd0471ab9)
    #10 0x561dd2bce462 in main (/app/a.out+0x40462) (BuildId: d8b0663adf9142cf63d5e69ce28e312bd0471ab9)
    #11 0x7f3b0aaa2d09 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d09) (BuildId: b503275bf9fee51581fdceef97533b194035b4f7)
    #12 0x561dd2b99cc9 in _start (/app/a.out+0xbcc9) (BuildId: d8b0663adf9142cf63d5e69ce28e312bd0471ab9)

SUMMARY: libFuzzer: fuzz target exited
MS: 0 ; base unit: 0000000000000000000000000000000000000000


artifact_prefix='./'; Test unit written to ./crash-da39a3ee5e6b4b0d3255bfef95601890afd80709
Base64: 
COVERAGE:

Notice nothing is printed at the end.

Dor1s · 2023-03-06T05:31:32Z

@vors you have three options:

remove inputs that trigger exit() from your corpus
if you really need to have exit() invoked as part of the expectation behavior, you need to call __llvm_profile_dump prior to exiting the program
on Mac, you can try using %c pattern in the LLVM_PROFILE_FILE value (see https://clang.llvm.org/docs/SourceBasedCodeCoverage.html#running-the-instrumented-program - it also has a bit more context about program exits / crashes)

vors · 2023-03-06T07:37:04Z

@Dor1s oh interesting, thank you for the replay! FWIW this doesn't affect the . profraw flow -- I just realized that I was still able to create this datafile.

Fuzzing binary now searches for environment variable `FUZZ_CAMPAIGN_MINUTES` to automatically limit, halt execution, and dump gcov data once X minutes have elapsed. This was required to extract gcov data from a fuzzing binary as under normal circumstances manually aborting the execution did not produce any gcov data. google/fuzzing#41

lz101010 mentioned this issue May 19, 2022

[TLS 1.3] Fuzz Target for Handshake Message Parsing randombit/botan#2977

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code coverage metrics for libFuzzer #41

Code coverage metrics for libFuzzer #41

JulianVolodia commented May 8, 2020

kcc commented May 11, 2020 via email

JulianVolodia commented Sep 7, 2020

JulianVolodia commented Sep 7, 2020

Dor1s commented Sep 10, 2020

damgut commented Feb 1, 2022 •

edited

maflcko commented Feb 1, 2022

damgut commented Feb 1, 2022 •

edited

Dor1s commented Feb 2, 2022

damgut commented Feb 2, 2022 •

edited

Dor1s commented Feb 5, 2022 •

edited

damgut commented Feb 7, 2022 •

edited

chinggg commented Jul 3, 2022

vors commented Feb 28, 2023

Dor1s commented Mar 6, 2023

vors commented Mar 6, 2023

Code coverage metrics for libFuzzer #41

Code coverage metrics for libFuzzer #41

Comments

JulianVolodia commented May 8, 2020

kcc commented May 11, 2020 via email

JulianVolodia commented Sep 7, 2020

JulianVolodia commented Sep 7, 2020

Dor1s commented Sep 10, 2020

damgut commented Feb 1, 2022 • edited

maflcko commented Feb 1, 2022

damgut commented Feb 1, 2022 • edited

Dor1s commented Feb 2, 2022

damgut commented Feb 2, 2022 • edited

Dor1s commented Feb 5, 2022 • edited

damgut commented Feb 7, 2022 • edited

Summary

Using -print_coverage=1

Using .profraw file

Using gcovr

chinggg commented Jul 3, 2022

vors commented Feb 28, 2023

Dor1s commented Mar 6, 2023

vors commented Mar 6, 2023

damgut commented Feb 1, 2022 •

edited

damgut commented Feb 1, 2022 •

edited

damgut commented Feb 2, 2022 •

edited

Dor1s commented Feb 5, 2022 •

edited

damgut commented Feb 7, 2022 •

edited