Skip to content

Compared

Martin Pool edited this page Jul 31, 2022 · 3 revisions

Mutations vs Fuzz testing

Fuzz testing primarily detects cases where bad input could crash your program or cause a security problem. Mutation testing primarily detects code that's not well tested; finding missing tests might indirectly help find bugs.

Fuzz testing randomly varies the input to the program and sees if the program will crash. Mutation testing deterministically varies the program (in ways that will probably still compile) and sees if your tests detect the change.

When cargo-mutants makes pseudo-random edits to a copy of your program, such as deleting the body of a function, those edits are probably incorrect. If no test notices the bug, that might indicate a gap in test coverage. (Or, it might be that the edit is semantically harmless, or it changes something that is very hard to test hermetically, such as making something slower…)

Fuzz testing is generally used to tell you that your program will crash/hang/spin on bad input (i.e. there are bugs in your code, if you can’t trust the input).

Fuzz testing (e.g. with cargo-fuzz for Rust) requires that you have an interface to “run the code on this blob of bytes”, e.g. “parse this blob of bytes as an MP3 file”. Mutation testing doesn’t make any assumptions about the structure and doesn’t need you to write a harness: you can just run it on any existing crate with hermetic tests.

Fuzz testing is necessarily sampling-based because there are astronomically many possible input files once you get beyond a couple of bytes of input: 256 << n_bytes. Mutation testing can in principle test every mutation it can generate. (Although for large crates that might be a large number, it’s still probably only in the thousands-millions, not astronomical.)

Mutations vs Coverage

Both mutation testing and coverage measurement aim to tell you something about gaps in the test suite of your program. But they can give you somewhat different information, so in my opinion both are useful.

cargo-mutants should work the same way on any platform that can build rust, with no need for any other tools or setups, and no modifications to the source tree. Many coverage tools need OS-specific setup, only work on some platforms, require setup of external tools or third-party services, or can be fiddly to get working. (Although, great work is being done in the Rust community to address these issues and to provide easier-to-use tooling.)

cargo-mutants output is, in my opinion, more immediately actionable: a list of function return values that wouldn't be detected by tests, each of which you can fix, skip, or ignore. Examining coverage output tends to require scrolling through annotated files to see which parts might need to be covered.

Some coverage tools report that some lines are not covered for mysterious reasons connected to the heuristic mapping between source code and the binary instructions that are actually measured. It can be harder, in my opinion, to understand whether a coverage gap is something you should fix.

Coverage tools run the tests just once to measure coverage, but cargo-mutants runs them once for each mutant, and so it's probably slower.

Coverage tools tell you that the line (or in some cases, branch) was executed during a test. Mutation testing tells you that a test actually depends on the results of executing a function. A function whose result is ignored or overwritten by later code will be marked as covered, but flagged by mutation testing.

Coverage works at the granularity of a line, whereas mutation typically works at a larger granularity. Currently, cargo-mutants replaces entire functions.

It is possible to use either coverage measures or mutation testing as a pre-submit check, and some people do.

cargo-mutants vs Mutagen

There's an existing Rust mutation testing tool called Mutagen.

Some differences are:

  • Mutagen requires changes to the source tree, and for functions to be mutated to be marked with an attribute. cargo-mutants can work with any unmodified tree.

  • Mutagen needs a nightly compiler. cargo-mutants can be built with any recent stable (or nightly) compiler and can be used with very old compilers.

  • Mutagen builds the tree only once; cargo-mutants does an incremental build for each mutation.

    On the up side building for each mutation gives cargo-mutants the freedom to try mutations it's not sure will compile.

    Typically the incremental builds are relatively cheap compared to the time to run the tests.

Please let me know anything else that should be added or corrected.

Clone this wiki locally