CategoriesPythonTesting

Xonsh + Mut.py: Filtering Mut.py’s Output

Mut.py is a useful tool for performing mutation testing on Python programs. If you want to learn more about that, see my blog post over on the PyBites blog. In short, mutation testing helps us test our tests to make sure that they cover the program completely by making small changes to the code and then rerunning the tests for each change.

That’s useful information, but sometimes the output can be a bit too much. As an example, let’s set up a small program and some tests.

#example.py
import math

def check_prime(number):
    if number < 2: return False

    for i in range(2, int(math.sqrt(number)) + 1):
            if number % i == 0:
                break
    else:
        return True

    return False
#test_example.py
import pytest

from example import check_prime

def test_prime():
    assert check_prime(7)

The test passes! However, it should be obvious that simply making sure that our function can tell that 7 is prime isn’t enough to cover all its functionality. Let’s see what mut.py has to say about this.

$ mut.py --target example --unit-test test_example --runner pytest                                                                                          
[*] Start mutation process:
   - targets: example
   - tests: test_example
[*] 1 tests passed:
   - test_example [0.55317 s]
[*] Start mutants generation and execution:
   - [#   1] AOR example: [0.06451 s] survived
   - [#   2] AOR example: [0.05991 s] survived
   - [#   3] BCR example: [0.06143 s] survived
   - [#   4] COI example: [0.11853 s] killed by test_example.py::test_prime
   - [#   5] COI example: [0.11193 s] killed by test_example.py::test_prime
   - [#   6] CRP example: [0.06105 s] survived
   - [#   7] CRP example: [0.06164 s] survived
   - [#   8] CRP example: [0.06224 s] survived
   - [#   9] CRP example: [0.11637 s] killed by test_example.py::test_prime
   - [#  10] ROR example: [0.11332 s] killed by test_example.py::test_prime
   - [#  11] ROR example: [0.06272 s] survived
   - [#  12] ROR example: [0.11869 s] killed by test_example.py::test_prime
[*] Mutation score [1.67688 s]: 41.7%
   - all: 12
   - killed: 5 (41.7%)
   - survived: 7 (58.3%)
   - incompetent: 0 (0.0%)
   - timeout: 0 (0.0%)

Looks like there’s still a lot of work to be done. Seven out of the twelve mutants survived the test, but this doesn’t tell us anything about where the coverage is lacking. Let’s try adding the -m flag to mut.py.

mut.py --target example --unit-test test_example --runner pytest -m                                                                                       
[*] Start mutation process:
   - targets: example
   - tests: test_example
[*] 1 tests passed:
   - test_example [0.57480 s]
[*] Start mutants generation and execution:
   - [#   1] AOR example: 
--------------------------------------------------------------------------------
   2: 
   3: def check_prime(number):
   4:     if number < 2:
   5:         return False
-  6:     for i in range(2, int(math.sqrt(number)) + 1):
+  6:     for i in range(2, int(math.sqrt(number)) - 1):
   7:         if number % i == 0:
   8:             break
   9:     else:
  10:         return True
--------------------------------------------------------------------------------
[0.07507 s] survived
   - [#   2] AOR example: 
--------------------------------------------------------------------------------
   3: def check_prime(number):
   4:     if number < 2:
   5:         return False
   6:     for i in range(2, int(math.sqrt(number)) + 1):
-  7:         if number % i == 0:
+  7:         if number * i == 0:
   8:             break
   9:     else:
  10:         return True
  11:     
--------------------------------------------------------------------------------
[0.17719 s] survived
   - [#   3] BCR example: 
--------------------------------------------------------------------------------
   4:     if number < 2:
   5:         return False
   6:     for i in range(2, int(math.sqrt(number)) + 1):
   7:         if number % i == 0:
-  8:             break
+  8:             continue
   9:     else:
  10:         return True
  11:     
  12:     return False
--------------------------------------------------------------------------------
[0.05884 s] survived
   - [#   4] COI example: 
--------------------------------------------------------------------------------
   1: import math
   2: 
   3: def check_prime(number):
-  4:     if number < 2:
+  4:     if not (number < 2):
   5:         return False
   6:     for i in range(2, int(math.sqrt(number)) + 1):
   7:         if number % i == 0:
   8:             break
--------------------------------------------------------------------------------
[0.11854 s] killed by test_example.py::test_prime
   - [#   5] COI example: 
--------------------------------------------------------------------------------
   3: def check_prime(number):
   4:     if number < 2:
   5:         return False
   6:     for i in range(2, int(math.sqrt(number)) + 1):
-  7:         if number % i == 0:
+  7:         if not (number % i == 0):
   8:             break
   9:     else:
  10:         return True
  11:     
--------------------------------------------------------------------------------
[0.11615 s] killed by test_example.py::test_prime
[SNIP!]

Suddenly, it’s too much information!

We get the mutation and the context, which can help us pinpoint where tests need to be improved, but since even mutations that were killed show up here, it’s hard to tell at a glance which ones are safe to ignore and which ones to pay attention to.

Let’s pause for a moment here and talk about xonsh’s Callable Aliases. Xonsh, like Bash, has the ability to add aliases for common commands. Unlike Bash, xonsh’s aliases are also the method we can use to access Python functions from subprocess mode.

Aliases are stored in a mapping similar to a dictionary called, aptly, aliases. So we can add an alias by setting a key.

aliases["lt"] = "ls --human-readable --size -1 -S --classify"

Callable aliases extend this idea to form a bridge between a Python function and subprocess mode. Normally, to use anything from Python in subprocess mode would require special syntax. Useful, but limited.

We can define a callable alias just like any Python function. Since our goal is to filter out some of the noise in mut.py’s output, let’s get started on that.

A callable alias can be passed the arguments from the command (as a list of strings), stdin, stdout, and a couple other more obscure values. Our function will need stdin, which means args will also be defined—xonsh determines what values to pass in based on argument position, not the name.

Here’s how to register the alias with xonsh:

#~/.xonshrc
def _filter_mutpy(args, stdin=None):
    if not stdin:
        return "No input to filter"

aliases["filter_mutpy"] = _filter_mutpy
$ filter_mutpy
No input to filter

Success! When called with no stdin, there’s nothing for our function to parse. Xonsh accepts a string as the return value, which is appended to stdout. There are two more optional values that could also be used: stderr and a return code. To use them, just return a tuple like (stdout, stderr) or (stdout, stderr, return code).

Now that we have our alias configured in xonsh, it’s time to add the functionality we want: taming mut.py’s output.

def _filter_mutpy(args, stdin=None):
    if stdin is None:
        return "No input to filter"

    output = []
    mutant = []
    collect_mutant = False
    for line in stdin:
        if " s] " in line and collect_mutant:
            collect_mutant = False
            mutant.append(line)
            if "incompetent" in line or "killed" in line:
                print(mutant[0], end="")
                print(mutant[-1], end="")
            else:
                print("".join(mutant), end="")
            mutant = []
        elif "- [#" in line and not collect_mutant:
            collect_mutant = True
            mutant.append(line)
        elif collect_mutant:
            mutant.append(line)
        else:
            print(line, end="")

aliases["filter_mutpy"] = _filter_mutpy

Now we can pipe mut.py into our alias and get this result:

$ mut.py --target example --unit-test test_example --runner pytest -m | filter_mutpy                                                                        
[*] Start mutation process:
   - targets: example
   - tests: test_example
[*] 2 tests passed:
   - test_example [0.52779 s]
[*] Start mutants generation and execution:
   - [#   1] AOR example: 
[0.12564 s] killed by test_example.py::test_not_prime
   - [#   2] AOR example: 
[0.12044 s] killed by test_example.py::test_not_prime
   - [#   3] BCR example: 
[0.12248 s] killed by test_example.py::test_not_prime
   - [#   4] COI example: 
[0.12042 s] killed by test_example.py::test_prime
   - [#   5] COI example: 
[0.11927 s] killed by test_example.py::test_prime
   - [#   6] CRP example: 
--------------------------------------------------------------------------------
   1: import math
   2: 
   3: def check_prime(number):
-  4:     if number < 2:
+  4:     if number < 3:
   5:         return False
   6:     for i in range(2, int(math.sqrt(number)) + 1):
   7:         if number % i == 0:
   8:             break
--------------------------------------------------------------------------------
[0.06259 s] survived
   - [#   7] CRP example: 
[0.12793 s] killed by test_example.py::test_not_prime
   - [#   8] CRP example: 
--------------------------------------------------------------------------------
   2: 
   3: def check_prime(number):
   4:     if number < 2:
   5:         return False
-  6:     for i in range(2, int(math.sqrt(number)) + 1):
+  6:     for i in range(2, int(math.sqrt(number)) + 2):
   7:         if number % i == 0:
   8:             break
   9:     else:
  10:         return True
--------------------------------------------------------------------------------
[0.06272 s] survived
   - [#   9] CRP example: 
[0.12543 s] killed by test_example.py::test_prime
   - [#  10] ROR example: 
[0.12325 s] killed by test_example.py::test_prime
   - [#  11] ROR example: 
--------------------------------------------------------------------------------
   1: import math
   2: 
   3: def check_prime(number):
-  4:     if number < 2:
+  4:     if number <= 2:
   5:         return False
   6:     for i in range(2, int(math.sqrt(number)) + 1):
   7:         if number % i == 0:
   8:             break
--------------------------------------------------------------------------------
[0.06679 s] survived
   - [#  12] ROR example: 
[0.12549 s] killed by test_example.py::test_prime
[*] Mutation score [2.04439 s]: 75.0%
   - all: 12
   - killed: 9 (75.0%)
   - survived: 3 (25.0%)
   - incompetent: 0 (0.0%)
   - timeout: 0 (0.0%)

Awesome! Every code snippet is now related to a mutant that survived, so we can see at a glance which ones are important—and I used that to improve the tests, so more cases are covered and more mutants are killed.

This is a relatively simple example of xonsh’s power, but remember that the entire Python standard library and ecosystem is available to parse, filter, and act on the output of any command-line interface.

I’m looking forward to discovering more ways to use callable aliases in my work. Got any ideas?