qtest
— Quick Unit Tests for OCaml —
version 2.0^*

Vincent Hugot & the Batteries team

1 Introduction

In a nutshell, qtest is a small program which reads .ml and .mli source files and extracts inline oUnit unit tests from them. It is used internally by the OCaml Batteries project, and is shipped with it as of version 2.0, but it does not depend on it and can be compiled and used independently.

Browse its code in the Batteries Github repository.

This document is available as either one big page or several smaller pages. Contents are the same.

2 Using qtest : a Quick, Simple Example

Say that you have a file foo.ml, which contains the implementation of your new, shiny function foo.

let rec foo x0 f = function
  [] -> 0 | x::xs -> f x (foo x0 f xs)

Maybe you don’t feel confident about that code; or maybe you do, but you know that the function might be re-implemented less trivially in the future and want to prevent potential regressions. Or maybe you simply think unit tests are good practice anyway. In either case, you feel that building a separate test suite for this would be overkill. Using qtest, you can immediately put simple unit tests in comments near foo, for instance:

(*$T foo
  foo  0 ( + ) [1;2;3] = 6
  foo  0 ( * ) [1;2;3] = 0
  foo  1 ( * ) [4;5]   = 20
  foo 12 ( + ) []      = 12
*)

the syntax is simple: (*$ introduces a qtest "pragma", such as T in this case. T is by far the most common and represents a "simple" unit test. T expects a "header", which is most of the time simply the name of the function under test, here foo. Following that, each line is a "statement", which must evaluate to true for the test to pass. Furthermore, foo must appear in each statement.

Now, in order to execute those tests, you need to extract them; this is done with the qtest executable. The command

$ qtest -o footest.ml extract foo.ml
Target file: `footest.ml'. Extraction : `foo.ml' Done.

will create a file footest.ml; it’s not terribly human-readable, but you can see that it contains your tests as well as some oUnit boilerplate. Now you need to compile the tests, for instance with ocamlbuild, and assuming oUnit was installed for ocamlfind.

$ ocamlbuild -cflags -warn-error,+26 -use-ocamlfind -package oUnit \
    footest.native
Finished, 10 targets (1 cached) in 00:00:00.

Note that the -cflags -warn-error,+26 is not indispensable but strongly recommended. Its function will be explained in more detail in the more technical sections of this documentation, but roughly it makes sure that if you write a test for foo, via (*$T foo for instance, then foo is actually tested by each statement – the tests won’t compile if not.

Important note: in order for this to work, ocamlbuild must know where to find foo.ml; if footest.ml is not in the same directory, you must make provisions to that effect. If foo.ml needs some specific flags in order to compile, they must also be passed.

Now there only remains to run the tests:

$ ./footest.native
..FF
==============================================================================
Failure: qtest:0:foo:3:foo.ml:10

OUnit: foo.ml:10::>  foo 12 ( + ) [] = 12
------------------------------------------------------------------------------
==============================================================================
Failure: qtest:0:foo:2:foo.ml:9

OUnit: foo.ml:9::>  foo 1 ( * ) [4;5] = 20
------------------------------------------------------------------------------
Ran: 4 tests in: 0.00 seconds.
FAILED: Cases: 4 Tried: 4 Errors: 0 Failures: 2 Skip:0 Todo:0

Oops, something’s wrong... either the tests are incorrect or foo is. Finding and fixing the problem is left as an exercise for the reader. When this is done, you get the expected

$ ./footest.native
....
Ran: 4 tests in: 0.00 seconds.

Tip: those steps are easy to automate, for instance with a small shell script:

set -e # stop on first error
qtest -o footest.ml extract foo.ml
ocamlbuild -cflags -warn-error,+26 -use-ocamlfind -package oUnit footest.native
./footest.native

3 More qtest Pragmas

3.1 Different Kinds of Tests

3.1.1 Simple Tests: T for "Test"

The most common kind of tests is the simple test, an example of which is given above. It is of the form

(*$T <header>
  <statement>
  ...
*)

where each statement must be a boolean OCaml expression involving the function (or functions, as we will see when we study headers) referenced in the header. The overall test is considered successful if each statement evaluates to true. Note that the "close comment" *) must appear on a line of its own.

Tip: if a statement is a bit too long to fit on one line, if can be broken using a backslash (\), immediately followed by the carriage return. This also applies to randomised tests.

3.1.2 Equality Tests: =

The vast majority of test cases tend to involve the equality of two expressions; using simple tests, one would write something like:

(*$T foo
  foo 1 ( * ) [4;5] = foo 3 ( * ) [1;5;2]
*)

While this certainly works, the failure report for such a test does not convey any useful information besides the simple fact that the test failed. Wouldn’t it be nice if the report also mentioned the values of the left-hand side and the right-hand side ? Yes it would, and specialised equality tests provide such functionality, at the cost of a little bit of boilerplate code. The bare syntax is:

(*$= <header>
  <lhs> <rhs>
  ...
*)

However, used bare, an equality test will not provide much more information than a simple test: just a laconic “not equal”. In order for the values to be printed, a “value printer” must be specified for the test. A printer is a function of type α→ string, where α is the type of the expressions on both side of the equality. To pass the printer to the test, we use parameter injection (cf. Section 4.2.5); equality tests have an optional argument printer for this purpose. In our example, we have α = int, so the test becomes simply:

(*$= foo & ~printer:string_of_int
  (foo 1 ( * ) [4;5]) (foo 3 ( * ) [1;5;2])
*)

The failure report will now be more explicit, saying expected: 20 but got: 30.

3.1.3 Randomized Tests: Q for "Quickcheck"

Quickcheck is a small library useful for randomized unit tests. Using it is a bit more complex, but much more rewarding than simple tests.

(*$Q <header>
  <generator> (fun <generated value> -> <statement>)
  ...
*)

Let us dive into an example straight-away:

(*$Q foo
  Q.small_int (fun i-> foo i (+) [1;2;3] = List.fold_left (+) i [1;2;3])
*)

The Quickcheck module is accessible simply as Q within inline tests; small_int is a generator, yielding a random, small integer. When the test is run, each statement will be evaluated for a large number of random values – 100 by default. Running this test for the above definition of foo catches the mistake easily:

law foo.ml:14::>  Q.small_int (fun i-> foo i (+) [1;2;3]
    = List.fold_left (+) i [1;2;3])
failed for 2

Note that the random value for which the test failed is provided by the error message – here it is 2. It is also possible to generate several random values simultaneously using tuples. For instance

(Q.pair Q.small_int (Q.list Q.small_int)) \
  (fun (i,l)-> foo i (+) l = List.fold_left (+) i l)

will generate both an integer and a list of small integers randomly. A failure will then look like

law foo.ml:15::>  (Q.pair Q.small_int (Q.list Q.small_int))
    (fun (i,l)-> foo i (+) l = List.fold_left (+) i l)
failed for (727, [4; 3; 6; 1; 788; 49])

Available Generators:

Simple generators:
unit, bool, float, pos_float, neg_float, int, pos_int, small_int, neg_int, char, printable_char, numeral_char, string, printable_string, numeral_string
Structure generators:
list and array. They take one generator as their argument. For instance (Q.list Q.neg_int) is a generator of lists of (uniformly taken) negative integers.
Tuples generators:
pair and triple are respectively binary and ternary. See above for an example of pair.
Size-directed generators:
string, numeral_string, printable_string, list and array all have *_of_size variants that take the size of the structure as their first argument.

Tips:

Duplicate Elements in Lists: When generating lists, avoid Q.list Q.int unless you have a good reason to do so. The reason is that, given the size of the Q.int space, you are unlikely to generate any duplicate elements. If you wish to test your function’s behaviour with duplicates, prefer Q.list Q.small_int.
Changing Number of Tests: If you want a specific test to execute each of its statements a specific number of times (deviating from the default of 100), you can specify it explicitly through parameter injection (cf. Section 4.2.5) using the count : int argument.
Getting a Better Counterexample: By default, a random test stops as soon as one of its generated values yields a failure. This first failure value is probably not the best possible counterexample. You can force qtest to generate and test all count random values regardless, and to display the value which is smallest with respect to a certain measure which you define. To this end, it suffices to use parameter injection to pass argument small : α → β, where α is the type of generated values and β is any totally ordered set (wrt. <). Typically you will take β = int or β = float. Example:
```
let fuz x = x
let rec flu = function
  | [] -> []
  | x :: l -> if List.mem x l then flu l else x :: flu l

(*$Q fuz; flu & ~small:List.length
  (Q.list Q.small_int) (fun x -> fuz x = flu x)
*)
```
The meaning of small:List.length is therefore simply: “choose the shortest list”. For very complicated cases, you can simultaneously increase count to yield an even higher-quality counterexample.

3.1.4 Raw oUnit Tests: R for "Raw"

When more specialised test pragmas are too restrictive, for instance if the test is too complex to reasonably fit on one line, then one can use raw oUnit tests.

(*$R <header>
  <raw oUnit test>...
  ...
*)

Here is a small example, with two tests stringed together:

(*$R foo
  let thing = foo  1 ( * )
  and li = [4;5] in
  assert_bool "something_witty" (thing li = 20);
  assert_bool "something_wittier" (foo 12 ( + ) [] = 12)
*)

Note that if the first assertion fails, the second will not be executed; so stringing two assertions in that mode is different in that respect from doing so under a T pragma, for instance.

That said, raw tests should only be used as a last resort; for instance you don’t automatically get the source file and line number when the test fails. If T and Q do not satisfy your needs, then it is probably a hint that the test is a bit complex and, maybe, belongs in a separate test suite rather than in the middle of the source code.

3.1.5 Exception-Throwing Tests: E for "Exception"

... not implemented yet...

The current usage is to use (*$T and the following pattern for function foo and exception Bar:

try ignore (foo x); false with Bar -> true

If your project uses Batteries and no pattern-matching is needed, then you can also use the following, sexier pattern:

Result.(catch foo x |> is_exn Bar)

3.2 Manipulation Pragmas

Not all qtest pragmas directly translate into tests; for non-trivial projects, sometimes a little boilerplate code is needed in order to set the tests up properly. The pragmas which do this are collectively called "manipulation pragmas"; they are described in the next section.

3.2.1 Opening Modules: open Pragma <...> and --preamble Option

The tests should have access to the same values as the code under test; however the generated code for foo.ml does not actually live inside that file. Therefore some effort must occasionally be made to synchronise the code’s environment with the tests’. There are three main usecases where you might want to open modules for tests:

Project-Wide Global Open:
It may happen that every single file in your project opens a given module. This is the case for Batteries, for instance, where every module opens Batteries. In that case simply use the --preamble switch. For instance,
```
qtest --preamble "open Batteries;;"  extract mod1.ml mod2.ml ... modN.ml
```
Note that you could insert arbitrary code using this switch.
Global Open in a File:
Now, let’s say that foo.ml opens Bar and Baz; you want the tests in foo.ml to open them as well. Then you can use the open pragma in its global form:
```
(*$< Bar, Baz >*)
```
The modules will be open for every test in the same .ml file, and following the pragma. However, in our example, you will have a duplication of code between the "open" directives of foo.ml, and the open pragma of qtest, like so:
```
open Bar;; open Baz;;
(*$< Bar, Baz >*)
```
It might therefore be more convenient to use the code injection pragma (see next section) for that purpose, so you would write instead:
```
(*${*) open Bar;; open Baz;; (*$}*)
```
The code between that special markup will simply be duplicated into the tests. The two methods are equivalent, and the second one is recommended, because it reduces the chances of an impedance mismatch between modules open for ‘foo.ml‘ and its tests. Therefore, the global form of the open pragma should preferentially be reserved for cases where you want such a mismatch. For instance, if you have special modules useful for tests but useless for the main code, you can easily open then for the tests alone using the pragma.
Local Open for a Submodule:
Let’s say we have the following foo.ml:
```
let outer x = <something>

module Submod = struct
  let inner y = 2*x
  (*$T inner
    inner 2 = 4
  *)
end
```
That seems natural enough... but it won’t work, because qtest is not actually aware that the test is "inside" Submod (and making it aware of that would be very problematic). In fact, so long as you use only test pragmas (ie. no manipulation pragma at all), the positions and even the order of the tests – respective to definitions or to each other – are unimportant, because the tests do not actually live in foo.ml. So we need to open Submod manually, using the local form of the open pragma:
```
module Submod = struct (*$< Submod *)
  let inner y = 2*x
  (*$T inner
    inner 2 = 4
  *)
end (*$>*)
```
Notice that the <...> have simply been split in two, compared to the global form. The effect of that construct is that Submod will be open for every test between (*$< Submod *) and (*$>*). Of course, you could also forgo that method entirely and do this:
```
module Submod = struct
  let inner y = 2*x
  (*$T &
    Submod.inner 2 = 4
  *)
end
```
... but it is impractical and you are forced to use an empty header because qualified names are not acceptable as headers. The first method is therefore strongly recommended.

3.2.2 Code Injection Pragma: {...}

TODO: ocamldoc comments that define unit tests from the offered examples

4 Technical Considerations and Other Details

What has been said above should suffice to cover at least 90% of use-cases for qtest. This section concerns itself with the remaining 10%.

4.1 Function Coverage

The headers of a test are not just there for decoration; three properties are enforced when a test, say, (*$X foo is compiled, where X is T, R, Q,... :

foo exists; that is to say, it is defined in the scope of the module where the testappears – though one can play with pragmas to relax this condition somewhat. At the very least, it has to be defined somewhere. Failure to conform results in an Error: Unbound value foo.
foo is referenced in each statement of the test: for T and Q, that means "each line". For R, that means "once somewhere in the test’s body". Failure to conform results in a Warning 26: unused variable foo, which will be treated as an error if -warn-error +26 is passed to the compiler. It goes without saying that this is warmly recommended.
the test possesses at least one statement.

Those two conditions put together offer a strong guarantee that, if a function is referenced in a test header, then it is actually tested at least once. The list of functions referenced in the headers of extracted tests is written by qtest into qtest.targets.log. Each line is of the form

foo.ml   42    foo

where foo.ml is the file in which the test appears, as passed to extract, and 42 is the line number where the test pragma appears in foo.ml. Note that a same function can be listed several times for the same source file, if several tests involve it (say, two times if it has both a simple test and a random one). The exact number of statements involving foo in each test is currently not taken into account in the logs.

4.2 Headers and Metaheaders

The informal definition of headers given in the above was actually a simplification. In this section we explore two syntaxes available for headers.

4.2.1 Aliases

Some functions have exceedingly long names. Case in point :

let rec pretentious_drivel x0 f = function [] -> x0
  | x::xs -> pretentious_drivel (f x x0) f xs

(*$T pretentious_drivel
  pretentious_drivel 1 (+) [4;5] = foo 1 (+) [4;5]
  ... pretentious_drivel of this and that...
*)

The constraint that each statement must fit on one line does not play well with very long function names. Furthermore, you known which function is being tested, it’s right there is the header; no need to repeat it a dozen times. Instead, you can define an alias, and write equivalently:

(*$T pretentious_drivel as x
  x 1 (+) [4;5] = foo 1 (+) [4;5]
  ... x of this and that...
*)

... thus saving many keystrokes, thereby contributing to the preservation of the environment. More seriously, aliases have uses beyond just saving a few keystrokes, as we will see in the next sections.

4.2.2 Mutually Tested Functions

Most of the time, a test only pertains to one function. There are times, however, when one wishes to test two functions – or more – at the same time. For instance

let rec even = function 0 -> true
  | n -> odd (pred n)
and odd = function 0 -> false
  | n -> even (pred n)

Let us say that we have the following test:

(*$Q <header>
  Q.small_int (fun n-> odd (abs n+3) = even (abs n))
*)

It involves both even and odd. That question is: "what is a proper header for this test"? One could simply put "even", and thus it would be referenced as being tested in the logs, but odd would not, which is unfair. Putting "odd" is symmetrically unfair. The solution is to put both, separated by a semi-colon:

(*$Q even; odd

That way both functions are referenced in the logs:

    foo.ml   37    even
    foo.ml   37    odd

and of course the compiler enforces that both of them are actually referenced in each statement of the test. Of course, each of them can be written under alias, in which case the header could be even as x; odd as y.

4.2.3 Testing Functions by the Dozen

Let us come back to our functions foo (after correction) and pretentious_drivel, as defined above.

let rec foo x0 f = function
  [] -> x0 | x::xs -> f x (foo x0 f xs)

let rec pretentious_drivel x0 f = function [] -> x0
  | x::xs -> pretentious_drivel (f x x0) f xs

You will not have failed to notice that they bear more than a passing resemblance to one another. If you write tests for one, odds are that the same test could be useful verbatim for the other. This is a very common case when you have closely related functions, or even several implementations of the same function, for instance the old, slow, naïve, trustworthy one and the new, fast, arcane, highly optimised version you have just written. The typical case is sorting routines, of which there are many flavours.

For our example, recall that we have the following test for foo:

(*$Q foo
  (Q.pair Q.small_int (Q.list Q.small_int)) \
    (fun (i,l)-> foo i (+) l = List.fold_left (+) i l)
*)

The same test would apply to pretentious_drivel; you could just copy-and-paste the test and change the header, but it’s not terribly elegant. Instead, you can just just add the other function to the header, separating the two by a comma, and defining an alias:

(*$Q foo, pretentious_drivel as x
  (Q.pair Q.small_int (Q.list Q.small_int)) \
  (fun (i,l)-> x i (+) l = List.fold_left (+) i l)
*)

This same test will be run once for x = foo, and once for x = pretentious_drivel. Actually, you need not define an alias: if the header is of the form

(*$Q foo, pretentious_drivel

then it is equivalent to

(*$Q foo, pretentious_drivel as foo

so you do not need to alter the body of the test if you subsequently add new functions. A header which combines more than one "version" of a function in this way is called a metaheader.

4.2.4 Metaheaders Unleashed

All the constructs above can be combined without constraints: the grammar is as follows:

    Metaheader  ::=   Binding {";" Binding}
    Binding     ::=   Functions [ "as" ID ]
    Functions   ::=   ID {"," ID}
    ID          ::=   (*OCaml lower-case identifier*)

4.2.5 Header Parameters Injection

coming soon...

4.3 Warnings and Exceptions Thrown by qtest

Fatal error: exception Failure("Unrecognised qtest pragma: ` T foo'")

You have written something like (*$ T foo; there must not be any space between (*$ and the pragma.

Warning: likely qtest syntax error: `(* $T foo'. Done.

Self-explanatory; if $ is the first real character of a comment, it’s likely a mistyped qtest pragma. This is only a warning though.

Fatal error: exception Core.Bad_header_char("M", "Mod.foo")

You have used a qualified name in a header, for instance (*$T Mod.foo. You cannot do that, the name must be unqualified and defined under the local scope. Furthermore, it must be public, unless you have used pragmas to deal with private functions.

Error: Comment not terminated
Fatal error: exception Core.Unterminated_test(_, 0)

Most probably, you forgot the comment-closing *) to close some test.

Fatal error: exception Failure("runaway test body terminator: n))*)")

The comment-closing *) must be on a line of its own; or, put another way, every statement must be ended by a line break.

4.4 qtest Command-Line Options

$ qtest --help

** qtest (qtest)
USAGE: qtest [options] extract <file.mli?>...

OPTIONS:
--output <file.ml>    (-o) def: standard output
  Open or create a file for output; the resulting file will be an OCaml
  source file containing all the tests.

--preamble <string>   (-p) def: empty
  Add code to the tests' preamble; typically this will be an instruction
  of the form 'open Module;;'


--help          Displays this help page and stops

5 Editor Support

coming soon

*

This page was last updated on March 21, 2012, 07:50:48.

At the time of this document’s creation, the last commits to Batteries concerning the documentation were:

commit 41a55fd220cbabd4c16701e3eeffebdd596ef13f Author: Gabriel Scherer
<bluestorm.dylc@gmail.com> Date: Mon Mar 12 06:39:46 2012 +0100

    fix some typos in qtest comments and documentation

commit c1765061189bb56c38e79f6628bde55313ac829a Author: Vincent Hugot
<vincent.hugot@gmail.com> Date: Sun Mar 11 19:26:53 2012 +0100

    qtest: documentation dump

commit b63625c83e9ac78b2636ecc59cef10c9f1f155e4 Author: Vincent Hugot
<vincent.hugot@gmail.com> Date: Wed Feb 8 01:22:40 2012 +0100

    Fix #243: Migrated qtest documentation to LaTeX => HeVeA.

This document was translated from L^AT_EX by H^EV^EA.

qtest — Quick Unit Tests for OCaml — version 2.0*