docs/tutorial.txt

   1 = FiveAM Tutorial =
   2 Marco Baringer <mb@bese.it>
   3 Fall/Winter 2012
   4 :Author Initials: MB
   5 :toc:
   6 :icons:
   7 :numbered:
   8 :website: http://common-lisp.net/project/fiveam
   9 :stylesheet: fiveam.css
  10 :linkcss:
  11
  12 == Setup ==
  13
  14 Before we even start, we'll need to load FiveAM itself:
  15
  16 --------------------------------
  17 CL-USER> (quicklisp:quickload :fiveam)
  18 To load "fiveam":
  19   Load 1 ASDF system:
  20     fiveam
  21 ; Loading "fiveam"
  22
  23 (:FIVEAM)
  24 CL-USER> (use-package :5am)
  25 T
  26 --------------------------------
  27
  28 == Failure For Beginners ==
  29
  30 Now, this is a tutorial to the testing framework FiveAM. Over the
  31 course of this tutorial we're going to test an implementation of
  32 link:https://en.wikipedia.org/wiki/Peano_axioms[peano numbers]
  33 (basically, pretend that lisp didn't have integers or arithmetic built
  34 in and we wanted to add it in the least efficent way possible). The
  35 first thing we need is the constant `0`, a function `zero-p` for
  36 testing if a number is zero, and function `succ` which, given a number
  37 `N`, returns its successor (in other words `N + 1`).
  38
  39 It's still not totally clear to me what the `succ` function should
  40 look like, but the `zero` and `zero-p` functions are easy enough, so
  41 let's define a test for those two funtions. We'll start by testing
  42 `zero` as much as we can:
  43
  44 --------------------------------
  45 (def-test zero ()
  46   (finishes (zero)))
  47 --------------------------------
  48
  49 [NOTE]
  50 ignore the second argument to def-test for now. if it helps pretend it's filler to make the identation look better.
  51
  52 Since we don't know, nor really care at this stage, what the function
  53 `zero` returns, we simply use the
  54 link:manual.html#FUNCTION_FINISHES[`FINISHES`] macro to make sure that
  55 the function does in fact return (as opposed to signaling some weird
  56 error). Our `zero-p` test, on the other hand, does actually have
  57 something we can test. Whatever is returned by `zero` should be
  58 `zero-p`:
  59
  60 --------------------------------
  61 (def-test zero-p ()
  62   (is-true (zero-p (zero))))
  63 --------------------------------
  64
  65 Finally, let's run our tests:
  66
  67 --------------------------------
  68 CL-USER> (run!)
  69 XXf
  70  Did 2 checks.
  71     Pass: 0 ( 0%)
  72     Skip: 0 ( 0%)
  73     Fail: 2 (100%)
  74
  75  Failure Details:
  76  --------------------------------
  77  ZERO []:
  78  Unexpected Error: #<UNDEFINED-FUNCTION ZERO {10058AD6F3}>
  79 The function COMMON-LISP-USER::ZERO is undefined..
  80  --------------------------------
  81  --------------------------------
  82  ZERO-P []:
  83  Unexpected Error: #<UNDEFINED-FUNCTION ZERO {10056FE5A3}>
  84 The function COMMON-LISP-USER::ZERO is undefined..
  85  --------------------------------
  86
  87 --------------------------------
  88
  89 so, 100% failure rate, and even an Unexpected error...that's bad, but
  90 it's also what we should have been expecting given that we haven't
  91 actually defined `zero-p` or `zero`. So, let's define those two
  92 functions:
  93
  94 --------------------------------
  95 CL-USER> (defun zero () 'zero)
  96 ZERO
  97 CL-USER> (defun zero-p (value) (eql 'zero value))
  98 ZERO-P
  99 --------------------------------
 100
 101 Now let's run our test again:
 102
 103 --------------------------------
 104 CL-USER> (run!)
 105 ..
 106  Did 2 checks.
 107     Pass: 2 (100%)
 108     Skip: 0 ( 0%)
 109     Fail: 0 ( 0%)
 110 --------------------------------
 111
 112 Much better.
 113
 114 [NOTE]
 115 TEST ALL THE THINGS!
 116 .
 117 There's actually a bit of work being done with suites and default
 118 tests and stuff in order to make that `run!` call do what it just did
 119 (call our previously defined tests). If you never create a suite on
 120 your own then you can think of `run!` as being the 'run every test'
 121 function, if you start creating your own suites (and you will
 122 eventually), then you'll want to know that run's second, optional,
 123 argument is the name of a test or suite to run, but until then just go
 124 with `(run!)`.
 125
 126 == More code ==
 127
 128 So, we have zero, and we can test for zero ness, wouldn't it be nice
 129 to have the number one too? How about the number two? how about a
 130 billion? I like the number 1 billion. Now, since we thoroughly read
 131 through the wiki page on peano numbers we now that there's a function,
 132 called `succ` which, give one number returns the "next" one. In this
 133 implementation we're going to represent numbers as nested lists, so
 134 our `succ` function just wraps its input in another cons cell:
 135
 136 --------------------------------
 137 (defun succ (number)
 138   (cons number nil))
 139 --------------------------------
 140
 141 Easy enough. That could also be right, it could also be wrong too, we
 142 don't really have a way to check (yet). We do know one thing though,
 143 the `succ` of any number (even zero) isn't zero. So let's redefine our
 144 zero test to check that zero plus one isn't zero:
 145
 146 --------------------------------
 147 (def-test zero-p ()
 148   (is-true  (zero-p (zero)))
 149   (is-false (zero-p (succ (zero)))))
 150 --------------------------------
 151
 152 and let's run the test:
 153
 154 --------------------------------
 155 CL-USER> (run!)
 156 ...
 157  Did 3 checks.
 158     Pass: 3 (100%)
 159     Skip: 0 ( 0%)
 160     Fail: 0 ( 0%)
 161 --------------------------------
 162
 163 Nice!
 164
 165 == Elementary, my dear watson. Run the test. ==
 166
 167 When working interactively like this, we almost always define a
 168 test and then immediately run it, we can tell fiveam to do that
 169 automatically by setting `*run-test-when-defined*` to T:
 170
 171 --------------------------------
 172 CL-USER> (setf *run-test-when-defined* t)
 173 T
 174 --------------------------------
 175
 176 Now if we were to redefine (either via the repl as I'm doing here or
 177 via C-cC-c in a slime buffer), we'll see:
 178
 179 --------------------------------
 180 CL-USER> (def-test zero-p ()
 181   (is-true (zero-p (zero)))
 182   (is-false (zero-p (plus-one (zero)))))
 183 ..
 184  Did 2 checks.
 185     Pass: 2 (100%)
 186     Skip: 0 ( 0%)
 187     Fail: 0 ( 0%)
 188 ZERO-P
 189 --------------------------------
 190
 191 Great, at this point it's time we add a function for testing integer
 192 equality (in other words, `cl:=`). Let's try with this:
 193
 194 --------------------------------
 195 CL-USER> (defun equiv (a b)
 196   (and (zero-p a) (zero-p b)))
 197 EQUIV
 198 --------------------------------
 199
 200 [NOTE]
 201 Since i'm doing everything in the package common-lisp-user i
 202 couldn't use the name `=` (or even `equal`). I don't want to talk
 203 about packages at this point, so we'll just have to live with `equiv`
 204 for now.
 205
 206 And let's test it:
 207
 208 --------------------------------
 209 CL-USER> (def-test equiv () (equiv (zero) (zero)))
 210  Didn't run anything...huh?
 211 EQUIV
 212 --------------------------------
 213
 214 Well, that's not what I was expecting. I'd forgotten that FiveAM,
 215 unlike other test frameworks, doesn't actually look at the return
 216 value of the function, it only runs its so called checks (one of which
 217 is the `is-true` function we've been using so far). So let's add that
 218 in and try again:
 219
 220 --------------------------------
 221 CL-USER> (def-test equiv ()
 222            (is-true (equiv (zero) (zero))))
 223 .
 224  Did 1 check.
 225     Pass: 1 (100%)
 226     Skip: 0 ( 0%)
 227     Fail: 0 ( 0%)
 228
 229 EQUIV
 230 --------------------------------
 231
 232 == Failing, but gently. ==
 233
 234 Nice, now, finally, we can test that 1 is equal to 1 (or, in our
 235 implementation, the successor of zero is equal to the successor of
 236 zero):
 237
 238 --------------------------------
 239 CL-USER> (def-test equiv ()
 240            (is-true (equiv (zero) (zero)))
 241            (is-true (equiv (succ (zero)) (succ (zero)))))
 242 .f
 243  Did 2 checks.
 244     Pass: 1 (50%)
 245     Skip: 0 ( 0%)
 246     Fail: 1 (50%)
 247
 248  Failure Details:
 249  --------------------------------
 250  EQUIV []:
 251  (EQUIV (SUCC (ZERO)) (SUCC (ZERO))) did not return a true value
 252  --------------------------------
 253
 254 EQUIV
 255 --------------------------------
 256
 257 Oh, cry, baby cry. The important part of that output is this line:
 258
 259 --------------------------------
 260  EQUIV []:
 261  (EQUIV (SUCC (ZERO)) (SUCC (ZERO))) did not return a true value
 262 --------------------------------
 263
 264 That means that, in the test `EQUIV` the form `(EQUIV (SUCC (ZERO))
 265 (SUCC (ZERO)))` evaluated to NIL. I wonder why? It'd be nice to see
 266 what the values evaluated to, what the actual arguments and return
 267 value of `EQUIV` was. There are two things we could do at this point:
 268
 269 . Set 5am:*debug-on-failure* to `T` and re-run the test and dig around
 270   in the backtrace for the info we need.
 271
 272 . Use the `IS` check macro to get a more informative message in the
 273   output.
 274
 275 In practice you'll end up using a combination of both (though i prefer
 276 that tests run to completion without hitting the debugger, and this
 277 may have influenced fiveam a bit, but others prefer working with live
 278 data in a debugger window and that's an equally valid approach).
 279
 280 == Tell me what I need to know ==
 281
 282 However, since this a non-interactive static file, and debuggers are
 283 really interactive and implementation specific, I'm going to go with
 284 the second option for now, here's the same test using the `IS` check
 285 instead of `IS-TRUE`:
 286
 287 --------------------------------
 288 CL-USER> (def-test equiv ()
 289            (is (equiv (zero) (zero)))
 290            (is (equiv (succ (zero)) (succ (zero)))))
 291 .f
 292  Did 2 checks.
 293     Pass: 1 (50%)
 294     Skip: 0 ( 0%)
 295     Fail: 1 (50%)
 296
 297  Failure Details:
 298  --------------------------------
 299  EQUIV []:
 300
 301 (SUCC (ZERO)) <1>
 302
 303  evaluated to
 304
 305 (ZERO) <2>
 306
 307  which is not
 308
 309 EQUIV <3>
 310
 311  to
 312
 313 (ZERO) <4>
 314
 315  --------------------------------
 316
 317 EQUIV
 318
 319 <1> actual value's source code
 320 <2> actual value's value
 321 <3> comparison operator
 322 <4> expected value
 323 --------------------------------
 324
 325 I need to mention something at this point: the `IS-TRUE` and `IS`
 326 macro do not do anything different at run time. They both have some
 327 code, which they run, and if the result is NIL they record a failure
 328 and if not they record a success (which 5am calls a pass). The only
 329 difference is in how they report a failure: The `IS-TRUE` function
 330 just stores the source form and prints that back, the `IS` macro
 331 assumes that the form has a specific format:
 332
 333     (TEST-FUNCTION EXPECTED-VALUE ACTUAL-VALUE)
 334
 335 and generates a failure message based on that. In this case we
 336 evaluated `(succ (zero))`, and got `(zero)`, and passed this value,
 337 along with the result of the expected value (`(succ (zero))`) to
 338 `equiv` and got `NIL`.
 339
 340 Now, back to our test, it's actually pretty obvious that our current
 341 implementation of equiv:
 342
 343 --------------------------------
 344 (defun equiv (a b)
 345   (and (zero-p a) (zero-p b)))
 346 --------------------------------
 347
 348 is buggy, so let's fix and run the test again:
 349
 350 --------------------------------
 351 CL-USER> (defun equiv (a b)
 352            (if (and (zero-p a) (zero-p b))
 353                t
 354                (equiv (car a) (car b))))
 355 EQUIV
 356 CL-USER> (!)
 357 ..
 358  Did 2 checks.
 359     Pass: 2 (100%)
 360     Skip: 0 ( 0%)
 361     Fail: 0 ( 0%)
 362
 363 NIL
 364 --------------------------------
 365
 366 == Again, from the top ==
 367
 368 Great, our tests passed. You'll notice though that this time we used
 369 the `!` function instead of `run!`.
 370
 371 == Birds of a feather flock together. Horses of a different color stay home. ==
 372
 373 So far we've always defined and run single tests, while it's certainly
 374 possible to continue this way it gets unweidly pretty quickly.
 375