Sunday, May 25, 2014

Roman Numerals and FsCheck

"Friends, Romans, countrymen, lend me your ears;
I come to bury Caesar, not to praise him.
"
-- Shakespeare, Julius Caesar
Act III, Scene II, Lines 74-75

Testing is Hard


Lets face it, testing is hard.  In fact be able to test something in which you are unsure even how it is suppose to work is very hard.  Luckily we have a bit more tools today than when people wrote assembler.  One such tool is property based testing.

With property based testing you test for properties of the result of your system under test.  Also unlike traditional unit testing, with property based testing your test data is generated for you.  With this style of testing you can enhance your unit testing by finding more edge cases with property based testing.  I believe an example would be nice about now.

Property of Roman Numerals


The Roman Numerals kata is an interesting exercise in which you take a number in its arabic form and convert it into its roman numeral form.  This kata is fairly realistic if you do a lot of ETL type working in your day-to-day job.  Let's take a look at some of the rules for the Roman Numeral kata.

Given an arabic number it will be converted to a string representing its value as a roman numeral.

roman(   1) => I
roman(   2) => II
roman(   3) => III
roman(   4) => IV
roman(   5) => V
roman(   6) => VI
roman(   7) => VII
roman(   8) => VIII
roman(    9) => IX
roman(  10) => X
...
roman(  29) => XXIX
roman(  30) => XXX
...
roman(  49) => XLIX
roman(  50) => L
...
roman(  99) => XCIX
roman( 100) => C
...
roman( 499) => CDXCIX
roman( 500) => D
...
roman( 999) => CMXCIX
roman(1000) => M

Looking at the examples above we see that we only really need to test 1-10 and special values like 40, 50, 90, 100, 400, 500, 900, and 1000.  We also see that roman numerals have properties like, at most we have one of the values like V, L, and D in the roman numeral string.  We also see that we have a property of only have up to three of the same letter in a row.  With these tests and properties in mind we can write some C# code to provide the functionality and test it with some F# code using xUnit and FsCheck.



Unit Testing


With our Unit Testing we only really need to test 1-10 and special values like 40, 50, 90, 100, 400, 500, 900, and 1000.

We can do this with one simple test (lines 8-38 in the gist above).

module ``Roman Numeral Unit tests`` =
[<Fact>]
let ``For this input it will produce this output`` () =
let testpairs = [
(0, "")
(1, "I")
(2, "II")
(3, "III")
(4, "IV")
(5, "V")
(6, "VI")
(7, "VII")
(8, "VIII")
(9, "IX")
(10, "X")
(20, "XX")
(30, "XXX")
(40, "XL")
(50, "L")
(90, "XC")
(100, "C")
(400, "CD")
(500, "D")
(900, "CM")
(1000, "M")
]
for (number, expected) in testpairs do
let romanizer = RomanNumeraler ()
let actual = romanizer.Translate number
Assert.Equal<string>(expected, actual)
With one simple collection of pairs we can easily add test cases as we add functionality.  This is a big boost for productivity.

I got the idea for this from Scott Wlaschin's blog excellent blog post.

Property Based Testing


Unit Testing is great for driving out what we already know, but how do we really put our code through the wringer?  This is were property based testing comes into play.  For our property based testing we will use FsCheck.

The first property we'll look at is that we have at most one V, L, or D in the roman numeral string.

let ``max number of V is 1`` =
``max number of letter is 1`` "V"
let ``max number of L is 1`` =
``max number of letter is 1`` "L"
let ``max number of D is 1`` =
``max number of letter is 1`` "D"

We see that this uses very similar functionality to test so we come up with a general function.

let ``max number of letter is 1`` letter number =
let romanNumerals = ((RomanNumeraler()).Translate number)
let numberOf = match romanNumerals with
| "" -> 0
| _ -> (romanNumerals.Length - romanNumerals.Replace(letter, "").Length) / romanNumerals.Length
match numberOf with
| 0 | 1 -> true
| _ -> false

We get the count of the number of letters repeating using a function I found on Rosetta Code that I thought was very slick.

Next we see that this property is also true for the values of IV, IX, XL, XC, CD, and CM.

let ``max number of CM is 1`` =
``max number of letter is 1`` "CM"
let ``max number of CD is 1`` =
``max number of letter is 1`` "CD"
let ``max number of XC is 1`` =
``max number of letter is 1`` "XC"
let ``max number of XL is 1`` =
``max number of letter is 1`` "XL"
let ``max number of IX is 1`` =
``max number of letter is 1`` "IX"
let ``max number of IV is 1`` =
``max number of letter is 1`` "IV"

Now we can actual test the properties.

[<Property>]
let ``for all valid inputs there is a maximum of one V`` (x:int) =
normalize x |> ``max number of V is 1``
[<Property>]
let ``for all valid inputs there is a maximum of one L`` (x:int) =
normalize x |> ``max number of L is 1``
[<Property>]
let ``for all valid inputs there is a maximum of one D`` (x:int) =
normalize x |> ``max number of D is 1``
[<Property>]
let ``for all valid inputs there is a maximum of one CM`` (x:int) =
normalize x |> ``max number of CM is 1``
[<Property>]
let ``for all valid inputs there is a maximum of one CD`` (x:int) =
normalize x |> ``max number of CD is 1``
[<Property>]
let ``for all valid inputs there is a maximum of one XC`` (x:int) =
normalize x |> ``max number of XC is 1``
[<Property>]
let ``for all valid inputs there is a maximum of one XL`` (x:int) =
normalize x |> ``max number of XL is 1``
[<Property>]
let ``for all valid inputs there is a maximum of one IX`` (x:int) =
normalize x |> ``max number of IX is 1``
[<Property>]
let ``for all valid inputs there is a maximum of one IV`` (x:int) =
normalize x |> ``max number of IV is 1``

To limit the value which are passed to our tests we can a normalize function which will limit values to be between 1 and 3999.

let normalize x = (abs x % 3999) + 1

Now we can look at testing the max number of repeating letters for I and X.

let ``has max number of repeating letter is 3`` letter number =
let romanizer = RomanNumeraler ()
not ((romanizer.Translate number).Contains(String.replicate 4 letter))
let ``has max number of repeating I is 3`` =
``has max number of repeating letter is 3`` "I"
let ``has max number of repeating X is 3`` =
``has max number of repeating letter is 3`` "X"

[<Property>]
let ``for all valid inputs, there is a max of three repeating I`` (x:int) =
normalize x |> ``has max number of repeating I is 3``
[<Property>]
let ``for all valid inputs, there is a max of three repeating X`` (x:int) =
normalize x |> ``has max number of repeating X is 3``

Last we can look at testing every number which is divisible by 5 ends with a V.

let ``divisible by 5 will have a V`` number =
let romanizer = RomanNumeraler ()
(romanizer.Translate number).Contains "V"

[<Property>]
let ``for all valid inputs divisible by 5, there is a V`` (x:int) =
x % 2 <> 0
==> ``divisible by 5 will have a V`` (abs x * 5)

We can continue in this format but I believe we have a good foundation at this point.