Tuesday, March 19, 2013

Letter Occurrence

"Machines take me by surprise with great frequency."
-- Alan Turning

I do a code kata a day to keep unemployment away.  Well I actually do it to learn F#, but the previous statement sounds better.  Recently I was on Rosetta Code looking for an idea and came across a letter frequency counting problem.

The problem is simple:

Given a string, count how many times each letter occurs in said string.

Not too hard of a problem, but how do we "best" solve this with a functional language?

What I would like to do, would be to group each letter together and then count the number of occurrences of each letter in the group.  To make it easier to test I'll place the results into a Map with the key being the letter and the value being the number of occurrences of the letter in the input string.
In F# if you want to do a Group By you'll have to use Seq.groupby which takes a function (in this case we can use the identity since we are not doing anything with the elements of the string) and a sequence (in this case the string).  It gives you a sequence of keys and the complete sequence of values (e.g. A * seq AAAA).  Once we have that we just need to make a tuple with the value as a string and the number of times the value occurs, we can use Seq.map to do this.  Last we just need to place these tuples into a Map.


let occurs str =
  str 
      |> Seq.groupBy (fun c -> c) 
      |> Seq.map (fun (c, n) -> (c.ToString(), Seq.length n)) 
      |> Map.ofSeq


With this simple function we can now count the number of times a letter occurs in a string (similar to the Hadoop Hello World type program).

Below you'll find the full code from the kata including the unit tests.  For comparison I did the same kata again in Javascript (again and always with unit tests).

In F# (with FsUnit):



open NUnit.Framework
open FsUnit

let occurs str =
  str |> Seq.groupBy (fun c -> c) |> Seq.map (fun (c, n) -> (c.ToString(), Seq.length n)) |> Map.ofSeq

[<Test>]
let ``Validate that "A" has 1 A occurence`` () =
  let result = Map.empty.Add("A", 1)
  occurs "A" |> should equal result

[<Test>]
let ``Validate that "AAA" has 3 A occurence`` () =
  let result = Map.empty.Add("A", 3)
  occurs "AAA" |> should equal result

[<Test>]
let ``Validate that "AB" has 1 A occurence and 1 B occurence`` () =
  let result = Map.empty.Add("A", 1).Add("B", 1)
  occurs "AB" |> should equal result

[<Test>]
let ``Validate that "ABBBCCA" has 2 A occurence, 3 B occurence, and 2 C`` () =
  let result = Map.empty.Add("A", 2).Add("B", 3).Add("C", 2)
  occurs "ABBBCCA" |> should equal result


You can find the F# source code here.

In Javascript (with QUnit):



function occurs(letters) {
          var countletters = {}, index = 0;
         
          return function counter() {
                   if (letters.length <= index) {
                     return countletters;
                   }
                  
                   return function incrument() {
                       var letter = letters[index++];
                             countletters[letter] ? countletters[letter] += 1 : countletters[letter] = 1;
                             return counter();
                   }();
          }();
}

test("Count a string with 1 A", function () {
          var actual, expected = {};
          expected["A"] = 1;
          actual = occurs("A");
          deepEqual(actual, expected);
});

test("Count a string with 2 A", function () {
          var actual, expected = {};
          expected["A"] = 2;
          actual = occurs("AA");
          deepEqual(actual, expected);
});

test("Count a string with 1 A and 1 B", function () {
          var actual, expected = {};
          expected["A"] = 1;
          expected["B"] = 1;
          actual = occurs("AB");
          deepEqual(actual, expected);
});

test("Count a string with 2 A and 2 B", function () {
          var actual, expected = {};
          expected["A"] = 2;
          expected["B"] = 2;
          actual = occurs("ABAB");
          deepEqual(actual, expected);
});

test("Count a string with 2 A and 2 B and 1 C", function () {
          var actual, expected = {};
          expected["A"] = 2;
          expected["B"] = 2;
          expected["C"] = 1;
          actual = occurs("ABCAB");
          deepEqual(actual, expected);
});


You can find the Javascript source code here.