String Calculator Kata using Eta

" ++ (mkInput separators ns)) `shouldBe` sumList nsThe novelty in this test is the composition of multiple multi-character separators.

Fortunately Haskell treats a string as a list of character allowing the code listOf1 customCharacterSeparatorGenerator to produce a generator of non-empty strings.

Passing the previous expression into listOf1 will now produce a generator of non empty lists of non-empty separator strings.

For all lists of natural numbers and a list of valid multi-character separators, calling add with a single parameter composed of all numbers separated using the custom separators, will return the sum of all numbers less than 1001addThe implementation of the add function is not at all heroic.

I will however unpack it piece by piece.

module SC whereimport Data.

Listimport Data.

List.

Splitimport Text.

Regexadd :: String -> Either [Int] Intadd input = .

The add function takes a single argument and returns an Either where the left indicates the error condition containing the negative values whilst the right contains the result.

The body of add is simple Haskell relying only on parse.

add :: String -> Either [Int] Intadd input = let numbers = map read $ parse input negatives = filter (< 0) numbers in if null negatives then Right $ sum $ filter (< 1001) numbers else Left negativesparse :: String -> [String]parse input = .

The body of parse uses a guard to separate out the 3 scenarios:parse :: String -> [String]parse input | isPrefixOf "//[" input = .

| isPrefixOf "//" input = .

| otherwise = .

Working backwards the otherwise scenario deals with the comma/newline scenario.

Using regular expressions this implementation is simple.

splitRegex (mkRegex ",|.") inputThe middle scenario is also simple by virtue of being able to extract out the separator and content based on positions.

splitRegex (mkRegex $ escapeRe $ take 1 $ drop 2 input) $ drop 4 inputReference is made to the escapeRe function responsible for marking up text and escaping all of the regular expression characters.

escapeRe :: String -> StringescapeRe [] = []escapeRe (x:xs) | x `elem` "*+.

?|$^)({}" = '' : x : escapeRe xs | otherwise = x : escapeRe xsThe final scenario is the tricky scenario in that it needs to compose the regular expression separator through a number of manipulations.

let inputs :: [String] inputs = splitOn "." input separator :: Regex separator = mkRegex $ intercalate "|" $ map escapeRe $ sortBy (a b -> if length a > length b then LT else GT) $ splitOn "][" $ init $ drop 3 $ inputs !.0 content :: String content = inputs !.1in splitRegex separator contentTaking a bit of time to work through each step the manipulation is clear except for the sort step.

The sort is necessary due to an anomaly with regular expression libraries.

The following REPL interaction is useful for explanation:Prelude Main> import Text.

RegexPrelude Text.

Regex Main> splitRegex (mkRegex "x|xy") "1x2xy3"["1","2","y3"]it2 :: [String]Prelude Text.

Regex Main> splitRegex (mkRegex "xy|x") "1x2xy3"["1","2","3"]it3 :: [String]The regular expression libraries appear to apply the split using each regular expression left to right.

The result of this strategy is that if a separator is a prefix of a later separator then the prefix will be matched.

In the “x|xy” scenario the “x” is applied before the “xy” resulting in the “1x2xy3” being split into “1”, “2”, “y3”.

The way to resolve this issue is to sort the separators on descending separator length.

As an aside a number of us have been performing this kata for a number of years and only found this scenario when we used predicate based testing.

This scenario appears in virtually all implementations that use a regular expression strategy.

Closing ThoughtsOn a personal note — I am a Java developer and have lived on the JVM from the early days of Java.

To be specific from Java 1.

0.

2.

The lens that I look through when using Eta is not “Eta is Haskell on the JVM” but rather “Eta is an additional language on the JVM” so please consider my thoughts within that context.

Eta does not support the JVM idiomatically.

For Eta to fulfil the promise of bringing a non-strict functional language to the JVM it needs to not only interact with JVM code but also fit alongside the idioms.

My Eta tests need to fit alongside my xUnit runner and report accordingly.

I need to be able to separate my code into src and test.

I need to be able to exploit packages.

Functional languages on other environments have taken a more measured approach to libraries.

Elm has created a smaller set of focused libraries.

Elm can get away with that because it is a new language and the community is purposefully keeping it’s implementation narrow and focused on it’s chosen domain.

Eventhough Purescript is a general purpose Haskell flavoured language they have pulled the libraries together and introduced greater coherency.

Eta positions itself as Hasell on the JVM and inherits a vast number of libraries.

However this does introduce the following issues:Steep learning curve and lots of searching to find functionality.

For example to escape a regular expression is native to most libraries.

I am sure that somewhere in the vast Haskell landscape it is there but I could not find it.

Inconsistencies between libraries for things that are functionally similar.

For example splitting an empty string using a character separator or a regular expression separator has different outcomes.

Prelude Main> import Data.

List.

SplitPrelude Data.

List.

Split Main> splitOn "," ""[""]it0 :: [[Char]]Prelude Main> import Text.

RegexPrelude Text.

Regex Main> splitRegex (mkRegex ",") ""[]it0 :: [String]Eta has massive promise and it was wonderful to implement “old faithful” String Calculator Kata using Eta on the JVM.

My hope is that the Eta community will embrace the JVM as a first class participant rather than treating it as a runtime for the Haskell community.

Only then can it be successful in bridging between the two communities.

.

. More details

Leave a Reply