Scrap your typeclasses, take 2

Yesand so on…This algorithm for “type-directed function application” is so simple that it has been implemented in the registry library.

The registry-hedgehog library is just an extension with some combinators specialized for Hedgehog.

With registry-hedgehog you can write the following:You create a “Registry” where you “register” a bunch of functions to use for function generation, like Company, Department, Employee.

You also provide some “base generators” like genInt, or genText.

They are just “values” because they don’t depend on any other generators.

You also need to specify how lists of elements for specific types need to be generated.

This can be done with listOf, one of the library combinators (same thing for maybeOf to generate Maybe of a given type).

All of this takes as many lines as creating Arbitrary instances for each type.

Once you have your registry, you can call make and get a nicely assembled generator for Company without writing any more code.

In that sense we get as much power as typeclasses in terms of boilerplate reduction but we can solve all the other challenges!(note: you might have noticed GenIO instead of Gen, we’ll come back to that)Create generators for an ADTThis part is a bit more involved for “type-directed function application”.

Each constructor of a data type returns the same type.

When we want to create an EmployeeStatus which constructor should we take, Permanent or Temporary?The solution is to “tag” values generated with each constructor like so:Then we have a function genEmployeeStatus taking generators built with both constructors and choosing randomly (with Gen.

choice) between them.

Again this can be completely automated and there is a piece of Template Haskell producing the same expression:$(makeGenerators ‘’EmployeeStatus)Important limitation though!.This will not work with recursive data types.

Typically a data type for lambda expressions isA generator for the App constructor would have the signature:genApp :: Gen Exp -> Gen Exp -> Gen (Tag "app" Expr)But the selector generator isgenExpr :: Gen (Tag "var" Expr) -> Gen (Tag "lam" Expr) -> Gen (Tag "app" Expr) -> Gen ExprSo that genApp and genExpr are mutually recursive.

This is, for now, prohibited by the registry library, I think this can be safely done though but this needs more work.

Override a generatorThis is where getting away from typeclasses starts to pay off.

We want to generate only Permanent employees for a company.

To do this we “override the registry with a fixed generator for EmployeeStatusWe add “on top” of the registry a generator for EmployeeStatus always returning the same value.

When the registry library will try to create a generator for Company it will take the first available generator for EmployeeStatus.

Override a relationThat’s an easy one given the previous section.

If you want more departments in your company change the function which is producing a Gen [Department]The function being overridden is simply the Gen.

list function which takes another generator as a parameter.

You can do the same for other types of “relations”: Maybe, Set, Map, NonEmpty,… This is super-useful in real-life testing because you can really fine-tune your generation to a specific test scenario.

Override a generator in a specific contextWe want to generate only short and upper-cased names for Departments, what do we do?There’s nothing more to do.

Instead of just adding the new generator for Text on top of the registry (it would be used everywhere in that case) we declare that we only want to use it when creating Departments.

Compose generation constraintsIf you think of setting a new generator as a “Registry modification” and write it as such:setShortDepartmentNames :: Registry _ _ -> Registry _ _ setShortDepartmentNames = specializeGen @Department genDepartmentNameThen it is easy to see that all those modifications can be composed using the simple function composition operator .

setupCompany = setShortDepartmentNames .

setManyDepartmentsThat’s a feature which comes for free, thank you Haskell!.We can also see this as “stateful transformations” in a State monadMaybe you noticed that we used specializeGenS this time.

Indeed there are MonadState xxxS combinators in the library for setting generators or specializing them.

We can still compose constraints with the >> operatorsetupCompanyS = setShortDepartmentNamesS >> setManyDepartmentsSThis also means that we can write “stateful” Hedgehog properties because the property type in Hedgehog is a transformer, PropertyTAt the beginning of the property declaration we pass the registry containing all our generators, runS generators, and inside the property we modify the registry statefully with setSmallCompany or setDepartmentName.

Create effectful generatorsThis is another challenge which is quite hard to solve with regular Arbitrary typeclasses or normal Hedgehog generators but it is so essential in practice.

When we generate data types like names, identifiers, nonces, keys,… we very often have a uniqueness constraint on them: “all the departments must have a unique name”.

This means that the generation of department names must be effectful.

You must remember the previously generated departments in order to generate new distinct ones.

This is supported by the registry-hedgehog library with the setDistinct functionsetDistinct :: (Eq a) => Registry ins out -> IO (Registry ins out)As you can see, we are now doing an effectful transformation of the registry in order to keep track of generated values.

But this is not really an issue since the property type used in Hedgehog is PropertyT IO () which gives us the possibility to use IO.

This makes a “stateful” declaration very straightforward because the signature of setDistinctS is:setDistinctS :: (Eq a, MonadState (Registry ins out) m, MonadIO m) => m ()which allows us to blend such a constraint with others in the PropertyT (StateT (Registry _ _) IO) monad.

This also explains why we create generators as:type GenIO = GenT IOThat’s because they need to generate some effect, in that case, to add a value to the list of created values once a value has been generated.

I think this feature is a “killer feature” for the registry-hedgehog library.

Generating distinct values has always been a bit troublesome and now it can be done in one line of code.

Cycle constructorsNow that we have more control over effects for generating values we can apply it to the final challenge.

When you have a large ADT, with many constructors you might want to make sure that over 100 tests, which is the default number of tests for a given property execution, you will use as many as those constructors as possible.

However the default method to select constructors in the $(makeGenerators ''EmployeeStatus) Template Haskell code is using Gen.

choice.

It would be nice to use a function like cyclecycle :: [Gen a] -> IO (Gen a)which goes through all the generators, one by one, and comes back to the beginning of the list.

This is an effectful function because we need to remember which generator was selected last.

Fortunately we can do all of this because makeGenerators is not hard-coding Gen.

choice as the selection strategy.

It is requiring a parameter, GenIO Chooser, containing the selection strategy.

The default Chooser value in the registry is using Gen.

choice internally but nothing forbids us to pass a new one!.This is exactly what the function setCycleChooser is doing:The property above generates EmployeeStatus values and will repeatedly generate Permanent, Temporary 1, Permanent, Temporary 1, Permanent, Temporary 1,.

cycling through the 2 constructors of the EmployeeStatus datatype.

ConclusionI hope I have demonstrated today that:automating function application is usefultypeclasses are not the only tool at our disposal to do itgetting away from typeclasses gives us unprecedented expressivity to declare how data generation should be doneThe registry-hedgehog is available on Github and Hackage, please give it a go and tell me what you think!.. More details

Leave a Reply