Charts nearly form monoids — algebraic plotting with Altair

Charts nearly form monoids — algebraic plotting with AltairQuinn DoughertyBlockedUnblockFollowFollowingMar 3Monoid looks like a jargon word if you’ve never seen it before, so you’re authorized to call it “puppies” every time I write it.

The internet’s one true currencyBut it’s not unfamiliar, because I can readily write down1 + 3 + 0and you can tell me everything I need to know about it.

Evaluation order doesn’t matterYou know that 1 + 3 then 4 + 0 is the same as 3 + 0 then 1 + 3Adding zero is identityYou know 0 is neutral, or to speak in plain python, identity = lambda x: x + 0 satisfying for every number identity(x)==xYou’re done.

You’re all caught up with the the jargon.

That’s all a p̶u̶p̶p̶y̶ monoid is.

Out in the wild, whenever you encounter a p̶u̶p̶p̶y̶ binary, closed operator (where binary means it combines two things, and closed means it lands back in the type where the two things came from), look for the above properties — called associativity and neutrality.

+ : (num, num) -> num — commonsense addition⊕ : (a, a) -> a — ANY binary, closed operation on any type aalt.

ChartIn data visualization, you rarely see a whole plot in one line of code.

Programming in plt doesn’t encourage you to treat objects like a type of plots that you use in function input or output.

from sklearn import datasetsimport altair as altiris_load = datasets.

load_iris()featnames = iris_load['feature_names']X = pd.

DataFrame(iris_load.

data, columns=featnames) y = pd.

DataFrame(iris_load.

target, columns=['target']).

replace({k: flowa for k,flowa in zip(range(3), iris_load.

target_names)})iris = pd.

concat([X,y], axis=1)C = alt.

Chart(iris)Instantiate altair’s Chart object with a dataframe.

The rest is methodsC.

mark_point().

encode(x=featnames[0], y=featnames[1])This line is in fact a program of type Chart.

When executed it draws a plot on the screen.

You can define a function that takes a Chart and a feature name and returns a Chartdef chart_func(C: alt.

Chart, y: str) -> alt.

Chart: return C.

encode(y=y)sepallength = C.

mark_point().

encode(x=featnames[0])chart_func(sepallength, featnames[1])If you run this, you get the exact same picture as above.

Input and output types are useful, because you can build Charts through composition.

def chart_func_two(Ch: alt.

Chart, c: str) -> alt.

Chart: return Ch.

encode(color=c)chart_func_two(chart_func(sepallength, featnames[1]), featnames[2])Which is saying “Give me a Chart of iris data.

Assign sepal length to the x axis.

Then, assign sepal width to the y axis.

Finally, assign petal length to color.

”Altair’s monoidsHow can you combine Charts besides method chaining?sepal = sepallength.

encode(color=featnames[1])sepal.

encode(y=featnames[2]) | sepal.

encode(y=featnames[3])The operation | takes two Charts and concats them horizontally.

Charts form three monoid-ish structures of interest to us.

The type signatures look roughly like this:| : (Chart, Chart) -> Chart — horizontal concat& : (Chart, Chart) -> Chart — vertical concat+ : (Chart, Chart) -> Chart — overlayI say roughly because the return isn’t exactly Chart, but something in the top-level objects family, which behaves exactly like Chart in most situations.

For instance, I can overlay three charts together then apply a color:(sepallength.

encode(y=featnames[1]) + sepallength.

encode(y=featnames[2]) + sepallength.

encode(y=featnames[3])).

encode(color='target:N')None of the binary operators on charts are entirely closed and neither | , &have a neutral element.

We can sort of give + a neutral, however.

zero = C.

mark_point(opacity=0)Then for any call to C, zero + C , C , C + zero will each look exactly the same.

ReviewAltair Charts are combined with associative, binary operatorsThey’re not a 100% closed but give you most of the benefits of closure when you need them (as we’ll see in a minute).

Through method chaining and function composition, Charts of arbitrary complexity can be executed in a single line of code.

Meta-reviewWhen a programmer sees something in the wild ze want to know which basic properties or laws it obeys.

(i.

e.

, Chart combination rules reminded us of ordinary integer addition)When such patterns are identified, a programmer knows that they can borrow ideas from other things that obey that pattern.

(i.

e.

, our next section today!)from functools import reducereduce is a higher order function that takes a binary operation and a list to return a single value.

reduce : ((a,a) -> a), List[a] -> aConsider the case where the type a is num.

I’ll show you an evaluation.

As you can see, flipping + to * would change the result to 0 , but the evaluation tree would have been exactly the same.

The point is to realize that you use a particular instance of reduce every day in a basic python standard function.

If you see where I’m going with this, pause a minute and try to write it out.

before reading the next codeblock.

sum = lambda xs: reduce(lambda x,y: x+y, xs)This is the linked list or recursive interpretation of sum — not what python actually implements.

reduce is linear in list length, not the fastest available.

The point is that it allows us to combine a list with respect to any monoid we like.

Think about the following row of chartsC2 = alt.

Chart(iris, height=240, width=240).

mark_point().

encode(x=featnames[3]+':Q', color='target:N')C2.

encode(y=featnames[0]) | C2.

encode(y=featnames[1]) | C2.

encode(y=featnames[2])Horizontal concat with `|`How much redundant code do you see in the execution call?.Can that redundant code be factored out?Pause a minute, and try to write it out if you see where I’m going with this.

Don’t worry, the codeblock will still be there when you get back!reduce(lambda D,F: D| F, [C2.

encode(y=name) for name in featnames[:2]])PairplotSuppose you show up to the office tomorrow morning and your boss slams a coffee-stained memo on your desk.

“I need a function from matrices to scalars, pronto”.

You look at the memo, and indeed a type signature is scribbled on the bottomf : List[List[float]] -> float“Uh”, you start, unsure if what you’re about to ask is a stupid question.

“What does the function, erm, you know, do?”He looks perplexed for a moment, finally says “oh, just multiply together all the row-sums” and walks off.

You sit and ponder that a moment, waiting for the coffee to drip into the filament of your mind’s lightbulbs.

Before long, “aha!” you exclaim, having just spent the previous night reading a post about monoids and functools.

reduce on medium.

“Since it’s two monoids in one program (plus and times), I can have two reduces — first make a listcomp of sums, then take the product of that list”.

The duck on your desk remains stoic.

As soon as your station fires up, you clackety-clack the function to the screen.

def f(mat: List[List[float]]) -> float: return reduce(lambda c,d: c*d, [reduce(lambda r,s: r+s, row for row in mat You even jot down a unit testassert f([[1,2,3],[4,5,6],[7,8,9]]) == 2160And crumple up the memo, toss it into the bin over your shoulder.

Time to get back to work, as you had been writing a report with Seaborn’s pairplot.

pairplot draws a grid of plots, showing each combination of two covariates.

“That’s strange,” you mutter to yourself.

“That Medium post I was reading last night mentioned something about pairplot”, but you don’t remember exactly what.

You left it open in a tab and went to sleep before finishing it.

But as soon as you see the grid this morning, “wait a minute — this grid of charts is sort of like a matrix of charts”, and of course you could writedef covariates(i: int, j: int) -> Chart: return C.

encode(x=feats[i], y=feats[j])pairplot = [[covariates(0,0), covariates(0,1), covariates(0,2), covariates(0,3)], [covariates(1,0), covariates(1,1), covariates(1,2), covariates(1,3)], [covariates(2,0), covariates(2,1), covariates(2,2), covariates(2,3)],[covariates(3,0), covariates(3,1), covariates(3,2), covariates(3,3)]]“This certainly describes each Chart in pairplot, but can I run it and have them render?” you inquire.

The duck continues to be of no help.

“I am but a simple duck, I know nothing of compumancy”Each of those rows must be combined by horizontal concatenation, soreduce(lambda C,D: C | D, [covariates(i,j) for j in range(3)])That will output a Chart for each i, so if we called that C_row(i), thenreduce(lambda R,S: R & S, [C_row(i) for i in range(3)])“Now hold on, this is just the function f from the memo today!”.

The duck looks sort of relieved.

We’ll also factor out the helper functions.

def pairplot(C: Chart, names: List[str]) -> Chart: return reduce(lambda R,S: R & S, [reduce(lambda C,D: C | D, [C.

encode(x=name_x, y=name_y for name_y in names] for name_x in names])pairplot(alt.

Chart(iris, width=115, height=115).

mark_point(size=7).

encode(color=’target’), featnames)BeautifulConclusionFind the basic patterns in the weird things you encounter, and see if weird things satisfy basic laws (i.

e.

Chart binary ops satisfy associativity)Use those laws to find out which ready-made abstractions are in your arsenal for the situation.

(i.

e.

reduce on Charts)—appendix:Don’t typically roll your own basic functions like sum.

This is illustrative.

If you want pairplot in altair use Chart.

repeat (docs).

Again, illustrative.

In Haskell, reduce is called “fold” and it’s not for monoids, but for a slightly broader class of binary operations.

For this reason, it has foldl and foldr for left- or right-associative binary ops, respectively.

Also notice that when I gave the type signature reduce : ((a,a) -> a), List[a] -> a I lied to you — the real type signature is slightly stronger to accept this broader class.

—enjoy the notebook.

. More details

Leave a Reply