Cleaner R Code with Functional Programming

What’s a data analyst to do?The solution is to use applies (also called maps).

Maps take a collection of things, and apply some function to each one of those things.

Here’s a diagram taken directly from RStudio’s purrr cheat sheet (Credit: Mara Averick):Sidenote: The dplyr package actually gets its name from applies.

dplyr = data + apply + R.

The purrr package contains a ridiculous number of maps from which to choose.

Seriously, check out that cheatsheet!Example, bringing it all together: Suppose I had a vector of strings, and I wanted to extract the longest word in each.

There is no vectorized function that will do this for me.

I will need to split the string by the space character and get the longest word.

For dramatic effect, I also upper-case the strings and paste them back together:library(tidyverse)library(purrr) sentences <- c( "My head is not functional", "Programming is hard", "Too many rules") getLongestWord <- function(words) { word_counts <- str_length(words) longest_word <- words[which.

max(word_counts)] return(longest_word)} sentences %>% toupper() %>% str_split(' ') %>% map_chr(getLongestWord) %>% str_c(collapse = ' ') # [1] "FUNCTIONAL PROGRAMMING RULES"Bonus Step: Know the LingoIn other languages, some of the lingo of FP is built-in.

Specifically, there are three higher-order functions that make their way into almost every language, functional or not: map (which we’ve already covered), reduce, and filter.

A higher-order function is a function that either takes a function as an argument, returns a function, or both.

Filtering in R is easy.

For data frames, we can use use tidyverse::filter.

For most other things, we can simply use R’s vectorization.

However, when all else fails, base R does indeed have a Filter() function.

Example:Filter(function(x) x %% 2 == 0, 1:10)# [1] 2 4 6 8 10Similarly, you probably won’t ever need Reduce() in R.

But just in case, here’s how it works: Reduce() will take a collection and a binary function (ie, takes two parameters), and successively applies that function two-at-a-time along that collection, cumulatively.

Example:wrap <- function(a, b) paste0("(", a, " ", b, ")")Reduce(wrap, c("A", "B", "C", "D", "E"))# [1] "((((A B) C) D) E)"Another beloved FP topic is that of currying.

Currying is the act of taking a function with many arguments, and breaking it out into functions that take a partial amount of those arguments.

These are sometimes called partial functions.

The following example uses a function factory to make partial functions:# Adder is a "function factory" – a function that makes new functions.

adder <- function(a) { return(function(b) a + b)} # Function factory pumping out new functions.

add3 <- adder(3)add5 <- adder(5) add3(add5(1))# 9Do you find this concept hard to follow?.You’re not alone.

To make this slightly more readable, the functional library gives you an explicitly currying builder:library(functional)add <- function(a, b) a + badd3 <- Curry(add, a = 3)add5 <- Curry(add, a = 5) add3(add5(1))# 9Sidenote: The verb “currying” comes from Haskell Curry, famed Mathematician/Computer Scientist/Fellow Penn Stater.

SummaryDo you feel smarter?.More powerful?.Ready to torture your data with some of your new FP skills?.Here are some of the big takeaways:No more loops!.Ever!Anytime you want to use a loop, find the appropriate apply/map.

Integrate the Tidyverse into your workflow wherever possible.

Use the pipe (%>%) when applying several functions to one thing (eg, a data frame being manipulated in the Tidyverse).

Adhering to these mindsets while coding can greatly reduce ugly, difficult-to-maintain spaghetti code.

Bottling things in functions can leave you with clean, readable, modular ravioli code.

I’ll leave you with a famous quote from John Woods:Always code as if the [person] who ends up maintaining your code will be a violent psychopath who knows where you live.

.

. More details

Leave a Reply