Build smart(er) applications with probabilistic models and AWS Lambda functions

Build smart(er) applications with probabilistic models and AWS Lambda functionsJacopo TagliabueBlockedUnblockFollowFollowingMar 11A quick, cheap and serverless way to power endpoints with WebPPL probabilistic programs.

Oops!… I need to deploy a model again“Probability is the very guide of life.

”―CiceroWhile the cool kids today are busy Tensorflow-ing and PyTorch-ing, cooler kids are getting more and more into Probabilistic Programming (PP) — see for example Pyro, Figaro, Stan, Infer.

NET (the philosophically inclined reader that wants to know more about PP may start from our own opinion piece on concept learning).

If you are a data scientist/date engineer/A.

I.

-something and you’re considering adding some probabilistic magic to your existing data pipeline — without bothering your devOps team and spending cloud money—you may like the idea of having micro-services exposing models in a quick, cheap and straightforward way.

In this post, we leverage Serverless and AWS lambda functions to deploy a WebPPL-powered endpoint in minutes.

As we shall see, the newly deployed endpoint, coupled with other serverless code (e.

g.

a one-page web app), becomes a prototype that can be shared within the organization for quick feedback and rapid iteration.

This is a sequel to the previous posts in the series “being-lazy-at-devOps”, that featured AWS Lambda functions with Tensorflow models and Spark Decision Trees.

PrerequisitesBefore diving into the code, please make sure to:have a working node environment with npm installed;have an AWS account to deploy the code to AWS Lambda;setup Serverless (follow the instructions here; remember to setup your AWS credentials);if you have a WebPPL model ready to go, great (100 bonus points!); if not, no worries: we are going to ship one together!As usual, you can find all the code in our GitHub page: let’s get to work.

A (toy) probabilistic app“All models are wrong, but some are useful.

”―G.

BoxIf the semantics of programs in, say, Javascript is a function from programs to executions, the semantics of its PP cousin, WebPPL, is a function from programs to distributions over executions.

In other words, when you run this program:the result is a distribution of outcomes, generated by i) enumerating the possible values for the random variables (die1 and die2), ii) conditioning on a value (8) for the sum of die1 + die2, i.

e.

rejecting all the runs in which the condition is not met:Frequency distribution for the value of the first die, conditional on the sum of the two being 8.

Since this is not an introduction to probabilistic programming as such, we just briefly mention that PP principles have been successfully applied to a wide variety of phenomena, such as skill rating, computer vision and inference over images (see also here and here), program synthesis, decision under uncertainty, natural language semantics and pragmatics, and obviously the thorny task of modeling human reasoning in cognitive sciences— the non-lazy reader is strongly encouraged to explore further the theoretical landscape and learn about the many virtues of this family of tools (such as, for example, quantifying model uncertainty when serving predictions).

As anticipated, the language we will be using is WebPPL, a purely functional Javascript subset augmented with probabilistic primitives, such as sample (a function which, unsurprisingly, sample values from a distribution).

To showcase the probabilistic nature of WebPPL, we are going to sample and draw data points from Gaussian Processes (GP): GPs are super cool, incredibly versatile and getting increasing attention in the data science community (obviously, what we discuss is not specific to GPs and can therefore be applied to other types of WebPPL programs, such as, say, inference for computer vision).

Charting priors from GPs with the RBF kernel (mean=0, sigma=1).

The intended workflow is as follows:A lambda-powered web app for prototyping and data viz.

with WebPPL and standard JS libraries.

the user requests the web app from the public URL;a Lambda function gets invoked by API Gateway; the response is simply HTML code containing a basic form asking the user for some input variables;the user submits her choices through an HTML button, which uses JQuery to send an AJAX request to the model endpoint;a Lambda function gets invoked by API Gateway; the response is the output of running the chosen WebPPL program (more on that below) on the user input;finally, some Chart.

js code maps the endpoint response to a pretty data visualization.

In the next section, we will see how to go from an idea living on a powerpoint to a working web application in seconds.

Talk is cheap, show me the code!““Give a man a program, frustrate him for a day.

Teach a man to program, frustrate him for a lifetime.

” ― M.

 WaseemOur project functionalities depend on just two files in the project:handler.

js contains the code for the two functions detailed above;serverless.

yml is a standard yml file describing the AWS infrastructure to run the two functions.

serverless.

yml is a plain vanilla Serverless file that exposes lambda functions as endpoints through API Gateway: nothing special here (e.

g.

the World Wide Web is already full of tutorials covering this setup, including our own)!handler.

js is where all the magic happens.

The structure is indeed fairly straightforward, so we are just going to highlight some key features:WebPPL is imported with the webppl = require('webppl') syntax at the top;app is a simple function returning a full HTML page, including JQuery functions at the end to add the required input fields and basic interaction— the HTML code is simply stored as a string and returned with the specific headers in the response;the model function is the one actually running the WebPPL code: it starts with some simple parameter checking/verification/type casting and then use webppl.

run(CODE) to execute the code specified in the const code variable.

Please note that to pass query parameters to the model we use the simplest trick in the book, i.

e.

building a snippet (as a string) containing the variables for the script, and make sure code uses the correct names for the input variables.

An obvious alternative would be the global store, but if there are no particular concerns (as in this case), our simple strategy works well enough.

To see the project in action, just clone the repo, open terminal, cd into the folder, download WebPPL dependencies:npm install webppland then finally type:serverless deployIf Serverless is installed correctly, the first deploy triggers the creation of the required resources on AWS (future deploy commands will be quicker).

After some waiting, you should be greeted by a success message in your terminal:Congratulations, your lambdas are up and running!If you point your browser to the /app URL as found in the terminal (see “Endpoints” in the image above), everything should work as expected:Our fantastic web app is up and running (original video here)!When you are happy with your prototyping and ready to clean up the resources, just use the remove command to safely remove the stack:serverless removeNext steps“A journey of a thousand models begins with a single lambda” ―(almost) Lao TzuIn this brief post we showed how we can deploy probabilistic models to AWS and serve predictions on demand through standard GET requests with a few lines of code:this setup is fast, reliable and cheap (if not free, since the first 1M requests/mo are completely on Jeff Bezos).

To adapt the provided template to your own prototyping, just change the WebPPL code in the model function, and then adjust HTML input fields, argument parsing and data visualization accordingly: in a few minutes, you will be able to share with anybody the result of your marvelous Bayesian modeling, or generate insightful visualizations to help with your scientific exploration.

On the visualization side, WebPPL viz module can be of some inspiration to generalize what is now a simple-and-hacky chart: it shouldn’t take too long to use chart.

js/d3 to build a reusable wrapper around the main use cases for WebPPL models [ exercise for the non-lazy reader: what can be an interesting and engaging visualization for a linear regression case, such as the Single Regression example in Probmods?].

On the modeling side, we picked GPs to provide an end-to-end probabilistic program that looks good in the browser window, but obviously the world of probabilistic A.

I.

is great and—like the universe — in constant expansion: if you want some PP modeling ideas, there is a ton of cool stuff here.

Finally, if you really like this setup and you think it is ready for prime time, make sure to add tests, error handling and all the other essential things this tutorial is too short to contain.

Happy (probabilistic) coding!See you, space cowboysIf you have question, feedback or comments, please share your story with jacopo.

tagliabue@tooso.

ai.

Don’t forget to get the latest from Tooso on Linkedin, Twitter and Instagram.

AcknowledgmentsNone of this would have been possible without Luca, Senior Engineer and Node Wizard at Tooso: Luca built our first WebPPL-to-lambda playground and patiently answered all my (silly) node questions.

Most good ideas in this post are Luca’s, all the remaining mistakes are mine.

.. More details

Leave a Reply