R Shiny in Production

In this situation, the workload should be taken out of Shiny.

Be “taken out” means that data must be ready and preprocessed for the application to access it.

Two words in the last phrase are very essential: preprocessed and access.

An example of a database structure for a shiny app.

(Only displaying some databases supported by R)R is a single-threaded process.

This means that everything is running in sequence.

So, making many calculations inside the app code can give you a headache and let your users tired of waiting for your system’s response.

The figure above shows an example of the structure that could be used in this case, and we can see some of the databases supported by R.

Choose the database that best fits your app’s necessities, we’ve previously posted an article about it: The right database for the job.

With this in mind, now we query the database and get some “ready” data from a user request, instead of making significant calculations inside the app.

Another thing that you have to avoid is querying without a filter to the database.

Commonly, the data used is loaded in the global.

R file:data <- read_csv("data/my_app_data.

csv")ordata <- readRDS("data/my_app_data.

RDS")The database has millions of observations, and bringing all this data into the app will probably crash it.

The query must be made using as filter some input or action from the user.

Below there is an example for the Telephones by region app.

barPlot_data <- reactive({ return( tbl(con, "telephones") %>% filter(region == input$region) )})And the object barPlot_data has a data frame ready to be used for plotting, no need for any calculation.

You can create indices in your database to boost the performance even more, but just filtering can handle the latency and responsiveness.

DeployReliability is our primary objective when we deploy services in production, and that includes consistency.

Imagine you are working on an analysis in R and you send your code to a friend.

Your friend runs exactly this code on the same dataset but receives a slightly different result.

The difference can have various reasons such as a different operating system or a different version of an R package.

A solution to this problem is Docker!Docker is a tool designed to make it easier to create, deploy, and run applications by using containers.

In a way,“Docker is a bit like a virtual machine.

But unlike a virtual machine, rather than creating a whole virtual operating system, Docker allows applications to use the same Linux kernel as the system that they’re running on and only requires applications be shipped with things not already running on the host computer.

This gives a significant performance boost and reduces the size of the application.

”Source: https://opensource.

com/resources/what-dockerUsing Docker is a great way to deploy a Shiny application.

Shiny application inside a container, establishing a connection to an external database.

Shiny application and database both containerized.

We can see two examples of deploy for Shiny apps using Docker.

The app is inside of an isolated environment where there will not be problems about packages, R versions and a lot of others problems you can face when trying to run your code in another machine, be a friend’s pc or AWS EC2.

The first one is only the app inside a container and a connection to a PostgreSQL database outside the container.

This one is the recommended scenario for a production environment since your data will be outside the container and you won’t need to worry about data being lost if your container is restarted or the new infrastructure you’d need to deal with to avoid this scenario.

You can think that the database is shared by your whole team/chapter with ETLs and jobs, then you can write the data used by your service also there.

In the second one, both Shiny app and database are containerized.

This architecture leverages the development flow, making it faster to try out different databases, testing your code and how your app is querying.

When finished, all the data will be lost if you didn’t create a volume to store on your host machine.

Shiny applications are powerful and can be used in production environments, letting Data Scientists develop data products from end to end.

To achieve this, the interaction between Data Scientists and Software Engineers is essential.

Models and data products need the expertise of both professionals to be released, creating services that are reliable and secure.

Are you interested?If you are interested in building context-aware products through location, check out our opportunities.

Also, we’d love to hear from you!.Leave a comment and let us know what you would like us to talk about in the upcoming posts.

A special thanks to Raíza and Abel.

.. More details

Leave a Reply