Why plumb JSON when you can ProtoBuf?

Why plumb JSON when you can ProtoBuf?João SantiagoBlockedUnblockFollowFollowingMar 19R is the main language of the Data Science team at Billie, so it is only natural that we use at every stage of our analytical workflow, including deployment in production.

The amazing Plumber package allows R functions to be served as a RESTful service with only special comments, and does most of the heavy-lifting for us:Out-of-the-box Plumber handles requests in JSON format or character strings, and responses in a handful of formats including JSON, PDF and JPEG.

This is enough for a lot of applications, but for the volumes of data we deal with, JSON it is too slow and inefficient.

Enter Protocol Buffers.

Protocol Buffers and why should you careProtocol Buffers are a technology developed at Google to serialize structured data, and allow communication between different services in a fast, simple and efficient manner.

Broadly speaking, it replaces JSON or XML as a serialization method, which are good for application in which, for example, human readability is necessary; though once you start sending gigabytes of data between APIs, suddenly all those brackets and colons become a huge burden.

Protocol Buffers removes all that cruft, resulting in a lean(er), binary message.

To use it, you first start with a schema defining what variables are contained in the message, their type, and their order:You write this definition once for your data, and use it on both ends of the wire: the API sending the request encodes the data based on it, and the target API will quickly decode the message.

Additionally, because the schema is statically typed, it works as a contract between the two ends of the wire, and ensures consistency.

No more typecasting JSON strings to the correct types!Most languages have an implementation for Protocol Buffers, R being no exception in the form of the RProtoBuf package, which has good documentation with useful examples.

To our delight, Plumber also allows setting custom filters (handling inputs) and serializers (handling outputs), which means we could extend Plumber to speak ProtoBuf ????.

Plumbing everything togetherAfter browsing stackoverlow for inspiration, and exploring Plumber’s source-code and docs, we found that:Plumber holds the original, raw request in req$rook.

inputreq is an environment, which means it can hold our ProtoBuf message e.

g.

as req$protobufAfter some experimentation we got information flowing in the correct format, and protopretzel was born ????, an R package expanding Plumber with a working filter and serializer for Protocol Buffers.

The package is under active development, and even though it is already capable of handling our use-cases, the API might still change, so consider it experimental.

Cool story, now show me the code ????This is how you get Protocol Buffers support in Plumber using protopretzel:Write a .

proto descriptor file.

If you already have one, that's also great!Add our new serializer (before the call creating the API object), and our new filter:Tag your functions with #* @serializer Protobuf so they use the new serializer, and modify your functions so that they return an RProtoBuf object:Start your API.

Send a request including the type of message in the header, so that the filter knows which type of message to unserialize:The response header will include the messagetype of the output function in the same format, e.

g.

Content-Type: application/x-protobuf; messagetype=example.

TestResponse.

Done!.Plumber is now receiving, handling and responding with binary ProtoBuf messages ????????For a simple implementation, and more examples, take a look at the protopretzel-playground repository.

There is still a lot of room for optimization and improvements, and we very much welcome PRs and issues.

Above all we wish protopretzel can be useful to a lot of R users out there!.????.

. More details

Leave a Reply