Uber Has Been Quietly Assembling One of the Most Impressive Open Source Deep Learning Stacks in the Market

Artificial intelligence(AI) has been an atypical technology trend.

In a traditional technology cycle, innovation typically begins with startups trying to disrupt industry incumbents.

In the case of AI, most of the innovation in the space has been coming from the big corporate labs of companies like Google, Facebook, Uber or Microsoft.

Those companies are not only leading impressive tracks of research but also regularly open sourcing new frameworks and tools that streamline the adoption of AI technologies.

In that context, Uber has emerged as one of the most active contributors to open source AI technologies in the current ecosystems.

In just a few years, Uber has regularly open sourced projects across different areas of the AI lifecycle.

Today, I would like to review a few of my favorites.

Uber is a near-perfect playground for AI technologies.

The company combines all the traditional AI requirements of a large scale tech company with a front row seat to AI-first transportation scenarios.

As a result, Uber has been building machine/deep learning applications across largely diverse scenarios ranging from customer classifications to self-driving vehicles.

Many of the technologies used by Uber teams have been open sourced and received accolades from the machine learning community.

Let’s look at some of my favorites:Note: I am not covering technologies like Michelangelo or PyML, as they are well documented having been open sourced.

   Ludwig is a TensorFlow based toolbox that allows to train and test deep learning models without the need to write code.

Conceptually, Ludwig was created under five fundamental principles:Using Ludwig, a data scientist can train a deep learning model by simply providing a CSV file that contains the training data as well as a YAML file with the inputs and outputs of the model.

Using those two data points, Ludwig performs a multi-task learning routine to predict all outputs simultaneously and evaluate the results.

Under the covers, Ludwig provides a series of deep learning models that are constantly evaluated and can be combined in a final architecture.

The Uber engineering team explains this process by using the following analogy: “if deep learning libraries provide the building blocks to make your building, Ludwig provides the buildings to make your city, and you can choose among the available buildings or add your own building to the set of available ones.

”   Pyro is a deep probabilistic programming language(PPL) released by Uber AI Labs.

Pyro is built on top of PyTorch and is based on four fundamental principles:These principles often pull Pyro’s implementation in opposite directions.

Being universal, for instance, requires allowing arbitrary control structure within Pyro programs, but this generality makes it difficult to scale.

However, in general, Pyro achieves a brilliant balance between these capabilities making one of the best PPLs for real world applications.

   Manifold is Uber technologies for debugging and interpreting machine learning models at scale.

With Manifold, the Uber engineering team wanted to accomplish some very tangible goals:To accomplish those goals, Manifold segments the machine learning analysis process into three main phases: Inspection, Explanation and Refinement.

   Uber built the Plato Research Dialogue System(PRDS) to address the challenges of building large scale conversational applications.

Conceptually, PRDS is a framework to create, train and evaluate conversational AI agents on diverse environments.

From a functional standpoint, PRDS includes the following building blocks:PRDS was designed with modularity in mind in order to incorporate state-of-the-art research in conversational systems as well as continuously evolve every component of the platform.

In PRDS, each component can be trained either online (from interactions) or offline and incorporate into the core engine.

From the training standpoint, PRDS supports interactions with human and simulated users.

The latter are common to jumpstart conversational AI agents in research scenarios while the former is more representative of live interactions.

   Horovod is one of the Uber ML stacks that has become extremely popular within the community and has been adopted by research teams at AI-powerhouses like DeepMind or OpenAI.

Conceptually, Horovod is a framework for running distributed deep learning training jobs at scale.

Horovod leverages message passing interface stacks such as OpenMPI to enable a training job to run on a highly parallel and distributed infrastructure without any modifications.

Running a distributed TensorFlow training job in Horovod is accomplished in four simple steps:   Last by not least, we should mention Uber’s active contributions to AI research.

Many of Uber’s open source releases are inspired by their research efforts.

 Uber AI Research website is a phenomenal catalog of papers that highlight Uber’s latest effort in AI research.

These are some of the contributions of the Uber engineering team that have seen regular adoption by the AI research and development community.

As Uber continues implementing AI solutions at scale, we should see new and innovated frameworks that simplify the adoption of machine learning by data scientists and researchers.

  Original.

Reposted with permission.

Related: var disqus_shortname = kdnuggets; (function() { var dsq = document.

createElement(script); dsq.

type = text/javascript; dsq.

async = true; dsq.

src = https://kdnuggets.

disqus.

com/embed.

js; (document.

getElementsByTagName(head)[0] || document.

getElementsByTagName(body)[0]).

appendChild(dsq); })();.

Leave a Reply