Best of arXiv.org for AI, Machine Learning, and Deep Learning – March 2019

Exact Gaussian Processes on a Million Data Points Gaussian processes (GPs) are flexible models with state-of-the-art performance on many impactful applications.

However, computational constraints with standard inference procedures have limited exact GPs to problems with fewer than about ten thousand training points, necessitating approximations for larger datasets.

This paper develops a scalable approach for exact GPs that leverages multi-GPU parallelization and methods like linear conjugate gradients, accessing the kernel matrix only through matrix multiplication.

Yet Another Accelerated SGD: ResNet-50 Training on ImageNet in 74.

7 seconds There has been a strong demand for algorithms that can execute machine learning as faster as possible and the speed of deep learning has accelerated by 30 times only in the past two years.

Distributed deep learning using the large mini-batch is a key technology to address the demand and is a great challenge as it is difficult to achieve high scalability on large clusters without compromising accuracy.

This paper introduces optimization methods applied to this challenge.

The authors achieved the training time of 74.

7 seconds using 2,048 GPUs on ABCI cluster applying these methods.

The training throughput is over 1.

73 million images/sec and the top-1 validation accuracy is 75.

08%.

Multimodal Emotion Classification Most NLP and Computer Vision tasks are limited to scarcity of labelled data.

In social media emotion classification and other related tasks, hashtags have been used as indicators to label data.

With the rapid increase in emoji usage of social media, emojis are used as an additional feature for major social NLP tasks.

However, this is less explored in case of multimedia posts on social media where posts are composed of both image and text.

At the same time, w.

e have seen a surge in the interest to incorporate domain knowledge to improve machine understanding of text.

In this paper, we investigate whether domain knowledge for emoji can improve the accuracy of emotion classification task.

Generative Adversarial Networks: recent developments In traditional generative modeling, good data representation is very often a base for a good machine learning model.

It can be linked to good representations encoding more explanatory factors that are hidden in the original data.

With the invention of Generative Adversarial Networks (GANs), a subclass of generative models that are able to learn representations in an unsupervised and semi-supervised fashion, we are now able to adversarially learn good mappings from a simple prior distribution to a target data distribution.

This paper presents an overview of recent developments in GANs with a focus on learning latent space representations.

Time Series Imputation Multivariate time series is a very active topic in the research community and many machine learning tasks are being used in order to extract information from this type of data.

However, in real-world problems data has missing values, which may difficult the application of machine learning techniques to extract information.

This paper focuses on the task of imputation of time series.

Many imputation methods for time series are based on regression methods.

Unfortunately, these methods perform poorly when the variables are categorical.

To address this case, the authors propose a new imputation method based on Expectation Maximization over dynamic Bayesian networks.

Accelerating Gradient Boosting Machine Gradient Boosting Machine (GBM) is an extremely powerful supervised learning algorithm that is widely used in practice.

GBM routinely features as a leading algorithm in machine learning competitions such as Kaggle and the KDDCup.

This work proposes Accelerated Gradient Boosting Machine (AGBM) by incorporating Nesterov’s acceleration techniques into the design of GBM.

Solving the Black Box Problem: A General-Purpose Recipe for Explainable Artificial Intelligence Many of the computing systems developed using machine learning are opaque: it is difficult to explain why they do what they do, or how they work.

The Explainable AI research program aims to develop analytic techniques for rendering such systems transparent, but lacks a general understanding of what it actually takes to do so.

The aim of this paper is to provide a general-purpose recipe for Explainable AI: A series of steps that should be taken to render an opaque computing system transparent.

Accelerating Minibatch Stochastic Gradient Descent using Typicality Sampling Machine learning, especially deep neural networks, has been rapidly developed in fields including computer vision, speech recognition and reinforcement learning.

Although Mini-batch SGD is one of the most popular stochastic optimization methods in training deep networks, it shows a slow convergence rate due to the large noise in gradient approximation.

In this paper, we attempt to remedy this problem by building more efficient batch selection method based on typicality sampling, which reduces the error of gradient estimation in conventional Minibatch SGD.

Predicting Research Trends From Arxiv Researchers perform trend detection on two data sets of Arxiv papers, derived from its machine learning (cs.

LG) and natural language processing (cs.

CL) categories.

The approach is bottom-up: first rank papers by their normalized citation counts, then group top-ranked papers into different categories based on the tasks that they pursue and the methods they use.

Then analyze these resulting topics.

The paper describes that the dominating paradigm in cs.

CL revolves around natural language generation problems and those in cs.

LG revolve around reinforcement learning and adversarial principles.

By extrapolation, it is predicted that these topics will remain lead problems/approaches in their fields in the short- and mid-term.

Sign up for the free insideBIGDATA newsletter.

.

. More details

Leave a Reply