While previous research had driven the direction of developing deep networks towards being algorithm-dependent (in order to stick to uniform convergence), this paper proposes a need for developing algorithm-independent techniques that don’t restrict themselves to uniform convergence to explain generalization.
We can clearly see why this machine learning research paper has received an award for Outstanding New Directions Paper at NeurIPS 2019.
The researchers have shown that merely uniform convergence is not enough to explain generalization in deep learning.
Also, it is not possible to achieve small bounds satisfying all the 5 criteria.
This has opened a whole new research area into exploring other tools which might explain generalization.
You can access and read the full paper here.
Other honorable mentions by NeurIPS on the Outstanding New Directions Paper Award are: Putting An End to End-to-End: Gradient-Isolated Learning of Representations Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations Test of Time Award at NeurIPS 2019 Each year, NeurIPS also gives an award to a paper presented at the conference 10 years ago and that has had a lasting impact on the field in its contributions (and is also a widely popular paper).
This year, the aptly named Test of Time award has been given to the “Dual Averaging Method for Regularized Stochastic Learning and Online Optimization” by Lin Xiao.
This research is based on the fundamental concepts which built the very foundations of modern machine learning as we know it.
Dual Averaging Method for Regularized Stochastic Learning and Online Optimization Let’s break down the four key concepts covered in this awesome paper: Stochastic Gradient Descent: Stochastic Gradient Descent has been formally established as the method for optimization in machine learning.
This can be achieved by the following stochastic optimization.
Recall SGD and the equation for stochastic optimization for very large samples – here, w is the weight vector and z is the input feature vector.
For t = 0, 1, 2….
Online Convex Optimisation: Another pathbreaking research.
This was simulated as a game where the player will try to predict a weight vector and the resulting loss would be calculated at each t.
The main aim is to minimize this loss – the result is very similar to how we optimize using stochastic gradient descent Compressed Learning: This includes Lasso Regression, L1 Regularized Least-Squares, and other mixed regularization schemes Proximal Gradient Method: Compared to the earlier techniques, this was a much faster method for reducing the loss and still retaining the convexity While the previous research developed an efficient algorithm converging to O(1/t), the sparsity of data was one factor that had been neglected until then.
This paper proposed a new regularizing technique, called the Regularised Dual Averaging Method (RDA) for solving online convex optimization problems.
At that time, these convex optimization problems were not efficient, particularly in terms of scalability.
This research proposed a novel method of Batch Optimization.
This means that only some independent samples are made available initially, and the weight vector is calculated based on those samples (at current time t).
The loss with respect to the current weight vector is calculated along with a subgradient.
This is used again in the iteration (at time t+1).
Specifically, in RDA, instead of the current subgradient, the average subgradient is taken into account.
At that time, this method had achieved much better results for the sparse MNIST dataset than the SGD and other prevalent techniques.
In fact, with an increase in sparsity, the RDA method had demonstrably better results as well.
The reason this was awarded the test of time paper is evident in the different papers which studied the above method further, like manifold identification, accelerated RDA, etc.
You can find the complete paper here.
End Notes NeurIPS 2019 was an extremely educational and inspiring conference again.
I was especially intrigued by the New Directions Paper award and how it tackled the problem of generalization in deep learning.
Which machine learning research paper caught your eye?.Did you like any other paper than you would want to try out or that really inspired you?.Let me know in the comments section below.
All of the talks, including the spotlights and showcases, were broadcast live by the NeurIPS team.
You can find links to the recorded sessions here.
You can also read this article on Analytics Vidhyas Android APP Share this:Click to share on LinkedIn (Opens in new window)Click to share on Facebook (Opens in new window)Click to share on Twitter (Opens in new window)Click to share on Pocket (Opens in new window)Click to share on Reddit (Opens in new window) Related Articles (adsbygoogle = window.
adsbygoogle || ).