Does recent progress with neural networks foretell artificial general intelligence?

Photo by Michael BrandtDoes recent progress with neural networks foretell artificial general intelligence?Trent EadyBlockedUnblockFollowFollowingDec 4I have some thoughts on this talk by Greg Brockman, the CTO of OpenAI:It’s a good talk, but it’s really about the last 6 years of progress in AI more than anything else..I wish the talk had been twice as long, and the second half of the talk had focused on those ideas.Brockman emphasizes the astronomical increases in computing power used to train neural networks, with the implication that this foretells astronomical improvements in AI..It’s also possible that astronomical increases in the computing power used to train neural networks will lead to only modest performance increases..Facebook trained a neural network on 1 billion images from Instagram, and achieved a 14.6% top-1 error rate on the ImageNet benchmark..(“Top-1” means the neural network’s top guess for the type of object in the image.) That’s a 1000x increase from the 1 million images in the ImageNet training dataset, but only a 1.26x improvement on the 18.5% top-1 error rate previously achieved by a neural network using only those 1 million training images..When this many training examples are mislabelled, the training signal for a neural network is very weak.It’s not just the sheer volume of training data that matters, but also the quality of the data and the quality of the labelling..Maybe with carefully curated and hand-labelled images, we would see something much closer to a 1:1 ratio between the increase in training data and the increase in performance..With autonomous cars, for instance, it seems like an open question whether 10x more embedded compute, 10x bigger neural networks, 10x more training data, and 10x more backend compute for training would lead to anything like a 10x improvement in neural network performance, or just an incremental improvement.Incremental improvements are still worth pursuing..If humans have a 1 in 128 error rate on a perception task like traffic sign recognition, and neural networks have, say, a 1 in 105 error rate, pushing up performance by 1.26x to 1 in 132 could be revolutionary in its practical and economic implications..Or, perhaps more realistically, if you start with a neural network that has a 1 in 14 error rate, and you increase neural network accuracy by 1.26x ten subsequent times, you’ll end up with an error rate of 1 in 141.Even if each incremental step requires an order of magnitude increase in embedded compute, neural network size, training data volume, labelling quality, backend compute, or neural architecture engineers, it’s still worth it..Throw crazy amounts of compute at a neural network so it can play 600 years of Dota 2 every day, and through highly random, Darwinian mutate-and-select it will eventually learn to do one thing very, very well.. More details

Leave a Reply