Introducing TAPAS

IBM setup to model a similar intuitive criteria in the form of a neural network.Introducing TAPASTrain-less accuracy predictor for architecture search (TAPAS) is a deep neural network accuracy predictor that estimates the performance of a neural network for unseen datasets. TAPAS attempts to algorithmize the intuitions of deep learning experts trying to predict the performance of specific models. It does so by adapting predictions based on the complexity of a specific dataset and knowledge of a specific neural network architecture. The main contributions of the TAPAS framework can be summarized as followed:1) TAPAS is not bounded to any specific dataset2) TAPAS is able to learn from previous experiments, whatever dataset they involve, improving prediction over usage.3) TAPAS allows to run large-scale architecture search on a single GPU device within a few minutes.The first version of TAPAS is focused on estimating the performance of convolutional neural networks(CNNs) but the principles can be adapted to other deep learning architectures. TAPAS leverages a compact characterization of the user-provided input dataset, as well as a dynamically growing database of trained neural networks and associated performance. The TAPAS architecture is illustrated in the following figure:Although the TAPAS model is incredibly sophisticated, its core functionality is based on three fundamental components:1) Dataset Characterization(DC): This component estimates the difficulty of a specific dataset using a numeric score called the Dataset Characterization Number (DCN). The DCN scores is first calculated by training a probe net to obtain a dataset difficulty estimation. Probe nets are modest-sized neural networks designed to characterize the difficulty of an image classification dataset. After the probe net is trained, the DCN is used for filtering datasets from the LDE and directly as input score in the TAP training and prediction phases.2) Lifelong Database of Experiments (LDE): This component is a continuously growing DB, which ingests every new experiment effectuated inside the framework..An experiment includes the CNN architecture description, the training hyper-parameters, the employed dataset (with its DCN), as well as the achieved accuracy.3) Train-less Accuracy Predictor (TAP): This component predicts the performance of a CNN without a training cycle..TAP leverages knowledge accumulated through experiments of datasets of similar difficulty filtered from the LDE based on the DCN..TAP uses a clever cascade method(shown below) to analyze the performance of a specific layer assuming that the performance of its sublayers is known.The following animation captures the essence of TAPAS and the processes to estimates the performance of a deep neural network .TAPAS in ActionIBM benchmarked TAPAS against other state-of-the-art performance prediction methods..The experiments essentially compared the predicted performance of a model against the values obtained after training..For quantifying the results, the experiments relied on three main metrics: (i) the mean squared error (MSE), which measures the difference between the estimator and what is estimated, (ii) Kendall’s Tau (Tau), which measures the similarities of the ordering between the predictions and the real values, and (iii) the coefficient of determination (R2), which measures the proportion of the variance in the dependent variable that is predictable from the independent variable..The results below clearly showed that TAPAS was very accurate predicting the performance of specific models.The ideas behind the TAPAS framework can result incredible valuable to reduce the time and computation resources used in training and experimentation of deep learning models..By trying to model human intuition estimating the complexity of a dataset and a neural network, TAPAS shown to be incredibly accurate estimating the performance of a deep neural network.. More details

Leave a Reply