Initial choice of Learning Rate is a key part of gradient based methods and has a great effect on the performance of the Deep Learning Model.This paper studies the behavior of multiple gradient based optimization algorithm which are commonly used in Deep Learning and compare their performance on various learning rate. As observed popular choice of optimization algorithms are highly sensitive to various choice of learning rates. Our goal is to find which optimizer has an edge over others for a specific setting. We look at two datasets namely MNIST and CIFAR10 for benchmarking. The results are quite surprising, and it will help us to choose a learning rate more efficiently.

Link: https://github.com/rehanguha/Learning-Rate-Sensitivity