Training modern machine learning models involves finding optimal parameter values through iterative optimization algorithms such as stochastic gradient descent (SGD) or its variants.
One of the key challenges while using such algorithms is determining appropriate step sizes or learning rates, which strike a balance between convergence speed and stability.
In recent years, adaptive step-size algorithms have gained popularity due to their ability to dynamically adjust learning rates based on the gradient magnitudes encountered during training.
This talk explores various adaptive step-size algorithms, such as Adam and Adagrad, and discusses their advantages and limitations for training machine learning models. We will then introduce two new strategies for adaptive step-size based on Polyak step-size and implicit step-size in a variance-reduced method Sarah.
Martin Takac is an Associate Professor at the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), UAE. Before joining MBZUAI, he was an Associate Professor in the Department of Industrial and Systems Engineering at Lehigh University, where he has been employed since 2014. He received his B.S. (2008) and M.S. (2010) degrees in Mathematics from Comenius University, Slovakia, and Ph.D. (2014) degree in Mathematics from the University of Edinburgh, United Kingdom. He received several awards during this period, including the Best Ph.D. Dissertation Award by the OR Society (2014), Leslie Fox Prize (2nd Prize; 2013) by the Institute for Mathematics and its Applications, and INFORMS Computing Society Best Student Paper Award (runner up; 2012). His current research interests include designing and analyzing algorithms for machine learning, AI for science, understanding protein-DNA interactions, and using ML for energy. Martin received funding from various U.S. National Science Foundation programs (including through a TRIPODS Institute grant awarded to him and his collaborators at Lehigh, Northwestern, and Boston University) and recently was awarded a grant with the Weizmann Institute of Science. He served as an Associate Editor for Mathematical Programming Computation, Journal of Optimization Theory and Applications, and Optimization Methods and Software, and as area chair for ICLR and AISTATS. Martin currently serves as an area chair at ICML and NeurIPS.