Bernd GirodStanford University
Abstract: With intelligent processing, cameras have great potential to link the real world and the virtual world. We review advances and opportunities for algorithms and applications that retrieve information from large databases using images as queries. For rate-constrained applications, remarkable improvements have been achieved over the course the MPEG-CDVS (Compact Descriptors for Visual Search) standardization. Beyond CDVS lie applications that query video databases with images, while others continually match video frames against image databases. Exploiting the temporal coherence of video for either case can yield large additional gains. We will look at implementations for example applications ranging from text recognition to augmented reality to understand the challenges of building databases for rapid search and scalability, as well as the tradeoffs between processing on a mobile device and in the cloud.
Biography: Bernd Girod is the Robert L. and Audrey S. Hancock Professor of Electrical Engineering at Stanford University, California. Until 1999, he was a Professor in the Electrical Engineering Department of the University of Erlangen-Nuremberg. His research interests are in the area of image, video, and multimedia systems. He has published over 600 conference and journal papers and 6 books, receiving the EURASIP Signal Processing Best Paper Award in 2002, the IEEE Multimedia Communication Best Paper Award in 2007, the EURASIP Image Communication Best Paper Award in 2008, the EURASIP Signal Processing Most Cited Paper Award in 2008, as well as the EURASIP Technical Achievement Award in 2004 and the Technical Achievement Award of the IEEE Signal Processing Society in 2011. As an entrepreneur, Professor Girod has worked with numerous startup ventures, among them Polycom, Vivo Software, 8x8, and RealNetworks. He received an Engineering Doctorate from University of Hannover, Germany, and an M.S. Degree from Georgia Institute of Technology. Prof. Girod is a Fellow of the IEEE, a EURASIP Fellow, a member of the German National Academy of Sciences (Leopoldina), and a member of the United States National Academy of Engineering. He currently serves Stanford’s School of Engineering as Senior Associate Dean at Large.
Asu OzdaglarMassachusetts Institute of Technology
Overview: Motivated by machine learning problems over large data sets and distributed optimization over networks, we consider the problem of minimizing the sum of a large number of convex component functions. We study incremental gradient methods for solving such problems, which use information about a single component function at each iteration. We provide new convergence rate results under some assumptions. We also consider incremental aggregated gradient methods, which compute a single component function gradient at each iteration while using outdated gradients of all component functions to approximate the entire global cost function, and provide new linear rate results.
This is joint work with Mert Gurbuzbalaban and Pablo Parrilo.
Biography: Asu Ozdaglar received the B.S. degree in electrical engineering from the Middle East Technical University, Ankara, Turkey, in 1996, and the S.M. and the Ph.D. degrees in electrical engineering and computer science from the Massachusetts Institute of Technology, Cambridge, in 1998 and 2003, respectively.
She is the Joseph F. and Nancy P. Keithley Professor of Electrical Engineering and Computer Science Department at the Massachusetts Institute of Technology. She is also the director of the Laboratory for Information and Decision Systems and associate director of the Institute for Data, Systems, and Society. Her research expertise includes optimization theory, with emphasis on nonlinear programming and convex analysis, game theory, with applications in communication, social, and economic networks, distributed optimization and control, and network analysis with special emphasis on contagious processes, systemic risk and dynamic control.
Professor Ozdaglar is the recipient of a Microsoft fellowship, the MIT Graduate Student Council Teaching award, the NSF Career award, the 2008 Donald P. Eckman award of the American Automatic Control Council, the Class of 1943 Career Development Chair, the inaugural Steven and Renee Innovation Fellowship, and the 2014 Spira teaching award. She served on the Board of Governors of the Control System Society in 2010 and was an associate editor for IEEE Transactions on Automatic Control. She is currently the area co-editor for a new area for the journal Operations Research, entitled "Games, Information and Networks. She is the co-author of the book entitled “Convex Analysis and Optimization” (Athena Scientific, 2003).
Li DengMicrosoft Research
Overview: Deep learning has fundamentally changed the landscape of two important areas of artificial intelligence (AI): speech recognition since year 2010 and computer vision since 2012. The rapid progress in these AI areas that pertain to machine perception has given high hopes that deep learning will further thrust new advances in other areas of AI pertaining to cognition functions of human intelligence, including language processing, reasoning, attention, memory, knowledge, and decision making.
In this talk I will first reflect on the historical path to the transformative success of deep learning in speech recognition, after providing brief reviews of earlier studies on (shallow) neural networks and on (deep) generative models relevant to the introduction of deep learning methods to speech recognition. Then, an overview will be given on sweeping achievements of deep learning in speech recognition since its initial success, which have resulted in across-the-board deployment of deep learning in modern speech recognition systems worldwide. The huge impact of deep learning in image recognition and computer vision is also described and analyzed in terms of the same enabling factors of big compute, big data, and innovations in deep architectures and learning methods as in speech recognition.
Next, more challenging application areas of deep learning, including natural language processing, multimodal processing involving text, and deep reinforcement learning for decision making, will be selectively reviewed and analyzed. I will show examples of machine translation, contextual entity search, and automatic image captioning, where fresh ideas from deep learning, continuous-space embedding of natural language text in particular, are revolutionizing these AI application areas. Finally, a number of key issues and future directions of deep learning for AI tasks will be addressed and explored.
Biography: Li Deng received the Ph.D. degree from the University of Wisconsin-Madison. He was a Professor at the University of Waterloo, Ontario, Canada during 1989-1999, and then joined Microsoft Research, Redmond, USA, where currently he leads R&D of deep learning as Partner Research Manager of its Deep Learning Technology Center. He authored or co-authored 5 books including the latest books of Deep Learning: Methods and Applications (2014) and of Automatic Speech Recognition: A Deep-Learning Approach (Springer, 2015). He is a Fellow of the Acoustical Society of America, a Fellow of the IEEE, and a Fellow of the International Speech Communication Association. He served on the Board of Governors of the IEEE Signal Processing Society. More recently, he served as Editors-In-Chief for IEEE Signal Processing Magazine and for IEEE/ACM Transactions on Audio, Speech and Language Processing.