Deep Learning Systems
Algorithms, Compilers, and Processors for Large-Scale Production

Synthesis Lectures on Computer Architecture Series

Language: English
Cover of the book Deep Learning Systems

Subject for Deep Learning Systems

63.29 €

In Print (Delivery period: 15 days).

Add to cartAdd to cart
Publication date:
245 p. · 19.1x23.5 cm · Paperback
This book describes deep learning systems: the algorithms, compilers, and processor components to efficiently train and deploy deep learning models for commercial applications. The exponential growth in computational power is slowing at a time when the amount of compute consumed by state-of-the-art deep learning (DL) workloads is rapidly growing. Model size, serving latency, and power constraints are a significant challenge in the deployment of DL models for many applications. Therefore, it is imperative to codesign algorithms, compilers, and hardware to accelerate advances in this field with holistic system-level and algorithm solutions that improve performance, power, and efficiency. Advancing DL systems generally involves three types of engineers: (1) data scientists that utilize and develop DL algorithms in partnership with domain experts, such as medical, economic, or climate scientists; (2) hardware designers that develop specialized hardware to accelerate the components in the DL models; and (3) performance and compiler engineers that optimize software to run more efficiently on a given hardware. Hardware engineers should be aware of the characteristics and components of production and academic models likely to be adopted by industry to guide design decisions impacting future hardware. Data scientists should be aware of deployment platform constraints when designing models. Performance engineers should support optimizations across diverse models, libraries, and hardware targets. The purpose of this book is to provide a solid understanding of (1) the design, training, and applications of DL algorithms in industry; (2) the compiler techniques to map deep learning code to hardware targets; and (3) the critical hardware features that accelerate DL systems. This book aims to facilitate co-innovation for the advancement of DL systems. It is written for engineers working in one or more of these areas who seek to understand the entire system stack in order to bettercollaborate with engineers working in other parts of the system stack. The book details advancements and adoption of DL models in industry, explains the training and deployment process, describes the essential hardware architectural features needed for today's and future models, and details advances in DL compilers to efficiently execute algorithms across various hardware targets. Unique in this book is the holistic exposition of the entire DL system stack, the emphasis on commercial applications, and the practical techniques to design models and accelerate their performance. The author is fortunate to work with hardware, software, data scientist, and research teams across many high-technology companies with hyperscale data centers. These companies employ many of the examples and methods provided throughout the book.
Preface.- Acknowledgments.- Introduction.- Building Blocks.- Models and Applications.- Training a Model.- Distributed Training.- Reducing the Model Size.- Hardware.- Compiler Optimizations.- Frameworks and Compilers.- Opportunities and Challenges.- Bibliography.- Author's Biography.
Andres Rodriguez is a Sr. Principal Engineer and AI Architect in the Data Platform Group at Intel Corporation where he designs deep learning solutions for Intel’s customers and provides technical leadership across Intel for deep learning hardware and software products. He has 15 years of experience working in AI. Andres received a Ph.D. from Carnegie Mellon University for his research in machine learning. He was the lead instructor in the Coursera course An Introduction to Practical Deep Learning to over 20 thousand students. He has been an invited speaker at several AI events, including AI with the Best, ICML, CVPR, AI Frontiers Conference, Re-Work Deep Learning Summit, TWIML, Startup MLConf, Open Compute Platform Global Summit, AWS re:Invent, Baidu World, Baidu Cloud ABC Inspire Summit, Google Cloud OnAir Webinar, and several Intel events, as well as an invited lecturer at Carnegie Mellon University, Stanford University, UC Berkeley, and Singularity University