CUDA Fortran for Scientists and Engineers (2nd Ed.)
Best Practices for Efficient CUDA Fortran Programming

Authors:

Language: English
Publication date:
350 p. · Paperback

CUDA Fortran for Scientists and Engineers: Best Practices for Efficient CUDA Fortran Programming shows how high-performance application developers can leverage the power of GPUs using Fortran, the familiar language of scientific computing and supercomputer performance benchmarking. The authors presume no prior parallel computing experience, and cover the basics along with best practices for efficient GPU computing using CUDA Fortran. In order to add CUDA Fortran to existing Fortran codes, they explain how to understand the target GPU architecture, identify computationally-intensive parts of the code, and modify the code to manage the data and parallelism and optimize performance ? all in Fortran, without having to rewrite in another language. Each concept is illustrated with actual examples so you can immediately evaluate the performance of your code in comparison. This second edition provides much needed updates on how to efficiently program GPUs in CUDA Fortran. It can be used either as a tutorial on GPU programming in CUDA Fortran as well as a reference text.

PART I: CUDA Fortran Programming 1. Introduction 2. Correctness, Accuracy, and Debugging 3. Performance Measurements and Metrics 4. Synchronization 5. Optimization 6. Multi-GPU Programming 7. Porting Tips and Techniques 8. Interfacing with CUDA C, OpenACC, and CUDA Libraries PART II Case Studies 9. Monte Carlo Method 10. Finite Difference Method 11. Applications of the Fast Fourier TransformRay Tracing

Greg Ruetsch is a Senior Applied Engineer at NVIDIA, where he works on CUDA Fortran and performance optimization of HPC codes. He holds a Bachelor’s degree in mechanical and aerospace engineering from Rutgers University and a Ph.D. in applied mathematics from Brown University. Prior to joining NVIDIA he has held research positions at Stanford University’s Center for Turbulence Research and Sun Microsystems Laboratories.
Massimiliano Fatica is the manager of the Tesla HPC Group at NVIDIA where he works in the area of GPU computing (high-performance computing and clusters). He holds a laurea in Aeronautical Engineering and a Phd in Theoretical and Applied Mechanics from the University of Rome “La Sapienza”. Prior to joining NVIDIA, he was a research staff member at Stanford University where he worked at the Center for Turbulence Research and Center for Integrated Turbulent Simulations on applications for the Stanford Streaming Supercomputer.
  • Presents optimization strategies for current hardware, including Hopper generation GPUs
  • Includes discussions of new language and hardware features, including managed memory, tensor cores, shuffle instructions, new multi-GPU paradigms
  • Offers resources and strategies for porting large codes to GPUs, including language features as well as library use