Description
Programming Massively Parallel Processors (4th Ed.)
A Hands-on Approach
Authors: Hwu Wen-mei W., Kirk David B., El Hajj Izzat
Language: EnglishSubjects for Programming Massively Parallel Processors:
Keywords
?Accessibility; Adjacent (block) synchronization; Algorithm selection; Algorithmic complexity; Amdahl’s law; Applications programming interface (API); Approximation; Associative operators; Asynchronous; Atomic operation; Atomic operations; Barrier; Barrier synchronization; Binary search; Binning; Bitonic sort; Bottom-up sort methods; Boundary condition problems; Breadth-first search; Brent-Kung; Bucketing; Buffering; CUDA C; CUDA streams; Cache; Chain rule; Circular buffer; Co-rank; Coalescing; Collective; Column-major layout; Communication; Commutative operators; Compare-and-swap; Comparison sort; Compression; Computational thinking; Compute to global memory access ratio; Constant cache; Constant memory; Contention; Contiguous partitioning; Control divergence; Convolution; Convolution filters; Convolutional neural network; Cooperating kernels; Corner turning; Counting sort; Critical path analysis; Cross-iteration dependence; CuDNN; Cutoff; DRAM burst; Data characteristics; Data dependence; Data parallelism; Data reuse; Data size scaling; Data transfer; Data-dependent execution behavior; Deadlock; Deep learning; Device code; Device property query; Differential equations; Direction-optimized parallelization; Discretization; Divide and conquer; Domain decomposition; Domain partition; Double-buffering; Driving direction map services; Dynamic input data identification; Dynamic resource partitioning; Dynamic work discovery; Edge; Edge-centric parallelization; Electrostatic potential energy; Embarrassingly parallel applications; Error handling; Execution configuration parameters; Execution resource utilization efficiency; Feature extraction; Finite difference method; Flexibility; Forward propagation; Frontiers; GPU computing; Gather; Gather parallelization; Ghost cells; Golden age of computing; Gradient backpropagation; Graph algorithms; Graph traversal; Graphs; Grid launch; Halo cells; Heterogeneous computing
88.31 €
In Print (Delivery period: 14 days).
Add to cart the book of Hwu Wen-mei W., Kirk David B., El Hajj Izzat580 p. · 19x23.4 cm · Paperback
Description
/li>Contents
/li>Biography
/li>Comment
/li>
Programming Massively Parallel Processors: A Hands-on Approach shows both students and professionals alike the basic concepts of parallel programming and GPU architecture. Concise, intuitive, and practical, it is based on years of road-testing in the authors' own parallel computing courses. Various techniques for constructing and optimizing parallel programs are explored in detail, while case studies demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs. The new edition includes updated coverage of CUDA, including the newer libraries such as CuDNN. New chapters on frequently used parallel patterns have been added, and case studies have been updated to reflect current industry practices.
Part I Fundamental Concepts
2 Heterogeneous data parallel computing
3 Multidimensional grids and data
4 Compute architecture and scheduling
5 Memory architecture and data locality
6 Performance considerations
Part II Parallel Patterns
7 Convolution: An introduction to constant memory and caching
8 Stencil
9 Parallel histogram
10 Reduction And minimizing divergence
11 Prefix sum (scan)
12 Merge: An introduction to dynamic input data identification
Part III Advanced patterns and applications
13 Sorting
14 Sparse matrix computation
15 Graph traversal
16 Deep learning
17 Iterative magnetic resonance imaging reconstruction
18 Electrostatic potential map
19 Parallel programming and computational thinking
Part IV Advanced Practices
20 Programming a heterogeneous computing cluster: An introduction to CUDA streams
21 CUDA dynamic parallelism
22 Advanced practices and future evolution
23 Conclusion and outlook
Appendix A: Numerical considerations
David B. Kirk is well recognized for his contributions to graphics hardware and algorithm research. By the time he began his studies at Caltech, he had already earned B.S. and M.S. degrees in mechanical engineering from MIT and worked as an engineer for Raster Technologies and Hewlett-Packard's Apollo Systems Division, and after receiving his doctorate, he joined Crystal Dynamics, a video-game manufacturing company, as chief scientist and head of technology. In 1997, he took the position of Chief Scientist at NVIDIA, a leader in visual computing technologies, and he is currently an NVIDIA Fellow.
At NVIDIA, Kirk led graphics-technology development for some of today's most popular consumer-entertainment platforms, playing a key role in providing mass-market graphics capabilities previously available only on workstations costing hundreds of thousands of dollars. For
- Parallel Patterns Introduces new chapters on frequently used parallel patterns (stencil, reduction, sorting) and major improvements to previous chapters (convolution, histogram, sparse matrices, graph traversal, deep learning)
- Ampere Includes a new chapter focused on GPU architecture and draws examples from recent architecture generations, including Ampere
- Systematic Approach Incorporates major improvements to abstract discussions of problem decomposition strategies and performance considerations, with a new optimization checklist
These books may interest you
Professional CUDA C Programming 58.47 €