Intel® Xeon Phi™ Coprocessor Architecture and Tools: The Guide for Application Developers provides developers a comprehensive introduction and in-depth look at the Intel Xeon Phi coprocessor architecture and the corresponding parallel data structure tools and algorithms used in the various technical computing applications for which it is suitable. It also examines the source code-level optimizations that can be performed to exploit the powerful features of the processor.
Xeon Phi is at the heart of world’s fastest commercial supercomputer, which thanks to the massively parallel computing capabilities of Intel Xeon Phi processors coupled with Xeon Phi coprocessors attained 33.86 teraflops of benchmark performance in 2013. Extracting such stellar performance in real-world applications requires a sophisticated understanding of the complex interaction among hardware components, Xeon Phi cores, and the applications running on them.
In this book, Rezaur Rahman, an Intel leader in the development of the Xeon Phi coprocessor and the optimization of its applications, presents and details all the features of Xeon Phi core design that are relevant to the practice of application developers, such as its vector units, hardware multithreading, cache hierarchy, and host-to-coprocessor communication channels. Building on this foundation, he shows developers how to solve real-world technical computing problems by selecting, deploying, and optimizing the available algorithms and data structure alternatives matching Xeon Phi’s hardware characteristics. From Rahman’s practical descriptions and extensive code examples, the reader will gain a working knowledge of the Xeon Phi vector instruction set and the Xeon Phi microarchitecture whereby cores execute 512-bit instruction streams in parallel.
What you’ll learn:
- How to calculate theoretical Gigaflops and bandwidth numbers on the hardware and measure them through code segment
- How to estimate latencies in fetching data from different cache hierarchies, including memory subsystems
- How to measure PCIe bus bandwidth between the host and coprocessor
- How to exploit power management and reliability features built into the hardware
- How to select and manipulate the best tools to tune particular Xeon Phi applications
- Algorithms and data structures for optimizing Xeon Phi performance
- Case studies of real-world Xeon Phi technical computing applications in molecular dynamics and financial simulations