Chip Multiprocessor Architecture [electronic resource] : Techniques to Improve Throughput and Latency / by Kunle Olukotun, Lance Hammond, James Laudon.
- 作者: Olukotun, Kunle. author.
- 其他作者:
- 其他題名:
- Synthesis Lectures on Computer Architecture,
- 出版: Cham : Springer International Publishing :Imprint: Springer 2007.
- 叢書名: Synthesis Lectures on Computer Architecture,
- 主題: Electronic circuits. , Microprocessors. , Computer architecture. , Electronic Circuits and Systems. , Processor Architectures.
- 版本:1st ed. 2007.
- ISBN: 9783031017209
- URL:
Electronic resource
-
讀者標籤:
- 系統號: 005282062 | 機讀編目格式
館藏資訊
摘要註
Chip multiprocessors - also called multi-core microprocessors or CMPs for short - are now the only way to build high-performance microprocessors, for a variety of reasons. Large uniprocessors are no longer scaling in performance, because it is only possible to extract a limited amount of parallelism from a typical instruction stream using conventional superscalar instruction issue techniques. In addition, one cannot simply ratchet up the clock speed on today's processors, or the power dissipation will become prohibitive in all but water-cooled systems. Compounding these problems is the simple fact that with the immense numbers of transistors available on today's microprocessor chips, it is too costly to design and debug ever-larger processors every year or two. CMPs avoid these problems by filling up a processor die with multiple, relatively simpler processor cores instead of just one huge core. The exact size of a CMP's cores can vary from very simple pipelines to moderately complex superscalar processors, but once a core has been selected the CMP's performance can easily scale across silicon process generations simply by stamping down more copies of the hard-to-design, high-speed processor core in each successive chip generation. In addition, parallel code execution, obtained by spreading multiple threads of execution across the various cores, can achieve significantly higher performance than would be possible using only a single core. While parallel threads are already common in many useful workloads, there are still important workloads that are hard to divide into parallel threads. The low inter-processor communication latency between the cores in a CMP helps make a much wider range of applications viable candidates for parallel execution than was possible with conventional, multi-chip multiprocessors; nevertheless, limited parallelism in key applications is the main factor limiting acceptance of CMPs in some types of systems. After a discussion of the basic
內容註
Contents: The Case for CMPs -- Improving Throughput -- Improving Latency Automatically -- Improving Latency using Manual Parallel Programming -- A Multicore World: The Future of CMPs.