Monday, February 14, 2000
4:00 PM (refreshments 3:45)
Edgerton Hall, Room 34-101
EECS Colloquium
Abstract
Increasing focus on multimedia applications has prompted the addition of multimedia extensions to most existing general-purpose microprocessors. These extensions introduce short Single Instruction Multiple Data (SIMD) instructions to the processor's Instruction Set Architecture. Unfortunately, access to these multimedia instructions has been limited to in-line assembly and library calls.
Researchers have proposed using vectorization techniques as a means of automatically exploiting short SIMD instructions. Although vectorization technology is well understood, it is geared toward identifying the massive amounts of parallelism needed by traditional vector processors. On the other hand, modern multimedia extensions only require a small amount of parallelism. This key insight led us to the development of Superword Level Parallelism (SLP), a novel way of viewing short SIMD parallelism found in many applications. SLP is fundamentally different from loop-level vector parallelism and leads to simple, robust, and general compiler techniques that can automatically exploit multimedia instructions.
We have developed a robust compiler for detecting SLP that targets basic blocks rather than loop nests. Detection is done through a simple analysis in which independent isomorphic statements are located within a basic block. We are also able to translate vector parallelism into SLP using loop unrolling. Thus, our SLP compiler exploits parallelism both across loop iterations and within basic blocks, providing excellent performance in several application domains. Experiments on scientific and multimedia benchmarks have yielded average performance improvements of 84%, and range as high as 253%.
|
Modified: Feb 8, 2000
|
Current events
|
Your comments
and inquiries are welcome.