10:30 Asymmetries in Multi-Core Systems – T Gross (ETHZ)
Future exascale systems will be based on multi-core processors, but even
today’s multi-core processors can be asymmetric and exhibit limitations
and bottlenecks that are different from those found on a symmetric
multiprocessor. We investigate the performance of a cluster node based
on the Intel Xeon E 5520 quad-core processor and note that despite the
symmetry implied by the programming model, the available memory
bandwidth is not shared equally among all threads. Consequently,
applications experience substantial performance variance and slow-downs
when the tasks (threads) are mapped to cores in a naive manner. An
operating system scheduler could mitigate these effects by taking into
account the memory system structure but needs accurate information from
the performance monitoring unit as the asymmetry is not directly exposed
in the processor’s instruction set manual. Current performance
monitoring units are quite inflexible and change from one processor to
the next, so higher levels of the software tool chain are discouraged to
use them. The development of portable performance monitoring units will
be crucial if applications want to use the performance potential of
exascale systems.