As published in HPCWire
Anticipating the Fall: Application Performance Has Chased Multicore’s Speed Right Over a Cliff
Wile E. Coyote is doomed. Hanging in space, he is about to fall, and everyone knows it but him. We all saw it coming. Poor Coyote.
Yet strangely, he doesn’t fall right away. According to the alternate-reality rules of cartoon physics, the Coyote must first look down and realize
he is standing in thin air. He then has time to gather his thoughts, issue a final desperate wave, and then finally — poof! — he plummets body
first, leaving his head in the frame for the viewers to witness a comical last-second grimace before that too disappears.
Know what else we saw coming? The crash in HPC application performance that is being brought about by the transition to multicore processors. We’ve
been watching the race, as applications (Codus productivus) desperately chased processors (Waferii siliconium) up the performance mountain. Suddenly
multicore came and — meep! meep! — the CPUs put on a burst of speed and zoomed around a bend, leaving application software headed for a cliff.
HPC users were doomed. Everyone knew it. Poor users.
What’s this? Application performance hasn’t dramatically suffered? Users are satisfied with the performance they’re getting? How is this possible?
The answer: cartoon physics.
According to our most recent research, the reason performance hasn’t plummeted is that users haven’t been forced to deal with the problem yet. Rather
than introducing a new level of parallelism at the socket level, most users have responded by running separate jobs on each core. Sure, they’re
buying a lot more memory to do that — configured memory per core is staying relatively stable, and therefore configured memory per socket is skyrocketing
— but at least the application is scaling. For now.
We’ve gone off a cliff; we just don’t know it yet. Because those cores aren’t getting any faster, we’re soon going to come to grips with the reality
that new tools or programming models are needed in order to keep up the race. Look down, everybody. The ground isn’t there. Now is the time to
hold up a little “Oh, no!” sign and wave to the camera.
This is going to hurt, but fear not. The Coyote is resilient, and he always comes up with a new scheme. Soon he’ll be back in the race and chasing
right behind the Road Runner again.
The ISC conference in Dresden is coming up, and the new things I’ll most want to see are tools for improving application performance yield in large-scale,
multicore systems. Acme Application Optimizers, anyone?