Branchless Programming in C++
Have you ever written code like this:
void f(bool b, long x, long& s) { if (b) s += x; }
Probably you have. Would you like me to tell you how much performance you left on the table? With a small change, that function could be made 2.5 times faster. What about this code:
if (a[i] && b[i]) do_something(); else do_something_else();
Would you believe me if I told you that, under some not-so-exotic conditions, this line runs four times slower than it could be? It’s true, and I’m going to show you when and why. This presentation will explain how modern CPUs handle computations to ensure that the hardware is used efficiently (which is how you get high performance from the processor). We will then learn how conditional code disrupts the ideal flow of computations and the countermeasures employed by the CPU designers to retain good performance in the presence of such code. Sometimes these measures are sufficient, often with the help of the compiler. But when they aren’t, it is up to the programmer to recover lost performance by coding with fewer branches. PUBLICATION PERMISSIONS:
CppCon Organizer provided Coding Tech with the permission to republish CppCon tech talks. CREDITS
CppCon YouTube channel: https://www.youtube.com/channel/UCMlGfpWw-RUdWX_JbLCukXg https://www.youtube.com/watch?v=Bh8flzIX5gw