Compilers a lot of times use magic to uncover hidden mysteries of your program and optimize it aggressively. But even in their infinite wisdom, there are some things they just can't know which they wish were true. So, tell 'em and you won't be disappointed.
Note: This article is focused on C and C++ and the compilers: Clang/LLVM, GCC and MSVC
Optimizers basically do two steps: Anaysis and Transformation. An analysis gathers potentially useful information about the program while the transformation phase uses this info to optimize it. The latter usually is in the form: I want to do X. I can do it only if Y is true. I'll ask the analysis if it knows it to be true and if yes, I'll do it. Otherwise, tough luck, your code won't be optimized.
As you can imagine, the analysis phase is just as important as the transformation. And while analyses get more and more sophisticated, there are things that they can't deduce because some functionality has not been implemented, or we haven't even thought how it can be implemented or, worse, it's proven that we can't implement it.
Even if it could deduce all possible info that can be found from your code (I don't even know what that would mean but anyway), there are still facts that only you know about how you want your code to be compiled. The compiler has to adhere to some rules of the language semantics, the Instruction Set Architecture, or just plain logic / correctness rules. A lot of times, you might not care about it doing that (like say, preserving floating-point operations).
Here comes your part in this business. Tell the compiler what is true about your code and what you care about (or not) and who knows, it might find some good use. I know that unfortunately most communication between you and the compiler is the compiler telling you that you made a mistake (again). But there are some ways in which you can reply for your own good.
restrict C99 keyword is a type qualifier that is mainly used in pointer
declarations to inform the compiler that this pointer does not alias with any other pointer. Pointer Aliasing is a somewhat complicated topic (and so is the restrict keyword 1, 2) but the rule of thumb is: If the objects accessed by a pointer won't ever be accessed by any other pointer, declare it as
restrict (especially parameters of hot functions).
As you may be thinking, this is true for most pointers in your programs, so, the common thing should
be to declare pointers as
I, [insert your name], a PROFESSIONAL or AMATEUR [circle one] programmer recognize that there are limits to what a compiler can do. I certify that, to the best of my knowledge, there are no magic elves or monkeys in the compiler which through the forces of fairy dust can always make code faster. I understand that there are some problems for which there is not enough information to solve. I hereby declare that given the opportunity to provide the compiler with sufficient information, perhaps through some key word, I will gladly use said keyword and not bitch and moan about how "the compiler should be doing this for me."
In this case, I promise that the pointer declared along with the restrict qualifier is not aliased. I certify that writes through this pointer will not effect the values read through any other pointer available in the same context which is also declared as restricted.
* Your agreement to this contract is implied by use of the restrict keyword ;)
- Mike Acton's restrict keyword contract
An example of observing the difference with and without
here. This is a simple sum function that adds elements of two arrays and saves them in a third. The code with it is about half the code without. With
restrict the compiler does some relatively trivial vectorization. Without it, it has to insert run-time checks (essentially to verify the information that restrict would provide, so that it can branch to the vectorized version) and a fall-back scalar version.
I should mention that
restrict is only a C99 and later keyword. This means
that for starters, it's not available in C++, but fortunately, GCC, Clang and MSVC support it in different ways
The basic idea behind these builtins is you being able to say "hey compiler, assume this is true". The rest is just every compiler being picky about the way they want you to communicate it.
The easiest case is Clang. You just write
__builtin_assume(b) like a statement,
b is a boolean expression that the compiler can assume is true.
__assume(b) can be used in MSVC / Visual Studio in a similar way.
Lastly, GCC provides the same functionality, but not directly. We have to emulate it. It does not provide an "assume"
intrinsic, only an "unreachable" intrinsic, the
That tells the compiler "this code can't be reached". So, you can emulate the "assume
is true" behavior by writing code as
if (!b) __builtin_unreachable().
Let's see how combining
restrict and "assume"s can have a dramatic effect. Let's
say we know that in the previous example, the length of the 3 arrays is a multiple of 4, which just so happens
to be a vector width. We tell the compiler and look the original vs this
Before moving forward, I should say that there are other similar keywords, like the
__builtin_assume_aligned(). You can find those online the more interested you become in providing information
to the compiler.
While the previous methods provided ways to give fine-grained information about the code (e.g. for a specific variable or function), compiler switches usually provide ways to give coarser-grained information, like for example: Do / don't try to preserve the order of floating point operations, do / don't care about strict aliasing, assume this architecture or later etc. Since any modern compiler has thousands of switches, there's no possibility to cover them in a single article. That said, I will describe the most "common" ones for GCC and Clang (because this is what I happen to know):
This is actually a lot of flags together, which basically tell to the compiler don't care about all the weirdness and craziness of floating point math. You should be careful with this, especially if your code does sensitive and high-precision floating-point calculations. But, the truth is, usually your code does not any of that. So, since we usually don't care, let the compiler not care too.
This is one of the many switches to give information about the architecture you want to compile. Most commonly,
you want the compiler to compile the code for the machine you're compiling and so you can
-march=nativeto tell it to
optimize specifically for this machine's characteristics. You can also check other
options for this flag here.
I feel like this is the part of the article that will do more harm than good so, so let me just remind
something important: Usually, when the compiler gives a warning, especially when you're not
(which you should for God's sake, you're compiling C/C++ not Haskell), is almost definitely to your benefit to listen to it.
That said, I understand that some warnings are just irrelevant. For all warnings,
both Clang and GCC tell you the flag that enables them, which is always in the form:
x is the mnemonic for the warning. You can
turn it off with the flag
Again, there are hundreds of compiler flags and note that I didn't even talk about Visual Studio / MSVC. I would strongly recommend that you watch this video: Improving Performance Through Compiler Switches if you're interested in GCC and Clang options, it's a great talk. Also, you can take a look at MSVC's compiler flag documentation.
To wrap this up, I would advise you to generally seek ways to inform the compiler about stuff. It's more about having an attitude "I want to help the compiler help me" than memorizing specific flags and using them blindly. Because, and that's a topic of a future article, the compiler is not a magic wand (Mike Acton's quote). Realizing that and participating actively in the process of compilation will give you better results in the end.