ODR, libc++ hardening, Profiles and Contracts
What follows is a writeup I did after receiving a question about Contracts, ODR violations, libc++ hardening and their relationships. Since I think it provides a decent explanation of some of the ODR issues we’ve been discussing in WG21, I am sharing more widely. Thanks to Hui Xie and Anthony Williams for the original communication this is extracted from.
The context is that libc++ provides hardening modes that can be selected on a per-TU basis by users, by defining a
macro. The hardening mode used in a TU ends up enabling/disabling some assertion macros located in the implementation
of libc++, say in vector::operator[]
. As such, it faces many of the same ODR challenges that two ongoing proposals
are facing:
- contracts: the selection of a contract semantic (
enforce
,quick_enforce
,ignore
, etc..) and how things work when different TUs are compiled with different contract semantics - profiles: how to create a program where different TUs enable different runtime profiles (e.g. one TU enables the library hardening profile, and another TU doesn’t)
As far as their relationships to ODR is concerned, these three situations are very similar. So here’s the interesting example:
// foo.hpp
#include <vector>
inline void foo(std::vector<int>& x) {
x.push_back(42);
x[0] = 1;
}
// a.cpp, hardening mode A
#include "foo.hpp"
void f() {
std::vector<int> x;
foo(x);
}
// b.cpp, hardening mode B
#include "foo.hpp"
void g() {
std::vector<int> x;
foo(x);
}
// main.cpp
#include "foo.hpp"
extern void f();
extern void g();
int main() {
f();
g();
}
The question here is which hardening mode is applicable for the foo
function and for vector::operator[]
.
Specifically, when building this program:
- TU
a.cpp
contains definitions forinline foo(...)
,inline vector::operator[]
, andf()
- TU
b.cpp
contains definitions forinline foo(...)
,inline vector::operator[]
, andg()
When linking the program, the linker will see the two definitions of inline foo(...)
and the two definitions of
inline vector::operator[]
, and it will keep exactly one of each of them. It does so by assuming that all definitions
of the same function are equivalent, due to the ODR. But in reality, these definitions have different object code,
since they are using different hardening modes:
a.cpp
’sinline foo(...)
callsvector<int>::operator[]
containing checks per hardening mode Ab.cpp
’sinline foo(...)
callsvector<int>::operator[]
containing checks per hardening mode B
This is technically an ODR violation, and so this program is ill-formed-NDR. In practice, what happens is that you’ll
get one definition of foo()
and one definition of vector::operator[]
at random, so you may end up getting either
hardening mode. To help protect against this issue, libc++ uses __attribute__((abi_tag(<hardening-mode>)))
on its
inline functions, where <hardening-mode>
is simply a string like "hardening_mode_A"
, "hardening_mode_B"
, etc…
The effect of this ABI tag on a function is that the function’s mangled name will now contain the given string,
effectively making the mangled name of operator[]
different when the hardening mode is different. So we now have
the following picture:
- TU
a.cpp
contains definitions forinline foo(...)
,inline vector::operator[]<ABITAG:hardening_mode_A>
, andf()
- TU
b.cpp
contains definitions forinline foo(...)
,inline vector::operator[]<ABITAG:hardening_mode_B>
, andg()
So from the linker’s perspective we don’t have an ODR violation anymore for vector::operator[]
, since definitions
with different hardening modes (and hence different codegen) now have different names as well. However, we still have
an ODR violation for inline foo(...)
, to which nobody applied an ABI tag.
Sadly, this ODR violation is outside the control of libc++. In principle, users would also need to ABI-tag their inline functions in order to safeguard against this problem. In practice though, nobody even knows about the issue so nobody protects against it, and it may end up being a real ODR violation. Note that compiler inlining also comes into the mix and gets rid of many of these ODR violations by simply removing the existence of these functions, but this is not reliable.
In practice, I have not come across any instance where an ODR violation like this one caused problems since we started using ABI tags on libc++ functions. However, these ABI tags were introduced in the first place because we did encounter problematic ODR violations across libc++ functions, so the problem is real. Reducing its scope by protecting libc++’s own functions against it seems to have been an effective mitigation technique, but it’s not a full solution.
At the core, this deficiency is due to the fact that ABI tags do not propagate from callees to callers, just like they
don’t propagate from data members to the types containing them (which makes them of limited usefulness to handle ABI
breaks like the GCC std::string
one). In order to do this properly, the compiler would have to effectively ABI-tag
every inline function (and only those I think) based on the hardening mode / contract assertion semantics / profiles in
effect inside a translation unit. From an implementer’s perspective, I believe that introducing some kind of “poisonous
ABI tag” attribute (as a compiler extension – this doesn’t need to be standardized) that propagates from callees to
callers would make the most sense. That way, the functionality would be available to the library, and the compiler could
reuse those same semantics for implementing core language features like Contracts or Profiles. This thought is not fully
formed in my head yet, as far as I can tell solving this is still an open problem.
One important thing I would like to point out is that this “ODR sensitivity” is not specific to Contracts, Profiles or libc++ hardening. The exact same thing happens with any “ODR sensitive” property, which is basically anything that results in different code generation inside an inline function. There are many examples of that:
- if you have an inline function with
#if CONDITION
and not all translation units see the same value forCONDITION
, you also have an ODR violation - when you compile translation units with different values of
-fno-exceptions
, the same is true, and I have seen this break in practice (this can result in e.g. libc++ callingstd::abort()
instead of throwing an exception that the caller intended to handle) - any other compiler flag that affects code generation can have the same result if you mix different values of these flags across translation units
Libc++ currently uses ABI tags to try protecting against ODR violations involving -fno-exceptions
, the hardening mode,
and the release version of libc++. There are other properties (like the value of -std=c++XY
) that we should probably
take into account too. But either way, this safeguarding is limited in effectiveness to libc++’s own functions as it
does not propagate to user-defined inline functions. Even outside of any specific WG21 proposal, solving these ODR
issues would be greatly desirable as a QOI matter for compilers and Standard Libraries.