Efficient parameter pack indexing
29 Nov 2015
Recently, I’ve been looking at ways to index into parameter packs with as little
compile-time overhead as possible. This is not a new problem, and we know of
several metaprogramming techniques to achieve this, some of which offer pretty
good compile-times. Most of these techniques are also well documented, for
example here and here . So, why write an article about this
well-covered topic? Well, I recently decided to cheat and modified the compiler
to see how fast we could possibly be. This article summarizes what I found.
First technique: recursion
The first implementation technique is the most naive one, and also the one
which is unfortunately encountered the most often in the wild. Basically, we
recursively walk the parameter pack until we find the type at the given index:
template < std :: size_t I , typename T , typename ... Ts >
struct nth_element_impl {
using type = typename nth_element_impl < I - 1 , Ts ... >:: type ;
};
template < typename T , typename ... Ts >
struct nth_element_impl < 0 , T , Ts ... > {
using type = T ;
};
template < std :: size_t I , typename ... Ts >
using nth_element = typename nth_element_impl < I , Ts ... >:: type ;
This is simple and easy to understand, but it can also be very inefficient in
some circumstances. In general, don’t ever do this in your codebase unless
you want to extend your coffee breaks.
Second implementation: multiple inheritance
The second implementation technique is much more clever. It consists in using
multiple inheritance and template argument deduction to trick the compiler into
deducing the n-th element of a parameter pack for us. Here’s how it’s done:
template < std :: size_t I , typename T >
struct indexed {
using type = T ;
};
template < typename Is , typename ... Ts >
struct indexer ;
template < std :: size_t ... Is , typename ... Ts >
struct indexer < std :: index_sequence < Is ... > , Ts ... >
: indexed < Is , Ts > ...
{ };
template < std :: size_t I , typename T >
static indexed < I , T > select ( indexed < I , T > );
template < std :: size_t I , typename ... Ts >
using nth_element = typename decltype ( select < I > (
indexer < std :: index_sequence_for < Ts ... > , Ts ... > {}
)) :: type ;
The code is adapted from this post .
The idea here is that indexer
inherits from indexed<0, T0>
, indexed<1, T1>
,
up to indexed<N, TN>
. Then, when we call select<I>
with a fixed I
, the
only match is select(indexed<I, TI>)
. Hence, the compiler deduces the right
indexed<I, TI>
for us, and all we have to do is fetch the actual TI
,
which is just indexed<I, TI>::type
.
Third implementation: the void*
trick
The third implementation technique is probably the most tricky one to understand.
The idea is that we generate a function template of the form
template < typename T >
T f ( void * , void * , ..., void * , T * , ...);
// ^^^^^^^^^^^^^^^^^^^^^^^^ n-1 of these
Then, when we call that function with pointer arguments, its return type is
always going to be the type pointed to by the n-th argument. Modulo some
trickery, you can probably see that we could get the n-th element of an
arbitrary parameter pack if we had such a contraption. But can we build
such a function for an arbitrary n
? Yes, we can:
template < std :: size_t n , typename = std :: make_index_sequence < n > >
struct nth_element_impl ;
template < std :: size_t n , std :: size_t ... ignore >
struct nth_element_impl < n , std :: index_sequence < ignore ... >> {
template < typename Tn >
static Tn f ( decltype (( void * ) ignore )..., Tn * , ...);
};
Given any n
, nth_element_impl<n>::f
is the function we’ve been searching
for. Now, all that’s left to do is to use f
in the right way to extract the
n-th element of a parameter pack, which is not too difficult:
template < typename T >
struct wrapper { using type = T ; };
template < std :: size_t n , typename ... T >
using nth_element = typename decltype (
nth_element_impl < n >:: f ( static_cast < wrapper < T >*> ( 0 )...)
) :: type ;
Here, we use wrapper
to avoid creating a pointer to each T
, because some
types can’t be pointed-to (like references). Then, since f
’s return type is
the pointee of the n-th argument, the decltype
of the call to f
is just a
wrapper
holding the n-th type of the parameter pack. Finally, we just fetch
that type in the nested ::type
alias. Phew!
Fourth implementation: A compiler builtin!
Now’s the meat of this article. I’ve known about the above techniques for a
while, and I was also aware of the approximate compile-time performance of
each. However, I couldn’t help but wonder how a compiler intrinsic would
compare with these library-based approaches. So I went ahead and implemented
that in the Clang compiler, which is surprinsigly easy to hack on.
Basically, I modified the compiler so that when it sees a specialization called
__nth_element<n, T...>
, it substitutes it for the n-th type of the parameter
pack T...
. Just as if __nth_element
was a template alias doing one of the
tricks that we saw above, except the trick is implemented at the compiler-level.
Here’s how we can use it:
using Nth = __nth_element < 3 , int , char , float , double , void > ;
static_assert ( std :: is_same < Nth , double >:: value , "" );
Simple, huh? Let’s now see how the different techniques compare.
Comparing the techniques
To compare the techniques, we’ll first need to settle on a benchmark that we
can put each of the techniques through. For different values of n
, we’ll
fetch all the elements in a parameter pack of length n+1
(indexing starts
at 0):
using T0 = nth_element < 0 , int , int , ..., int > ;
^^^^^^^^^^^^^^^^^^ n + 1 elements total
using T1 = nth_element < 1 , int , int , ..., int > ;
...
using Tn = nth_element < n , int , int , ..., int > ;
A couple of things are worth noting here. First, we fetch types at all the
indices in the parameter pack instead of fetching a single, arbitrary index.
From experience, we often need to fetch the types at many (or all) the indices
in a parameter pack, so this use case is closest to most real-world use cases.
Also, the second and third techniques have a cost to amortize, i.e. the cost
of initially building the std::index_sequence
. It is therefore more fair to
fetch many indices in order to let the amortizing happen. Finally, fetching at
many different indices means longer compile-times, which is easier to measure
with a decent precision.
Another thing worth justifying is the usage of int
as the type in the
parameter pack. In fact, the choice that was made is to use a homogeneous
parameter pack instead of a heterogeneous parameter pack like
template < int > struct X ;
using T0 = nth_element < 0 , X < 0 > , X < 1 > , ..., X < n + 1 >> ;
The reason for doing so is that creating types takes time to the compiler,
even when those types are not instantiated (like the X<i>
s above). Since
we really want to benchmark the algorithm for getting the n-th element of
the parameter pack, it is better to use an homogeneous parameter pack. Now,
as for any rigorous benchmark, we’ll need a baseline implementation of
nth_element
:
template < std :: size_t n , typename ... Ts >
using nth_element = void ;
Of course, this implementation gives the wrong result, but it doesn’t matter
since it’s just for benchmarking. Also, since we really want to measure the
efficiency of the techniques we presented above and not some constant factors,
we’ll #include
exactly the same headers for all the benchmarks. In other
words, even though the builtin __nth_element
requires no additional headers,
its benchmark will still include <utility>
to be fair with the other techniques
that require it. Finally, note that all benchmarks are on Clang trunk, to which
my patch is added. Enough waiting, here’s the result:
{
"title": {
"text": "Compilation time for accessing a parameter pack"
},
"series": [
{
"name": "recursion",
"data": [[1, 0.08791385299991816], [11, 0.1040869290009141], [21, 0.10760211700107902], [31, 0.12382004098617472], [41, 0.1280566049972549], [50, 0.14985320100095123], [75, 0.22767344501335174], [100, 0.44891875301254913], [125, 0.6296636089973617], [150, 0.8388248109840788], [175, 1.2580508480023127], [200, 1.75584493597853], [225, 2.347315200982848], [250, 3.070959162985673], [275, 3.9935728659911547], [300, 5.04345749298227], [325, 6.411517444008496], [350, 7.570142345997738], [375, 9.334758556011366], [400, 11.364183734986], [425, 13.100518237974029], [450, 15.150793401990086], [475, 18.016470752976602]]
}, {
"name": "multiple inheritance",
"data": [[1, 0.09923712399904616], [11, 0.1010527829930652], [21, 0.09471394898719154], [31, 0.09767348700552247], [41, 0.10303380902041681], [50, 0.11183845900814049], [75, 0.12262416601879522], [100, 0.13599308699485846], [125, 0.1490678550035227], [150, 0.16649107899866067], [175, 0.1929700399923604], [200, 0.24410744599299505], [225, 0.25436611101031303], [250, 0.2929226130072493], [275, 0.33210227798554115], [300, 0.3667355029901955], [325, 0.39734739597770385], [350, 0.44671401599771343], [375, 0.5002171350060962], [400, 0.5487369850161485], [425, 0.6234887770260684], [450, 0.6647616800037213], [475, 0.7309245250071399], [500, 0.7873986509803217], [525, 0.8701764259894844], [550, 0.9605405330075882], [575, 1.0053072549926583], [600, 1.1314099880110007], [625, 1.190395031007938], [650, 1.2666128969867714], [675, 1.3734691129939165], [700, 1.5193456730048638], [725, 1.5714434119872749], [750, 1.8087361850193702], [775, 1.7521372050105128], [800, 2.0012963689805474], [825, 2.524466857983498], [850, 2.1846430029836483], [875, 2.2757716499909293], [900, 2.5220009069889784], [925, 2.8757488050032407], [950, 2.876232857001014], [975, 3.066998501017224], [1000, 3.1187242529995274], [1025, 3.245229969994398], [1050, 3.7422107250022236], [1075, 3.713907941011712], [1100, 3.5173721110040788], [1125, 4.708479123015422], [1150, 5.443987812002888], [1175, 5.246640281024156], [1200, 5.023036516999127], [1225, 4.97608685400337], [1250, 5.379978543001926], [1275, 5.288356656004908], [1300, 6.068725592020201], [1325, 6.058860035991529], [1350, 6.039346348989056], [1375, 6.673429206974106], [1400, 6.59464824697352], [1425, 6.439105322002433], [1450, 7.265887854999164], [1475, 7.10695827100426], [1500, 7.622790062014246], [1525, 7.2972888789954595], [1550, 7.67380925299949], [1575, 7.264323350013001], [1600, 8.271978703996865], [1625, 7.791878398013068], [1650, 9.339439092000248], [1675, 9.755184198991628], [1700, 10.066870843991637], [1725, 10.372837678994983], [1750, 10.309330489981221], [1775, 9.768502388993511], [1800, 9.401901669014478], [1825, 10.241244286007714], [1850, 9.85094714400475], [1875, 10.683431051002117], [1900, 10.967952057020739], [1925, 12.433038909017341], [1950, 12.530010496993782], [1975, 12.624577670998406], [2000, 11.85506974501186]]
}, {
"name": "void* trick",
"data": [[1, 0.10740649900981225], [11, 0.11059235798893496], [21, 0.12323373701656237], [31, 0.12219471001299098], [41, 0.1328271089878399], [50, 0.14716608601156622], [75, 0.19515213900012895], [100, 0.31295377301285043], [125, 0.33367389597697183], [150, 0.5311430190049578], [175, 0.6208896419848315], [200, 0.742176756990375], [225, 0.8497800249897409], [250, 0.9993392080068588], [275, 1.3237761600175872], [300, 1.5975253860233352], [325, 2.0864916009886656], [350, 2.1416680899856146], [375, 2.7329974559834227], [400, 3.1420765760121867], [425, 3.185407273005694], [450, 3.5732148960232735], [475, 4.1299009829817805], [500, 4.569442178006284], [525, 4.898843714006944], [550, 6.166297096002381], [575, 5.742453882005066], [600, 6.27711286599515], [625, 6.7415783570031635], [650, 7.676926622982137], [675, 8.281751938979141], [700, 9.087016478995793], [725, 9.76674667999032], [750, 11.78882013799739], [775, 10.950903388991719], [800, 11.854080150020309], [825, 13.233973357011564], [850, 15.64278328200453], [875, 15.402913261001231], [900, 17.994686827005353], [925, 17.696341240982292], [950, 19.390634357987437]]
}, {
"name": "__nth_element",
"data": [[1, 0.11245947898714803], [11, 0.1085786210023798], [21, 0.10532981299911626], [31, 0.12184113499824889], [41, 0.10490602100617252], [50, 0.11116955100442283], [75, 0.11290367800393142], [100, 0.12389529898064211], [125, 0.1296987759997137], [150, 0.14026946501689963], [175, 0.14848662100848742], [200, 0.16904175697709434], [225, 0.19542713597184047], [250, 0.211196149000898], [275, 0.21766682498855516], [300, 0.2804989920114167], [325, 0.2675095460144803], [350, 0.3391032590006944], [375, 0.32595690799644217], [400, 0.3887825500278268], [425, 0.39993590902304277], [450, 0.4480190910107922], [475, 0.4430583660141565], [500, 0.48500029600108974], [525, 0.5337112010165583], [550, 0.5564779890119098], [575, 0.5852735450025648], [600, 0.6763202070142142], [625, 0.7239769989973865], [650, 0.7201910940057132], [675, 0.776890175009612], [700, 0.7959786260034889], [725, 0.8278058749856427], [750, 0.9787349559774157], [775, 0.991111416980857], [800, 0.9531115650024731], [825, 1.1124548520019744], [850, 1.1440459869918413], [875, 1.1474693639902398], [900, 1.1929215129930526], [925, 1.3329246929788496], [950, 1.320771906990558], [975, 1.400243792013498], [1000, 1.4220198979892302], [1025, 1.4624821589968633], [1050, 1.7917850179946981], [1075, 1.7489220330026], [1100, 1.8088964839989785], [1125, 1.756834443018306], [1150, 1.9329246929846704], [1175, 2.160528142005205], [1200, 2.1670490649994463], [1225, 2.392635665019043], [1250, 2.420430022990331], [1275, 2.360121834994061], [1300, 2.6496885970118456], [1325, 2.4889739899954293], [1350, 2.546395246987231], [1375, 2.7288687319960445], [1400, 2.95728632301325], [1425, 3.1285262529854663], [1450, 3.3501748190028593], [1475, 3.458735363004962], [1500, 3.0745456560107414], [1525, 3.4576181620068382], [1550, 3.3852364490157925], [1575, 3.468985949002672], [1600, 3.739083191991085], [1625, 3.8641978960076813], [1650, 3.942073091981001], [1675, 4.5900268880068325], [1700, 4.179321927018464], [1725, 4.130655788991135], [1750, 4.486190288997022], [1775, 4.261819762003142], [1800, 4.686393445008434], [1825, 4.74440868300735], [1850, 4.688695557997562], [1875, 5.090895264991559], [1900, 4.904464602004737], [1925, 5.021394329989562], [1950, 4.951598395971814], [1975, 5.175259591982467], [2000, 5.444765752996318], [3000, 12.110702848003712]]
}, {
"name": "std::tuple_element",
"data": [[1, 0.09287038102047518], [11, 0.09819490101654083], [21, 0.10967255602008663], [31, 0.12497597699984908], [41, 0.15581403399119154], [50, 0.20737257500877604], [75, 0.4075722369889263], [100, 0.6961136639874894], [125, 1.2459769090055488], [150, 1.9477032349968795], [175, 2.9241388669761363], [200, 4.062343477999093], [225, 5.525791090010898], [250, 7.429152600001544], [275, 9.735833615995944], [300, 12.339807118987665], [325, 15.295822995016351], [350, 18.94190217999858]]
}, {
"name": "baseline",
"data": [[1, 0.09534351801266894], [11, 0.09232071999576874], [21, 0.10048456001095474], [31, 0.0982192869996652], [41, 0.10659951399429701], [50, 0.10047914599999785], [75, 0.10098609799752012], [100, 0.11007569800131023], [125, 0.11086453800089657], [150, 0.12483492901083082], [175, 0.12282644398510456], [200, 0.1368741389887873], [225, 0.1399976889952086], [250, 0.14830340398475528], [275, 0.1735205659933854], [300, 0.18462765400181524], [325, 0.20261039299657568], [350, 0.2079498780076392], [375, 0.2330766140075866], [400, 0.2435350499872584], [425, 0.2761935309972614], [450, 0.28044877099455334], [475, 0.2974146460182965], [500, 0.34834816100192256], [525, 0.35303976098657586], [550, 0.3978775860159658], [575, 0.4169384289998561], [600, 0.45173969701863825], [625, 0.47565915700397454], [650, 0.5098676639900077], [675, 0.5550524260033853], [700, 0.5470710580120794], [725, 0.5713922709983308], [750, 0.5942362000059802], [775, 0.6306585159909446], [800, 0.7168841210077517], [825, 0.7222614150086883], [850, 0.7455230760097038], [875, 0.7977525219903328], [900, 0.8829816509969532], [925, 0.9142037700221408], [950, 0.9537750399904326], [975, 0.9351416190038435], [1000, 1.0214263429807033], [1025, 1.1319507640146185], [1050, 1.1235301730048377], [1075, 1.2404336280014832], [1100, 1.3133715709846001], [1125, 1.3291781340085436], [1150, 1.4689148160105105], [1175, 1.4539184950117487], [1200, 1.6047468119941186], [1225, 1.5771693959832191], [1250, 1.5649027740000747], [1275, 1.607890668004984], [1300, 1.735658618999878], [1325, 1.8082804939767811], [1350, 1.8304610850173049], [1375, 1.8633078320126515], [1400, 2.0342835540068336], [1425, 2.01016613500542], [1450, 2.1453720389981754], [1475, 2.1390185649797786], [1500, 2.2633344949863385], [1525, 2.418157876993064], [1550, 2.588575252011651], [1575, 2.728629636025289], [1600, 2.5235889900068287], [1625, 2.675839487987105], [1650, 2.6967918330046814], [1675, 2.900896061997628], [1700, 2.943891672999598], [1725, 3.0031289469916373], [1750, 3.02798648798489], [1775, 3.1218033979821485], [1800, 3.1856411909975577], [1825, 3.3108235569961835], [1850, 3.4588255060079973], [1875, 3.532556129997829], [1900, 3.566291323979385], [1925, 3.972061591979582], [1950, 4.007646832003957], [1975, 3.9374160630104598], [2000, 3.7607903350144625], [3000, 8.245222013996681], [4000, 14.593654008989688]]
}
]
}
First, we can see that our naive recursive approach is really a no-no. Just
for kicks, I decided to also benchmark std::tuple_element
, which seems to
be using a naive approach, because it blows up quite quickly. Then, perhaps
surprisingly, the void*
trick is also awful. This probably has to do with
template argument deduction for functions of many arguments being a very
expensive process. Hence, it seems like being clever does not always pay!
Then, there’s the multiple inheritance trick and the __nth_element
builtin.
Frankly, I was disappointed. I expected the builtin __nth_element
to have
basically zero cost and to beat the hell out of every other technique. Being
frustrated, I profiled Clang and it turns out that it does have nearly zero
cost, but the problem is that our benchmark produces very long lists of
template arguments, and these seem to be quite expensive to parse for some
reason. Hence, I designed a second benchmark with smaller parameter packs,
which might also represent more accurately real-world use cases. Indeed, most
applications should manipulate decently short parameter packs, but many of
them. Here’s the second benchmark:
template < int >
struct x ;
using T0 = nth_element < 0 % 10 , x < 0 > , x < 0 > , ..., x < 0 >> ;
^^^^^^^^^^^^^^^^^^^^^ 10 elements total
using T1 = nth_element < 1 % 10 , x < 1 > , x < 1 > , ..., x < 1 >> ;
^^^^^^^^^^^^^^^^^^^^^ 10 elements total
...
using Tn = nth_element < n % 10 , x < n > , x < n > , ..., x < n >> ;
^^^^^^^^^^^^^^^^^^^^^ 10 elements total
Now, instead of creating huge parameter packs, we’re creating 10-element
parameter packs, but many of them. For each such parameter pack, we fetch
a single “randomly selected” element inside of it (assuming a uniform
distribution of the indices). Here’s the result:
{
"title": {
"text": "Compilation time for accessing a parameter pack (take 2)"
},
"xAxis": {
"title": { "text": "Number of parameter packs" }
},
"series": [
{
"name": "recursion",
"data": [[1, 0.09546216399758123], [11, 0.10053210199112073], [21, 0.10203463799552992], [31, 0.09694144999957643], [41, 0.10242460202425718], [50, 0.10128581000026315], [75, 0.11142494800151326], [100, 0.11320021498249844], [125, 0.11604463201365434], [150, 0.12825809099012986], [175, 0.13458143200841732], [200, 0.1334242750017438], [225, 0.1417607549810782], [250, 0.1425978770130314], [275, 0.15075135600636713], [300, 0.15558661299291998], [325, 0.16686130198650062], [350, 0.16883271798724309], [375, 0.1748867779970169], [400, 0.18178407798404805], [425, 0.18084629299119115], [450, 0.19229805198847316], [475, 0.20241704399813898], [500, 0.20735692200833], [525, 0.2159531649958808], [550, 0.21777220000512898], [575, 0.22475279701757245], [600, 0.24088511799345724], [625, 0.2382344029902015], [650, 0.23422862100414932], [675, 0.24891721797757782], [700, 0.24859152399585582], [725, 0.252897924015997], [750, 0.26519585799542256], [775, 0.2739419830031693], [800, 0.2721554650051985], [825, 0.2729388040024787], [850, 0.27653093900880776], [875, 0.29092756297905], [900, 0.30301166500430554], [925, 0.29757942899595946], [950, 0.3102994409855455], [975, 0.3152729130233638], [1000, 0.3156762180151418], [1025, 0.3192651009885594], [1050, 0.3262079489941243], [1075, 0.3284809179895092], [1100, 0.34397911399719305], [1125, 0.3444209290028084], [1150, 0.3498887099849526], [1175, 0.3638946379942354], [1200, 0.3646941340120975], [1225, 0.3727168460027315], [1250, 0.3713586179947015], [1275, 0.37220706898369826], [1300, 0.39239601700683124], [1325, 0.39544566700351425], [1350, 0.3858365759951994], [1375, 0.3985497700050473], [1400, 0.40260483900783584], [1425, 0.4431414139980916], [1450, 0.41728350101038814], [1475, 0.4295167609816417], [1500, 0.44971138102118857], [1525, 0.4440774260147009], [1550, 0.45846742700086907], [1575, 0.4521048960159533], [1600, 0.4538642060069833], [1625, 0.4660802519938443], [1650, 0.4696536709961947], [1675, 0.47891058499226347], [1700, 0.4790992380003445], [1725, 0.49025946599431336], [1750, 0.49801503599155694], [1775, 0.4938179950113408], [1800, 0.5024845529987942], [1825, 0.5170660479925573], [1850, 0.5151772559911478], [1875, 0.5197437989991158], [1900, 0.5247153460222762], [1925, 0.5359846579958685], [1950, 0.5461656560073607], [1975, 0.5425802750105504], [2000, 0.5387226490129251], [3000, 0.7951755810063332], [4000, 1.0108453659922816], [5000, 1.2513339669967536], [6000, 1.4941552609961946], [7000, 1.7305793739797082], [8000, 1.9308063049975317], [9000, 2.2092738460050896], [10000, 2.5425733939919155]]
}, {
"name": "multiple inheritance",
"data": [[1, 0.09151444601593539], [11, 0.09865311699104495], [21, 0.10033098098938353], [31, 0.10533868399215862], [41, 0.11535708099836484], [50, 0.12215074599953368], [75, 0.12984100400353782], [100, 0.14441272200201638], [125, 0.16284042201004922], [150, 0.17365362198324874], [175, 0.19496029199217446], [200, 0.2072609159804415], [225, 0.22125396400224417], [250, 0.23301479101064615], [275, 0.255450507014757], [300, 0.27156443998683244], [325, 0.28353306802455336], [350, 0.29450121300760657], [375, 0.3083658030082006], [400, 0.3301475939806551], [425, 0.35266350000165403], [450, 0.35196093498962], [475, 0.37795999902300537], [500, 0.39584094801102765], [525, 0.40609418600797653], [550, 0.41754621799918823], [575, 0.4468166219885461], [600, 0.4448362110124435], [625, 0.46652591400197707], [650, 0.4980369940167293], [675, 0.5082713239826262], [700, 0.5177251249842811], [725, 0.5287353679887019], [750, 0.5390414630237501], [775, 0.550618115987163], [800, 0.5778990149847232], [825, 0.5952765170077328], [850, 0.6174120159994345], [875, 0.6328760939941276], [900, 0.6459776010015048], [925, 0.6528845750144683], [950, 0.6826316170045175], [975, 0.6853330680169165], [1000, 0.72312860598322], [1025, 0.7334286409895867], [1050, 0.7593278499844018], [1075, 0.7583801440196112], [1100, 0.7827245009830222], [1125, 0.7881580829853192], [1150, 0.7922238949977327], [1175, 0.8226729289744981], [1200, 0.8166877999901772], [1225, 0.8405488420103211], [1250, 0.8650902639783453], [1275, 0.8678461930248886], [1300, 0.8847986570035573], [1325, 0.8977372460067272], [1350, 0.9120421389816329], [1375, 0.9351357370032929], [1400, 0.9367919310170691], [1425, 0.9446767119807191], [1450, 0.9818752180144656], [1475, 1.0052662579983007], [1500, 1.0300840190029703], [1525, 1.0455090399773326], [1550, 1.0464601060084533], [1575, 1.2114405519969296], [1600, 1.142153906985186], [1625, 1.2166247009881772], [1650, 1.1413106289983261], [1675, 1.1214883000066038], [1700, 1.1351152490242384], [1725, 1.1640560600208119], [1750, 1.2056733100034762], [1775, 1.1936281240195967], [1800, 1.2217883720004465], [1825, 1.252211951999925], [1850, 1.2353310039907228], [1875, 1.2762424740067217], [1900, 1.2977470860059839], [1925, 1.3223357270180713], [1950, 1.3227329080109484], [1975, 1.3491440149955451], [2000, 1.3347533339983784], [3000, 1.9974134030053392], [4000, 2.653963026998099], [5000, 3.305021703999955], [6000, 3.872916328982683], [7000, 4.586430655996082], [8000, 5.357805745006772], [9000, 5.955259381007636], [10000, 6.504300118016545]]
}, {
"name": "void* trick",
"data": [[1, 0.09203997100121342], [11, 0.09545609698398039], [21, 0.10256834499887191], [31, 0.09706332799396478], [41, 0.0984092199942097], [50, 0.1045367390033789], [75, 0.1082264110154938], [100, 0.11862418300006539], [125, 0.12041148499702103], [150, 0.13339903502492234], [175, 0.13015233498299494], [200, 0.13923106499714777], [225, 0.1430561679881066], [250, 0.14880602600169368], [275, 0.15454588498687372], [300, 0.161727332015289], [325, 0.16069514499395154], [350, 0.16592281201155856], [375, 0.18007649498758838], [400, 0.1896179059986025], [425, 0.18950157598010264], [450, 0.1900415209820494], [475, 0.20248964399797842], [500, 0.20836186001542956], [525, 0.2073231410176959], [550, 0.21623049801564775], [575, 0.2169176310126204], [600, 0.22627796800225042], [625, 0.22758046301896684], [650, 0.23332810201100074], [675, 0.2511284099891782], [700, 0.25519671200891025], [725, 0.2556150059972424], [750, 0.26649533401359804], [775, 0.2623064749932382], [800, 0.2672479720204137], [825, 0.2795683729927987], [850, 0.29297664598561823], [875, 0.282479641988175], [900, 0.29566582600818947], [925, 0.2995607829943765], [950, 0.30768891199841164], [975, 0.3134861010184977], [1000, 0.31936754402704537], [1025, 0.32432024998706765], [1050, 0.3268379910150543], [1075, 0.3339029019989539], [1100, 0.34309613300138153], [1125, 0.34483412001281977], [1150, 0.39213137701153755], [1175, 0.4069843189790845], [1200, 0.37722004001261666], [1225, 0.3748901489889249], [1250, 0.39090636299806647], [1275, 0.4308397780114319], [1300, 0.4910384530085139], [1325, 0.4382038509938866], [1350, 0.4533662829780951], [1375, 0.441017456992995], [1400, 0.4012415350007359], [1425, 0.41599810600746423], [1450, 0.4302890099934302], [1475, 0.42798446799861267], [1500, 0.44604688198887743], [1525, 0.44274365302408114], [1550, 0.4481841569941025], [1575, 0.44917578902095556], [1600, 0.4430997190065682], [1625, 0.4548824009834789], [1650, 0.47848281398182735], [1675, 0.4693559430015739], [1700, 0.48753689500153996], [1725, 0.48132931999862194], [1750, 0.48320488299941644], [1775, 0.4990676899906248], [1800, 0.4927079319895711], [1825, 0.5046479360025842], [1850, 0.5103697620215826], [1875, 0.5145584550045896], [1900, 0.5434839590161573], [1925, 0.5335127039870713], [1950, 0.5518362639995757], [1975, 0.5405141009832732], [2000, 0.54103193100309], [3000, 0.7652243690099567], [4000, 0.9990616499853786], [5000, 1.2758305509923957], [6000, 1.5197779479785822], [7000, 1.7589197579945903], [8000, 1.9899196999904234], [9000, 2.234814929019194], [10000, 2.513137086003553]]
}, {
"name": "__nth_element",
"data": [[1, 0.09258351699099876], [11, 0.09369529000832699], [21, 0.09137631699559279], [31, 0.0931541160098277], [41, 0.09510698998929001], [50, 0.09406216599745676], [75, 0.09517580500687473], [100, 0.1004086040193215], [125, 0.1024101059883833], [150, 0.10546683601569384], [175, 0.11905966998892836], [200, 0.11666212399723008], [225, 0.1126696839928627], [250, 0.12200452599790879], [275, 0.12620002301991917], [300, 0.1234256629832089], [325, 0.12382141500711441], [350, 0.13037488900590688], [375, 0.13426684399018995], [400, 0.14645328299957328], [425, 0.1350236459984444], [450, 0.13826761400559917], [475, 0.14238037599716336], [500, 0.14378828898770735], [525, 0.14977716200519353], [550, 0.1541026189806871], [575, 0.1540961409918964], [600, 0.16704897399176843], [625, 0.16169460298260674], [650, 0.16521404599188827], [675, 0.16363754399935715], [700, 0.17383311697631143], [725, 0.16963031800696626], [750, 0.17304997600149363], [775, 0.17705260097864084], [800, 0.1819289090053644], [825, 0.17856109401327558], [850, 0.19231098299496807], [875, 0.18416340099065565], [900, 0.20071763801388443], [925, 0.1938421549857594], [950, 0.20347351100645028], [975, 0.20492530800402164], [1000, 0.1992722229915671], [1025, 0.20585699399816804], [1050, 0.22756165801547468], [1075, 0.20841177098918706], [1100, 0.21447809599339962], [1125, 0.2205687320092693], [1150, 0.2234987980045844], [1175, 0.22188781001023017], [1200, 0.23698215099284425], [1225, 0.23091710699372925], [1250, 0.23105728600057773], [1275, 0.2436108459951356], [1300, 0.24563979098456912], [1325, 0.24147150097996928], [1350, 0.24175899801775813], [1375, 0.251738799008308], [1400, 0.24432772901491262], [1425, 0.2462646100029815], [1450, 0.25462219599285163], [1475, 0.2559547359996941], [1500, 0.2572521349939052], [1525, 0.26170854599331506], [1550, 0.2638758099928964], [1575, 0.2730573180015199], [1600, 0.2729876170051284], [1625, 0.27134343498619273], [1650, 0.2728070250013843], [1675, 0.28107758102123626], [1700, 0.2876122359884903], [1725, 0.28621940698940307], [1750, 0.2956657790055033], [1775, 0.2888143279997166], [1800, 0.2981316950172186], [1825, 0.30017611797666177], [1850, 0.30495091501506977], [1875, 0.3050569420156535], [1900, 0.3104755629901774], [1925, 0.30851303998497315], [1950, 0.30941488599637523], [1975, 0.3150175769987982], [2000, 0.31527140497928485], [3000, 0.4386807169939857], [4000, 0.5505664100055583], [5000, 0.6861678110144567], [6000, 0.7839790970028844], [7000, 0.8928194290201645], [8000, 0.9890408939972986], [9000, 1.1216794269857928], [10000, 1.2831098659953568]]
}, {
"name": "std::tuple_element",
"data": [[1, 0.08725655399030074], [11, 0.09195292697404511], [21, 0.10007163099362515], [31, 0.10152014100458473], [41, 0.10763215299812146], [50, 0.10880942500079982], [75, 0.12157031500828452], [100, 0.12863950998871587], [125, 0.14376431799610145], [150, 0.15181195698096417], [175, 0.16103118800674565], [200, 0.16904300698661245], [225, 0.18703976599499583], [250, 0.18900810499326326], [275, 0.19913125899620354], [300, 0.21194772102171555], [325, 0.2334577119909227], [350, 0.23001668401411735], [375, 0.2582478020049166], [400, 0.25202805202570744], [425, 0.26054437801940367], [450, 0.2704305560037028], [475, 0.2925830650201533], [500, 0.29654172199661843], [525, 0.3107448320079129], [550, 0.31896207699901424], [575, 0.3372797560004983], [600, 0.3332043020054698], [625, 0.3564894550072495], [650, 0.361033559020143], [675, 0.37833846602006815], [700, 0.3859302310156636], [725, 0.38701714199851267], [750, 0.40831167399301194], [775, 0.408328239020193], [800, 0.42765604399028234], [825, 0.43115721401409246], [850, 0.4403081700147595], [875, 0.44868303200928494], [900, 0.4689236289996188], [925, 0.47045738899032585], [950, 0.4890698030067142], [975, 0.48397234998992644], [1000, 0.4999616479908582], [1025, 0.5227718350070063], [1050, 0.5298947420087643], [1075, 0.5319647330034059], [1100, 0.5504700639867224], [1125, 0.5616040360182524], [1150, 0.5910105789953377], [1175, 0.5768523699953221], [1200, 0.5810605169972405], [1225, 0.6067086260009091], [1250, 0.6255020879907534], [1275, 0.6255923410062678], [1300, 0.624644163006451], [1325, 0.6390568479837384], [1350, 0.6553193780127913], [1375, 0.6619394760055002], [1400, 0.666027354018297], [1425, 0.6866251940082293], [1450, 0.6937769000069238], [1475, 0.7034092230023816], [1500, 0.7233108189830091], [1525, 0.7196284850069787], [1550, 0.7412698229891248], [1575, 0.7457349679898471], [1600, 0.7799992149812169], [1625, 0.765022695006337], [1650, 0.7808330050029326], [1675, 0.7895931460079737], [1700, 0.7948282559809741], [1725, 0.808631477993913], [1750, 0.8358911699906457], [1775, 0.8437636210001074], [1800, 0.8312495099962689], [1825, 0.8549530629825313], [1850, 0.8706723829964176], [1875, 0.8555138139927294], [1900, 0.8877996869850904], [1925, 0.915654448006535], [1950, 0.9087838460109197], [1975, 0.9177905230026226], [2000, 0.9095878500083927], [3000, 1.3705580249952618], [4000, 1.802314425993245], [5000, 2.1767055650125258], [6000, 2.6772592380002607], [7000, 3.0683914969849866], [8000, 3.515081396995811], [9000, 4.004349105991423], [10000, 4.41421711599105]]
}, {
"name": "baseline",
"data": [[1, 0.08836935798171908], [11, 0.09070015599718317], [21, 0.0925242199737113], [31, 0.09414662802009843], [41, 0.10279214099864475], [50, 0.09896241998649202], [75, 0.09976018301676959], [100, 0.09753267097403295], [125, 0.09922220601583831], [150, 0.10532773801242001], [175, 0.10608820701600052], [200, 0.11065703901113011], [225, 0.1123215830011759], [250, 0.1177014849963598], [275, 0.11565389999304898], [300, 0.1241399860009551], [325, 0.12040434402297251], [350, 0.12256316700950265], [375, 0.12604840600397438], [400, 0.13172699700226076], [425, 0.13368127998546697], [450, 0.13814642900251783], [475, 0.13840877899201587], [500, 0.1437560040212702], [525, 0.14108052098890767], [550, 0.1477437140129041], [575, 0.1456049160042312], [600, 0.15583807602524757], [625, 0.15059926302637905], [650, 0.151307149004424], [675, 0.15665120398625731], [700, 0.16021031499258243], [725, 0.16767066597822122], [750, 0.165985533996718], [775, 0.16411666499334387], [800, 0.16883892999612726], [825, 0.16886239597806707], [850, 0.17774291298701428], [875, 0.17477420999784954], [900, 0.1855626500037033], [925, 0.18780808500014246], [950, 0.18344571598572657], [975, 0.18439895001938567], [1000, 0.18803021701751277], [1025, 0.19807562697678804], [1050, 0.20291807799367234], [1075, 0.1958030410169158], [1100, 0.20309493597596884], [1125, 0.20794074499281123], [1150, 0.20179298901348375], [1175, 0.21718914198572747], [1200, 0.20522874599555507], [1225, 0.21356377998017706], [1250, 0.2164742379973177], [1275, 0.221289778011851], [1300, 0.2231813330145087], [1325, 0.2197718029783573], [1350, 0.2396993339934852], [1375, 0.22822035200078972], [1400, 0.22713286799262278], [1425, 0.2279730679874774], [1450, 0.24019850601325743], [1475, 0.23765384801663458], [1500, 0.23472911099088378], [1525, 0.247682524000993], [1550, 0.24636138099594973], [1575, 0.2607763030100614], [1600, 0.24956377601483837], [1625, 0.2559770910011139], [1650, 0.26713756402023137], [1675, 0.26661788098863326], [1700, 0.2613370750041213], [1725, 0.2636273299867753], [1750, 0.2592260919918772], [1775, 0.26603113499004394], [1800, 0.2699959379970096], [1825, 0.2735434710048139], [1850, 0.280390133993933], [1875, 0.272813618008513], [1900, 0.27897772000869736], [1925, 0.28572729899315163], [1950, 0.2820904229884036], [1975, 0.28896554600214586], [2000, 0.30226761000812985], [3000, 0.411518729000818], [4000, 0.4954906789935194], [5000, 0.5890666850027628], [6000, 0.6803854379977565], [7000, 0.8217915959830862], [8000, 0.9064403439988382], [9000, 0.9965935639920644], [10000, 1.0792520420218352]]
}
]
}
This time, the results are pretty different. The multiple inheritance trick
seems to be doing the worse, which surprised me a lot. However, when you think
of it, we have to build up a new indexer
structure each time the parameter
pack changes. Hence, lookup is fast once we have the indexer
set up, but if
we have many different parameter packs, this technique does not scale terribly
well.
The reason why the recursive approach fares so well in that benchmark, I think,
is because we’re fetching elements at “random” indices. Assuming a uniform
distribution of the indices (which is the case in our benchmark), we’re on
average fetching the 5-th element of the parameter pack. Since the recursive
implementation will bail out as soon as the index is reached, we’re effectively
doing only half the work. Just to confirm this, let’s look at a last benchmark,
in which we’ll do the same as before, but we’ll fetch all the elements of each
parameter pack:
template < int >
struct x ;
using T0_0 = nth_element < 0 , x < 0 > , x < 0 > , ..., x < 0 >> ;
... ^^^^^^^^^^^^^^^^^^^^^ 10 elements total
using T0_9 = nth_element < 9 , x < 0 > , x < 0 > , ..., x < 0 >> ;
using T1_0 = nth_element < 0 , x < 1 > , x < 1 > , ..., x < 1 >> ;
... ^^^^^^^^^^^^^^^^^^^^^ 10 elements total
using T1_9 = nth_element < 9 , x < 1 > , x < 1 > , ..., x < 1 >> ;
...
using Tn_0 = nth_element < 0 , x < n > , x < n > , ..., x < n >> ;
... ^^^^^^^^^^^^^^^^^^^^^ 10 elements total
using Tn_9 = nth_element < 9 , x < n > , x < n > , ..., x < n >> ;
This benchmark is like a mix between the first and the second benchmarks. We’re
now fetching the elements at every index in each parameter pack, but we’re still
considering only decently-sized parameter packs. If everything goes well, we
should see a relative improvement of the multiple inheritance technique,
because its large setup cost should be amortized. Here’s the result:
{
"title": {
"text": "Compilation time for accessing a parameter pack (take 3)"
},
"xAxis": {
"title": { "text": "Number of parameter packs" }
},
"series": [
{
"name": "recursion",
"data": [[1, 0.09247852402040735], [11, 0.11355378301232122], [21, 0.13338755400036462], [31, 0.16322578600374982], [41, 0.19732626000768505], [50, 0.1972720960038714], [75, 0.2623697560047731], [100, 0.3152514419925865], [125, 0.3643700699904002], [150, 0.434101582999574], [175, 0.4869558570208028], [200, 0.544020223984262], [225, 0.5982955270155799], [250, 0.6618624980037566], [275, 0.7156725490058307], [300, 0.7903607559856027], [325, 0.8512078409839887], [350, 0.8812316030089278], [375, 0.9448290120053571], [400, 0.9972827429883182], [425, 1.071594877023017], [450, 1.1052856640017126], [475, 1.170036657975288], [500, 1.2377116909774486], [525, 1.2924690379877575], [550, 1.3294718980032485], [575, 1.444631324004149], [600, 1.4712974010035396], [625, 1.549133834982058], [650, 1.6408855409827083], [675, 1.6455992450064514], [700, 1.704104128992185], [725, 1.7532410640269518], [750, 1.8344721080211457], [775, 1.868957368977135], [800, 1.9780673930072226], [825, 1.9907542649889365], [850, 2.1039565069950186], [875, 2.1065325450035743], [900, 2.1516486979962792], [925, 2.2638184810057282], [950, 2.26328834699234], [975, 2.3912347349978518], [1000, 2.381053201010218], [1025, 2.4194702329987194], [1050, 2.5251537870208267], [1075, 2.596071977983229], [1100, 2.627454307017615], [1125, 2.7678066219959874], [1150, 2.8025187570019625], [1175, 2.806375754997134], [1200, 2.8604497209889814], [1225, 3.0076436329982243], [1250, 3.158441492007114], [1275, 3.1114319049811456], [1300, 3.0909016340156086], [1325, 3.1378083919989876], [1350, 3.349280896014534], [1375, 3.4116867010015994], [1400, 3.4529027859971393], [1425, 3.4588574320077896], [1450, 3.504199271003017], [1475, 3.551439056987874], [1500, 3.546112673007883], [1525, 3.6018413730198517], [1550, 3.7930642070132308], [1575, 3.7182173150067683], [1600, 3.8223208590061404], [1625, 3.9356493199884426], [1650, 4.030348346015671], [1675, 4.199211902014213], [1700, 4.188961676991312], [1725, 4.377914400014561], [1750, 4.423989615985192], [1775, 4.435019900993211], [1800, 4.481998608011054], [1825, 4.429894586006412], [1850, 4.487346948008053], [1875, 4.561044814006891], [1900, 4.707669840980088], [1925, 4.713419749983586], [1950, 4.783669797994662], [1975, 4.81688208997366], [2000, 4.895502117986325], [3000, 7.09816518498701], [4000, 9.461126145004528], [5000, 11.703825340984622], [6000, 14.03645402600523], [7000, 16.346084266988328], [8000, 19.191189885983476]]
}, {
"name": "multiple inheritance",
"data": [[1, 0.09198752100928687], [11, 0.12020572600886226], [21, 0.16084843198768795], [31, 0.1745391979929991], [41, 0.20417138500488363], [50, 0.21939924400066957], [75, 0.30302287300582975], [100, 0.35446803001104854], [125, 0.4134082210075576], [150, 0.48780169599922374], [175, 0.5471654980210587], [200, 0.6154608000069857], [225, 0.7023536860069726], [250, 0.7461642109847162], [275, 0.8052385459886864], [300, 0.8926802729838528], [325, 0.9597103749983944], [350, 1.052633125014836], [375, 1.0952334459871054], [400, 1.1717934230109677], [425, 1.2547796389844734], [450, 1.3024881400051527], [475, 1.3623891719907988], [500, 1.456249859998934], [525, 1.513698638009373], [550, 1.5811683869978879], [575, 1.6570780460024253], [600, 1.6956725579802878], [625, 1.8160427490074653], [650, 1.8558584979909938], [675, 1.9304179849859793], [700, 1.9631099890102632], [725, 2.1051817090192344], [750, 2.0835875690099783], [775, 2.1975078520190436], [800, 2.3185115600063], [825, 2.317430933995638], [850, 2.444908547011437], [875, 2.4391568780119997], [900, 2.4993303589872085], [925, 2.5867401850118767], [950, 2.730908234982053], [975, 2.7366971530136652], [1000, 2.8055373259994667], [1025, 2.9939946260128636], [1050, 2.9838134289893787], [1075, 3.0081931989989243], [1100, 3.043275233008899], [1125, 3.227533967990894], [1150, 3.290739656018559], [1175, 3.3836791339854244], [1200, 3.398272559978068], [1225, 3.455187918996671], [1250, 3.5017083669954445], [1275, 3.5853923080139793], [1300, 3.579832189017907], [1325, 3.736169982003048], [1350, 3.891673890000675], [1375, 4.017194060987094], [1400, 4.04635707399575], [1425, 4.10671942299814], [1450, 4.1767789000005], [1475, 4.259794933983358], [1500, 4.4118497730232775], [1525, 4.436610111995833], [1550, 4.64371065699379], [1575, 4.605047772027319], [1600, 4.627138048002962], [1625, 4.738393618987175], [1650, 4.844859117001761], [1675, 4.837418096984038], [1700, 4.923475474002771], [1725, 4.946855489979498], [1750, 5.066053073009243], [1775, 5.219290932000149], [1800, 5.192452216986567], [1825, 5.322330966009758], [1850, 5.35496404801961], [1875, 5.390155083005084], [1900, 5.475375829002587], [1925, 5.597877419990255], [1950, 5.670635850983672], [1975, 5.6966252679994795], [2000, 5.755978784000035], [3000, 8.27490048200707], [4000, 11.060260159982136], [5000, 13.65958705797675], [6000, 16.716019885003334], [7000, 19.106026647990802]]
}, {
"name": "void* trick",
"data": [[1, 0.1073889720137231], [11, 0.1452771039912477], [21, 0.1431081990012899], [31, 0.1559223980002571], [41, 0.20665713999187574], [50, 0.20084173997747712], [75, 0.2946061349939555], [100, 0.34384088098886423], [125, 0.3898407839878928], [150, 0.4640267709910404], [175, 0.5141664699767716], [200, 0.5636051079782192], [225, 0.6370294090011157], [250, 0.7189938000228722], [275, 0.7910375889914576], [300, 0.8944112170138396], [325, 0.919358467013808], [350, 0.9637669200019445], [375, 0.9541001849866007], [400, 1.0643012590007856], [425, 1.2021882720000576], [450, 1.2190341310051735], [475, 1.3122795779781882], [500, 1.3650111590104643], [525, 1.4797079130075872], [550, 1.5735298939980567], [575, 1.656451204995392], [600, 1.6529475140268914], [625, 1.6743415789969731], [650, 1.6548598159861285], [675, 1.7177728010283317], [700, 1.7791876640112605], [725, 1.8364603800000623], [750, 1.844965553988004], [775, 1.9358663269958925], [800, 1.9837099819851574], [825, 2.0275426029984374], [850, 2.112992905982537], [875, 2.16545048498665], [900, 2.5238939360133372], [925, 2.460455580992857], [950, 2.5159078649885487], [975, 2.4227139790018555], [1000, 2.3851212130102795], [1025, 2.792484058008995], [1050, 2.7354417049791664], [1075, 2.538887786999112], [1100, 2.7332587350101676], [1125, 2.9145966520009097], [1150, 3.2307106019870844], [1175, 3.07687999701011], [1200, 3.0401059629803058], [1225, 3.2314461499918252], [1250, 3.421544067998184], [1275, 3.4948609349958133], [1300, 3.5868837120069657], [1325, 3.377412458998151], [1350, 3.640264046989614], [1375, 3.7059665439883247], [1400, 3.432480806019157], [1425, 3.4418949670216534], [1450, 4.289867592975497], [1475, 3.7609424609981943], [1500, 3.733723666984588], [1525, 4.366200033022324], [1550, 4.364277562999632], [1575, 4.378962991992012], [1600, 4.063558651978383], [1625, 4.141913137980737], [1650, 4.008789872983471], [1675, 4.538566064002225], [1700, 4.517647509987], [1725, 4.592582953977399], [1750, 4.158898051013239], [1775, 4.108714434987633], [1800, 4.554175988014322], [1825, 5.482355123007437], [1850, 4.459770114975981], [1875, 4.579241200990509], [1900, 5.776070221007103], [1925, 4.830430870002601], [1950, 4.675362765003229], [1975, 4.989379442995414], [2000, 5.483278057014104], [3000, 7.71909163199598], [4000, 10.052502998994896], [5000, 12.62802253200789], [6000, 14.700998643995263], [7000, 18.014430240989896], [8000, 19.667670697992435]]
}, {
"name": "__nth_element",
"data": [[1, 0.0950568169937469], [11, 0.10422070001368411], [21, 0.11312327400082722], [31, 0.12100902199745178], [41, 0.13229449599748477], [50, 0.1492392419895623], [75, 0.17108788699260913], [100, 0.19836945601855405], [125, 0.23523157901945524], [150, 0.2775640409963671], [175, 0.3322309989889618], [200, 0.3467939769907389], [225, 0.3997763399966061], [250, 0.41092007598490454], [275, 0.4472865370044019], [300, 0.479327780980384], [325, 0.47323612798936665], [350, 0.5430320870073047], [375, 0.5825483509979676], [400, 0.614084303000709], [425, 0.6547560330072884], [450, 0.6912187470006756], [475, 0.7053063539788127], [500, 0.7163041739840992], [525, 0.7697810089739505], [550, 0.925340269022854], [575, 0.9072284449939616], [600, 0.8678634690004401], [625, 1.1857491510163527], [650, 1.013095684000291], [675, 0.9259142770024482], [700, 1.0036134260008112], [725, 1.003436030994635], [750, 1.0211288550053723], [775, 1.0618203879857901], [800, 1.2634732279984746], [825, 1.2032499659981113], [850, 1.1673175980104133], [875, 1.0960878320038319], [900, 1.1092872690060176], [925, 1.1250440779840574], [950, 1.1758719449862838], [975, 1.2136791050143074], [1000, 1.220605345995864], [1025, 1.3250704110250808], [1050, 1.4525212860025931], [1075, 1.4133804920129478], [1100, 1.378123783011688], [1125, 1.5650337359984405], [1150, 1.5272134989791084], [1175, 1.5577035329770297], [1200, 1.5140516629908234], [1225, 1.5116898390115239], [1250, 1.5028544160013553], [1275, 1.5653873319970444], [1300, 1.577734621008858], [1325, 1.817314192012418], [1350, 1.9845920379739255], [1375, 1.9030265510082245], [1400, 1.9168393790023401], [1425, 1.9240890619985294], [1450, 1.9527441280079074], [1475, 2.1423851709987503], [1500, 1.9973142569942866], [1525, 2.6714377719908953], [1550, 1.913854395999806], [1575, 1.9511215070087928], [1600, 1.911259655986214], [1625, 1.9522429139760789], [1650, 2.330047277006088], [1675, 2.376613529981114], [1700, 2.2836325599928387], [1725, 2.320423130004201], [1750, 2.4753193740034476], [1775, 2.434135713003343], [1800, 2.1415978049917612], [1825, 2.187181327986764], [1850, 2.284417407005094], [1875, 2.246651594003197], [1900, 2.3078603349858895], [1925, 2.3660219680168666], [1950, 2.28750070600654], [1975, 2.4328526470053475], [2000, 2.3650536669883877], [3000, 3.4386173709935974], [4000, 4.763912066002376], [5000, 5.8363779580104165], [6000, 6.881287380005233], [7000, 8.100905020022765], [8000, 9.241638478008099], [9000, 10.273312300996622], [10000, 11.25853025802644]]
}, {
"name": "std::tuple_element",
"data": [[1, 0.09337756698369049], [11, 0.13377408098313026], [21, 0.1732819729950279], [31, 0.21852461900562048], [41, 0.2546429369831458], [50, 0.2823179699771572], [75, 0.38543783500790596], [100, 0.4995455569878686], [125, 0.5855343099974561], [150, 0.6879644419823308], [175, 0.7933025999809615], [200, 0.8977979089831933], [225, 0.9996375079790596], [250, 1.1193739840236958], [275, 1.2000017059908714], [300, 1.296332016994711], [325, 1.4149017610179726], [350, 1.4863329239888117], [375, 1.6474636100174394], [400, 1.7066396370064467], [425, 1.8082447120104916], [450, 1.9330964750261046], [475, 2.0098657220078167], [500, 2.2121536310005467], [525, 2.3517623330117203], [550, 2.318767855002079], [575, 2.493857276014751], [600, 2.519583412009524], [625, 2.6586784579849336], [650, 2.7677744689863175], [675, 2.8834427859983407], [700, 2.916233911004383], [725, 3.0219406409887597], [750, 3.1239475379989017], [775, 3.310127389995614], [800, 3.409046381013468], [825, 3.514461916987784], [850, 3.6096187089860905], [875, 3.626689437980531], [900, 3.701576388993999], [925, 3.897961706999922], [950, 3.989045284979511], [975, 4.113302537996788], [1000, 4.287587864993839], [1025, 4.392406702012522], [1050, 4.509951138985343], [1075, 4.472606594004901], [1100, 4.704200700012734], [1125, 4.809550042991759], [1150, 4.782554339995841], [1175, 4.896435742004542], [1200, 5.149643496988574], [1225, 5.155710003018612], [1250, 5.195539377979003], [1275, 5.533276050002314], [1300, 5.524178879975807], [1325, 5.575433574995259], [1350, 5.621228789008455], [1375, 5.7721650179883], [1400, 5.967793115996756], [1425, 5.9824653239920735], [1450, 6.127393369009951], [1475, 6.305587689013919], [1500, 6.308764015993802], [1525, 6.533352591999574], [1550, 6.59228588201222], [1575, 6.635692954994738], [1600, 6.649674073996721], [1625, 6.89821601199219], [1650, 6.896466217993293], [1675, 7.344635518005816], [1700, 7.681716649996815], [1725, 7.35485692799557], [1750, 7.636738499975763], [1775, 8.669315953011392], [1800, 8.188681711006211], [1825, 7.52730417001294], [1850, 8.024651742016431], [1875, 8.761455725005362], [1900, 8.550652243022341], [1925, 9.099231283005793], [1950, 8.020784931984963], [1975, 8.550188498018542], [2000, 9.964308563998202], [3000, 14.280408783000894], [4000, 17.97387029399397]]
}, {
"name": "baseline",
"data": [[1, 0.0895553519949317], [11, 0.09979549501440488], [21, 0.11330437599099241], [31, 0.13849593402119353], [41, 0.137482207006542], [50, 0.14284568399307318], [75, 0.1773306849936489], [100, 0.1884095050045289], [125, 0.21988379300455563], [150, 0.25168353199842386], [175, 0.2868826750200242], [200, 0.308317688992247], [225, 0.33244993499829434], [250, 0.3624091840174515], [275, 0.39794950000941753], [300, 0.40776465900125913], [325, 0.43727376498281956], [350, 0.4728284180164337], [375, 0.49249671801226214], [400, 0.5058450030046515], [425, 0.5444902209856082], [450, 0.5837795360130258], [475, 0.6214615339995362], [500, 0.671573837986216], [525, 0.6467340749804862], [550, 0.6627447769860737], [575, 0.7218813590006903], [600, 0.7234189859882463], [625, 0.7632153170125093], [650, 0.7856548090057913], [675, 0.7986638139991555], [700, 0.8189744959818199], [725, 0.8462159839982633], [750, 0.8851248699938878], [775, 0.9154056039988063], [800, 0.9406576729961671], [825, 1.0103037280205172], [850, 1.0288816959946416], [875, 1.0288725550053641], [900, 1.088915932981763], [925, 1.0910141869971994], [950, 1.131581638008356], [975, 1.1764616339933127], [1000, 1.1502197259978857], [1025, 1.1954374169872608], [1050, 1.201837553002406], [1075, 1.262557398993522], [1100, 1.2741319700144231], [1125, 1.2638344909937587], [1150, 1.3024873409885913], [1175, 1.3896058530081064], [1200, 1.3849343079782557], [1225, 1.426253588986583], [1250, 1.4527283280040137], [1275, 1.4745519419957418], [1300, 1.51767245400697], [1325, 1.6073829960078], [1350, 1.565623925998807], [1375, 1.6282155700027943], [1400, 1.6590890980151016], [1425, 1.6315327839984093], [1450, 1.6618742069986183], [1475, 1.6905794409976806], [1500, 1.7500342380080838], [1525, 1.7808794279990252], [1550, 1.8134323380072601], [1575, 1.7704940969997551], [1600, 1.8733548439922743], [1625, 2.024379991024034], [1650, 1.9723198350111488], [1675, 2.0633506719896104], [1700, 1.9977447040146217], [1725, 2.053775262000272], [1750, 2.001571528002387], [1775, 2.0927021270035766], [1800, 2.054515050025657], [1825, 2.0970026050054003], [1850, 2.12224082701141], [1875, 2.1247854190005455], [1900, 2.1612690539914183], [1925, 2.271123151993379], [1950, 2.2433295069786254], [1975, 2.262846002995502], [2000, 2.4405463490111288], [3000, 3.2802189700014424], [4000, 4.261982837982941], [5000, 5.434160718985368], [6000, 6.526061890006531], [7000, 7.482144627982052], [8000, 8.456365802005166], [9000, 9.528541000996483], [10000, 10.602903638995485]]
}
]
}
Indeed, the multiple inheritance technique is now faring better, but it still
isn’t as good as we’d like it to.
Conclusion
In this article, we outlined a couple of different library-based implementation
techniques for indexing into parameter packs. I also presented a compiler
builtin which allows us to index parameter packs with very little compile-time
overhead. As the benchmarks show, there does not seem to be a silver bullet as
far as library-based techniques go. However, the compiler builtin consistently
provided superior performance.
I would also like to point out how tricky benchmarking can be, especially
compile-time benchmarking. Indeed, several people (including myself) have
already benchmarked those techniques, and the conclusion had always been that
the multiple inheritance based technique was superior. It turns out that this
superiority depends highly on the exact use case, and in fact it might not be
so superior for real-world use cases. Here, I tried to back myself up as much
as possible, but it is also possible that my benchmarks are lying for one
reason or another, so keep this in mind.
The next step for me will be to propose my patch to Clang, and to propose a
matching patch to libc++ for std::tuple_element
. Of course, you can expect
Boost.Hana to take advantage of this builtin if it makes it into Clang.
I will also try to explore how other fundamental metaprogramming operations
can be sped up using compiler builtins, and report on them. I think there’s
hope.
Stay tuned!
Edit:
The patch for Clang has been proposed here , and the one for
libc++ here .
Edit (30/06/2016):
The patch has been merged into Clang. Thus, Clang will now have a
__type_pack_element<n, T...>
intrinsic equivalent to the
__nth_element
presented in this post.