Louis Dionne Programming and Categories, Oh My!

Efficient parameter pack indexing

Recently, I’ve been looking at ways to index into parameter packs with as little compile-time overhead as possible. This is not a new problem, and we know of several metaprogramming techniques to achieve this, some of which offer pretty good compile-times. Most of these techniques are also well documented, for example here and here. So, why write an article about this well-covered topic? Well, I recently decided to cheat and modified the compiler to see how fast we could possibly be. This article summarizes what I found.

First technique: recursion

The first implementation technique is the most naive one, and also the one which is unfortunately encountered the most often in the wild. Basically, we recursively walk the parameter pack until we find the type at the given index:

template <std::size_t I, typename T, typename ...Ts>
struct nth_element_impl {
    using type = typename nth_element_impl<I-1, Ts...>::type;
};

template <typename T, typename ...Ts>
struct nth_element_impl<0, T, Ts...> {
    using type = T;
};

template <std::size_t I, typename ...Ts>
using nth_element = typename nth_element_impl<I, Ts...>::type;

This is simple and easy to understand, but it can also be very inefficient in some circumstances. In general, don’t ever do this in your codebase unless you want to extend your coffee breaks.

Second implementation: multiple inheritance

The second implementation technique is much more clever. It consists in using multiple inheritance and template argument deduction to trick the compiler into deducing the n-th element of a parameter pack for us. Here’s how it’s done:

template <std::size_t I, typename T>
struct indexed {
    using type = T;
};

template <typename Is, typename ...Ts>
struct indexer;

template <std::size_t ...Is, typename ...Ts>
struct indexer<std::index_sequence<Is...>, Ts...>
    : indexed<Is, Ts>...
{ };

template <std::size_t I, typename T>
static indexed<I, T> select(indexed<I, T>);

template <std::size_t I, typename ...Ts>
using nth_element = typename decltype(select<I>(
    indexer<std::index_sequence_for<Ts...>, Ts...>{}
))::type;

The code is adapted from this post.

The idea here is that indexer inherits from indexed<0, T0>, indexed<1, T1>, up to indexed<N, TN>. Then, when we call select<I> with a fixed I, the only match is select(indexed<I, TI>). Hence, the compiler deduces the right indexed<I, TI> for us, and all we have to do is fetch the actual TI, which is just indexed<I, TI>::type.

Third implementation: the void* trick

The third implementation technique is probably the most tricky one to understand. The idea is that we generate a function template of the form

template <typename T>
T f(void*, void*, ..., void*, T*, ...);
//  ^^^^^^^^^^^^^^^^^^^^^^^^ n-1 of these

Then, when we call that function with pointer arguments, its return type is always going to be the type pointed to by the n-th argument. Modulo some trickery, you can probably see that we could get the n-th element of an arbitrary parameter pack if we had such a contraption. But can we build such a function for an arbitrary n? Yes, we can:

template <std::size_t n, typename = std::make_index_sequence<n>>
struct nth_element_impl;

template <std::size_t n, std::size_t ...ignore>
struct nth_element_impl<n, std::index_sequence<ignore...>> {
    template <typename Tn>
    static Tn f(decltype((void*)ignore)..., Tn*, ...);
};

Given any n, nth_element_impl<n>::f is the function we’ve been searching for. Now, all that’s left to do is to use f in the right way to extract the n-th element of a parameter pack, which is not too difficult:

template <typename T>
struct wrapper { using type = T; };

template <std::size_t n, typename ...T>
using nth_element = typename decltype(
    nth_element_impl<n>::f(static_cast<wrapper<T>*>(0)...)
)::type;

Here, we use wrapper to avoid creating a pointer to each T, because some types can’t be pointed-to (like references). Then, since f’s return type is the pointee of the n-th argument, the decltype of the call to f is just a wrapper holding the n-th type of the parameter pack. Finally, we just fetch that type in the nested ::type alias. Phew!

Fourth implementation: A compiler builtin!

Now’s the meat of this article. I’ve known about the above techniques for a while, and I was also aware of the approximate compile-time performance of each. However, I couldn’t help but wonder how a compiler intrinsic would compare with these library-based approaches. So I went ahead and implemented that in the Clang compiler, which is surprinsigly easy to hack on.

Basically, I modified the compiler so that when it sees a specialization called __nth_element<n, T...>, it substitutes it for the n-th type of the parameter pack T.... Just as if __nth_element was a template alias doing one of the tricks that we saw above, except the trick is implemented at the compiler-level. Here’s how we can use it:

using Nth = __nth_element<3, int, char, float, double, void>;
static_assert(std::is_same<Nth, double>::value, "");

Simple, huh? Let’s now see how the different techniques compare.

Comparing the techniques

To compare the techniques, we’ll first need to settle on a benchmark that we can put each of the techniques through. For different values of n, we’ll fetch all the elements in a parameter pack of length n+1 (indexing starts at 0):

using T0 = nth_element<0, int, int, ..., int>;
                          ^^^^^^^^^^^^^^^^^^ n+1 elements total

using T1 = nth_element<1, int, int, ..., int>;
...
using Tn = nth_element<n, int, int, ..., int>;

A couple of things are worth noting here. First, we fetch types at all the indices in the parameter pack instead of fetching a single, arbitrary index. From experience, we often need to fetch the types at many (or all) the indices in a parameter pack, so this use case is closest to most real-world use cases. Also, the second and third techniques have a cost to amortize, i.e. the cost of initially building the std::index_sequence. It is therefore more fair to fetch many indices in order to let the amortizing happen. Finally, fetching at many different indices means longer compile-times, which is easier to measure with a decent precision.

Another thing worth justifying is the usage of int as the type in the parameter pack. In fact, the choice that was made is to use a homogeneous parameter pack instead of a heterogeneous parameter pack like

template <int> struct X;

using T0 = nth_element<0, X<0>, X<1>, ..., X<n+1>>;

The reason for doing so is that creating types takes time to the compiler, even when those types are not instantiated (like the X<i>s above). Since we really want to benchmark the algorithm for getting the n-th element of the parameter pack, it is better to use an homogeneous parameter pack. Now, as for any rigorous benchmark, we’ll need a baseline implementation of nth_element:

template <std::size_t n, typename ...Ts>
using nth_element = void;

Of course, this implementation gives the wrong result, but it doesn’t matter since it’s just for benchmarking. Also, since we really want to measure the efficiency of the techniques we presented above and not some constant factors, we’ll #include exactly the same headers for all the benchmarks. In other words, even though the builtin __nth_element requires no additional headers, its benchmark will still include <utility> to be fair with the other techniques that require it. Finally, note that all benchmarks are on Clang trunk, to which my patch is added. Enough waiting, here’s the result:

{ "title": { "text": "Compilation time for accessing a parameter pack" }, "series": [ { "name": "recursion", "data": [[1, 0.08791385299991816], [11, 0.1040869290009141], [21, 0.10760211700107902], [31, 0.12382004098617472], [41, 0.1280566049972549], [50, 0.14985320100095123], [75, 0.22767344501335174], [100, 0.44891875301254913], [125, 0.6296636089973617], [150, 0.8388248109840788], [175, 1.2580508480023127], [200, 1.75584493597853], [225, 2.347315200982848], [250, 3.070959162985673], [275, 3.9935728659911547], [300, 5.04345749298227], [325, 6.411517444008496], [350, 7.570142345997738], [375, 9.334758556011366], [400, 11.364183734986], [425, 13.100518237974029], [450, 15.150793401990086], [475, 18.016470752976602]] }, { "name": "multiple inheritance", "data": [[1, 0.09923712399904616], [11, 0.1010527829930652], [21, 0.09471394898719154], [31, 0.09767348700552247], [41, 0.10303380902041681], [50, 0.11183845900814049], [75, 0.12262416601879522], [100, 0.13599308699485846], [125, 0.1490678550035227], [150, 0.16649107899866067], [175, 0.1929700399923604], [200, 0.24410744599299505], [225, 0.25436611101031303], [250, 0.2929226130072493], [275, 0.33210227798554115], [300, 0.3667355029901955], [325, 0.39734739597770385], [350, 0.44671401599771343], [375, 0.5002171350060962], [400, 0.5487369850161485], [425, 0.6234887770260684], [450, 0.6647616800037213], [475, 0.7309245250071399], [500, 0.7873986509803217], [525, 0.8701764259894844], [550, 0.9605405330075882], [575, 1.0053072549926583], [600, 1.1314099880110007], [625, 1.190395031007938], [650, 1.2666128969867714], [675, 1.3734691129939165], [700, 1.5193456730048638], [725, 1.5714434119872749], [750, 1.8087361850193702], [775, 1.7521372050105128], [800, 2.0012963689805474], [825, 2.524466857983498], [850, 2.1846430029836483], [875, 2.2757716499909293], [900, 2.5220009069889784], [925, 2.8757488050032407], [950, 2.876232857001014], [975, 3.066998501017224], [1000, 3.1187242529995274], [1025, 3.245229969994398], [1050, 3.7422107250022236], [1075, 3.713907941011712], [1100, 3.5173721110040788], [1125, 4.708479123015422], [1150, 5.443987812002888], [1175, 5.246640281024156], [1200, 5.023036516999127], [1225, 4.97608685400337], [1250, 5.379978543001926], [1275, 5.288356656004908], [1300, 6.068725592020201], [1325, 6.058860035991529], [1350, 6.039346348989056], [1375, 6.673429206974106], [1400, 6.59464824697352], [1425, 6.439105322002433], [1450, 7.265887854999164], [1475, 7.10695827100426], [1500, 7.622790062014246], [1525, 7.2972888789954595], [1550, 7.67380925299949], [1575, 7.264323350013001], [1600, 8.271978703996865], [1625, 7.791878398013068], [1650, 9.339439092000248], [1675, 9.755184198991628], [1700, 10.066870843991637], [1725, 10.372837678994983], [1750, 10.309330489981221], [1775, 9.768502388993511], [1800, 9.401901669014478], [1825, 10.241244286007714], [1850, 9.85094714400475], [1875, 10.683431051002117], [1900, 10.967952057020739], [1925, 12.433038909017341], [1950, 12.530010496993782], [1975, 12.624577670998406], [2000, 11.85506974501186]] }, { "name": "void* trick", "data": [[1, 0.10740649900981225], [11, 0.11059235798893496], [21, 0.12323373701656237], [31, 0.12219471001299098], [41, 0.1328271089878399], [50, 0.14716608601156622], [75, 0.19515213900012895], [100, 0.31295377301285043], [125, 0.33367389597697183], [150, 0.5311430190049578], [175, 0.6208896419848315], [200, 0.742176756990375], [225, 0.8497800249897409], [250, 0.9993392080068588], [275, 1.3237761600175872], [300, 1.5975253860233352], [325, 2.0864916009886656], [350, 2.1416680899856146], [375, 2.7329974559834227], [400, 3.1420765760121867], [425, 3.185407273005694], [450, 3.5732148960232735], [475, 4.1299009829817805], [500, 4.569442178006284], [525, 4.898843714006944], [550, 6.166297096002381], [575, 5.742453882005066], [600, 6.27711286599515], [625, 6.7415783570031635], [650, 7.676926622982137], [675, 8.281751938979141], [700, 9.087016478995793], [725, 9.76674667999032], [750, 11.78882013799739], [775, 10.950903388991719], [800, 11.854080150020309], [825, 13.233973357011564], [850, 15.64278328200453], [875, 15.402913261001231], [900, 17.994686827005353], [925, 17.696341240982292], [950, 19.390634357987437]] }, { "name": "__nth_element", "data": [[1, 0.11245947898714803], [11, 0.1085786210023798], [21, 0.10532981299911626], [31, 0.12184113499824889], [41, 0.10490602100617252], [50, 0.11116955100442283], [75, 0.11290367800393142], [100, 0.12389529898064211], [125, 0.1296987759997137], [150, 0.14026946501689963], [175, 0.14848662100848742], [200, 0.16904175697709434], [225, 0.19542713597184047], [250, 0.211196149000898], [275, 0.21766682498855516], [300, 0.2804989920114167], [325, 0.2675095460144803], [350, 0.3391032590006944], [375, 0.32595690799644217], [400, 0.3887825500278268], [425, 0.39993590902304277], [450, 0.4480190910107922], [475, 0.4430583660141565], [500, 0.48500029600108974], [525, 0.5337112010165583], [550, 0.5564779890119098], [575, 0.5852735450025648], [600, 0.6763202070142142], [625, 0.7239769989973865], [650, 0.7201910940057132], [675, 0.776890175009612], [700, 0.7959786260034889], [725, 0.8278058749856427], [750, 0.9787349559774157], [775, 0.991111416980857], [800, 0.9531115650024731], [825, 1.1124548520019744], [850, 1.1440459869918413], [875, 1.1474693639902398], [900, 1.1929215129930526], [925, 1.3329246929788496], [950, 1.320771906990558], [975, 1.400243792013498], [1000, 1.4220198979892302], [1025, 1.4624821589968633], [1050, 1.7917850179946981], [1075, 1.7489220330026], [1100, 1.8088964839989785], [1125, 1.756834443018306], [1150, 1.9329246929846704], [1175, 2.160528142005205], [1200, 2.1670490649994463], [1225, 2.392635665019043], [1250, 2.420430022990331], [1275, 2.360121834994061], [1300, 2.6496885970118456], [1325, 2.4889739899954293], [1350, 2.546395246987231], [1375, 2.7288687319960445], [1400, 2.95728632301325], [1425, 3.1285262529854663], [1450, 3.3501748190028593], [1475, 3.458735363004962], [1500, 3.0745456560107414], [1525, 3.4576181620068382], [1550, 3.3852364490157925], [1575, 3.468985949002672], [1600, 3.739083191991085], [1625, 3.8641978960076813], [1650, 3.942073091981001], [1675, 4.5900268880068325], [1700, 4.179321927018464], [1725, 4.130655788991135], [1750, 4.486190288997022], [1775, 4.261819762003142], [1800, 4.686393445008434], [1825, 4.74440868300735], [1850, 4.688695557997562], [1875, 5.090895264991559], [1900, 4.904464602004737], [1925, 5.021394329989562], [1950, 4.951598395971814], [1975, 5.175259591982467], [2000, 5.444765752996318], [3000, 12.110702848003712]] }, { "name": "std::tuple_element", "data": [[1, 0.09287038102047518], [11, 0.09819490101654083], [21, 0.10967255602008663], [31, 0.12497597699984908], [41, 0.15581403399119154], [50, 0.20737257500877604], [75, 0.4075722369889263], [100, 0.6961136639874894], [125, 1.2459769090055488], [150, 1.9477032349968795], [175, 2.9241388669761363], [200, 4.062343477999093], [225, 5.525791090010898], [250, 7.429152600001544], [275, 9.735833615995944], [300, 12.339807118987665], [325, 15.295822995016351], [350, 18.94190217999858]] }, { "name": "baseline", "data": [[1, 0.09534351801266894], [11, 0.09232071999576874], [21, 0.10048456001095474], [31, 0.0982192869996652], [41, 0.10659951399429701], [50, 0.10047914599999785], [75, 0.10098609799752012], [100, 0.11007569800131023], [125, 0.11086453800089657], [150, 0.12483492901083082], [175, 0.12282644398510456], [200, 0.1368741389887873], [225, 0.1399976889952086], [250, 0.14830340398475528], [275, 0.1735205659933854], [300, 0.18462765400181524], [325, 0.20261039299657568], [350, 0.2079498780076392], [375, 0.2330766140075866], [400, 0.2435350499872584], [425, 0.2761935309972614], [450, 0.28044877099455334], [475, 0.2974146460182965], [500, 0.34834816100192256], [525, 0.35303976098657586], [550, 0.3978775860159658], [575, 0.4169384289998561], [600, 0.45173969701863825], [625, 0.47565915700397454], [650, 0.5098676639900077], [675, 0.5550524260033853], [700, 0.5470710580120794], [725, 0.5713922709983308], [750, 0.5942362000059802], [775, 0.6306585159909446], [800, 0.7168841210077517], [825, 0.7222614150086883], [850, 0.7455230760097038], [875, 0.7977525219903328], [900, 0.8829816509969532], [925, 0.9142037700221408], [950, 0.9537750399904326], [975, 0.9351416190038435], [1000, 1.0214263429807033], [1025, 1.1319507640146185], [1050, 1.1235301730048377], [1075, 1.2404336280014832], [1100, 1.3133715709846001], [1125, 1.3291781340085436], [1150, 1.4689148160105105], [1175, 1.4539184950117487], [1200, 1.6047468119941186], [1225, 1.5771693959832191], [1250, 1.5649027740000747], [1275, 1.607890668004984], [1300, 1.735658618999878], [1325, 1.8082804939767811], [1350, 1.8304610850173049], [1375, 1.8633078320126515], [1400, 2.0342835540068336], [1425, 2.01016613500542], [1450, 2.1453720389981754], [1475, 2.1390185649797786], [1500, 2.2633344949863385], [1525, 2.418157876993064], [1550, 2.588575252011651], [1575, 2.728629636025289], [1600, 2.5235889900068287], [1625, 2.675839487987105], [1650, 2.6967918330046814], [1675, 2.900896061997628], [1700, 2.943891672999598], [1725, 3.0031289469916373], [1750, 3.02798648798489], [1775, 3.1218033979821485], [1800, 3.1856411909975577], [1825, 3.3108235569961835], [1850, 3.4588255060079973], [1875, 3.532556129997829], [1900, 3.566291323979385], [1925, 3.972061591979582], [1950, 4.007646832003957], [1975, 3.9374160630104598], [2000, 3.7607903350144625], [3000, 8.245222013996681], [4000, 14.593654008989688]] } ] }

First, we can see that our naive recursive approach is really a no-no. Just for kicks, I decided to also benchmark std::tuple_element, which seems to be using a naive approach, because it blows up quite quickly. Then, perhaps surprisingly, the void* trick is also awful. This probably has to do with template argument deduction for functions of many arguments being a very expensive process. Hence, it seems like being clever does not always pay!

Then, there’s the multiple inheritance trick and the __nth_element builtin. Frankly, I was disappointed. I expected the builtin __nth_element to have basically zero cost and to beat the hell out of every other technique. Being frustrated, I profiled Clang and it turns out that it does have nearly zero cost, but the problem is that our benchmark produces very long lists of template arguments, and these seem to be quite expensive to parse for some reason. Hence, I designed a second benchmark with smaller parameter packs, which might also represent more accurately real-world use cases. Indeed, most applications should manipulate decently short parameter packs, but many of them. Here’s the second benchmark:

template <int>
struct x;

using T0 = nth_element<0 % 10, x<0>, x<0>, ..., x<0>>;
                               ^^^^^^^^^^^^^^^^^^^^^ 10 elements total

using T1 = nth_element<1 % 10, x<1>, x<1>, ..., x<1>>;
                               ^^^^^^^^^^^^^^^^^^^^^ 10 elements total
...
using Tn = nth_element<n % 10, x<n>, x<n>, ..., x<n>>;
                               ^^^^^^^^^^^^^^^^^^^^^ 10 elements total

Now, instead of creating huge parameter packs, we’re creating 10-element parameter packs, but many of them. For each such parameter pack, we fetch a single “randomly selected” element inside of it (assuming a uniform distribution of the indices). Here’s the result:

{ "title": { "text": "Compilation time for accessing a parameter pack (take 2)" }, "xAxis": { "title": { "text": "Number of parameter packs" } }, "series": [ { "name": "recursion", "data": [[1, 0.09546216399758123], [11, 0.10053210199112073], [21, 0.10203463799552992], [31, 0.09694144999957643], [41, 0.10242460202425718], [50, 0.10128581000026315], [75, 0.11142494800151326], [100, 0.11320021498249844], [125, 0.11604463201365434], [150, 0.12825809099012986], [175, 0.13458143200841732], [200, 0.1334242750017438], [225, 0.1417607549810782], [250, 0.1425978770130314], [275, 0.15075135600636713], [300, 0.15558661299291998], [325, 0.16686130198650062], [350, 0.16883271798724309], [375, 0.1748867779970169], [400, 0.18178407798404805], [425, 0.18084629299119115], [450, 0.19229805198847316], [475, 0.20241704399813898], [500, 0.20735692200833], [525, 0.2159531649958808], [550, 0.21777220000512898], [575, 0.22475279701757245], [600, 0.24088511799345724], [625, 0.2382344029902015], [650, 0.23422862100414932], [675, 0.24891721797757782], [700, 0.24859152399585582], [725, 0.252897924015997], [750, 0.26519585799542256], [775, 0.2739419830031693], [800, 0.2721554650051985], [825, 0.2729388040024787], [850, 0.27653093900880776], [875, 0.29092756297905], [900, 0.30301166500430554], [925, 0.29757942899595946], [950, 0.3102994409855455], [975, 0.3152729130233638], [1000, 0.3156762180151418], [1025, 0.3192651009885594], [1050, 0.3262079489941243], [1075, 0.3284809179895092], [1100, 0.34397911399719305], [1125, 0.3444209290028084], [1150, 0.3498887099849526], [1175, 0.3638946379942354], [1200, 0.3646941340120975], [1225, 0.3727168460027315], [1250, 0.3713586179947015], [1275, 0.37220706898369826], [1300, 0.39239601700683124], [1325, 0.39544566700351425], [1350, 0.3858365759951994], [1375, 0.3985497700050473], [1400, 0.40260483900783584], [1425, 0.4431414139980916], [1450, 0.41728350101038814], [1475, 0.4295167609816417], [1500, 0.44971138102118857], [1525, 0.4440774260147009], [1550, 0.45846742700086907], [1575, 0.4521048960159533], [1600, 0.4538642060069833], [1625, 0.4660802519938443], [1650, 0.4696536709961947], [1675, 0.47891058499226347], [1700, 0.4790992380003445], [1725, 0.49025946599431336], [1750, 0.49801503599155694], [1775, 0.4938179950113408], [1800, 0.5024845529987942], [1825, 0.5170660479925573], [1850, 0.5151772559911478], [1875, 0.5197437989991158], [1900, 0.5247153460222762], [1925, 0.5359846579958685], [1950, 0.5461656560073607], [1975, 0.5425802750105504], [2000, 0.5387226490129251], [3000, 0.7951755810063332], [4000, 1.0108453659922816], [5000, 1.2513339669967536], [6000, 1.4941552609961946], [7000, 1.7305793739797082], [8000, 1.9308063049975317], [9000, 2.2092738460050896], [10000, 2.5425733939919155]] }, { "name": "multiple inheritance", "data": [[1, 0.09151444601593539], [11, 0.09865311699104495], [21, 0.10033098098938353], [31, 0.10533868399215862], [41, 0.11535708099836484], [50, 0.12215074599953368], [75, 0.12984100400353782], [100, 0.14441272200201638], [125, 0.16284042201004922], [150, 0.17365362198324874], [175, 0.19496029199217446], [200, 0.2072609159804415], [225, 0.22125396400224417], [250, 0.23301479101064615], [275, 0.255450507014757], [300, 0.27156443998683244], [325, 0.28353306802455336], [350, 0.29450121300760657], [375, 0.3083658030082006], [400, 0.3301475939806551], [425, 0.35266350000165403], [450, 0.35196093498962], [475, 0.37795999902300537], [500, 0.39584094801102765], [525, 0.40609418600797653], [550, 0.41754621799918823], [575, 0.4468166219885461], [600, 0.4448362110124435], [625, 0.46652591400197707], [650, 0.4980369940167293], [675, 0.5082713239826262], [700, 0.5177251249842811], [725, 0.5287353679887019], [750, 0.5390414630237501], [775, 0.550618115987163], [800, 0.5778990149847232], [825, 0.5952765170077328], [850, 0.6174120159994345], [875, 0.6328760939941276], [900, 0.6459776010015048], [925, 0.6528845750144683], [950, 0.6826316170045175], [975, 0.6853330680169165], [1000, 0.72312860598322], [1025, 0.7334286409895867], [1050, 0.7593278499844018], [1075, 0.7583801440196112], [1100, 0.7827245009830222], [1125, 0.7881580829853192], [1150, 0.7922238949977327], [1175, 0.8226729289744981], [1200, 0.8166877999901772], [1225, 0.8405488420103211], [1250, 0.8650902639783453], [1275, 0.8678461930248886], [1300, 0.8847986570035573], [1325, 0.8977372460067272], [1350, 0.9120421389816329], [1375, 0.9351357370032929], [1400, 0.9367919310170691], [1425, 0.9446767119807191], [1450, 0.9818752180144656], [1475, 1.0052662579983007], [1500, 1.0300840190029703], [1525, 1.0455090399773326], [1550, 1.0464601060084533], [1575, 1.2114405519969296], [1600, 1.142153906985186], [1625, 1.2166247009881772], [1650, 1.1413106289983261], [1675, 1.1214883000066038], [1700, 1.1351152490242384], [1725, 1.1640560600208119], [1750, 1.2056733100034762], [1775, 1.1936281240195967], [1800, 1.2217883720004465], [1825, 1.252211951999925], [1850, 1.2353310039907228], [1875, 1.2762424740067217], [1900, 1.2977470860059839], [1925, 1.3223357270180713], [1950, 1.3227329080109484], [1975, 1.3491440149955451], [2000, 1.3347533339983784], [3000, 1.9974134030053392], [4000, 2.653963026998099], [5000, 3.305021703999955], [6000, 3.872916328982683], [7000, 4.586430655996082], [8000, 5.357805745006772], [9000, 5.955259381007636], [10000, 6.504300118016545]] }, { "name": "void* trick", "data": [[1, 0.09203997100121342], [11, 0.09545609698398039], [21, 0.10256834499887191], [31, 0.09706332799396478], [41, 0.0984092199942097], [50, 0.1045367390033789], [75, 0.1082264110154938], [100, 0.11862418300006539], [125, 0.12041148499702103], [150, 0.13339903502492234], [175, 0.13015233498299494], [200, 0.13923106499714777], [225, 0.1430561679881066], [250, 0.14880602600169368], [275, 0.15454588498687372], [300, 0.161727332015289], [325, 0.16069514499395154], [350, 0.16592281201155856], [375, 0.18007649498758838], [400, 0.1896179059986025], [425, 0.18950157598010264], [450, 0.1900415209820494], [475, 0.20248964399797842], [500, 0.20836186001542956], [525, 0.2073231410176959], [550, 0.21623049801564775], [575, 0.2169176310126204], [600, 0.22627796800225042], [625, 0.22758046301896684], [650, 0.23332810201100074], [675, 0.2511284099891782], [700, 0.25519671200891025], [725, 0.2556150059972424], [750, 0.26649533401359804], [775, 0.2623064749932382], [800, 0.2672479720204137], [825, 0.2795683729927987], [850, 0.29297664598561823], [875, 0.282479641988175], [900, 0.29566582600818947], [925, 0.2995607829943765], [950, 0.30768891199841164], [975, 0.3134861010184977], [1000, 0.31936754402704537], [1025, 0.32432024998706765], [1050, 0.3268379910150543], [1075, 0.3339029019989539], [1100, 0.34309613300138153], [1125, 0.34483412001281977], [1150, 0.39213137701153755], [1175, 0.4069843189790845], [1200, 0.37722004001261666], [1225, 0.3748901489889249], [1250, 0.39090636299806647], [1275, 0.4308397780114319], [1300, 0.4910384530085139], [1325, 0.4382038509938866], [1350, 0.4533662829780951], [1375, 0.441017456992995], [1400, 0.4012415350007359], [1425, 0.41599810600746423], [1450, 0.4302890099934302], [1475, 0.42798446799861267], [1500, 0.44604688198887743], [1525, 0.44274365302408114], [1550, 0.4481841569941025], [1575, 0.44917578902095556], [1600, 0.4430997190065682], [1625, 0.4548824009834789], [1650, 0.47848281398182735], [1675, 0.4693559430015739], [1700, 0.48753689500153996], [1725, 0.48132931999862194], [1750, 0.48320488299941644], [1775, 0.4990676899906248], [1800, 0.4927079319895711], [1825, 0.5046479360025842], [1850, 0.5103697620215826], [1875, 0.5145584550045896], [1900, 0.5434839590161573], [1925, 0.5335127039870713], [1950, 0.5518362639995757], [1975, 0.5405141009832732], [2000, 0.54103193100309], [3000, 0.7652243690099567], [4000, 0.9990616499853786], [5000, 1.2758305509923957], [6000, 1.5197779479785822], [7000, 1.7589197579945903], [8000, 1.9899196999904234], [9000, 2.234814929019194], [10000, 2.513137086003553]] }, { "name": "__nth_element", "data": [[1, 0.09258351699099876], [11, 0.09369529000832699], [21, 0.09137631699559279], [31, 0.0931541160098277], [41, 0.09510698998929001], [50, 0.09406216599745676], [75, 0.09517580500687473], [100, 0.1004086040193215], [125, 0.1024101059883833], [150, 0.10546683601569384], [175, 0.11905966998892836], [200, 0.11666212399723008], [225, 0.1126696839928627], [250, 0.12200452599790879], [275, 0.12620002301991917], [300, 0.1234256629832089], [325, 0.12382141500711441], [350, 0.13037488900590688], [375, 0.13426684399018995], [400, 0.14645328299957328], [425, 0.1350236459984444], [450, 0.13826761400559917], [475, 0.14238037599716336], [500, 0.14378828898770735], [525, 0.14977716200519353], [550, 0.1541026189806871], [575, 0.1540961409918964], [600, 0.16704897399176843], [625, 0.16169460298260674], [650, 0.16521404599188827], [675, 0.16363754399935715], [700, 0.17383311697631143], [725, 0.16963031800696626], [750, 0.17304997600149363], [775, 0.17705260097864084], [800, 0.1819289090053644], [825, 0.17856109401327558], [850, 0.19231098299496807], [875, 0.18416340099065565], [900, 0.20071763801388443], [925, 0.1938421549857594], [950, 0.20347351100645028], [975, 0.20492530800402164], [1000, 0.1992722229915671], [1025, 0.20585699399816804], [1050, 0.22756165801547468], [1075, 0.20841177098918706], [1100, 0.21447809599339962], [1125, 0.2205687320092693], [1150, 0.2234987980045844], [1175, 0.22188781001023017], [1200, 0.23698215099284425], [1225, 0.23091710699372925], [1250, 0.23105728600057773], [1275, 0.2436108459951356], [1300, 0.24563979098456912], [1325, 0.24147150097996928], [1350, 0.24175899801775813], [1375, 0.251738799008308], [1400, 0.24432772901491262], [1425, 0.2462646100029815], [1450, 0.25462219599285163], [1475, 0.2559547359996941], [1500, 0.2572521349939052], [1525, 0.26170854599331506], [1550, 0.2638758099928964], [1575, 0.2730573180015199], [1600, 0.2729876170051284], [1625, 0.27134343498619273], [1650, 0.2728070250013843], [1675, 0.28107758102123626], [1700, 0.2876122359884903], [1725, 0.28621940698940307], [1750, 0.2956657790055033], [1775, 0.2888143279997166], [1800, 0.2981316950172186], [1825, 0.30017611797666177], [1850, 0.30495091501506977], [1875, 0.3050569420156535], [1900, 0.3104755629901774], [1925, 0.30851303998497315], [1950, 0.30941488599637523], [1975, 0.3150175769987982], [2000, 0.31527140497928485], [3000, 0.4386807169939857], [4000, 0.5505664100055583], [5000, 0.6861678110144567], [6000, 0.7839790970028844], [7000, 0.8928194290201645], [8000, 0.9890408939972986], [9000, 1.1216794269857928], [10000, 1.2831098659953568]] }, { "name": "std::tuple_element", "data": [[1, 0.08725655399030074], [11, 0.09195292697404511], [21, 0.10007163099362515], [31, 0.10152014100458473], [41, 0.10763215299812146], [50, 0.10880942500079982], [75, 0.12157031500828452], [100, 0.12863950998871587], [125, 0.14376431799610145], [150, 0.15181195698096417], [175, 0.16103118800674565], [200, 0.16904300698661245], [225, 0.18703976599499583], [250, 0.18900810499326326], [275, 0.19913125899620354], [300, 0.21194772102171555], [325, 0.2334577119909227], [350, 0.23001668401411735], [375, 0.2582478020049166], [400, 0.25202805202570744], [425, 0.26054437801940367], [450, 0.2704305560037028], [475, 0.2925830650201533], [500, 0.29654172199661843], [525, 0.3107448320079129], [550, 0.31896207699901424], [575, 0.3372797560004983], [600, 0.3332043020054698], [625, 0.3564894550072495], [650, 0.361033559020143], [675, 0.37833846602006815], [700, 0.3859302310156636], [725, 0.38701714199851267], [750, 0.40831167399301194], [775, 0.408328239020193], [800, 0.42765604399028234], [825, 0.43115721401409246], [850, 0.4403081700147595], [875, 0.44868303200928494], [900, 0.4689236289996188], [925, 0.47045738899032585], [950, 0.4890698030067142], [975, 0.48397234998992644], [1000, 0.4999616479908582], [1025, 0.5227718350070063], [1050, 0.5298947420087643], [1075, 0.5319647330034059], [1100, 0.5504700639867224], [1125, 0.5616040360182524], [1150, 0.5910105789953377], [1175, 0.5768523699953221], [1200, 0.5810605169972405], [1225, 0.6067086260009091], [1250, 0.6255020879907534], [1275, 0.6255923410062678], [1300, 0.624644163006451], [1325, 0.6390568479837384], [1350, 0.6553193780127913], [1375, 0.6619394760055002], [1400, 0.666027354018297], [1425, 0.6866251940082293], [1450, 0.6937769000069238], [1475, 0.7034092230023816], [1500, 0.7233108189830091], [1525, 0.7196284850069787], [1550, 0.7412698229891248], [1575, 0.7457349679898471], [1600, 0.7799992149812169], [1625, 0.765022695006337], [1650, 0.7808330050029326], [1675, 0.7895931460079737], [1700, 0.7948282559809741], [1725, 0.808631477993913], [1750, 0.8358911699906457], [1775, 0.8437636210001074], [1800, 0.8312495099962689], [1825, 0.8549530629825313], [1850, 0.8706723829964176], [1875, 0.8555138139927294], [1900, 0.8877996869850904], [1925, 0.915654448006535], [1950, 0.9087838460109197], [1975, 0.9177905230026226], [2000, 0.9095878500083927], [3000, 1.3705580249952618], [4000, 1.802314425993245], [5000, 2.1767055650125258], [6000, 2.6772592380002607], [7000, 3.0683914969849866], [8000, 3.515081396995811], [9000, 4.004349105991423], [10000, 4.41421711599105]] }, { "name": "baseline", "data": [[1, 0.08836935798171908], [11, 0.09070015599718317], [21, 0.0925242199737113], [31, 0.09414662802009843], [41, 0.10279214099864475], [50, 0.09896241998649202], [75, 0.09976018301676959], [100, 0.09753267097403295], [125, 0.09922220601583831], [150, 0.10532773801242001], [175, 0.10608820701600052], [200, 0.11065703901113011], [225, 0.1123215830011759], [250, 0.1177014849963598], [275, 0.11565389999304898], [300, 0.1241399860009551], [325, 0.12040434402297251], [350, 0.12256316700950265], [375, 0.12604840600397438], [400, 0.13172699700226076], [425, 0.13368127998546697], [450, 0.13814642900251783], [475, 0.13840877899201587], [500, 0.1437560040212702], [525, 0.14108052098890767], [550, 0.1477437140129041], [575, 0.1456049160042312], [600, 0.15583807602524757], [625, 0.15059926302637905], [650, 0.151307149004424], [675, 0.15665120398625731], [700, 0.16021031499258243], [725, 0.16767066597822122], [750, 0.165985533996718], [775, 0.16411666499334387], [800, 0.16883892999612726], [825, 0.16886239597806707], [850, 0.17774291298701428], [875, 0.17477420999784954], [900, 0.1855626500037033], [925, 0.18780808500014246], [950, 0.18344571598572657], [975, 0.18439895001938567], [1000, 0.18803021701751277], [1025, 0.19807562697678804], [1050, 0.20291807799367234], [1075, 0.1958030410169158], [1100, 0.20309493597596884], [1125, 0.20794074499281123], [1150, 0.20179298901348375], [1175, 0.21718914198572747], [1200, 0.20522874599555507], [1225, 0.21356377998017706], [1250, 0.2164742379973177], [1275, 0.221289778011851], [1300, 0.2231813330145087], [1325, 0.2197718029783573], [1350, 0.2396993339934852], [1375, 0.22822035200078972], [1400, 0.22713286799262278], [1425, 0.2279730679874774], [1450, 0.24019850601325743], [1475, 0.23765384801663458], [1500, 0.23472911099088378], [1525, 0.247682524000993], [1550, 0.24636138099594973], [1575, 0.2607763030100614], [1600, 0.24956377601483837], [1625, 0.2559770910011139], [1650, 0.26713756402023137], [1675, 0.26661788098863326], [1700, 0.2613370750041213], [1725, 0.2636273299867753], [1750, 0.2592260919918772], [1775, 0.26603113499004394], [1800, 0.2699959379970096], [1825, 0.2735434710048139], [1850, 0.280390133993933], [1875, 0.272813618008513], [1900, 0.27897772000869736], [1925, 0.28572729899315163], [1950, 0.2820904229884036], [1975, 0.28896554600214586], [2000, 0.30226761000812985], [3000, 0.411518729000818], [4000, 0.4954906789935194], [5000, 0.5890666850027628], [6000, 0.6803854379977565], [7000, 0.8217915959830862], [8000, 0.9064403439988382], [9000, 0.9965935639920644], [10000, 1.0792520420218352]] } ] }

This time, the results are pretty different. The multiple inheritance trick seems to be doing the worse, which surprised me a lot. However, when you think of it, we have to build up a new indexer structure each time the parameter pack changes. Hence, lookup is fast once we have the indexer set up, but if we have many different parameter packs, this technique does not scale terribly well.

The reason why the recursive approach fares so well in that benchmark, I think, is because we’re fetching elements at “random” indices. Assuming a uniform distribution of the indices (which is the case in our benchmark), we’re on average fetching the 5-th element of the parameter pack. Since the recursive implementation will bail out as soon as the index is reached, we’re effectively doing only half the work. Just to confirm this, let’s look at a last benchmark, in which we’ll do the same as before, but we’ll fetch all the elements of each parameter pack:

template <int>
struct x;

using T0_0 = nth_element<0, x<0>, x<0>, ..., x<0>>;
...                         ^^^^^^^^^^^^^^^^^^^^^ 10 elements total
using T0_9 = nth_element<9, x<0>, x<0>, ..., x<0>>;


using T1_0 = nth_element<0, x<1>, x<1>, ..., x<1>>;
...                         ^^^^^^^^^^^^^^^^^^^^^ 10 elements total
using T1_9 = nth_element<9, x<1>, x<1>, ..., x<1>>;

...

using Tn_0 = nth_element<0, x<n>, x<n>, ..., x<n>>;
...                         ^^^^^^^^^^^^^^^^^^^^^ 10 elements total
using Tn_9 = nth_element<9, x<n>, x<n>, ..., x<n>>;

This benchmark is like a mix between the first and the second benchmarks. We’re now fetching the elements at every index in each parameter pack, but we’re still considering only decently-sized parameter packs. If everything goes well, we should see a relative improvement of the multiple inheritance technique, because its large setup cost should be amortized. Here’s the result:

{ "title": { "text": "Compilation time for accessing a parameter pack (take 3)" }, "xAxis": { "title": { "text": "Number of parameter packs" } }, "series": [ { "name": "recursion", "data": [[1, 0.09247852402040735], [11, 0.11355378301232122], [21, 0.13338755400036462], [31, 0.16322578600374982], [41, 0.19732626000768505], [50, 0.1972720960038714], [75, 0.2623697560047731], [100, 0.3152514419925865], [125, 0.3643700699904002], [150, 0.434101582999574], [175, 0.4869558570208028], [200, 0.544020223984262], [225, 0.5982955270155799], [250, 0.6618624980037566], [275, 0.7156725490058307], [300, 0.7903607559856027], [325, 0.8512078409839887], [350, 0.8812316030089278], [375, 0.9448290120053571], [400, 0.9972827429883182], [425, 1.071594877023017], [450, 1.1052856640017126], [475, 1.170036657975288], [500, 1.2377116909774486], [525, 1.2924690379877575], [550, 1.3294718980032485], [575, 1.444631324004149], [600, 1.4712974010035396], [625, 1.549133834982058], [650, 1.6408855409827083], [675, 1.6455992450064514], [700, 1.704104128992185], [725, 1.7532410640269518], [750, 1.8344721080211457], [775, 1.868957368977135], [800, 1.9780673930072226], [825, 1.9907542649889365], [850, 2.1039565069950186], [875, 2.1065325450035743], [900, 2.1516486979962792], [925, 2.2638184810057282], [950, 2.26328834699234], [975, 2.3912347349978518], [1000, 2.381053201010218], [1025, 2.4194702329987194], [1050, 2.5251537870208267], [1075, 2.596071977983229], [1100, 2.627454307017615], [1125, 2.7678066219959874], [1150, 2.8025187570019625], [1175, 2.806375754997134], [1200, 2.8604497209889814], [1225, 3.0076436329982243], [1250, 3.158441492007114], [1275, 3.1114319049811456], [1300, 3.0909016340156086], [1325, 3.1378083919989876], [1350, 3.349280896014534], [1375, 3.4116867010015994], [1400, 3.4529027859971393], [1425, 3.4588574320077896], [1450, 3.504199271003017], [1475, 3.551439056987874], [1500, 3.546112673007883], [1525, 3.6018413730198517], [1550, 3.7930642070132308], [1575, 3.7182173150067683], [1600, 3.8223208590061404], [1625, 3.9356493199884426], [1650, 4.030348346015671], [1675, 4.199211902014213], [1700, 4.188961676991312], [1725, 4.377914400014561], [1750, 4.423989615985192], [1775, 4.435019900993211], [1800, 4.481998608011054], [1825, 4.429894586006412], [1850, 4.487346948008053], [1875, 4.561044814006891], [1900, 4.707669840980088], [1925, 4.713419749983586], [1950, 4.783669797994662], [1975, 4.81688208997366], [2000, 4.895502117986325], [3000, 7.09816518498701], [4000, 9.461126145004528], [5000, 11.703825340984622], [6000, 14.03645402600523], [7000, 16.346084266988328], [8000, 19.191189885983476]] }, { "name": "multiple inheritance", "data": [[1, 0.09198752100928687], [11, 0.12020572600886226], [21, 0.16084843198768795], [31, 0.1745391979929991], [41, 0.20417138500488363], [50, 0.21939924400066957], [75, 0.30302287300582975], [100, 0.35446803001104854], [125, 0.4134082210075576], [150, 0.48780169599922374], [175, 0.5471654980210587], [200, 0.6154608000069857], [225, 0.7023536860069726], [250, 0.7461642109847162], [275, 0.8052385459886864], [300, 0.8926802729838528], [325, 0.9597103749983944], [350, 1.052633125014836], [375, 1.0952334459871054], [400, 1.1717934230109677], [425, 1.2547796389844734], [450, 1.3024881400051527], [475, 1.3623891719907988], [500, 1.456249859998934], [525, 1.513698638009373], [550, 1.5811683869978879], [575, 1.6570780460024253], [600, 1.6956725579802878], [625, 1.8160427490074653], [650, 1.8558584979909938], [675, 1.9304179849859793], [700, 1.9631099890102632], [725, 2.1051817090192344], [750, 2.0835875690099783], [775, 2.1975078520190436], [800, 2.3185115600063], [825, 2.317430933995638], [850, 2.444908547011437], [875, 2.4391568780119997], [900, 2.4993303589872085], [925, 2.5867401850118767], [950, 2.730908234982053], [975, 2.7366971530136652], [1000, 2.8055373259994667], [1025, 2.9939946260128636], [1050, 2.9838134289893787], [1075, 3.0081931989989243], [1100, 3.043275233008899], [1125, 3.227533967990894], [1150, 3.290739656018559], [1175, 3.3836791339854244], [1200, 3.398272559978068], [1225, 3.455187918996671], [1250, 3.5017083669954445], [1275, 3.5853923080139793], [1300, 3.579832189017907], [1325, 3.736169982003048], [1350, 3.891673890000675], [1375, 4.017194060987094], [1400, 4.04635707399575], [1425, 4.10671942299814], [1450, 4.1767789000005], [1475, 4.259794933983358], [1500, 4.4118497730232775], [1525, 4.436610111995833], [1550, 4.64371065699379], [1575, 4.605047772027319], [1600, 4.627138048002962], [1625, 4.738393618987175], [1650, 4.844859117001761], [1675, 4.837418096984038], [1700, 4.923475474002771], [1725, 4.946855489979498], [1750, 5.066053073009243], [1775, 5.219290932000149], [1800, 5.192452216986567], [1825, 5.322330966009758], [1850, 5.35496404801961], [1875, 5.390155083005084], [1900, 5.475375829002587], [1925, 5.597877419990255], [1950, 5.670635850983672], [1975, 5.6966252679994795], [2000, 5.755978784000035], [3000, 8.27490048200707], [4000, 11.060260159982136], [5000, 13.65958705797675], [6000, 16.716019885003334], [7000, 19.106026647990802]] }, { "name": "void* trick", "data": [[1, 0.1073889720137231], [11, 0.1452771039912477], [21, 0.1431081990012899], [31, 0.1559223980002571], [41, 0.20665713999187574], [50, 0.20084173997747712], [75, 0.2946061349939555], [100, 0.34384088098886423], [125, 0.3898407839878928], [150, 0.4640267709910404], [175, 0.5141664699767716], [200, 0.5636051079782192], [225, 0.6370294090011157], [250, 0.7189938000228722], [275, 0.7910375889914576], [300, 0.8944112170138396], [325, 0.919358467013808], [350, 0.9637669200019445], [375, 0.9541001849866007], [400, 1.0643012590007856], [425, 1.2021882720000576], [450, 1.2190341310051735], [475, 1.3122795779781882], [500, 1.3650111590104643], [525, 1.4797079130075872], [550, 1.5735298939980567], [575, 1.656451204995392], [600, 1.6529475140268914], [625, 1.6743415789969731], [650, 1.6548598159861285], [675, 1.7177728010283317], [700, 1.7791876640112605], [725, 1.8364603800000623], [750, 1.844965553988004], [775, 1.9358663269958925], [800, 1.9837099819851574], [825, 2.0275426029984374], [850, 2.112992905982537], [875, 2.16545048498665], [900, 2.5238939360133372], [925, 2.460455580992857], [950, 2.5159078649885487], [975, 2.4227139790018555], [1000, 2.3851212130102795], [1025, 2.792484058008995], [1050, 2.7354417049791664], [1075, 2.538887786999112], [1100, 2.7332587350101676], [1125, 2.9145966520009097], [1150, 3.2307106019870844], [1175, 3.07687999701011], [1200, 3.0401059629803058], [1225, 3.2314461499918252], [1250, 3.421544067998184], [1275, 3.4948609349958133], [1300, 3.5868837120069657], [1325, 3.377412458998151], [1350, 3.640264046989614], [1375, 3.7059665439883247], [1400, 3.432480806019157], [1425, 3.4418949670216534], [1450, 4.289867592975497], [1475, 3.7609424609981943], [1500, 3.733723666984588], [1525, 4.366200033022324], [1550, 4.364277562999632], [1575, 4.378962991992012], [1600, 4.063558651978383], [1625, 4.141913137980737], [1650, 4.008789872983471], [1675, 4.538566064002225], [1700, 4.517647509987], [1725, 4.592582953977399], [1750, 4.158898051013239], [1775, 4.108714434987633], [1800, 4.554175988014322], [1825, 5.482355123007437], [1850, 4.459770114975981], [1875, 4.579241200990509], [1900, 5.776070221007103], [1925, 4.830430870002601], [1950, 4.675362765003229], [1975, 4.989379442995414], [2000, 5.483278057014104], [3000, 7.71909163199598], [4000, 10.052502998994896], [5000, 12.62802253200789], [6000, 14.700998643995263], [7000, 18.014430240989896], [8000, 19.667670697992435]] }, { "name": "__nth_element", "data": [[1, 0.0950568169937469], [11, 0.10422070001368411], [21, 0.11312327400082722], [31, 0.12100902199745178], [41, 0.13229449599748477], [50, 0.1492392419895623], [75, 0.17108788699260913], [100, 0.19836945601855405], [125, 0.23523157901945524], [150, 0.2775640409963671], [175, 0.3322309989889618], [200, 0.3467939769907389], [225, 0.3997763399966061], [250, 0.41092007598490454], [275, 0.4472865370044019], [300, 0.479327780980384], [325, 0.47323612798936665], [350, 0.5430320870073047], [375, 0.5825483509979676], [400, 0.614084303000709], [425, 0.6547560330072884], [450, 0.6912187470006756], [475, 0.7053063539788127], [500, 0.7163041739840992], [525, 0.7697810089739505], [550, 0.925340269022854], [575, 0.9072284449939616], [600, 0.8678634690004401], [625, 1.1857491510163527], [650, 1.013095684000291], [675, 0.9259142770024482], [700, 1.0036134260008112], [725, 1.003436030994635], [750, 1.0211288550053723], [775, 1.0618203879857901], [800, 1.2634732279984746], [825, 1.2032499659981113], [850, 1.1673175980104133], [875, 1.0960878320038319], [900, 1.1092872690060176], [925, 1.1250440779840574], [950, 1.1758719449862838], [975, 1.2136791050143074], [1000, 1.220605345995864], [1025, 1.3250704110250808], [1050, 1.4525212860025931], [1075, 1.4133804920129478], [1100, 1.378123783011688], [1125, 1.5650337359984405], [1150, 1.5272134989791084], [1175, 1.5577035329770297], [1200, 1.5140516629908234], [1225, 1.5116898390115239], [1250, 1.5028544160013553], [1275, 1.5653873319970444], [1300, 1.577734621008858], [1325, 1.817314192012418], [1350, 1.9845920379739255], [1375, 1.9030265510082245], [1400, 1.9168393790023401], [1425, 1.9240890619985294], [1450, 1.9527441280079074], [1475, 2.1423851709987503], [1500, 1.9973142569942866], [1525, 2.6714377719908953], [1550, 1.913854395999806], [1575, 1.9511215070087928], [1600, 1.911259655986214], [1625, 1.9522429139760789], [1650, 2.330047277006088], [1675, 2.376613529981114], [1700, 2.2836325599928387], [1725, 2.320423130004201], [1750, 2.4753193740034476], [1775, 2.434135713003343], [1800, 2.1415978049917612], [1825, 2.187181327986764], [1850, 2.284417407005094], [1875, 2.246651594003197], [1900, 2.3078603349858895], [1925, 2.3660219680168666], [1950, 2.28750070600654], [1975, 2.4328526470053475], [2000, 2.3650536669883877], [3000, 3.4386173709935974], [4000, 4.763912066002376], [5000, 5.8363779580104165], [6000, 6.881287380005233], [7000, 8.100905020022765], [8000, 9.241638478008099], [9000, 10.273312300996622], [10000, 11.25853025802644]] }, { "name": "std::tuple_element", "data": [[1, 0.09337756698369049], [11, 0.13377408098313026], [21, 0.1732819729950279], [31, 0.21852461900562048], [41, 0.2546429369831458], [50, 0.2823179699771572], [75, 0.38543783500790596], [100, 0.4995455569878686], [125, 0.5855343099974561], [150, 0.6879644419823308], [175, 0.7933025999809615], [200, 0.8977979089831933], [225, 0.9996375079790596], [250, 1.1193739840236958], [275, 1.2000017059908714], [300, 1.296332016994711], [325, 1.4149017610179726], [350, 1.4863329239888117], [375, 1.6474636100174394], [400, 1.7066396370064467], [425, 1.8082447120104916], [450, 1.9330964750261046], [475, 2.0098657220078167], [500, 2.2121536310005467], [525, 2.3517623330117203], [550, 2.318767855002079], [575, 2.493857276014751], [600, 2.519583412009524], [625, 2.6586784579849336], [650, 2.7677744689863175], [675, 2.8834427859983407], [700, 2.916233911004383], [725, 3.0219406409887597], [750, 3.1239475379989017], [775, 3.310127389995614], [800, 3.409046381013468], [825, 3.514461916987784], [850, 3.6096187089860905], [875, 3.626689437980531], [900, 3.701576388993999], [925, 3.897961706999922], [950, 3.989045284979511], [975, 4.113302537996788], [1000, 4.287587864993839], [1025, 4.392406702012522], [1050, 4.509951138985343], [1075, 4.472606594004901], [1100, 4.704200700012734], [1125, 4.809550042991759], [1150, 4.782554339995841], [1175, 4.896435742004542], [1200, 5.149643496988574], [1225, 5.155710003018612], [1250, 5.195539377979003], [1275, 5.533276050002314], [1300, 5.524178879975807], [1325, 5.575433574995259], [1350, 5.621228789008455], [1375, 5.7721650179883], [1400, 5.967793115996756], [1425, 5.9824653239920735], [1450, 6.127393369009951], [1475, 6.305587689013919], [1500, 6.308764015993802], [1525, 6.533352591999574], [1550, 6.59228588201222], [1575, 6.635692954994738], [1600, 6.649674073996721], [1625, 6.89821601199219], [1650, 6.896466217993293], [1675, 7.344635518005816], [1700, 7.681716649996815], [1725, 7.35485692799557], [1750, 7.636738499975763], [1775, 8.669315953011392], [1800, 8.188681711006211], [1825, 7.52730417001294], [1850, 8.024651742016431], [1875, 8.761455725005362], [1900, 8.550652243022341], [1925, 9.099231283005793], [1950, 8.020784931984963], [1975, 8.550188498018542], [2000, 9.964308563998202], [3000, 14.280408783000894], [4000, 17.97387029399397]] }, { "name": "baseline", "data": [[1, 0.0895553519949317], [11, 0.09979549501440488], [21, 0.11330437599099241], [31, 0.13849593402119353], [41, 0.137482207006542], [50, 0.14284568399307318], [75, 0.1773306849936489], [100, 0.1884095050045289], [125, 0.21988379300455563], [150, 0.25168353199842386], [175, 0.2868826750200242], [200, 0.308317688992247], [225, 0.33244993499829434], [250, 0.3624091840174515], [275, 0.39794950000941753], [300, 0.40776465900125913], [325, 0.43727376498281956], [350, 0.4728284180164337], [375, 0.49249671801226214], [400, 0.5058450030046515], [425, 0.5444902209856082], [450, 0.5837795360130258], [475, 0.6214615339995362], [500, 0.671573837986216], [525, 0.6467340749804862], [550, 0.6627447769860737], [575, 0.7218813590006903], [600, 0.7234189859882463], [625, 0.7632153170125093], [650, 0.7856548090057913], [675, 0.7986638139991555], [700, 0.8189744959818199], [725, 0.8462159839982633], [750, 0.8851248699938878], [775, 0.9154056039988063], [800, 0.9406576729961671], [825, 1.0103037280205172], [850, 1.0288816959946416], [875, 1.0288725550053641], [900, 1.088915932981763], [925, 1.0910141869971994], [950, 1.131581638008356], [975, 1.1764616339933127], [1000, 1.1502197259978857], [1025, 1.1954374169872608], [1050, 1.201837553002406], [1075, 1.262557398993522], [1100, 1.2741319700144231], [1125, 1.2638344909937587], [1150, 1.3024873409885913], [1175, 1.3896058530081064], [1200, 1.3849343079782557], [1225, 1.426253588986583], [1250, 1.4527283280040137], [1275, 1.4745519419957418], [1300, 1.51767245400697], [1325, 1.6073829960078], [1350, 1.565623925998807], [1375, 1.6282155700027943], [1400, 1.6590890980151016], [1425, 1.6315327839984093], [1450, 1.6618742069986183], [1475, 1.6905794409976806], [1500, 1.7500342380080838], [1525, 1.7808794279990252], [1550, 1.8134323380072601], [1575, 1.7704940969997551], [1600, 1.8733548439922743], [1625, 2.024379991024034], [1650, 1.9723198350111488], [1675, 2.0633506719896104], [1700, 1.9977447040146217], [1725, 2.053775262000272], [1750, 2.001571528002387], [1775, 2.0927021270035766], [1800, 2.054515050025657], [1825, 2.0970026050054003], [1850, 2.12224082701141], [1875, 2.1247854190005455], [1900, 2.1612690539914183], [1925, 2.271123151993379], [1950, 2.2433295069786254], [1975, 2.262846002995502], [2000, 2.4405463490111288], [3000, 3.2802189700014424], [4000, 4.261982837982941], [5000, 5.434160718985368], [6000, 6.526061890006531], [7000, 7.482144627982052], [8000, 8.456365802005166], [9000, 9.528541000996483], [10000, 10.602903638995485]] } ] }

Indeed, the multiple inheritance technique is now faring better, but it still isn’t as good as we’d like it to.

Conclusion

In this article, we outlined a couple of different library-based implementation techniques for indexing into parameter packs. I also presented a compiler builtin which allows us to index parameter packs with very little compile-time overhead. As the benchmarks show, there does not seem to be a silver bullet as far as library-based techniques go. However, the compiler builtin consistently provided superior performance.

I would also like to point out how tricky benchmarking can be, especially compile-time benchmarking. Indeed, several people (including myself) have already benchmarked those techniques, and the conclusion had always been that the multiple inheritance based technique was superior. It turns out that this superiority depends highly on the exact use case, and in fact it might not be so superior for real-world use cases. Here, I tried to back myself up as much as possible, but it is also possible that my benchmarks are lying for one reason or another, so keep this in mind.

The next step for me will be to propose my patch to Clang, and to propose a matching patch to libc++ for std::tuple_element. Of course, you can expect Boost.Hana to take advantage of this builtin if it makes it into Clang. I will also try to explore how other fundamental metaprogramming operations can be sped up using compiler builtins, and report on them. I think there’s hope.

Stay tuned!

Edit: The patch for Clang has been proposed here, and the one for libc++ here.

Edit (30/06/2016): The patch has been merged into Clang. Thus, Clang will now have a __type_pack_element<n, T...> intrinsic equivalent to the __nth_element presented in this post.