THUNKS for Mortals
The topic of thunks is advanced², meaning an advanced concept (thunks) within an already advanced feature (LAMBDA). The purpose of this page is to make thunks in Excel usable by mere mortals. I'm going to try to explain them without using advanced computer science terms (because I don't really understand those anyway).
Why use thunks? Mainly to (1) improve calculation efficiency and/or (2) work around limitations of existing formulas, especially by using arrays of thunks.
When you define a function using LAMBDA, but do not call it, the code within the LAMBDA is not evaluated. It is only evaluated when you explicitly call the function using (). This basic property of the LAMBDA function, combined with the fact that Excel can store function definitions in arrays, is what enables the main reasons to use thunks.
What is a thunk?
Definition: In Excel, a thunk is a zero-argument LAMBDA function that (1) delays the evaluation of an expression and (2) is only evaluated when explicitly called with (). A thunk factory is a function that produces thunks programmatically, commonly written as LAMBDA(x,LAMBDA(x)).
The most basic thunk is:
=LAMBDA("hello")
or more generally
=LAMBDA(expression)
This is a LAMBDA that has no arguments. The body of the LAMBDA is only the expression "hello". The only thing it does is define the expression. It does not evaluate it ... yet.
Just like any other lambda =LAMBDA("hello") by itself will show a #CALC! "error" because it has not yet been called. The #CALC! error in this case simply means Excel is displaying a function value that hasn't been called yet.
You evaluate a thunk by using () when calling it like this:
=LAMBDA("hello")()
Result: "hello"
Important: A thunk is not a value. It is a function that produces a value when called.
We can use LET to (a) name the thunk and then (b) call it like this:
=LET(
mythunk, LAMBDA("hello"),
mythunk()
)
If you are familiar with LAMBDA, you may be wondering "What's the big deal? All we've done so far is point out that a thunk seems like the most useless and pointless version of a LAMBDA there could possibly be!" Well, let's move on to the good stuff.
Lazy Evaluation
Improving calculation efficiency is one of the main reasons for using thunks. Sometimes, efficiency can be improved by preventing unused logic branches from evaluating, or by delaying evaluation of an expression. Lazy isn't always better, but let's talk about situations where this property might help.
Most functions in Excel behave as if they are "Eager" in the sense that every expression is evaluated even when their results may not be used. However, a thunk is "Lazy" because it doesn't run until you call it. Remember that a thunk is a function, and any function you define has this same property: a LAMBDA function is not evaluated when defined - only when called.
A thunk is lazy: it defines an expression that is not evaluated until you explicitly call it with ().
Here is an example where the expression RAND() is defined but never called:
=LET( myThunk, LAMBDA( RAND() ), "Nothing random happens" )
Even though RAND() appears in the formula, it is never evaluated because the function named myThunk is never called.
The simplest useful example I can think of is the SWITCH function. The SWITCH function is "Eager" because the expressions for all of the cases are evaluated, even though only one of the cases is returned.
=LET(
sw,SWITCH("A",
"A", 5,
"B", NOW(),
"C", RANDARRAY(10000,10000),
"default"
),
sw
)
In the above example, even though we only want to retrieve the expression for "A" (which is 5), Excel will also evaluate the expressions NOW() and RANDARRAY(10000,10000).
Using RANDARRAY(10000,10000) is an easy test to see if Excel is evaluating a branch because you get the "Excel ran out of resources" error message when Excel tries to evaluate it. If your computer is more powerful than mine, you may need to increase the size of RANDARRAY until you get an error message. Warning, save your file frequently. If Excel crashes and lose your data, don't say I didn't warn you.
Another way to test if an unused branch is evaluated is to create a volatile VBA-based UDF that outputs a datetime to the debug "Immediate" window" - an idea from Diarmuid Early (originating from Charles Williams)
▼ Show WasEvaluated()
Here's the first cool thing about thunks: With very little complexity, we can "THUNK the SWITCH" so that SWITCH does not evaluate every case.
Yes, we are using THUNK as a verb now.
Definition of THUNK (as a verb): Writing the code so that a function returns thunks instead of values.
The following SWITCH has been THUNKED!
EXAMPLE: A THUNKED SWITCH
=LET(
sw,SWITCH("A",
"A", LAMBDA( 5 ),
"B", LAMBDA( NOW() ),
"C", LAMBDA( RANDARRAY(10000,10000) ),
LAMBDA( "default" )
),
sw()
)
Guess what? This code runs without the "Excel ran out of resources" error because RANDARRAY(10000,10000) is never evaluated!
All I did was change the code within SWITCH so that all the expressions it might return are now thunks. Now, because every so we can consider the variable sw to be a named thunk. To evaluate sw we must call it using sw().
Let's summarize the above example
- Each LAMBDA(expression) is a thunk
- SWITCH() returns one of those thunks
- The value assigned to sw is a thunk
- sw() calls the thunk and triggers evaluation
To test your understanding, what would happen if all of the cases in the SWITCH were thunks, except for the default expression? In other words, what's wrong with this:
EXAMPLE: INCORRECTLY THUNKED SWITCH =LET( sw,SWITCH("Q", "A", LAMBDA( 5 ), "B", LAMBDA( NOW() ), "C", LAMBDA( RANDARRAY(10000,10000) ), "default" ), sw() )
Answer: There is no "Q" case, so SWITCH returns "default" which is not a thunk. This means that for the default case, sw is not a thunk which means that sw() will fail with a #VALUE! error.
Here's what we've learned so far
- A thunk is LAMBDA(expression)
- A thunk is lazy - the expression is not evaluated right away
- A thunk becomes a value only when you call it with ()
- You can assign a thunk to a variable, making that variable a thunk
- If code returns thunks instead of values, we say it has been thunked
- Thunks can delay or avoid evaluation, which might improve efficiency
Lazy ≠ Better
Before we move on, I want to mention that even though by definition a thunk is always lazy, when it comes to efficiency, lazy is not always better. For example, if you plan to call myThunk() many times, and it contains an expensive calculation, it may be faster to evaluate it once and reuse the value instead.
Here is a very inefficient use of a thunk:
=LET( myThunk, LAMBDA( SUM(SEQUENCE(10000)) ), myThunk() + myThunk() + myThunk() )
In this example, you end up evaluating SUM(SEQUENCE(10000)) three times.
Arrays of Arrays
Another primary reason for using thunks is as a work-around for things Excel cannot do normally.
Excel does not support arrays whose elements are arrays of different sizes. When you use VSTACK or HSTACK, the resulting array is always rectangular. But, we can simulate an "array of arrays" by creating an array of thunks.
In Excel, a named array can store number values, text values, and even function values. This means that you can create an array of thunks using VSTACK, MAKEARRAY, MAP, BYROW, BYCOL, etc.
It doesn’t matter how the thunk is defined or what it contains. When an array stores a function (a LAMBDA), Excel stores only the definition of the function, not the result of running it.
This means that each thunk in the array can return an array of a different size, or even a completely different type of value.
The thunk=package analogy: A thunk is a single "thing" from Excel's perspective. A thunk is like a package you can pass around without opening, and the package can contain nearly anything. Calling it with () is how you open the package. An array of thunks is just a structured collection of packages that Excel can store, move, and index without forcing the contents to be created, evaluated, or opened.
Here is a very simple example of an array of thunks:
=LET( th_array, VSTACK( LAMBDA( {1,2,3} ), LAMBDA( {4;5} ), LAMBDA( NOW() ) ), res_th,INDEX(th_array,2,1), res_th() )
This demonstrates:
- th_array is a 3x1 array of thunks, created by using VSTACK to stack 3 thunks
- INDEX(th_array,2,1) returns the function (thunk) in row 2 column 1, still unevaluated
- res_th() calls the thunk, which evaluates the expression and returns {4;5}
- NOW() is never evaluated in this example
For some reason, with an array of thunks, you must be explicit in defining both the row and column in the INDEX function. The shortcut INDEX(th_array,2) will not work ("Excel ran out of resources" error). You would need to use INDEX(th_array,2,1) or @INDEX(th_array,2).
This was just a very basic example of an array of thunks. But, what we have demonstrated here is that by storing zero-argument LAMBDA functions as an "array of thunks," we can store thunks representing arrays of different sizes within a single array, manipulate that array, and evaluate the thunk(s) later.
See Owen Price's article for another example of how an array of thunks can contain arrays of different sizes.
Let's look at example where the efficiency gained from using thunks is based on delayed evaluation.
After writing up this article, I asked ChatGPT to come up with an idea for a menu of reports that are defined using thunks, so that you only need to calculate the chosen one rather than calculate everything before choosing the one you want. Here is the example created first as a thunked SWITCH and second as an array of thunks:
/* Choice in cell B1: dropdown: "Sales", "Inventory", "Exceptions" */
=LET(
choice, B1,
report_thunk,
SWITCH(choice,
"Sales", LAMBDA(SORT(FILTER(SalesTbl, SalesTbl[Amount]<>0), 3, -1)),
"Inventory", LAMBDA(SORTBY(InventoryTbl, InventoryTbl[SKU])),
"Exceptions", LAMBDA(FILTER(LogTbl, LogTbl[Status]="ERROR")),
LAMBDA("UNHANDLED")
),
report_thunk()
)
=LET(
choice, B1,
report_names, {"Sales";"Inventory";"Exceptions"},
report_thunks,
VSTACK(
LAMBDA(SORT(FILTER(SalesTbl, SalesTbl[Amount]<>0), 3, -1)),
LAMBDA(SORTBY(InventoryTbl, InventoryTbl[SKU])),
LAMBDA(FILTER(LogTbl, LogTbl[Status]="ERROR"))
),
INDEX(report_thunks, XMATCH(choice, report_names, 0))()
)
Without thunks, the typical approach would be this:
=LET( sales, SORT(FILTER(...)), inv, SORTBY(...), exc, FILTER(...), SWITCH(choice, "Sales", sales, "Inventory", inv, "Exceptions", exc) )
That computes all three large arrays on every recalc. With thunks, it computes exactly one. This is not a micro-optimization; it is the difference between "usable dashboard" and "Excel lagging / spilling / memory pressure".
Thunks as a Work-Around
If delayed evaluation was all that thunks were good for, they would still be valuable. However, there is another powerful reason to use thunks: to work around the limitations of existing Excel functions!
- Each iteration of BYROW or BYCOL normally only lets you return a scalar value (not an array). However, they can also return a thunk, which you can unpack later.
- In some cases, you may want to use SCAN as a "SCAN by ROW" but that doesn't work and you may have to resort to creating an index and then performing inefficient lookups of the original full data set. However, you can pass a array of thunks into SCAN and then unpack them within.
In some cases, the work around can actually lead to substantial efficiency gains (faster calculations). That leads to our next topic: thunk factories.
Thunk Factories
Up to this point, every thunk we've looked at has been written directly as LAMBDA(expression). But how do you process a bunch of data programmatically, to automatically convert structured data into arrays of thunks?
The answer is another lambda pattern that you may see described as a "thunk" but which I will refer to as a "thunk factory." It is this:
=LAMBDA(x, LAMBDA(x))
In Excel terms, this function evaluates the value passed as x and then returns the thunk LAMBDA(x). You can think of this function as converting x into a thunk, a function that wraps (or packages) x in a thunk, or a function that builds thunks.
A thunk factory is a function that creates thunks by packaging values inside zero-argument LAMBDA functions
Here is a very simple example of converting the value 5 into a thunk and then calling the thunk to return 5.
=LET( makeThunk, LAMBDA(x, LAMBDA(x)), th, makeThunk(5), th() )
This type of thunk factory does not delay evaluation of the expression passed to it. We can test this by using our RANDARRAY(10000,10000) trick.
=LET( makeThunk, LAMBDA(x, LAMBDA(x)), th, makeThunk(RANDARRAY(10000,10000)), 10 )
Here, the result should be 10, but makeThunk still tries to evaluate RANDARRAY(10000,10000) resulting in an "Excel ran out of resources" error message.
So, if a thunk factory does not delay evaluation, why is a thunk factory important? Answer: Instead of manually defining each individual array or function, the thunk factory lets you programmatically create a thunk or an array of thunks.
You can pass that array of thunks into functions like SCAN, or create thunks within functions like BYROW, BYCOL, MAP, or MAKEARRAY that would otherwise not be able to return arrays of arrays.
The example in the article "I THUNK; therefore I SCAN" by Craig Hatmaker is a perfect example of using BYCOL to convert rows of an array into individual thunks that are then passed to SCAN, for the purpose of efficiency gains.
SCAN can take an array as input, but it scans through each value one a time (not one row at a time). That typically leads people to using a SEQUENCE-style index for SCAN and then using INDEX(rate,i) and INDEX(flow,i) within the SCAN function to get each incremental value from the original data arrays.
He shows that you can first use BYCOL and a lambda factory to convert the rate and flow arrays into an array of 2x1 thunks. Then, pass that array of thunks to the SCAN function. The result is that within the SCAN function, each use of INDEX is only doing a lookup of a 2x1 array instead of a lookup of the full original array. And that makes all the difference.
IMPORTANT: The dramatic increase of speed in the SCAN example has nothing to do with delayed evaluation, and everything to do with creating and using an array of thunks to do more efficient lookups! 🤯
I used AI to help me come up with another example of the "array of different sizes" and I like this example because it demonstrates a few interesting concepts.
This example is not about calculation efficiency. It is an example that shows thunks used to store arrays of different sizes.
The following data set represents a table named Orders with a Category and an Amount column.
Category Amount A 10 A 15 B 5 C 8 C 12 C 20
I want an array where each element is an array of the amounts filtered by category. This would be { {10,15}; {5}; {8,12,20} }, corresponding to categories {"A"; "B"; "C"}, and Excel cannot store an array of arrays. However, we CAN create an array of thunks where each thunk is a different FILTER result.
=LET(
cats, UNIQUE(Orders[Category]),
thunks,
BYROW(cats,
LAMBDA(cat,
LAMBDA( FILTER(Orders[Amount], Orders[Category]=cat) )
)
),
thunks
)
We wrapped the FILTER(...) expression with a thunk because we need to return a single thing to BYROW. The purple LAMBDA( ) is our thunk: a zero-argument function.
Can we store the results of the FILTER as a thunk, rather than the FILTER expression itself? Of course, we can easily do that. We just need to make sure that the result is packaged as a thunk so BYROW can handle it.
=LET(
cats, UNIQUE(Orders[Category]),
thunks,
BYROW(cats,
LAMBDA(cat,
LET(f, FILTER(Orders[Amount], Orders[Category]=cat),
LAMBDA( f )
)
)
),
thunks
)
Are we still creating an array of thunks? Yes. Using LET, we defined the result of the filter expression as f, and then we wrapped the value of f as a thunk using the purple LAMBDA.
So far, this example just returns an array of thunks, not yet evaluated. So, in Excel this would display as an array of #CALC! errors.
Our next step is to create a report with multiple outputs, including SUM, AVERAGE, and COUNT for each category.
=LET(
cats, UNIQUE(Orders[Category]),
thunks,
BYROW(cats,
LAMBDA(cat,
LET(f, FILTER(Orders[Amount], Orders[Category]=cat),
LAMBDA( f )
)
)
),
sums, MAP(thunks, LAMBDA(th, SUM(th()) )),
avgs, MAP(thunks, LAMBDA(th, AVERAGE(th()) )),
counts, MAP(thunks, LAMBDA(th, ROWS(th()) )),
VSTACK(
{"Category","SUM","AVERAGE","COUNT"},
HSTACK(cats, sums, avgs, counts)
)
)
To generate the array of sums, we are looping through each one of the thunks using MAP and calculating the SUM by calling th(). Likewise for avgs and counts. This means that we are calling each individual thunk 3 times each. If we had used the earlier version, we could actually be running the FILTER(...) expression 9 times. If FILTER was the slow part of the process, that would hurt efficiency.
This particular example is a poor example for demonstrating or proving efficiency because the data set is so small. But again, this example was used to demonstrate how to work with an array of thunks that consists of arrays of different sizes.
Conclusion
It got kind of crazy there in the end, but that's because the use of thunks is an advanced² topic.
We showed how thunks are lazy, and how we can use that to our advantage. One of the uses for thunks is to avoid evaluation of unused logical branches, or to delay evaluation until you explicity need the result.
We've also seen how thunks can be used to work around some limitations of Excel, allowing us to simulate arrays of arrays by creating arrays of thunks.
Performance gains from thunks are not always about laziness. Sometimes they come from decreasing the cost of lookups by changing how data is packaged and delivered. This distinction becomes critical when working with iterative functions like SCAN, where array lookup cost may dominate performance.
Hopefully, this article was a useful primer on the concept of thunks, making them a bit more understandable.
One final parting thunk:
=LET(th, LAMBDA( LAMBDA( LAMBDA( "cheers") ) ), th()()() )
Comments or Suggestions?
Please email me or message me on LinkedIn.
Acknowledgements
Big thanks to Owen Price, Craig Hatmaker, Peter Bartholomew and Diarmuid Early for their feedback and suggestions!
- A generalised Lambda helper function that return arrays of arrays using bisection. - by Peter Bartholomew. One of the many articles by Dr. Bartholomew that includes thunks.
- I THUNK; thefefore, I SCAN! - by Craig Hatmaker. Describes speed boost from passing an array of small thunks to SCAN instead of using INDEX(large_array,i).
- What is a Thunk in Excel? - by Owen Price at flexyourdata.com. Shows the use of LAMBDA(x,LAMBDA(x)) and applies it to array-based patterns, including SCAN.
- Cell-Level Thunks in Excel Using #VALUE Arrays for Range-Reference Encapsulation - by Fredson Alves Pinho. This article shows that thunk behavior does not require LAMBDA. In my article, I focus on what may be called functional thunks - thunks created using zero-argument LAMBDA functions. Fredson’s article demonstrates what might be described as structural thunks, where delayed evaluation is achieved by how data is stored and referenced in the worksheet rather than by functions.
- Thunkit - by Travis Boulden on mrexcel.com. Is this the first use of LAMBDA(x,LAMBDA(x))?
- Thunk at Wikipedia.com
See Also
Create LAMBDA Functions, Candidates for the Library
