perf: store thunk data directly in value blocks #2455

yannham · 2025-12-04T13:41:40Z

This PR fixes a useless indirection when storing a closure/thunk as a value (or, before that, as a term). We would handle this case by storing a Thunk, which is itself of the form Rc<RefCell<ThunkData>>. However, value blocks are already reference-counted, so it doesn't make much sense to store an rc-ed pointer into another rc-ed pointer for no obvious benefit.

This commit implements the usual layout for thunks, which is to store directly the mutable thunk data into the value block, removing an indirection and an allocation.

There is the question of how to define Thunk, now that it is the same as a NickelValue or a ValueBlockRc, just one specific case. I started by performing operations on value::ThunkData := RefCell<lazy::ThunkData> first, but some operations do require not only the content, but the original Rc pointer (typically, for cloning or for testing physical equality). In the end, I made Thunk a smart constructor/transparent wrapper around NickelValue, which encodes at the type level that we know that this value is a block of type thunk.

This commit fixes a useless indirection when storing a closure/thunk as a value (or, before that, as a term). We would handle this case by storing a Thunk, which is itself of the form `Rc<RefCell<ThunkData>>`. However, value blocks are already reference-counted, so it doesn't make much sense to store an rc-ed pointer into another rc-ed pointer, for no obvious benefit. This commit implements the usual layout for thunks, which is to store directly the mutable thunk data into the value block, removing an indirection and an allocation.

yannham · 2025-12-04T13:48:08Z

core/src/eval/cache/lazy.rs

-    /// converse isn't true: two thunks can be equal as expressions but have different UID.
-    ///
-    /// In practice, the UID is currently the underlying `Rc` pointer value.
-    pub fn uid(&self) -> usize {


uid() seemed to not be used anymore.

github-actions · 2025-12-04T14:19:25Z

Bencher Report

Branch	perf/thunks-inline
Testbed	ubuntu-latest

Click to view all benchmark results

Benchmark	Latency	microseconds (µs)
diagnostics-benches/inputs/goto-perf.ncl	📈 view plot 🚷 view threshold	10,245.00 µs
diagnostics-benches/inputs/large-record-tree.ncl	📈 view plot 🚷 view threshold	191,570.00 µs
diagnostics-benches/inputs/redis-replication-controller.ncl	📈 view plot 🚷 view threshold	306.61 µs
diagnostics-benches/inputs/small-record-tree.ncl	📈 view plot 🚷 view threshold	426.88 µs
fibonacci 10	📈 view plot 🚷 view threshold	228.37 µs
foldl arrays 50	📈 view plot 🚷 view threshold	389.59 µs
foldl arrays 500	📈 view plot 🚷 view threshold	2,859.00 µs
foldr strings 50	📈 view plot 🚷 view threshold	3,296.70 µs
foldr strings 500	📈 view plot 🚷 view threshold	30,834.00 µs
generate normal 250	📈 view plot 🚷 view threshold	23,384.00 µs
generate normal 50	📈 view plot 🚷 view threshold	1,222.00 µs
generate normal unchecked 1000	📈 view plot 🚷 view threshold	1,759.70 µs
generate normal unchecked 200	📈 view plot 🚷 view threshold	362.90 µs
init-diagnostics-benches/inputs/goto-perf.ncl	📈 view plot 🚷 view threshold	55,665.00 µs
init-diagnostics-benches/inputs/large-record-tree.ncl	📈 view plot 🚷 view threshold	208,840.00 µs
init-diagnostics-benches/inputs/redis-replication-controller.ncl	📈 view plot 🚷 view threshold	48,898.00 µs
init-diagnostics-benches/inputs/small-record-tree.ncl	📈 view plot 🚷 view threshold	48,854.00 µs
pidigits 100	📈 view plot 🚷 view threshold	1,946.00 µs
pipe normal 20	📈 view plot 🚷 view threshold	608.73 µs
pipe normal 200	📈 view plot 🚷 view threshold	4,559.70 µs
product 30	📈 view plot 🚷 view threshold	274.17 µs
requests-benches/inputs/goto-perf.ncl-000	📈 view plot 🚷 view threshold	3,110.60 µs
requests-benches/inputs/large-record-tree.ncl-000	📈 view plot 🚷 view threshold	595,160.00 µs
requests-benches/inputs/large-record-tree.ncl-001	📈 view plot 🚷 view threshold	87.98 µs
scalar 10	📈 view plot 🚷 view threshold	517.81 µs
sum 30	📈 view plot 🚷 view threshold	274.82 µs

🐰 View full continuous benchmarking report in Bencher

yannham · 2025-12-04T17:27:36Z

This wins a noticeable chunk of memory usage, but is around 2% slower on the OPL benchmark. Investigating a bit. I can see with cachegrind that the instruction cache has different stats (otherwise there seems to be slightly less branches, slightly less mispredictions, and slightly less data access/data cache misses), so maybe it's just noise from the change of layout of the binary 🤷‍♂️ Another possibility is that the improvement is nullified by our less optimal implementation of Rc than in the stdlib, as clone/drop for NickelValue is probably stressed a lot.

yannham · 2025-12-05T13:09:29Z

Ok, after a round of optimizing a bit the reference counting part of NickelValue (mostly copying the stdlib), I got the 2% back, and even 2% more. I'll push the optimization in a follow-up PR, but let's move forward with this one independently first.

yannham commented Dec 4, 2025

View reviewed changes

chore: improve comments, minor code reorg

0d535f6

fix: undue strong clone of normal thunks upon revert()

87c730f

yannham marked this pull request as ready for review December 5, 2025 16:19

yannham requested a review from jneem December 5, 2025 16:19

yannham mentioned this pull request Dec 5, 2025

perf: optimize refcounting #2456

Open

jneem approved these changes Dec 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: store thunk data directly in value blocks #2455

perf: store thunk data directly in value blocks #2455

Uh oh!

yannham commented Dec 4, 2025

Uh oh!

yannham Dec 4, 2025

Uh oh!

github-actions bot commented Dec 4, 2025 •

edited

Loading

Uh oh!

yannham commented Dec 4, 2025 •

edited

Loading

Uh oh!

yannham commented Dec 5, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

perf: store thunk data directly in value blocks #2455

Are you sure you want to change the base?

perf: store thunk data directly in value blocks #2455

Uh oh!

Conversation

yannham commented Dec 4, 2025

Uh oh!

yannham Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bencher Report

Uh oh!

yannham commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yannham commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Dec 4, 2025 •

edited

Loading

yannham commented Dec 4, 2025 •

edited

Loading

yannham commented Dec 5, 2025 •

edited

Loading