Skip to content

Conversation

@tschuyebuhl
Copy link

@tschuyebuhl tschuyebuhl commented Nov 7, 2025

Background

At Chorus One, we recently deployed https://github.com/ChorusOne/nebula on Berachain as a remote signing solution that allows for decoupling of the validator node from the signer keys. It seemed to be running fine, but we noticed that our node was failing to propose a block.

After some further investigation, we found out that beacon-kit spawns another privval signer that's based on file-on-disk, requiring the node to have access to the key regardless of the remote signer deployed.
Also, the remote signer did not receive any SignBytes requests, which should be required for building blocks (IIRC it's randao?)

Proposed solution

After starting CometBFT, inject the built privval signer into the component provided to depinject. I deployed this on a local devnet, and I'm seeing the network progress, and also started seeing the requests (which currently on mainnet are absent)

[2025-11-07T13:15:20Z INFO  nebula::handler] Received request after 1.7734945s: ShowPublicKey
[2025-11-07T13:15:20Z INFO  nebula::handler] Processing request took: 5.25µs
[2025-11-07T13:15:20Z INFO  nebula::handler] Sending the response took: 132.791µs
[2025-11-07T13:15:20Z INFO  nebula::handler] Received request after 28.20575ms: SignBytes(snip)
[2025-11-07T13:15:20Z INFO  nebula::handler] Processing request took: 3.140291ms
[2025-11-07T13:15:20Z INFO  nebula::handler] Sending the response took: 140.083µs
[2025-11-07T13:15:20Z INFO  nebula::handler] Received request after 28.223625ms: SignBytes(snip)
[2025-11-07T13:15:20Z INFO  nebula::handler] Processing request took: 3.629916ms
[2025-11-07T13:15:20Z INFO  nebula::handler] Sending the response took: 136.458µs
[2025-11-07T13:15:20Z INFO  nebula::handler] Received request after 28.14925ms: SignProposal(h:185, r:0, step:32)
[2025-11-07T13:15:20Z DEBUG nebula::handler] waiting for lock
[2025-11-07T13:15:20Z DEBUG nebula::handler] lock acquired, took: 6.541µs
[2025-11-07T13:15:20Z INFO  nebula::safeguards] checking if proposal should be signed, state: ConsensusData 184/0/2, proposal: 185/0/32
[2025-11-07T13:15:20Z INFO  nebula::cluster] replicating state: ConsensusData 185/0/32, leader_id: 1

Test setup

Clone the forked repo and checkout privval-for-signing-v2:

git clone --single-branch --branch privval-for-signing-v2 [email protected]:tschuyebuhl/beacon-kit.git forked-beacon-kit
cd forked-beacon-kit

Start beacon-kit:

make start

In a separate terminal, start reth:

make start-reth

Let it produce a few blocks to make sure it works properly, then stop both nodes.

Update CometBFT to use a remote signer:

sed -i '' 's|^priv_validator_laddr *= *".*"|priv_validator_laddr = "tcp://127.0.0.1:26659"|' .tmp/beacond/config/config.toml

Grab the private key, it will be supplied to the remote signer:

jq  -Mr '.priv_key.value' .tmp/beacond/config/priv_validator_key.json > path/to/private/key/for/nebula

Clone nebula repo:

git clone [email protected]:ChorusOne/nebula.git nebula && cd nebula && git checkout v0.0.4-bera-hotfix-2

Use this config file to connect to your local beacon-kit node. Please remember to update the private_key_path:

Nebula config inside
chain_id         = "beacond-2061"
version          = "v1_0"
signing_mode     = "native"
log_level = "info"

[raft]
node_id      = 1
bind_addr    = "127.0.0.1:3232"
data_path   = "state"
initial_state_path = ""

[[raft.peers]]
id = 1
addr = "127.0.0.1:3232"

[[connections]]
host              = "127.0.0.1"
port              = 26659

[signing.native]
key_type = "bls12381"
private_key_path = ""
[signing]
bls_dst = "BLS_SIG_BLS12381G2_XMD:SHA-256_SSWU_RO_NUL_"

Start the Nebula signer:

cargo run -- start --config bera-nebula.toml

Then, start beacon-kit and reth. Remember to not reset the current state, i.e input n when prompted.

make start

In second terminal:

make start-reth

After some time, the Nebula signer will receive requests to sign bytes, prevotes, precommits, and proposals:

Nebula logs inside
[2025-11-27T16:44:06Z INFO  nebula::cluster] replication successful, propagated state: ConsensusData 9/0/2
[2025-11-27T16:44:06Z INFO  nebula::signer] it's a precommit with a non-nil block ID
[2025-11-27T16:44:06Z INFO  nebula::handler] Processing request took: 841.208µs
[2025-11-27T16:44:06Z INFO  nebula::handler] Sending the response took: 108.375µs
[2025-11-27T16:44:06Z INFO  nebula::handler] Received request after 27.318667ms: ShowPublicKey
[2025-11-27T16:44:06Z INFO  nebula::handler] Processing request took: 73.875µs
[2025-11-27T16:44:06Z INFO  nebula::handler] Sending the response took: 103.792µs
[2025-11-27T16:44:08Z INFO  nebula::handler] Received request after 1.930664417s: ShowPublicKey
[2025-11-27T16:44:08Z INFO  nebula::handler] Processing request took: 76.625µs
[2025-11-27T16:44:08Z INFO  nebula::handler] Sending the response took: 107.459µs
[2025-11-27T16:44:08Z INFO  nebula::handler] Received request after 237.625µs: SignBytes([55, 223, 186, 67, 84, 173, 177, 137, 195, 235, 61, 244, 140, 49, 10, 250, 79, 30, 209, 145, 231, 26, 32, 85, 131, 219, 250, 240, 66, 244, 19, 84])
[2025-11-27T16:44:08Z INFO  nebula::handler] Processing request took: 226.958µs
[2025-11-27T16:44:08Z INFO  nebula::handler] Sending the response took: 103.708µs
[2025-11-27T16:44:08Z INFO  nebula::handler] Received request after 334.958µs: SignBytes([125, 130, 61, 157, 175, 3, 91, 177, 49, 8, 174, 209, 252, 159, 142, 188, 151, 231, 44, 140, 220, 105, 202, 175, 237, 42, 177, 160, 66, 76, 7, 69])
[2025-11-27T16:44:08Z INFO  nebula::handler] Processing request took: 223.25µs
[2025-11-27T16:44:08Z INFO  nebula::handler] Sending the response took: 105.667µs
[2025-11-27T16:44:08Z INFO  nebula::handler] Received request after 5.584292ms: SignProposal(h:10, r:0, step:32)
[2025-11-27T16:44:08Z INFO  nebula::safeguards] checking if proposal should be signed, state: ConsensusData 9/0/2, proposal: 10/0/32
[2025-11-27T16:44:08Z INFO  nebula::cluster] replicating state: ConsensusData 10/0/32, leader_id: 1
[2025-11-27T16:44:08Z INFO  nebula::cluster] applying normal entry received from master: ConsensusData 10/0/32, current node state: ConsensusData 9/0/2, node_id: 1
[2025-11-27T16:44:08Z INFO  nebula::cluster] replication successful, propagated state: ConsensusData 10/0/32
[2025-11-27T16:44:08Z INFO  nebula::handler] Processing request took: 624.541µs
[2025-11-27T16:44:08Z INFO  nebula::handler] Sending the response took: 105.792µs
[2025-11-27T16:44:08Z INFO  nebula::handler] Received request after 16.82525ms: SignVote(h:10, r:0, step:1)
[2025-11-27T16:44:08Z INFO  nebula::safeguards] checking if vote should be signed, state: ConsensusData 10/0/32, vote: 10/0/1
[2025-11-27T16:44:08Z INFO  nebula::cluster] replicating state: ConsensusData 10/0/1, leader_id: 1
[2025-11-27T16:44:08Z INFO  nebula::cluster] applying normal entry received from master: ConsensusData 10/0/1, current node state: ConsensusData 10/0/32, node_id: 1
[2025-11-27T16:44:08Z INFO  nebula::cluster] replication successful, propagated state: ConsensusData 10/0/1
[2025-11-27T16:44:08Z INFO  nebula::signer] no vote ext this time
[2025-11-27T16:44:08Z INFO  nebula::handler] Processing request took: 594.167µs
[2025-11-27T16:44:08Z INFO  nebula::handler] Sending the response took: 110.625µs
[2025-11-27T16:44:08Z INFO  nebula::handler] Received request after 9.660541ms: SignVote(h:10, r:0, step:2)
[2025-11-27T16:44:08Z INFO  nebula::safeguards] checking if vote should be signed, state: ConsensusData 10/0/1, vote: 10/0/2
[2025-11-27T16:44:08Z INFO  nebula::cluster] replicating state: ConsensusData 10/0/2, leader_id: 1
[2025-11-27T16:44:08Z INFO  nebula::cluster] applying normal entry received from master: ConsensusData 10/0/2, current node state: ConsensusData 10/0/1, node_id: 1
[2025-11-27T16:44:08Z INFO  nebula::cluster] replication successful, propagated state: ConsensusData 10/0/2
[2025-11-27T16:44:08Z INFO  nebula::signer] it's a precommit with a non-nil block ID
[2025-11-27T16:44:08Z INFO  nebula::handler] Processing request took: 747.541µs
[2025-11-27T16:44:08Z INFO  nebula::handler] Sending the response took: 98.5µs
[2025-11-27T16:44:08Z INFO  nebula::handler] Received request after 27.319209ms: ShowPublicKey
[2025-11-27T16:44:08Z INFO  nebula::handler] Processing request took: 72µs
[2025-11-27T16:44:08Z INFO  nebula::handler] Sending the response took: 101.25µs

This is the full development setup. The logs show expected results. Now we can substitute the key in beacon-kit repo for some dummy key, and the consensus will still move forward, as opposed to the current main branch, where payloads for proposals will fail to be built because of the BlsSigner always using the local key.

Other notes

Ideally, we'd just change the ProvideBlsSigner to return the Comet's privval from the get go, but that caused issues with circular dependencies, because we'd have to provide CometBFT service to the Signer, which currently is relatively pure and used by components which are in turn used by CometBFT, closing the loop.
Maybe there's some more sophisticated way to handle this. If so, please let me know! I'm not very well versed with cosmossdk / cometbft development, so there's a big chance this solution isn't optimal

@tschuyebuhl tschuyebuhl requested a review from a team as a code owner November 7, 2025 13:38
@fridrik01
Copy link
Contributor

Hi @tschuyebuhl, I am having difficulty fully testing this change locally. Could you update the PR with a test plan that shows the commands you used to get this working?

@tschuyebuhl
Copy link
Author

hi @fridrik01! Thanks for the reply 💟

Updated the description with the test setup. There are a few manual steps. I'd be happy to integrate this into e.g Kurtosis if this approach is something you guys would like to go forward with.

Copy link
Contributor

@fridrik01 fridrik01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excuse the long review time on this...

Few comments:

  • ProvideBlsSigner still requires local key files to exist at startup so consider skipping that file check when priv_validator_laddr is configured
  • Some CLI commands (validator-keys, show-validator, show-address) load keys directly from disk so consider adding sth like --validator-public-key flag for remote signer setups (unless its possible to query the pub key of the remote signer?)
  • Regarding kurtosis, I don't think we need to look into that for now

cc @calbera


// The privval has been started, we can now swap the FilePV
if s.privValConsumer != nil {
s.privValConsumer.SetPrivValidator(s.node.PrivValidator())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful to log here when we swap the PrivValidator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants