-
Notifications
You must be signed in to change notification settings - Fork 262
Blob p2p distribution using Reactor API #2938
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
5e2e84d to
fce3562
Compare
fce3562 to
8e3f94c
Compare
eac91a1 to
e07b7d9
Compare
…t/integration tests
b62bf9f to
0ef6906
Compare
| return BlobFetcherConfig{ | ||
| CheckInterval: 10 * time.Second, | ||
| RetryInterval: 1 * time.Minute, | ||
| MaxRetries: 10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there should be a cap here. What happens at cap? A node gives up on syncing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If cap is reached, the blob request is deleted from disk, meaning that blob won't exist on this node. The node always continues syncing regardless and other blob requests are processed independently. If/when this happens, we record a blob_fetcher_requests_expired_total{reason="max_retries"}) metric so operators can detect if this happens frequently. Alternatively we can keep this uncapped and instead do something like double retryInterval (exponential backoff) each time it fails. Eventually it will pass the DA window be cleaned up.
| bf.ctx, bf.cancel = context.WithCancel(ctx) | ||
| go bf.run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unlike Stop, this is not idempotent. Should it be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not, kept it simple since depositCatchupFetcher isn't idempotent either and is called in the same Service.Start() method (https://github.com/berachain/beacon-kit/blob/blobreactor/beacon/blockchain/service.go#L113-L116). Happy to add sync.Once to both if we think it's worth it.
abi87
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few questions but overall LGTM
issue: #2665
depends on cometbft blob PR: berachain/cometbft#26
Context
Currently, blobs are stored as as CometBFT transaction so a blob will never be cleaned up and will take up wasted space since they have a finite lifespan.
This PR addresses this by implementing a complete p2p blob distribution system using the cometbft Reactor API which fetches missing blobs from other peers. It introduces a
BlobConsensusEnableHeightconfiguration parameter that defines when the chain transitions from storing blobs as transactions to using P2P distribution which requires a hard fork.Components
BlobReactor(inda/blobreactor/)that handles p2p blob requests/responses with timeout handling and makes sure received blobs pass verification (from potential byzantine peers).BlobFetcherbackground service (inbeacon/blockchain/) that manages a persistent filesystem based retry queue. When a block arrives with missing blobs, the request is queued and the background worker attempts to fetch from peers at regular interval with a max retry count. The queue is crash-safe with atomic writes, includes request deduplication, and automatically cleans up requests that exceed retry limits or fall outside the DA availability window.Consensus flow (after blob enable height reached)
PrepareProposal: Blobs included in the block are now returned inPrepareProposalResponsein a separateBlobfield instead of as CometBFT transactions. This prevents blobs from being persisted in CometBFTs block store.ProcessProposal: Blobs are now included in theProcessProposalRequestin a separateBlobfield and are validated synchronously. Proposals are rejected if blobs are missing or invalid. Validated blobs are cached with the state for FinalizeBlock.FinalizeBlock: If node is in consensus mode, then blobs come from theProcessProposalcache and are processed immediately. Otherwise, if the node is syncing (catching up) and comes upon a block that should have blobs the node will make async fetch request toBlobFetcher.FinalizeBlockreturns immediately without blocking, allowing the chain to continue syncing while blobs are fetched in the background.TODO: