-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Background and Problem Statement
We've encountered a scenario where our application using BinlogSyncer frequently processes large events. Specifically:
- A single binlog event (e.g., a RowsEvent from a large transaction) can contain rows that sum up to approximately 1MB in size.
- The current BinlogSyncer.EventCache is configured using EventCacheCount (defaulting to 10240), which limits the number of events stored.
- Given the 1MB-per-event scenario, caching 10240 events can require up to 1GB of memory (10240 events×1MB/event=10240 MB≈10 GB).
- In extreme cases with multiple large transactions/events, this can easily lead to Out-of-Memory (OOM) errors because the memory usage is directly tied to the highly variable size of the event, rather than the number of rows.
Workround
The current BinlogSyncer.EventCache is configured using EventCacheCount (defaulting to 10240), which limits the number of events stored.
- Scenario A: Many Small Events: EventCacheCount works well here, controlling the number of individual binlog events.
- Scenario B: Few Large Events (OOM Risk): A single event can be very large (up to ≈1MB in our case). Caching 10240 such events requires up to 10GB of memory, which frequently leads to Out-of-Memory (OOM) errors.
While the existing eventCacheCount can be configured to a smaller number to safeguard against Scenario B, doing so severely restricts the capacity for Scenario A, as it forces the cache to contain only a tiny number of events overall. It is difficult to find a single eventCacheCount value that controls the memory usage for large events while still being effective for a high volume of small events.
Proposed Solution / Feature Request
To gain more precise control over memory usage and prevent OOM issues, we propose changing the metric used for limiting the cache:
- Current: Limit by number of events (EventCacheCount). Memory usage is too wide-ranging and unpredictable.
- Request:
- Implement an option to limit the EventCache based on the total number of rows it contains, rather than the number of events.
Limiting by row count would allow users to configure a maximum, stable memory footprint, as the memory cost of a single row is generally more predictable than that of an entire event.
Reference
The relevant code section for the event cache is here: https://github.com/go-mysql-org/go-mysql/blob/master/replication/binlogsyncer.go#L195