Skip to content

Conversation

@adinauer
Copy link
Member

📜 Description

💡 Motivation and Context

💚 How did you test it?

📝 Checklist

  • I added GH Issue ID & Linear ID
  • I added tests to verify the changes.
  • No new PII added or SDK only sends newly added PII if sendDefaultPII is enabled.
  • I updated the docs if needed.
  • I updated the wizard if needed.
  • Review from the native team if needed.
  • No breaking change or entry added to the changelog.
  • No breaking change for hybrid SDKs or communicated to hybrid SDKs.

🔮 Next steps

Copy link
Member Author

adinauer commented Dec 22, 2025

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 22, 2025

Fails
🚫 Please consider adding a changelog entry for the next release.
Messages
📖 Do not forget to update Sentry-docs with your feature once the pull request gets approved.

Instructions and example for changelog

Please add an entry to CHANGELOG.md to the "Unreleased" section. Make sure the entry includes this PR's number.

Example:

## Unreleased

### Features

- [Trace Metrics 5] Add batch processor for Metrics ([#4984](https://github.com/getsentry/sentry-java/pull/4984))

If none of the above apply, you can opt out of this check by adding #skip-changelog to the PR description or adding a skip-changelog label.

Generated by 🚫 dangerJS against ff0fe29

Copy link
Member

@markushi markushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just one concern within MetricsBatchProcessor.close() which I would like to discuss before merging.

executorService.submit(() -> executorService.close(options.getShutdownTimeoutMillis()));
} else {
executorService.close(options.getShutdownTimeoutMillis());
while (!queue.isEmpty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

h: Hmm if we have some customer emitting metrics in an endless loop, and the app/sdk gets closed - wouldn't this continue to call flushBatch()? Maybe it makes sense to have a isClosed flag to guard against that and early exit here + within add()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in MetricsApi there's a check that should prevent adding anything after the SDK has closed down:

      if (!scopes.isEnabled()) {
        options
            .getLogger()
            .log(SentryLevel.WARNING, "Instance is disabled and this 'metrics' call is a no-op.");
        return;
      }

There might be an issue where this loops and the SDK can't set enabled to false, gonna verify.

Comment on lines +80 to +84
if (hasScheduled && !forceSchedule) {
return;
}
try (final @NotNull ISentryLifecycleToken ignored = scheduleLock.acquire()) {
final @Nullable Future<?> latestScheduledFlush = scheduledFlush;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is the only place where hasScheduled is being used outside the scheduleLock, for consistency it could make sense to move this inside.

Suggested change
if (hasScheduled && !forceSchedule) {
return;
}
try (final @NotNull ISentryLifecycleToken ignored = scheduleLock.acquire()) {
final @Nullable Future<?> latestScheduledFlush = scheduledFlush;
try (final @NotNull ISentryLifecycleToken ignored = scheduleLock.acquire()) {
if (hasScheduled && !forceSchedule) {
return;
}
final @Nullable Future<?> latestScheduledFlush = scheduledFlush;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here was to avoid waiting for the lock when scheduling has already happened anyways.

The code checks whether queue is empty before hasScheduled = false.
Do you think ensuring this check is correct warrants the performance hit from acquiring the lock for each metric added?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think ensuring this check is correct warrants the performance hit from acquiring the lock for each metric added?

No not at all 😅 I totally overlooked this is being called as part of every metric emission, all good then!

@adinauer adinauer force-pushed the 12-18-add_batch_processor_for_metrics branch from b324629 to ff0fe29 Compare January 9, 2026 10:26
private final @NotNull ISentryExecutorService executorService;
private volatile @Nullable Future<?> scheduledFlush;
private static final @NotNull AutoClosableReentrantLock scheduleLock =
new AutoClosableReentrantLock();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Static lock shared across instances causes concurrency issues

Medium Severity

The scheduleLock field is declared as static final, meaning all instances of MetricsBatchProcessor share the same lock. However, the fields it protects (scheduledFlush, hasScheduled) are instance variables. When multiple processor instances exist (e.g., after SDK reinitialization), one instance's lock acquisition blocks another independent instance, and the hasScheduled state can become incorrectly coordinated across unrelated instances.

Fix in Cursor Fix in Web

@adinauer
Copy link
Member Author

@sentry review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants