Skip to content

Unsafe fiber termination by name causes killing unrelated fibers #473

@vakhov

Description

@vakhov

After upgrading CRUD to 1.7.0, we observed unstable behavior on storage nodes when the expirationd role is enabled. During role application, the fiber executing apply() may be unexpectedly killed. This happens while CRUD is performing storage calls in “fast mode”. As a result, role application is interrupted and fail. The issue is caused by how CRUD currently identifies and terminates its internal fibers.

Root cause
CRUD relies on fiber.name() to identify internal “fast-mode” fibers and kills fibers by matching their name. This assumption is unsafe for several reasons:

  • changed fiber.name persists after crud request completed
  • As a result, a different, unrelated fiber may later be killed by crud

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions