-
Notifications
You must be signed in to change notification settings - Fork 775
Description
Background
My team at my employer has been using the OpenTelemetry Python SDK in production for the past year now. One issue that we have consistently run into is managing the configuration of the SDK across multiple services. While the auto instrumentation library, custom configurators, distros, etc. have helped, there are still many instances where specific services have specific instrumentation requirements and thus, we end up either end up writing custom SDK initialization code or relying heavily on environment variables to configure the SDK. We have considered writing our own configuration library to address this, but before doing so, I wanted to see if there was interest in adding a configuration system to the OpenTelemetry Python SDK itself.
I am aware that there were previous discussions about configuration in the OpenTelemetry Python SDK (e.g., #1035 and #663) however, it seems like these discussions have been dormant for a while and the proposed solutions did not seem sufficiently general.
Proposal: SDK Autoconfiguration
Summary
This proposal introduces a YAML based declarative configuration system for the Python OpenTelemetry SDK. This approach allows users to configure the SDK pipeline through a single configuration file, eliminating the need declaring a large number of environment variables or boilerplate initialization code.
Design Goals
A key goal of this proposal is to create a configuration format that is familiar to Python developers and OpenTelemetry users. The draft configuration format draws inspiration from both Python's builtin logging configuration format and the OpenTelemetry Collector's configuration format.
Concepts borrowed from Python's Logging Configuration
Python's logging.config module supports dictionary based configuration with internal references using the cfg:// and ext:// URI schemes. This proposal adopts the same pattern:
cfg://references other objects defined within the configuration file (e.g.,cfg://service.traces.exporters.console)ext://references external Python objects by their full name (e.g.,ext://sys.stdout)
Concepts borrowed from OpenTelemetry Collector Configuration
The OpenTelemetry Collector also uses a YAML configuration format that has become standard. This proposal aligns with several of its conventions:
- Named component instances using the
type/namepattern (e.g.,batch/console,otlp/primary) - Similar terminology for exporters, processors, and other components
Motivation
Reducing Excessive Environment Variable Usage
Currently, configuring the OpenTelemetry Python SDK often relies on numerous environment variables:
export OTEL_SERVICE_NAME=test-service
export OTEL_TRACES_EXPORTER=otlp
export OTEL_METRICS_EXPORTER=otlp
export OTEL_LOGS_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
export OTEL_EXPORTER_OTLP_COMPRESSION=gzip
export OTEL_TRACES_SAMPLER=parentbased_traceidratio
export OTEL_TRACES_SAMPLER_ARG=0.5
# ...This approach would provide an alternative to having to define so many environment variables.
Reducing Boilerplate Code
Beyond environment variables, configuring the OpenTelemetry Python SDK often requires writing several lines initialization code when utilizing multiple signals. Users must manually instantiate providers, exporters, processors, and wire them together.
A declarative configuration approach partially addresses these issues by utilizing a more declarative style that minimizes boilerplate code.
Proposed Configuration Format (Draft)
The following is a draft configuration format intended to illustrate the approach and receive feedback. The specific structure, field names and conventions are open for discussion and refinement if the team chooses to move forward.
version: 1
service:
resource:
detectors:
- otel
- process
- os
- host
attributes:
service.name: scratch-8-service
service.version: 1.0.0
service.environment: production
traces:
sampler: always_on
id_generator: random
exporters:
console:
otlp:
endpoint: "${env:MY_OTEL_TRACE_EXPORTER_OTLP_ENDPOINT}"
compression: "gzip"
processors:
batch/console:
span_exporter: cfg://service.traces.exporters.console
max_queue_size: 2048
schedule_delay_millis: 5000
max_export_batch_size: 512
export_timeout_millis: 30000
batch/otlp:
span_exporter: cfg://service.traces.exporters.otlp
max_queue_size: 2048
schedule_delay_millis: 5000
max_export_batch_size: 512
export_timeout_millis: 30000
metrics:
exporters:
console:
otlp:
endpoint: "${env:MY_OTEL_METRIC_EXPORTER_OTLP_ENDPOINT}"
compression: "gzip"
readers:
periodic/console:
exporter: cfg://service.metrics.exporters.console
export_interval_millis: 60000
periodic/otlp:
exporter: cfg://service.metrics.exporters.otlp
export_interval_millis: 60000
logs:
exporters:
console:
otlp:
endpoint: "${env:MY_OTEL_LOG_EXPORTER_OTLP_ENDPOINT}"
compression: "gzip"
processors:
batch/console:
exporter: cfg://service.logs.exporters.console
batch/otlp:
exporter: cfg://service.logs.exporters.otlpKey Features
Environment Variable Substitution and Extensible Providers
The ${env:VARIABLE_NAME} syntax allows sensitive or environment-specific values to be injected at runtime:
endpoint: "${env:MY_OTEL_TRACE_EXPORTER_OTLP_ENDPOINT}"This substitution syntax is exactly as defined in the OpenTelemetry collector and is designed to be extensible. In the future, additional configuration providers could be supported very easily:
# Environment variables
endpoint: "${env:OTEL_ENDPOINT}"
# File based secrets
api_key: "${file:/run/secrets/otel_api_key}"
# HashiCorp Vault
api_key: "${vault:secret/data/otel#api_key}"
# AWS Secrets Manager
api_key: "${secretsmanager:arn:aws:secretsmanager:us-west-2:123456789012:secret:otel}"
# HTTP endpoint
config_value: "${http:https://config-server/otel/settings}"
...This extensibility has numerous benefits as demonstrated in the OpenTelemetry Collector configuration specification.
Internal References (cfg://)
Borrowed from Python's logging.config module, the cfg:// scheme enables referencing other objects defined within the configuration. This feature is essential for wiring components together. For example, Span processors almost always require a Span exporter as seen below:
...
traces:
exporters:
console/basic:
processors:
batch/console:
span_exporter: cfg://service.traces.exporters.console/basic
...External References (ext://)
Python's logging configuration also supports ext:// for referencing external Python objects. This could be adopted to allow referencing custom components not registered via entry points:
...
traces:
exporters:
console/stdout:
output: ext://sys.stdout
console/stderr:
output: ext://sys.stderr
processors:
batch/stdout:
span_exporter: cfg://service.traces.exporters.console/stdout
batch/stderr:
span_exporter: cfg://service.traces.exporters.console/stderr
...This syntax would be familiar to Python developers and provides an alternative for advanced use cases.
Named Instances
Components can be named using the type/name convention (e.g., batch/console, periodic/otlp), enabling multiple instances of the same component type with different configurations.
The type portion of the name will be resolved depending on the context using the already defined entrypoints in the OpenTelemetry Python SDK and contrib packages. For example, when the configuration loader encounters batch/console under traces.processors, it:
- Looks up the
opentelemetry_traces_processorentry point group[project.entry-points.opentelemetry_traces_processor] batch = "opentelemetry.sdk.trace.export:BatchSpanProcessor" simple = "opentelemetry.sdk.trace.export:SimpleSpanProcessor" my_custom = "mycompany.telemetry:CustomSpanProcessor"
- Finds the entry registered as
batch - Instantiates that class with the provided configuration and the instance name
console
This provides an unambiguous way to reference specific component types in a concise format and utilizes the existing plugin architecture.
Open Question: Configuration Mapping Strategy
One design question that I was unsure about is how configuration values should map to SDK object construction. Two alternatives I have considered are:
Option A: Direct Constructor Mapping (Proposed)
Configuration keys map directly to constructor keyword arguments. For example for BatchSpanProcessor:
class BatchSpanProcessor(SpanProcessor):
def __init__(
self,
span_exporter: SpanExporter,
max_queue_size: int | None = None,
schedule_delay_millis: float | None = None,
max_export_batch_size: int | None = None,
export_timeout_millis: float | None = None,
):
...Configuration:
batch/console:
span_exporter: cfg://service.traces.exporters.console
max_queue_size: 2048
schedule_delay_millis: 5000
max_export_batch_size: 512
export_timeout_millis: 30000Pros:
- Mirrors the API directly
- No additional abstraction layer to maintain
- Easy for users familiar with the SDK to understand
Cons:
- Tightly couples configuration schema to constructor signatures
- Constructor changes become breaking changes to configuration format
- Cannot be applied to classes without suitable constructors (e.g.
SynchronousMultiSpanProcessor)
Option B: Explicit Configuration Classes
Introduce separate entry points that define configuration schemas and handle object construction. For example:
[project.entry-points.opentelemetry_traces_processor_cfg]
batch = "opentelemetry.sdk.trace.export.config:BatchSpanProcessorConfig"Where BatchSpanProcessorConfig might look like:
@dataclass
class BatchSpanProcessorConfig:
span_exporter: SpanExporter
max_queue_size: int = 2048
schedule_delay_millis: float = 5000.0
max_export_batch_size: int = 512
export_timeout_millis: float = 30000.0
@classmethod
def from_config(cls, config: dict, context: ConfigContext) -> "BatchSpanProcessorConfig":
"""Parse and validate configuration."""
# Custom validation, transformation, deprecation handling
return cls(**config)
def build(self) -> BatchSpanProcessor:
"""Construct the SDK object."""
return BatchSpanProcessor(
span_exporter=self.span_exporter,
max_queue_size=self.max_queue_size,
schedule_delay_millis=self.schedule_delay_millis,
max_export_batch_size=self.max_export_batch_size,
export_timeout_millis=self.export_timeout_millis,
)Pros:
- Decouples configuration schema from constructors
- Supports configuration transformations
- Can provide schema introspection for documentation/tooling
Cons:
- Additional abstraction layer to write and maintain
- Higher barrier for third-party contributions
Describe alternatives you've considered
No response
Additional Context
- [OpenTelemetry Configuration Specification](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/configuration/sdk-configuration.md)
- Python
logging.configDocumentation β see "User-defined objects" and "Access to internal objects" - [Python Entry Points Specification](https://packaging.python.org/en/latest/specifications/entry-points/)
- [OpenTelemetry Collector Configuration](https://opentelemetry.io/docs/collector/configuration/)
- [OpenTelemetry Python SDK](https://github.com/open-telemetry/opentelemetry-python)
Would you like to implement a fix?
Yes
Tip
React with π to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.