Skip to content

Distribute .proto files with datafusion .proto crate #19489

@DarkWanderer

Description

@DarkWanderer

Is your feature request related to a problem or challenge?

Currently, datafusion-proto crate excludes protobuf definitions from crate distribution. While generated structs are available for use, this prevents consumers of the crate from defining other messages or gRPC services that include types defined in datafusion.proto.

Workarounds exist (copying datafusion.proto into source, creating a stub and using extern_path with Prost), but they are brittle if the .proto file ever updates

Describe the solution you'd like

Publish .proto file as part of datafusion-proto package. This way, users can import datafusion.proto by adding crate directory to tonic import path in build.rs

Describe alternatives you've considered

  • Copying datafusion.proto file into my own source code. This may cause silent data corruption in future if upstream types change
  • Creating a stub datafusion.proto file. My current workaround, slightly less brittle but requires manual "forward declaration" and can still break if types change/are renamed

Additional context

This is the outline of the worker service I am trying to define using datafusion-proto types:

syntax = "proto3";

package worker;

import "datafusion.proto";

service WorkerService {
  // Execute a logical plan on this worker, streaming back record batches
  rpc ExecuteLogicalPlan(ExecuteLogicalPlanRequest) returns (stream ExecuteLogicalPlanResponse);
}

message ExecuteLogicalPlanRequest {
  // DataFusion LogicalPlan from datafusion-proto crate
  datafusion.LogicalPlanNode logical_plan = 1;
}

message ExecuteLogicalPlanResponse {
  optional bytes data = 1; // Serialized Apache Arrow data
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions