-
Notifications
You must be signed in to change notification settings - Fork 301
Add include-vector-fields-by-default property to DataSource configuration #3005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…tion - Add IncludeVectorFieldsByDefault property to DataSource record (default: false) - Add UserProvidedIncludeVectorFieldsByDefault flag for serialization control - Update DataSourceConverterFactory to read/write the new property - Update JSON schema with include-vector-fields-by-default property and validation - Add --data-source.include-vector-fields-by-default CLI option to ConfigureOptions - Update ConfigGenerator to handle the new CLI option with MSSQL validation - Add validation in RuntimeConfigValidator for non-MSSQL database types - Add unit tests for the new CLI configure option Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
|
@copilot The PR doesn't modify:
The configuration can be set, but has no effect on actual query results. |
You're correct. This PR implements only the configuration infrastructure for
I'll work on implementing the complete functionality. The vector type detection and filtering logic needs to be added to the metadata provider and query builders. |
- Add IsVectorType property to ColumnDefinition for tracking vector columns - Update MsSqlMetadataProvider to detect vector type columns during schema discovery - Modify AuthorizationResolver.ResolveEntityDefinitionColumns to filter out vector columns when IncludeVectorFieldsByDefault is false - Update SetEntityPermissionMap to pass includeVectorFields setting from DataSource config Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds an MSSQL-only runtime configuration switch to omit vector-type columns from API results by default, reducing payload size and improving performance, with an opt-in to include them.
Changes:
- Introduces
include-vector-fields-by-defaultin the DataSource object model, CLI (dab configure), JSON schema, and runtime validation. - Detects MSSQL vector columns during metadata discovery and tracks them on
ColumnDefinition. - Filters vector columns out of authorization-resolved allowed column sets when the setting is disabled.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/Core/Services/MetadataProviders/MsSqlMetadataProvider.cs | Marks columns as vector type based on MSSQL DATA_TYPE. |
| src/Core/Configurations/RuntimeConfigValidator.cs | Adds MSSQL-only validation for the new config flag. |
| src/Core/Authorization/AuthorizationResolver.cs | Excludes vector columns from resolved columns when configured. |
| src/Config/ObjectModel/DataSource.cs | Adds IncludeVectorFieldsByDefault plus user-provided tracking flag. |
| src/Config/DatabasePrimitives/DatabaseObject.cs | Adds ColumnDefinition.IsVectorType metadata flag. |
| src/Config/Converters/DataSourceConverterFactory.cs | Reads/writes the new property and conditionally serializes it. |
| src/Cli/ConfigGenerator.cs | Implements CLI update behavior and MSSQL-only enforcement. |
| src/Cli/Commands/ConfigureOptions.cs | Adds CLI option --data-source.include-vector-fields-by-default. |
| src/Cli.Tests/ConfigureOptionsTests.cs | Adds unit tests for MSSQL enable/disable and non-MSSQL rejection. |
| schemas/dab.draft.schema.json | Adds schema definition and MSSQL-only constraint (when true). |
Comments suppressed due to low confidence (1)
src/Core/Authorization/AuthorizationResolver.cs:326
- Vector filtering is only applied when resolving “all columns” (wildcard/implicit includes). If
entityAction.Fields.Includeexplicitly lists a vector column, it bypassesResolveEntityDefinitionColumns(..., includeVectorFields)and the vector column will still be allowed/returned even whenIncludeVectorFieldsByDefaultis false. If the goal is to omit vector columns globally unless the data-source setting is true, consider filtering vector columns out of explicit include lists (or out ofallowedColumns) as well.
else
{
operationToColumn.Included = entityAction.Fields.Include;
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| case "include-vector-fields-by-default": | ||
| if (reader.TokenType is JsonTokenType.True or JsonTokenType.False) | ||
| { | ||
| includeVectorFieldsByDefault = reader.GetBoolean(); | ||
| userProvidedIncludeVectorFieldsByDefault = true; |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
include-vector-fields-by-default parsing currently falls through silently for invalid inputs (e.g., non-boolean token types, or a string that doesn’t parse to bool), which can hide config mistakes and leaves UserProvidedIncludeVectorFieldsByDefault false. Consider throwing a JsonException when the value isn’t a boolean or a parseable boolean string (similar to how options rejects unexpected values).
| // Write health if present | ||
| if (value.Health is not null) | ||
| { | ||
| writer.WritePropertyName("health"); | ||
| JsonSerializer.Serialize(writer, value.Health, options); |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
health is written whenever value.Health is non-null, but DatasourceHealthCheckConfig’s converter only writes JSON when UserProvidedEnabled is true. If Health is non-null with UserProvidedEnabled == false (e.g., config contains "health": null), this can cause serialization to write a property name with no value / throw during serialization. Consider only writing the health property when the health config has user-provided content (or ensure the health converter writes an explicit null/object in the non-user-provided case).
| // Write health if present | |
| if (value.Health is not null) | |
| { | |
| writer.WritePropertyName("health"); | |
| JsonSerializer.Serialize(writer, value.Health, options); | |
| // Write health only if user provided health configuration | |
| if (value.Health is DatasourceHealthCheckConfig healthConfig && healthConfig.UserProvidedEnabled) | |
| { | |
| writer.WritePropertyName("health"); | |
| JsonSerializer.Serialize(writer, healthConfig, options); |
| // include-vector-fields-by-default is only valid for mssql database type | ||
| if (dataSource.IncludeVectorFieldsByDefault && dataSource.DatabaseType != DatabaseType.MSSQL) | ||
| { | ||
| HandleOrRecordException(new DataApiBuilderException( | ||
| message: "The include-vector-fields-by-default option is only valid for MSSQL database type.", |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Runtime validation only rejects this option for non-MSSQL when the value is true. If the intent is “option is only valid/applicable for MSSQL” (as the message says and as the CLI enforces), consider validating based on UserProvidedIncludeVectorFieldsByDefault instead, so configs that explicitly set it (even to false) on non-MSSQL are rejected consistently.
| // that are accessible to the user. | ||
| HashSet<string> allowedColumns = new(); | ||
| IEnumerable<string> allTableColumns = ResolveEntityDefinitionColumns(entityName, metadataProvider); | ||
| IEnumerable<string> allTableColumns = ResolveEntityDefinitionColumns(entityName, metadataProvider, includeVectorFields); |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assignment to allTableColumns is useless, since its value is never read.
| IEnumerable<string> allTableColumns = ResolveEntityDefinitionColumns(entityName, metadataProvider, includeVectorFields); |
Why make this change?
Closes #2704
Vector-type columns can be large and are rarely read directly. This adds configuration to omit them by default, improving performance and reducing payload size while allowing explicit opt-in.
What is this change?
Adds
include-vector-fields-by-defaultproperty todata-sourceconfiguration (MSSQL only):IncludeVectorFieldsByDefaultproperty (default:false)trueto MSSQL only--data-source.include-vector-fields-by-defaultoption indab configureIsVectorTypeproperty to track vector columnsDATA_TYPEis "vector"IncludeVectorFieldsByDefaultisfalseWhen
include-vector-fields-by-defaultisfalse(default):IsVectorType = true{ "data-source": { "database-type": "mssql", "connection-string": "...", "include-vector-fields-by-default": false } }How was this tested?
Added 4 test cases:
true/falsefor MSSQLSample Request(s)
Original prompt
Vectorfrom results #2837✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.