-
Notifications
You must be signed in to change notification settings - Fork 3.4k
SQL Server: Support for full-text search TVFs (CONTAINSTABLE/FREETEXTTABLE) #37489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
235a837 to
9785ec0
Compare
roji
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting this PR - it's a good start but requires quite a lot of work. Translating a specialized TVF like this is quite challenging, and requires quite a bit of EF internals knowledge (it's not a good issue for a casual external contributor).
Unless you're wililng to invest significant time here and go into complicated internal architecture (just using AI here probably won't be sufficient) please let me know - I'm OK to take this over and complete it (and you can be happy with the work you've done here!).
Otherwise, see first round of comments below.
| /// <param name="property">The property to search.</param> | ||
| /// <param name="searchCondition">The search condition.</param> | ||
| /// <returns>The table-valued function result containing Key and Rank columns.</returns> | ||
| public static IQueryable<FullTextTableResult> ContainsTable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move these to SqlServerQueryableExtensions, similar to RelationalQueryableExtensions.
| /// <returns>The table-valued function result containing Key and Rank columns.</returns> | ||
| public static IQueryable<FullTextTableResult> FreeTextTable( | ||
| this DbFunctions _, | ||
| object property, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentionally object as opposed to string? Is there a reason for that?
Also, is it possible for the full-text search column to be nullable? If so this parameter needs to be nullable too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the object type: I used object to stay consistent with other full-text extensions like SqlServerDbFunctionsExtensions.Contains, which takes an object property to allow the expression tree to capture the property access before it's translated. However, if the convention for Table-Valued Functions in EF Core prefers a specific type like string, I’m happy to change it.
Regarding nullability: Yes, full-text search columns can be nullable in SQL Server. I will update the parameter to be object? (or string?) to correctly reflect the provider's capabilities and avoid any potential issues with nullable properties
| /// <summary> | ||
| /// The key of the row matched. | ||
| /// </summary> | ||
| public object Key { get; set; } = default!; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't ideal for this to be an untyped object, but I can't really see a way for this to be generic over the key type... The key column is determined in the CREATE FULLTEXT INDEX DDL, and we have no way of flowing that as a generic parameter here.
I'm thinking it might be a good idea to accept the key type as a generic type parameter to the FreeTextTable/ContainsTable methods; another alternative is to accept an additional parameter on the functions where the user selects the key property - that would allow us to flow that type to the return type, but it would be a bit silly (if the user specifies anything other than the key property from CREATE FULLTEXT INDEX, the call fails).
Any thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the Generic approach (ContainsTable<TKey>) is the best way to go.
It’s simple and explicit. The user knows their database schema best, so they can just provide the correct type for the Key. This also keeps the translation logic clean and avoids the extra work of trying to infer the type from other properties.
If you're okay with this, I will update the code to use <TKey> and make FullTextTableResult generic
| /// <summary> | ||
| /// An expression that represents a SQL Server full-text table-valued function (e.g., CONTAINSTABLE). | ||
| /// </summary> | ||
| public class FullTextTableExpression : TableExpressionBase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why introduce a new expression type? Is TableValuedFunctionExpression not suitable somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I didn't use TableValuedFunctionExpression is that its Print method calls VisitCollection(Arguments). This treats all arguments as standard SqlExpression nodes, which the ExpressionPrinter translates into SQL literals (quoted strings) or parameters.
For CONTAINSTABLE, the first two arguments must be raw identifiers (Table and Column). By using a custom expression, I can bypass the standard Visit logic and use expressionPrinter.Append(Column.TableAlias) to print the table name directly without quotes, ensuring the SQL is syntactically correct for SQL Server
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't need a whole new expression type for this. We should be able to use TableValuedFunctionExpression, with the first two arguments (the table and column names) just being SqlFragmentExpressions. We'd still need special handling in SqlServerQuerySqlGenerator to quote those names, but at least we wouldn't have an additional expression type.
test/EFCore.SqlServer.FunctionalTests/Query/SqlQuerySqlServerTest.cs
Outdated
Show resolved
Hide resolved
| { | ||
| Sql.Append(fullTextTableExpression.FunctionName) | ||
| .Append("(") | ||
| .Append(_sqlGenerationHelper.DelimitIdentifier(fullTextTableExpression.Column.TableAlias!)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the first argument to these functions can only be a literal table name - not a table alias. For example, the following does not work:
SELECT *
FROM Flags AS f
WHERE (
SELECT TOP(1) RANK
FROM CONTAINSTABLE (f, FlagColors, 'Green or Black')
ORDER BY RANK DESC) <> 0;Where replacing f with a table name (Flags) does:
SELECT *
FROM Flags AS f
WHERE (
SELECT TOP(1) RANK
FROM CONTAINSTABLE (Flags, FlagColors, 'Green or Black')
ORDER BY RANK DESC) <> 0;We can use SqlFragmentExpression to represent the table and column names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the clarification. You're right, CONTAINSTABLE requires the actual table name rather than the alias.
I will refactor the implementation to use SqlFragmentExpression for both the table and column names. This ensures they are treated as raw SQL fragments and bypasses the quoting issues we discussed. I'll also make sure to pull the base table name from the metadata instead of using the alias
|
|
||
|
|
||
| var query = from c in ctx.Set<Microsoft.EntityFrameworkCore.TestModels.Northwind.Customer>() | ||
| join ct in EF.Functions.ContainsTable(ctx.Set<TestModels.Northwind.Customer>().Select(x => x.ContactName), "John") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requiring invokers to do ctx.Set<T>().Select() to get the property like this is problematic:
- It isn't a good dev experience.
- If the user diverges from this pattern in any way (e.g. projecting something other than a simple property), this will fail.
- What happens if there's a Where before the Select? I think your code above would just ignore that (I may be wrong). In any case, these functions do not support pre-filtering or any other operator - they accept a literal table name as the source only.
To avoid all the above problems, we should instead try to make ContainsTable an extension method over DbSet; its source (first argument) would thus represent the table, and it would accept a lambda selector for the full-text property to be searched (and of course, the actual text to be searched):
var query =
from c in ctx.Set<Customer>()
join ct in ctx.Set<Customer>().ContainsTable(c => c.ContactName, "John")
....Makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I’ll refactor the implementation to move ContainsTable and FreeTextTable as extension methods over IQueryable<TEntity> and update the visitor to handle this new pattern
| => Fixture.TestSqlLoggerFactory.AssertBaseline(expected); | ||
|
|
||
| [ConditionalFact] | ||
| public virtual void FullText_ContainsTable_queryable_simple() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move tests to NorthwindDbFunctionsQuerySqlServerTest alongside the existing full-text search tests (unless there's a specific reason not to).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will move them
| => Fixture.TestSqlLoggerFactory.AssertBaseline(expected); | ||
|
|
||
| [ConditionalFact] | ||
| public virtual void FullText_ContainsTable_queryable_simple() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a lot more testing here - one simple test isn't enough.
|
Thanks for your reply and for the feedback. I am definitely willing to invest the necessary time to work on this feature and dive deeper into the EF Core internal architecture. I'd be honored to learn from your expertise throughout this process. If you're available to provide guidance or point me in the right direction, I would be happy to put in the work and collaborate to get this implemented correctly. |
…est.cs Co-authored-by: Shay Rojansky <[email protected]>
…into feature-TVFs
|
Hi @roji
|
roji
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some more comments.
Please ensure that all tests pass before requesting another review.
| /// <param name="propertySelector">A lambda expression selecting the property to search.</param> | ||
| /// <param name="searchCondition">The search condition.</param> | ||
| /// <returns>The table-valued function result containing Key and Rank columns.</returns> | ||
| public static IQueryable<FullTextTableResult<TKey>> ContainsTable<TEntity, TKey>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the two SQL Server functions accept two additional parameters: language and top_n_by_rank.
| /// </summary> | ||
| /// <param name="key">The key of the row matched.</param> | ||
| /// <param name="rank">The ranking value assigned to the row.</param> | ||
| public FullTextTableResult(TKey key, int rank) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use primary constructor
| /// The result type for SQL Server full-text table-valued functions (CONTAINSTABLE / FREETEXTTABLE). | ||
| /// </summary> | ||
| /// <typeparam name="TKey">The type of the key column.</typeparam> | ||
| public class FullTextTableResult<TKey> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this a readonly struct, as it's a very think wrapper that can be immutable.
| /// <summary> | ||
| /// An expression that represents a SQL Server full-text table-valued function (e.g., CONTAINSTABLE). | ||
| /// </summary> | ||
| public class FullTextTableExpression : TableExpressionBase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't need a whole new expression type for this. We should be able to use TableValuedFunctionExpression, with the first two arguments (the table and column names) just being SqlFragmentExpressions. We'd still need special handling in SqlServerQuerySqlGenerator to quote those names, but at least we wouldn't have an additional expression type.
| private void AssertSql(params string[] expected) | ||
| => Fixture.TestSqlLoggerFactory.AssertBaseline(expected); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
| columnFragment, | ||
| searchCondition, | ||
| alias); | ||
| var resultType = typeof(SqlServerQueryableExtensions.FullTextTableResult<>).MakeGenericType(keyType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not call MakeGenericType, as it's incompatible with NativeAOT, which we're starting to move towards. You should already have the correct return type on the method in the inputted MethodCallExpression.
| alias); | ||
| var resultType = typeof(SqlServerQueryableExtensions.FullTextTableResult<>).MakeGenericType(keyType); | ||
| var keyTypeMapping = _typeMappingSource.FindMapping(keyType); | ||
| var keyColumn = new ColumnExpression("Key", alias, keyType, keyTypeMapping, nullable: false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call the columns KEY and RANK (uppercase) as per the documentation.
| return QueryCompilationContext.NotTranslatedExpression; | ||
| } | ||
| var keyType = method.GetGenericArguments()[1]; // TKey is the second generic argument | ||
| var alias = _queryCompilationContext.SqlAliasManager.GenerateTableAlias("ft"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| var alias = _queryCompilationContext.SqlAliasManager.GenerateTableAlias("ft"); | |
| var alias = _queryCompilationContext.SqlAliasManager.GenerateTableAlias("fts"); |
|
|
||
| var newSelectExpression = new SelectExpression( | ||
| [fullTextTableExpression], | ||
| keyColumn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong - the select projects out both the key and the column; down below you're also mixing two projection type (this is a complicated and messy part of SelectExpression currently). Instantiate a SelectExpression without any projections, then call ReplaceProjection() with a dictionary argument to provide the two projections.
| INNER JOIN CONTAINSTABLE([Employees], [Title], N'Sales') AS [f] ON [e].[EmployeeID] = [f].[Key] | ||
| """); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
249ae47 to
6b86657
Compare
Description
This PR implements support for SQL Server Full-Text Search Table-Valued Functions:
CONTAINSTABLEandFREETEXTTABLE.Currently, EF Core only supports scalar Full-Text functions (
CONTAINSandFREETEXT). This PR bridges the gap by allowing users to treat Full-Text results as queryable sources, enabling access to the Rank column and allowing for efficient Joins between full-text results and entity tables.Key Architectural Changes
FullTextTableExpression(inheriting fromTableExpressionBase) to represent the TVF in the relational command tree.SqlServerQueryableMethodTranslatingExpressionVisitorto interceptContainsTableandFreeTextTablemethod calls.SelectExpressionwrapped in aShapedQueryExpression.[Key]and[Rank]columns.SqlAliasManagerfor dynamic table alias generation.ShaperExpressionusingExpression.Newto bind SQL result columns to theFullTextTableResultDTO.VisitFullTextTableinSqlServerQuerySqlGeneratorto handle the specific TVF syntax requirements in SQL Server.Example Usage
Related Issue
Fixes #11487