-
Notifications
You must be signed in to change notification settings - Fork 25
feat(pkg-py): add multi-table support #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
cpsievert
wants to merge
38
commits into
main
Choose a base branch
from
feat/py-multi-table
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add a new DataSource implementation that keeps Polars LazyFrames lazy
until the render boundary. Key changes:
- Add `AnyFrame` type alias (`Union[nw.DataFrame, nw.LazyFrame]`)
- Widen DataSource ABC return types to support lazy frames
- Implement `PolarsLazySource` using Polars SQLContext for lazy SQL
- Update `normalize_data_source()` to detect and route LazyFrames
- Collect LazyFrames at render boundary in `app()` method
- Update type hints throughout
Usage:
```python
import polars as pl
from querychat import QueryChat
lf = pl.scan_parquet("large_data.parquet")
qc = QueryChat(data_source=lf, table_name="data")
```
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
PolarsLazySource._polars_dtype_to_sql was mapping pl.Time to "TIMESTAMP" but it should map to "TIME". Time-only values are not timestamps. Also added noqa comment for PLR0911 (too many return statements) since the function now has 7 return statements after the fix. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously, test_query only validated schema structure via collect_schema() without executing the query. This meant runtime errors (e.g., invalid casts) wouldn't surface until actual collection. Now test_query collects one row to catch runtime errors, matching the behavior of DataFrameSource.test_query. The return type changes from LazyFrame to DataFrame since we've already done the work. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The noqa: A005 comment was accidentally removed from types/__init__.py. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add IbisSource class to _datasource.py that wraps Ibis Tables for use with QueryChat. Key features: - Accepts ibis.Table and table_name, extracts backend and column names - get_db_type() returns the backend name (e.g., "duckdb", "postgres") - execute_query() uses check_query() for SQL injection protection and returns ibis.Table (lazy) for chaining additional operations - get_data() returns the original table - cleanup() is a no-op since Ibis manages connection lifecycle - Stores _colnames for use by test_query() (to be implemented later) Note: get_schema() and test_query() raise NotImplementedError for now; they will be implemented in separate tasks. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implement the get_schema() method for IbisSource that generates schema information for the LLM prompt. The implementation: - Classifies columns by type using Ibis dtype methods (is_numeric, is_string, is_date, is_timestamp) - Uses a single aggregate query for efficiency to get min/max for numeric/date columns and nunique for text columns - Shows categorical values for text columns with unique count below the threshold - Includes _ibis_dtype_to_sql() helper to convert Ibis dtypes to SQL type names The output format matches other DataSource implementations (DataFrameSource, PolarsLazySource, SQLAlchemySource). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace NotImplementedError stub with real implementation that: - Uses check_query() for SQL injection protection - Wraps query in LIMIT 1 subquery to test without full execution - Always collects (calls .execute()) to catch runtime errors - Returns nw.DataFrame via nw.from_native() on executed result - Validates all original columns present when require_all_columns=True - Raises MissingColumnsError when columns are missing Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update AnyFrame type with TYPE_CHECKING guard to include ibis.Table - Add Ibis Table detection in normalize_data_source() - Update render boundary in app() to handle Ibis Tables via to_pandas() - Export IbisSource and other DataSource classes from __init__.py Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add test_querychat_with_ibis_table() that verifies QueryChat correctly accepts an Ibis Table as a data source, creating an IbisSource and executing queries that return ibis.Table objects. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add documentation for using Ibis Tables as a data source in querychat, including examples for DuckDB, PostgreSQL, and BigQuery backends. The section explains Ibis's value proposition (lazy evaluation, backend flexibility, chainable operations) and provides guidance on when to choose Ibis vs SQLAlchemy. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Design for extending QueryChat to support multiple tables:
- .add_table() / .remove_table() API
- Relationship specification (explicit, auto-detect, free-text)
- Independent filter state per table
- .table("name").df()/.sql()/.title() accessor pattern
- Tabbed UI with auto-switch on filter
- Full backwards compatibility for single-table usage
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Store data sources in _data_sources dict keyed by table name. Maintains backwards compatibility via data_source property. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Returns list of registered table names in add-order. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Allows adding additional tables after construction. Stores relationships and descriptions for LLM context. Validates table names and prevents duplicates. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Allows removing tables before server initialization. Cleans up data source and associated metadata. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Provides per-table access pattern: qc.table('name').data_source
Reactive methods (df, sql, title) will be added in Phase 6.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Verifies data_source raises when multiple tables present. Full df/sql/title tests will be added with reactive state. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Prepares for multi-table support in update_dashboard tool. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Tool now requires table parameter to specify which table to filter. Validates table exists before executing query. - Updated tool_update_dashboard to accept dict[str, DataSource] instead of single DataSource - Added `table` parameter to the tool function signature - Added table validation with helpful error message listing available tables - Updated prompt template to document the table parameter - Updated callers in _querychat.py and _querychat_module.py - Added tests for new functionality Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Tool now requires table parameter to specify which table to reset. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Query tool now accepts dict of data sources. Full JOIN support will require shared database connection. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
QueryChatSystemPrompt now accepts dict of data sources via data_sources parameter while maintaining backwards compatibility with single data_source. Changes: - Add data_sources parameter accepting dict[str, DataSource] - Generate combined schema with <table name="..."> tags for each table - Add relationships parameter for foreign key information - Add table_descriptions parameter (reserved for future use) - Add _generate_combined_schema() and _generate_relationships_text() methods - Add data_source property for backwards compatibility Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds relationship section and multi-table filtering instructions. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ServerValues now contains tables dict with per-table TableState objects. Each TableState holds df, sql, and title reactive values for its table. Key changes: - Added TableState dataclass for per-table reactive state - Updated ServerValues to use tables dict and active_table tracker - Changed mod_server to accept data_sources dict instead of single source - Per-table callbacks use the table parameter from tool calls - Bookmarking saves/restores per-table state - Backwards compatibility maintained via properties on ServerValues Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
TableAccessor.df(), .sql(), .title() now access per-table reactive state. Requires .server() to be called first. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
.app() renders tabs when multiple tables are present. Single table mode unchanged for backwards compatibility. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
UI automatically switches to the most recently filtered table. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Enables custom layouts with qc.table('name').ui().
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Integrates all multi-table components: - System prompt now lazily generated with current tables - _make_system_prompt() creates fresh prompt with all sources - client() gets updated system prompt with all tables - Relationships and descriptions passed to system prompt Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add type: ignore comments for dynamic _vals attribute access. The hasattr check ensures runtime safety. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Makes TableAccessor available for type hints and documentation. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Move table name validation before data source storage - Initialize _server_initialized in __init__ - Set _server_initialized flag when server() is called - Simplify server init checks to use attribute directly Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Resolve merge conflicts by: - Taking main's refactored architecture (_querychat_base.py, _shiny.py, _shiny_module.py) - Taking main's improved typing (IntoFrameT, native return types) - Taking main's new framework support (Streamlit, Gradio, Dash) - Combining CHANGELOG entries from both branches - Keeping main's test improvements Note: Multi-table support from this branch will need to be re-integrated into the new architecture in a follow-up PR. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds multi-table support to QueryChat Python package, allowing users to chat with multiple related tables in a single session.
Key Features
add_table(),remove_table(),table_names(),table("name")methodsqc.table("name").df(),.sql(),.title(),.ui().app()renders tabs when multiple tables are presentAPI Examples
Files Changed
_querychat.py_table_accessor.py_querychat_module.py_system_prompt.pytools.pyprompts/prompt.mdprompts/tool-*.mdtests/test_multi_table.pyDesign Document
See
docs/plans/2025-01-14-multi-table-design.mdfor the full design specification.Test plan
test_multi_table.pycover storage, add/remove, accessor, ambiguity errors🤖 Generated with Claude Code