-
Notifications
You must be signed in to change notification settings - Fork 19
Doc and tests for search multiple partitions #227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughAdds CI/docker env flags for token-based auth and super-admin mode, extensive docs for user management/auth, and many auth-aware test updates (fixtures, multi-partition access tests) enforcing token-based RBAC across search and OpenAI-compatible endpoints. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant API as API Service
participant Auth as Auth Handler
participant DB as Database
Client->>API: Request (e.g. /v1/chat/completions) with Bearer token
API->>Auth: Validate token / resolve user context
Auth->>DB: Fetch user, roles, partition memberships
DB-->>Auth: Return permissions
Auth-->>API: User context (roles, partitions)
API->>DB: Query/verify requested partition(s)
DB-->>API: Access allowed / denied
API-->>Client: 200 OK or 403 Forbidden
sequenceDiagram
participant Tester as Test Runner
participant CI as CI Workflow
participant Compose as Docker Compose
participant Service as openrag Service
Tester->>CI: Trigger "Run API tests"
CI->>Compose: Start services (env: AUTH_TOKEN, SUPER_ADMIN_MODE)
Compose->>Service: Launch openrag with env vars
Tester->>Service: API requests from tests (api_client / user_client)
Service->>Auth: Enforce token & partition RBAC
Auth-->>Service: Allow or deny
Service-->>Tester: Test responses (200 / 403 / 404)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@CLAUDE.md`:
- Around line 111-153: The markdown tables under "User API Endpoints" and
"Partition Membership Endpoints" in CLAUDE.md are violating markdownlint rules
MD058/MD060; fix by adding a blank line before and after each table and
normalizing pipe spacing so each cell uses a single space after and before the
pipe (e.g., "| Column | Column |"), ensure header separator rows use consistent
dashes and pipes, and re-run markdownlint to confirm MD058/MD060 are resolved.
In `@tests/api_tests/test_openai_compat.py`:
- Around line 94-125: In the three test functions
test_chat_completions_with_openrag_all,
test_chat_completions_with_specific_partition, and
test_models_list_includes_openrag_all replace identity comparisons using "is"
when checking response.status_code against the literal 200 with equality
comparisons "=="; update the assertions (e.g., change assert
response.status_code is 200 to assert response.status_code == 200) so the status
code is compared by value rather than object identity.
🧹 Nitpick comments (4)
tests/api_tests/test_search.py (2)
108-147: Harden setup and indexing wait to reduce flakes.Right now partition creation/indexing responses are unchecked and
task_status_urlisn’t handled, which can yield before indexing completes.♻️ Suggested update
- # Create partitions - api_client.post(f"/partition/{partition1}") - api_client.post(f"/partition/{partition2}") + # Create partitions + resp = api_client.post(f"/partition/{partition1}") + assert resp.status_code in [200, 201], f"Failed to create partition1: {resp.text}" + resp = api_client.post(f"/partition/{partition2}") + assert resp.status_code in [200, 201], f"Failed to create partition2: {resp.text}" @@ - # Wait for indexing - data = response.json() - if "task_id" in data: - task_path = f"/indexer/task/{data['task_id']}" - for _ in range(30): - task_response = api_client.get(task_path) - if task_response.json().get("task_state") in ["SUCCESS", "COMPLETED"]: - break - time.sleep(2) + # Wait for indexing + assert response.status_code in [200, 201], f"Failed to index file: {response.text}" + data = response.json() + if "task_status_url" in data: + task_path = "/" + "/".join(data["task_status_url"].split("/")[3:]) + elif "task_id" in data: + task_path = f"/indexer/task/{data['task_id']}" + else: + task_path = None + + if task_path: + for _ in range(30): + task_response = api_client.get(task_path) + state = task_response.json().get("task_state", "") + if state in ["SUCCESS", "COMPLETED", "success", "completed"]: + break + if state in ["FAILED", "FAILURE", "failed", "failure"]: + pytest.skip(f"Indexing failed: {task_response.json()}") + time.sleep(2)
192-200: Strengthen the “sees all partitions” assertion.The test only checks that
"documents"exists. To validate multi-partition coverage, assert returned partitions (and ensure partition2 has docs in the fixture).✅ Suggested assertion
def test_admin_search_all_sees_all_partitions(self, api_client, two_partitions_with_docs): """Admin with partitions=all can see all partitions (if SUPER_ADMIN_MODE).""" + partition1 = two_partitions_with_docs["partition1"] + partition2 = two_partitions_with_docs["partition2"] response = api_client.get( "/search", params={"partitions": "all", "text": "machine learning", "top_k": 10}, ) assert response.status_code == 200 data = response.json() - assert "documents" in data + partitions = {doc["metadata"]["partition"] for doc in data.get("documents", [])} + assert partition1 in partitions + assert partition2 in partitionstests/api_tests/test_openai_compat.py (2)
58-92: Harden indexing wait to reduce flakes.The fixture doesn’t handle
task_status_urland doesn’t fail fast on index errors.♻️ Suggested update
# Index file with open(sample_text_file, "rb") as f: response = api_client.post( f"/indexer/partition/{partition}/file/chat-doc", files={"file": ("test.txt", f, "text/plain")}, data={"metadata": "{}"}, ) # Wait for indexing - if response.status_code in [200, 201]: - data = response.json() - if "task_id" in data: - task_path = f"/indexer/task/{data['task_id']}" - for _ in range(30): - task_response = api_client.get(task_path) - if task_response.json().get("task_state") in ["SUCCESS", "COMPLETED"]: - break - time.sleep(2) + assert response.status_code in [200, 201], f"Failed to index file: {response.text}" + data = response.json() + if "task_status_url" in data: + task_path = "/" + "/".join(data["task_status_url"].split("/")[3:]) + elif "task_id" in data: + task_path = f"/indexer/task/{data['task_id']}" + else: + task_path = None + + if task_path: + for _ in range(30): + task_response = api_client.get(task_path) + state = task_response.json().get("task_state", "") + if state in ["SUCCESS", "COMPLETED", "success", "completed"]: + break + if state in ["FAILED", "FAILURE", "failed", "failure"]: + pytest.skip(f"Indexing failed: {task_response.json()}") + time.sleep(2)
131-175: Harden setup and indexing wait to reduce flakes.Partition creation/indexing responses are unchecked and
task_status_urlisn’t handled.♻️ Suggested update
- # Create partitions - api_client.post(f"/partition/{partition1}") - api_client.post(f"/partition/{partition2}") + # Create partitions + resp = api_client.post(f"/partition/{partition1}") + assert resp.status_code in [200, 201], f"Failed to create {partition1}: {resp.text}" + resp = api_client.post(f"/partition/{partition2}") + assert resp.status_code in [200, 201], f"Failed to create {partition2}: {resp.text}" @@ - # Wait for indexing - if response.status_code in [200, 201]: - data = response.json() - if "task_id" in data: - task_path = f"/indexer/task/{data['task_id']}" - for _ in range(30): - task_response = api_client.get(task_path) - if task_response.json().get("task_state") in ["SUCCESS", "COMPLETED"]: - break - time.sleep(2) + # Wait for indexing + assert response.status_code in [200, 201], f"Failed to index file: {response.text}" + data = response.json() + if "task_status_url" in data: + task_path = "/" + "/".join(data["task_status_url"].split("/")[3:]) + elif "task_id" in data: + task_path = f"/indexer/task/{data['task_id']}" + else: + task_path = None + + if task_path: + for _ in range(30): + task_response = api_client.get(task_path) + state = task_response.json().get("task_state", "") + if state in ["SUCCESS", "COMPLETED", "success", "completed"]: + break + if state in ["FAILED", "FAILURE", "failed", "failure"]: + pytest.skip(f"Indexing failed: {task_response.json()}") + time.sleep(2)
5ad06d1 to
24c31f0
Compare
- Add comprehensive User Management & Authentication section covering: - Database schema (users, partition_memberships, partitions) - Authentication flow via AuthMiddleware - Role hierarchy (viewer < editor < owner) - Permission dependencies and SUPER_ADMIN_MODE behavior - User and partition membership API endpoints - Core implementation patterns via PartitionFileManager - Document multi-partition search behavior: - partitions=all resolves to user's accessible partitions - Admin with SUPER_ADMIN_MODE sees all system partitions - Model prefix is openrag- (legacy: ragondin-) Co-Authored-By: Claude Opus 4.5 <[email protected]>
24c31f0 to
26361a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@tests/api_tests/test_openai_compat.py`:
- Around line 1-9: Run the Ruff formatter on the test module to fix style
issues: execute `ruff format` (or your editor's Ruff integration) for the
OpenAI-compatible tests file (the test_openai_compat.py module containing the
module docstring and imports) so the file adheres to project linting rules and
removes the CI Ruff failure.
In `@tests/api_tests/test_search.py`:
- Around line 109-139: The fixture two_partitions_with_docs should create and
verify both partitions and index a document into each so tests requiring
cross-partition data are valid; update the calls that create partitions
(api_client.post for f"/partition/{partition1}" and f"/partition/{partition2}")
to assert successful creation, then POST the sample_text_file to both
f"/indexer/partition/{partition1}/file/doc1" and
f"/indexer/partition/{partition2}/file/doc2"; when handling the indexing
response, prefer any returned task_status_url or task_id from response.json(),
poll that URL (or /indexer/task/{task_id}) with a bounded timeout, bail early on
explicit failure states and raise if neither SUCCESS nor COMPLETED is reached to
avoid flaky waits, and then yield the two-partition map.
🧹 Nitpick comments (1)
tests/api_tests/test_openai_compat.py (1)
58-85: Harden indexing wait logic and reuse it across fixtures.Both fixtures only handle
task_idand never assert completion, which can produce flaky tests whentask_status_urlis returned or indexing stalls. A small helper can make this reliable and DRY.♻️ Proposed refactor
import pytest from . import conftest +def _wait_for_indexing(api_client, response): + assert response.status_code in [200, 201], f"Indexing failed: {response.text}" + data = response.json() + if "task_status_url" in data: + task_url = data["task_status_url"] + task_path = "/" + "/".join(task_url.split("/")[3:]) + elif "task_id" in data: + task_path = f"/indexer/task/{data['task_id']}" + else: + return + for _ in range(30): + task_response = api_client.get(task_path) + if task_response.json().get("task_state") in ["SUCCESS", "COMPLETED"]: + return + time.sleep(2) + pytest.fail("Indexing did not complete in time") + class TestChatCompletionsMultiPartition: """Test chat completions with multi-partition access.""" @@ with open(sample_text_file, "rb") as f: response = api_client.post( f"/indexer/partition/{partition}/file/chat-doc", files={"file": ("test.txt", f, "text/plain")}, data={"metadata": "{}"}, ) - - # Wait for indexing - if response.status_code in [200, 201]: - data = response.json() - if "task_id" in data: - task_path = f"/indexer/task/{data['task_id']}" - for _ in range(30): - task_response = api_client.get(task_path) - if task_response.json().get("task_state") in ["SUCCESS", "COMPLETED"]: - break - time.sleep(2) + _wait_for_indexing(api_client, response) @@ with open(file_path, "rb") as f: response = api_client.post( f"/indexer/partition/{partition1}/file/doc1", files={"file": ("test.txt", f, "text/plain")}, data={"metadata": "{}"}, ) - - # Wait for indexing - if response.status_code in [200, 201]: - data = response.json() - if "task_id" in data: - task_path = f"/indexer/task/{data['task_id']}" - for _ in range(30): - task_response = api_client.get(task_path) - if task_response.json().get("task_state") in ["SUCCESS", "COMPLETED"]: - break - time.sleep(2) + _wait_for_indexing(api_client, response)Also applies to: 131-166
26361a6 to
48689d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tests/api_tests/test_openai_compat.py`:
- Around line 129-165: The fixture two_partitions_for_chat only indexes a
document into partition1, leaving partition2 empty and allowing false-positive
access-control results; update the fixture to (1) verify partition creation
responses for both partition1 and partition2 after calling api_client.post, (2)
create and POST a second file (or reuse the same file) to
/indexer/partition/{partition2}/file/doc2 so partition2 contains indexed data,
and (3) replicate the existing indexing wait loop (checking
task_response.json().get("task_state") for "SUCCESS"/"COMPLETED") for the task
returned when indexing partition2 to ensure indexing finished before yielding
the partitions.
🧹 Nitpick comments (1)
tests/api_tests/test_openai_compat.py (1)
73-83: Make the indexing wait fail-fast and handletask_status_url.
Right now the fixture can proceed without a completed index if onlytask_status_urlis returned or if the task never completes.Proposed refinement
- # Wait for indexing - if response.status_code in [200, 201]: - data = response.json() - if "task_id" in data: - task_path = f"/indexer/task/{data['task_id']}" - for _ in range(30): - task_response = api_client.get(task_path) - if task_response.json().get("task_state") in ["SUCCESS", "COMPLETED"]: - break - time.sleep(2) + assert response.status_code in [200, 201], f"Indexing failed: {response.text}" + data = response.json() + if "task_status_url" in data: + task_url = data["task_status_url"] + task_path = "/" + "/".join(task_url.split("/")[3:]) + elif "task_id" in data: + task_path = f"/indexer/task/{data['task_id']}" + else: + pytest.fail("Indexing did not return task info") + for _ in range(30): + task_response = api_client.get(task_path) + state = task_response.json().get("task_state", "") + if state in ["SUCCESS", "COMPLETED", "success", "completed"]: + break + if state in ["FAILED", "FAILURE", "failed", "failure"]: + pytest.fail(f"Indexing failed: {task_response.json()}") + time.sleep(2) + else: + pytest.fail("Indexing did not complete in time")
- Add auth support to test fixtures (AUTH_TOKEN, created_user, user_client) - Add multi-partition search tests: - test_search_all_partitions - test_user_search_all_only_sees_own_partitions - test_user_cannot_search_unauthorized_partition - test_admin_search_all_sees_all_partitions - Add chat completions multi-partition tests: - test_chat_completions_with_openrag_all - test_chat_completions_with_specific_partition - test_models_list_includes_openrag_all - Add chat completions user access tests: - test_user_chat_all_only_uses_own_partitions - test_user_cannot_chat_unauthorized_partition - test_user_models_list_only_shows_accessible - Enable AUTH_TOKEN in test docker-compose for auth testing Co-Authored-By: Claude Opus 4.5 <[email protected]>
48689d1 to
b4d63a3
Compare
Summary by CodeRabbit
Documentation
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.