Doc and tests for search multiple partitions #227

paultranvan · 2026-01-29T10:08:47Z

Add comprehensive User Management & Authentication section to CLAUDE.md covering database schema, auth flow, role hierarchy, and API endpoints
Document multi-partition search behavior (partitions=all resolves to user's accessible partitions for regular users)
Add API tests for multi-partition search with user access restrictions
Add API tests for chat completions with openrag-all model
Enable AUTH_TOKEN in test docker-compose for auth testing

Summary by CodeRabbit

Documentation
- Added comprehensive User Management & Authentication docs covering token-based RBAC, roles, and API behavior.
Tests
- Tests now support token-based auth and include new fixtures and expanded coverage for multi-partition scenarios, access controls, and stricter API response assertions.
Chores
- CI and test workflows updated to enable auth tokens and super-admin mode; linting scope narrowed to core and test directories.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-29T10:09:02Z

Warning

Rate limit exceeded

@paultranvan has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 14 minutes and 44 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Adds CI/docker env flags for token-based auth and super-admin mode, extensive docs for user management/auth, and many auth-aware test updates (fixtures, multi-partition access tests) enforcing token-based RBAC across search and OpenAI-compatible endpoints.

Changes

Cohort / File(s)	Summary
CI / Docker Compose `.github/workflows/api_tests/docker-compose.yaml`, `.github/workflows/api_tests.yml`	Added `AUTH_TOKEN` (`test-admin-token`) to CI step and `openrag` service env; added `SUPER_ADMIN_MODE=true` to `openrag` service in docker-compose.
Documentation `CLAUDE.md`	Inserted a comprehensive User Management & Authentication section describing token-based RBAC, schema, auth flow, token format/bootstrapping, API endpoints, and testing patterns (duplicated content appears in multiple doc contexts).
Test fixtures & config `tests/api_tests/conftest.py`	Added `is_auth_enabled()` helper; `api_client` now attaches Authorization when enabled; added `created_user` and `user_client` fixtures; removed prior health-check wait loop.
OpenAI-compatible tests `tests/api_tests/test_openai_compat.py`	Tightened model/list assertions; added completion/chat tests; added `TestChatCompletionsMultiPartition` and `TestChatCompletionsUserAccess` with fixtures to create/index partitions, poll indexing, and assert multi-partition access and model visibility.
Search tests `tests/api_tests/test_search.py`	Added multi-partition search coverage and `TestMultiPartitionUserAccess` with `two_partitions_with_docs` fixture; tests for user/admin differences for `partitions=all`, unauthorized access (403), and admin visibility under `SUPER_ADMIN_MODE`.
Misc tests `openrag/test_version.py`	Removed explicit version-equality test (`test_package_version_matches_expected`).

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant API as API Service
    participant Auth as Auth Handler
    participant DB as Database

    Client->>API: Request (e.g. /v1/chat/completions) with Bearer token
    API->>Auth: Validate token / resolve user context
    Auth->>DB: Fetch user, roles, partition memberships
    DB-->>Auth: Return permissions
    Auth-->>API: User context (roles, partitions)
    API->>DB: Query/verify requested partition(s)
    DB-->>API: Access allowed / denied
    API-->>Client: 200 OK or 403 Forbidden

sequenceDiagram
    participant Tester as Test Runner
    participant CI as CI Workflow
    participant Compose as Docker Compose
    participant Service as openrag Service

    Tester->>CI: Trigger "Run API tests"
    CI->>Compose: Start services (env: AUTH_TOKEN, SUPER_ADMIN_MODE)
    Compose->>Service: Launch openrag with env vars
    Tester->>Service: API requests from tests (api_client / user_client)
    Service->>Auth: Enforce token & partition RBAC
    Auth-->>Service: Allow or deny
    Service-->>Tester: Test responses (200 / 403 / 404)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

chore: Add CLAUDE.md for better AI understanding #204 — Overlaps on CLAUDE.md documentation edits; likely related documentation changes.

Poem

🐰 I found a token by CI light,
Hopped through partitions left and right,
Admins may roam, users stay fenced,
Tests now check each gate, well-sensed,
The rabbit nods — access aligned 🥕

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly summarizes the main objective of the PR: adding documentation and tests for searching multiple partitions.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@CLAUDE.md`:
- Around line 111-153: The markdown tables under "User API Endpoints" and
"Partition Membership Endpoints" in CLAUDE.md are violating markdownlint rules
MD058/MD060; fix by adding a blank line before and after each table and
normalizing pipe spacing so each cell uses a single space after and before the
pipe (e.g., "| Column | Column |"), ensure header separator rows use consistent
dashes and pipes, and re-run markdownlint to confirm MD058/MD060 are resolved.

In `@tests/api_tests/test_openai_compat.py`:
- Around line 94-125: In the three test functions
test_chat_completions_with_openrag_all,
test_chat_completions_with_specific_partition, and
test_models_list_includes_openrag_all replace identity comparisons using "is"
when checking response.status_code against the literal 200 with equality
comparisons "=="; update the assertions (e.g., change assert
response.status_code is 200 to assert response.status_code == 200) so the status
code is compared by value rather than object identity.

🧹 Nitpick comments (4)

tests/api_tests/test_search.py (2)

108-147: Harden setup and indexing wait to reduce flakes.

Right now partition creation/indexing responses are unchecked and task_status_url isn’t handled, which can yield before indexing completes.

♻️ Suggested update

-        # Create partitions
-        api_client.post(f"/partition/{partition1}")
-        api_client.post(f"/partition/{partition2}")
+        # Create partitions
+        resp = api_client.post(f"/partition/{partition1}")
+        assert resp.status_code in [200, 201], f"Failed to create partition1: {resp.text}"
+        resp = api_client.post(f"/partition/{partition2}")
+        assert resp.status_code in [200, 201], f"Failed to create partition2: {resp.text}"
@@
-        # Wait for indexing
-        data = response.json()
-        if "task_id" in data:
-            task_path = f"/indexer/task/{data['task_id']}"
-            for _ in range(30):
-                task_response = api_client.get(task_path)
-                if task_response.json().get("task_state") in ["SUCCESS", "COMPLETED"]:
-                    break
-                time.sleep(2)
+        # Wait for indexing
+        assert response.status_code in [200, 201], f"Failed to index file: {response.text}"
+        data = response.json()
+        if "task_status_url" in data:
+            task_path = "/" + "/".join(data["task_status_url"].split("/")[3:])
+        elif "task_id" in data:
+            task_path = f"/indexer/task/{data['task_id']}"
+        else:
+            task_path = None
+
+        if task_path:
+            for _ in range(30):
+                task_response = api_client.get(task_path)
+                state = task_response.json().get("task_state", "")
+                if state in ["SUCCESS", "COMPLETED", "success", "completed"]:
+                    break
+                if state in ["FAILED", "FAILURE", "failed", "failure"]:
+                    pytest.skip(f"Indexing failed: {task_response.json()}")
+                time.sleep(2)

192-200: Strengthen the “sees all partitions” assertion.

The test only checks that "documents" exists. To validate multi-partition coverage, assert returned partitions (and ensure partition2 has docs in the fixture).

✅ Suggested assertion

 def test_admin_search_all_sees_all_partitions(self, api_client, two_partitions_with_docs):
     """Admin with partitions=all can see all partitions (if SUPER_ADMIN_MODE)."""
+    partition1 = two_partitions_with_docs["partition1"]
+    partition2 = two_partitions_with_docs["partition2"]
     response = api_client.get(
         "/search",
         params={"partitions": "all", "text": "machine learning", "top_k": 10},
     )
     assert response.status_code == 200
     data = response.json()
-    assert "documents" in data
+    partitions = {doc["metadata"]["partition"] for doc in data.get("documents", [])}
+    assert partition1 in partitions
+    assert partition2 in partitions

tests/api_tests/test_openai_compat.py (2)

58-92: Harden indexing wait to reduce flakes.

The fixture doesn’t handle task_status_url and doesn’t fail fast on index errors.

♻️ Suggested update

         # Index file
         with open(sample_text_file, "rb") as f:
             response = api_client.post(
                 f"/indexer/partition/{partition}/file/chat-doc",
                 files={"file": ("test.txt", f, "text/plain")},
                 data={"metadata": "{}"},
             )
 
         # Wait for indexing
-        if response.status_code in [200, 201]:
-            data = response.json()
-            if "task_id" in data:
-                task_path = f"/indexer/task/{data['task_id']}"
-                for _ in range(30):
-                    task_response = api_client.get(task_path)
-                    if task_response.json().get("task_state") in ["SUCCESS", "COMPLETED"]:
-                        break
-                    time.sleep(2)
+        assert response.status_code in [200, 201], f"Failed to index file: {response.text}"
+        data = response.json()
+        if "task_status_url" in data:
+            task_path = "/" + "/".join(data["task_status_url"].split("/")[3:])
+        elif "task_id" in data:
+            task_path = f"/indexer/task/{data['task_id']}"
+        else:
+            task_path = None
+
+        if task_path:
+            for _ in range(30):
+                task_response = api_client.get(task_path)
+                state = task_response.json().get("task_state", "")
+                if state in ["SUCCESS", "COMPLETED", "success", "completed"]:
+                    break
+                if state in ["FAILED", "FAILURE", "failed", "failure"]:
+                    pytest.skip(f"Indexing failed: {task_response.json()}")
+                time.sleep(2)

131-175: Harden setup and indexing wait to reduce flakes.

Partition creation/indexing responses are unchecked and task_status_url isn’t handled.

♻️ Suggested update

-        # Create partitions
-        api_client.post(f"/partition/{partition1}")
-        api_client.post(f"/partition/{partition2}")
+        # Create partitions
+        resp = api_client.post(f"/partition/{partition1}")
+        assert resp.status_code in [200, 201], f"Failed to create {partition1}: {resp.text}"
+        resp = api_client.post(f"/partition/{partition2}")
+        assert resp.status_code in [200, 201], f"Failed to create {partition2}: {resp.text}"
@@
-        # Wait for indexing
-        if response.status_code in [200, 201]:
-            data = response.json()
-            if "task_id" in data:
-                task_path = f"/indexer/task/{data['task_id']}"
-                for _ in range(30):
-                    task_response = api_client.get(task_path)
-                    if task_response.json().get("task_state") in ["SUCCESS", "COMPLETED"]:
-                        break
-                    time.sleep(2)
+        # Wait for indexing
+        assert response.status_code in [200, 201], f"Failed to index file: {response.text}"
+        data = response.json()
+        if "task_status_url" in data:
+            task_path = "/" + "/".join(data["task_status_url"].split("/")[3:])
+        elif "task_id" in data:
+            task_path = f"/indexer/task/{data['task_id']}"
+        else:
+            task_path = None
+
+        if task_path:
+            for _ in range(30):
+                task_response = api_client.get(task_path)
+                state = task_response.json().get("task_state", "")
+                if state in ["SUCCESS", "COMPLETED", "success", "completed"]:
+                    break
+                if state in ["FAILED", "FAILURE", "failed", "failure"]:
+                    pytest.skip(f"Indexing failed: {task_response.json()}")
+                time.sleep(2)

CLAUDE.md

tests/api_tests/test_openai_compat.py

- Add comprehensive User Management & Authentication section covering: - Database schema (users, partition_memberships, partitions) - Authentication flow via AuthMiddleware - Role hierarchy (viewer < editor < owner) - Permission dependencies and SUPER_ADMIN_MODE behavior - User and partition membership API endpoints - Core implementation patterns via PartitionFileManager - Document multi-partition search behavior: - partitions=all resolves to user's accessible partitions - Admin with SUPER_ADMIN_MODE sees all system partitions - Model prefix is openrag- (legacy: ragondin-) Co-Authored-By: Claude Opus 4.5 <[email protected]>

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@tests/api_tests/test_openai_compat.py`:
- Around line 1-9: Run the Ruff formatter on the test module to fix style
issues: execute `ruff format` (or your editor's Ruff integration) for the
OpenAI-compatible tests file (the test_openai_compat.py module containing the
module docstring and imports) so the file adheres to project linting rules and
removes the CI Ruff failure.

In `@tests/api_tests/test_search.py`:
- Around line 109-139: The fixture two_partitions_with_docs should create and
verify both partitions and index a document into each so tests requiring
cross-partition data are valid; update the calls that create partitions
(api_client.post for f"/partition/{partition1}" and f"/partition/{partition2}")
to assert successful creation, then POST the sample_text_file to both
f"/indexer/partition/{partition1}/file/doc1" and
f"/indexer/partition/{partition2}/file/doc2"; when handling the indexing
response, prefer any returned task_status_url or task_id from response.json(),
poll that URL (or /indexer/task/{task_id}) with a bounded timeout, bail early on
explicit failure states and raise if neither SUCCESS nor COMPLETED is reached to
avoid flaky waits, and then yield the two-partition map.

🧹 Nitpick comments (1)

tests/api_tests/test_openai_compat.py (1)

58-85: Harden indexing wait logic and reuse it across fixtures.

Both fixtures only handle task_id and never assert completion, which can produce flaky tests when task_status_url is returned or indexing stalls. A small helper can make this reliable and DRY.

♻️ Proposed refactor

 import pytest

 from . import conftest

+def _wait_for_indexing(api_client, response):
+    assert response.status_code in [200, 201], f"Indexing failed: {response.text}"
+    data = response.json()
+    if "task_status_url" in data:
+        task_url = data["task_status_url"]
+        task_path = "/" + "/".join(task_url.split("/")[3:])
+    elif "task_id" in data:
+        task_path = f"/indexer/task/{data['task_id']}"
+    else:
+        return
+    for _ in range(30):
+        task_response = api_client.get(task_path)
+        if task_response.json().get("task_state") in ["SUCCESS", "COMPLETED"]:
+            return
+        time.sleep(2)
+    pytest.fail("Indexing did not complete in time")
+
 class TestChatCompletionsMultiPartition:
     """Test chat completions with multi-partition access."""
@@
         with open(sample_text_file, "rb") as f:
             response = api_client.post(
                 f"/indexer/partition/{partition}/file/chat-doc",
                 files={"file": ("test.txt", f, "text/plain")},
                 data={"metadata": "{}"},
             )
-
-        # Wait for indexing
-        if response.status_code in [200, 201]:
-            data = response.json()
-            if "task_id" in data:
-                task_path = f"/indexer/task/{data['task_id']}"
-                for _ in range(30):
-                    task_response = api_client.get(task_path)
-                    if task_response.json().get("task_state") in ["SUCCESS", "COMPLETED"]:
-                        break
-                    time.sleep(2)
+        _wait_for_indexing(api_client, response)
@@
         with open(file_path, "rb") as f:
             response = api_client.post(
                 f"/indexer/partition/{partition1}/file/doc1",
                 files={"file": ("test.txt", f, "text/plain")},
                 data={"metadata": "{}"},
             )
-
-        # Wait for indexing
-        if response.status_code in [200, 201]:
-            data = response.json()
-            if "task_id" in data:
-                task_path = f"/indexer/task/{data['task_id']}"
-                for _ in range(30):
-                    task_response = api_client.get(task_path)
-                    if task_response.json().get("task_state") in ["SUCCESS", "COMPLETED"]:
-                        break
-                    time.sleep(2)
+        _wait_for_indexing(api_client, response)

Also applies to: 131-166

tests/api_tests/test_openai_compat.py

tests/api_tests/test_search.py

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@tests/api_tests/test_openai_compat.py`:
- Around line 129-165: The fixture two_partitions_for_chat only indexes a
document into partition1, leaving partition2 empty and allowing false-positive
access-control results; update the fixture to (1) verify partition creation
responses for both partition1 and partition2 after calling api_client.post, (2)
create and POST a second file (or reuse the same file) to
/indexer/partition/{partition2}/file/doc2 so partition2 contains indexed data,
and (3) replicate the existing indexing wait loop (checking
task_response.json().get("task_state") for "SUCCESS"/"COMPLETED") for the task
returned when indexing partition2 to ensure indexing finished before yielding
the partitions.

🧹 Nitpick comments (1)

tests/api_tests/test_openai_compat.py (1)

73-83: Make the indexing wait fail-fast and handle task_status_url.
Right now the fixture can proceed without a completed index if only task_status_url is returned or if the task never completes.

Proposed refinement

-        # Wait for indexing
-        if response.status_code in [200, 201]:
-            data = response.json()
-            if "task_id" in data:
-                task_path = f"/indexer/task/{data['task_id']}"
-                for _ in range(30):
-                    task_response = api_client.get(task_path)
-                    if task_response.json().get("task_state") in ["SUCCESS", "COMPLETED"]:
-                        break
-                    time.sleep(2)
+        assert response.status_code in [200, 201], f"Indexing failed: {response.text}"
+        data = response.json()
+        if "task_status_url" in data:
+            task_url = data["task_status_url"]
+            task_path = "/" + "/".join(task_url.split("/")[3:])
+        elif "task_id" in data:
+            task_path = f"/indexer/task/{data['task_id']}"
+        else:
+            pytest.fail("Indexing did not return task info")
+        for _ in range(30):
+            task_response = api_client.get(task_path)
+            state = task_response.json().get("task_state", "")
+            if state in ["SUCCESS", "COMPLETED", "success", "completed"]:
+                break
+            if state in ["FAILED", "FAILURE", "failed", "failure"]:
+                pytest.fail(f"Indexing failed: {task_response.json()}")
+            time.sleep(2)
+        else:
+            pytest.fail("Indexing did not complete in time")

tests/api_tests/test_openai_compat.py

- Add auth support to test fixtures (AUTH_TOKEN, created_user, user_client) - Add multi-partition search tests: - test_search_all_partitions - test_user_search_all_only_sees_own_partitions - test_user_cannot_search_unauthorized_partition - test_admin_search_all_sees_all_partitions - Add chat completions multi-partition tests: - test_chat_completions_with_openrag_all - test_chat_completions_with_specific_partition - test_models_list_includes_openrag_all - Add chat completions user access tests: - test_user_chat_all_only_uses_own_partitions - test_user_cannot_chat_unauthorized_partition - test_user_models_list_only_shows_accessible - Enable AUTH_TOKEN in test docker-compose for auth testing Co-Authored-By: Claude Opus 4.5 <[email protected]>

paultranvan changed the title ~~Search multiple partitions~~ Doc and tests for search multiple partitions Jan 29, 2026

paultranvan added the chore No production code impact, typically improve tooling, code quality, etc label Jan 29, 2026

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

CLAUDE.md Show resolved Hide resolved

tests/api_tests/test_openai_compat.py Show resolved Hide resolved

paultranvan force-pushed the search-multiple-partitions branch from 5ad06d1 to 24c31f0 Compare January 29, 2026 11:10

paultranvan force-pushed the search-multiple-partitions branch from 24c31f0 to 26361a6 Compare January 29, 2026 11:15

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

tests/api_tests/test_openai_compat.py Show resolved Hide resolved

tests/api_tests/test_search.py Show resolved Hide resolved

paultranvan force-pushed the search-multiple-partitions branch from 26361a6 to 48689d1 Compare January 29, 2026 11:17

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

tests/api_tests/test_openai_compat.py Show resolved Hide resolved

paultranvan and others added 3 commits January 29, 2026 12:25

fix: CI lint

eada6e1

test: Remove useless test

b4d63a3

paultranvan force-pushed the search-multiple-partitions branch from 48689d1 to b4d63a3 Compare January 29, 2026 11:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Doc and tests for search multiple partitions #227

Doc and tests for search multiple partitions #227

Uh oh!

paultranvan commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 29, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Doc and tests for search multiple partitions #227

Are you sure you want to change the base?

Doc and tests for search multiple partitions #227

Uh oh!

Conversation

paultranvan commented Jan 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

paultranvan commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 29, 2026 •

edited

Loading