Concordance

Concordance is an open-source inference stack that lets you observe, modify, and control LLM generation in real-time. It provides:

Quote Engine — An inference server with a programmable mod system for token-level intervention
Thunder Backend — Observability service that captures full inference traces
Web UI — Frontend for exploring traces, viewing mod actions, and debugging generation
CLI — Command-line tool for local development and mod management

Architecture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Frontend  │────▶│   Backend   │◀────│   Engine    │
│  (React/TS) │     │ (Rust/Axum) │     │  (Python)   │
└─────────────┘     └─────────────┘     └─────────────┘
     :3000              :6767               :8000

Quick Start

The fastest way to get Concordance running is with our interactive setup script.

Prerequisites

Before running the setup script, make sure you have the following installed:

Tool	Purpose	Installation
uv	Python package manager	`curl -LsSf https://astral.sh/uv/install.sh \| sh`
Rust	Backend and CLI	`curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs \| sh`
Node.js 18+	Frontend	Download from nodejs.org or use your package manager
psql	Database migrations	`brew install postgresql` (macOS) or `apt install postgresql-client` (Linux)

You'll also need:

A Hugging Face account with an API token (get one here)

Before Running Setup

1. Set up a PostgreSQL database

The backend requires a PostgreSQL database. We recommend Neon for a free, serverless Postgres:

Create an account at neon.tech
Create a new project
Copy your connection string from the dashboard

Your connection string will look like:

postgresql://user:[email protected]/dbname?sslmode=require

2. Get your Hugging Face token

Go to huggingface.co/settings/tokens
Create a new token with read access
Copy the token (starts with hf_)

Run the Setup Script

Clone the repository and run the interactive setup:

git clone https://github.com/concordance-co/quote.git
cd quote
./setup.sh

The setup script will guide you through configuring all components:

Step	What it configures
Prerequisites Check	Verifies uv, Rust, Node.js, npm are installed; optionally installs missing tools
Backend Setup	Database URL, server host/port, bootstrap secret, playground settings
Database Migrations	Runs SQL migrations to create required tables
Engine Setup	HF token, admin key, model ID, deployment mode (local/Modal), server settings
Frontend Setup	API URL, WebSocket URL for real-time streaming
Dependency Installation	Builds backend, installs Python/Node packages

You can also run setup for individual components:

./setup.sh --quick backend   # Set up only the backend
./setup.sh --quick engine    # Set up only the engine
./setup.sh --quick frontend  # Set up only the frontend
./setup.sh --quick all       # Set up everything (non-interactive defaults)

Start the Services

After setup, use the run script to start all services:

./run.sh start          # Start all services
./run.sh status         # Check service status
./run.sh logs engine    # View engine logs
./run.sh stop           # Stop all services

Or start services individually:

./run.sh start backend
./run.sh start engine
./run.sh start frontend

Once running:

Frontend: http://localhost:3000
Backend API: http://localhost:6767
Engine API: http://localhost:8000

Ready to build your first mod? Visit docs.concordance.co to get started!

Manual Setup

If you prefer to set things up manually, follow these steps:

Step 1: Set Up the Database

The backend requires a PostgreSQL database. We recommend Neon for a free, serverless Postgres:

Create an account at neon.tech
Create a new project
Copy your connection string from the dashboard (looks like postgresql://user:[email protected]/dbname)

See Neon's quickstart guide for detailed instructions.

Step 2: Start the Backend

cd backend
cp .env.example .env

Edit .env and set your database URL:

DATABASE_URL=postgresql://user:[email protected]/dbname?sslmode=require

Run database migrations:

./run_migration.sh

Then start the server:

cargo run

Verify it's running:

curl http://localhost:6767/healthz

Step 3: Start the Engine

The engine runs the LLM inference with mod support.

cd engine

Create an inference/.env file with your Hugging Face token:

HF_TOKEN=hf_your_token_here
MODEL_ID=modularai/Llama-3.1-8B-Instruct-GGUF

Install dependencies and start the server:

uv sync --all-packages
uv pip install -e inference
uv run -m quote.server.openai.local --host 0.0.0.0 --port 8000

Note: First run downloads the model and compiles it, which takes several minutes. Subsequent starts are faster.

Test the engine:

curl http://localhost:8000/v1/models

Step 4: Start the Frontend

cd frontend
npm install
npm run dev

Open http://localhost:3000 in your browser.

Writing Mods

Mods let you intercept and modify inference at the token level. Here's a simple example:

from quote_mod_sdk import mod, ForwardPassEvent, tokenize

@mod
def inject_thinking(event, actions, tokenizer):
    if isinstance(event, ForwardPassEvent) and event.step == 0:
        tokens = tokenize("<think>", tokenizer)
        return actions.force_tokens(tokens)
    return actions.noop()

Upload mods to a running server:

# Install the CLI first
cargo install --path cli

# Upload your mod
concai mod upload --file-name my_mod.py

Then enable the mod in your API calls by appending the mod name to the model:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "modularai/Llama-3.1-8B-Instruct-GGUF/inject_thinking",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

See engine/sdk/README.md for the full mod authoring guide, or visit docs.concordance.co to build your first mod!

Deployment (Modal)

For GPU inference in production, deploy the engine to Modal:

cd engine/inference
modal serve src/quote/server/openai/remote.py

Modal provides serverless GPU instances that scale to zero when not in use. See engine/inference/README.md for full deployment details.

Component Documentation

Component	Description	Docs
Engine	Inference server with mod system	engine/inference/README.md
Mod SDK	Python SDK for authoring mods	engine/sdk/README.md
Backend	Observability and logging service	backend/README.md
CLI	Command-line tool	cli/README.md
Frontend	Web UI	frontend/README.md

Configuration Reference

Backend

Variable	Required	Description
`DATABASE_URL`	Yes	Postgres connection string
`APP_HOST`	No	Server bind address (default: `127.0.0.1`)
`APP_PORT`	No	Server port (default: `6767`)
`BOOTSTRAP_SECRET`	No	Secret for creating initial admin API key
`PLAYGROUND_ADMIN_KEY`	No	Admin key for playground feature
`PLAYGROUND_LLAMA_8B_URL`	No	Modal URL for Llama 8B playground
`PLAYGROUND_QWEN_14B_URL`	No	Modal URL for Qwen 14B playground

Engine

Variable	Required	Description
`HF_TOKEN`	Yes*	Hugging Face token for model downloads
`MODEL_ID`	No	Model to load (default: `modularai/Llama-3.1-8B-Instruct-GGUF`)
`ADMIN_KEY`	No	Admin key for authenticated operations
`HOST`	No	Server bind address (default: `0.0.0.0`)
`PORT`	No	Server port (default: `8000`)
`USERS_PATH`	No	Path to users JSON (default: `./users/users.json`)
`MODS_BASE`	No	Base path for mods storage (default: `./mods`)
`QUOTE_LOG_INGEST_URL`	No	Backend URL for sending inference logs

Frontend

Variable	Required	Description
`VITE_API_URL`	No	Backend API URL (default: `/api`)
`VITE_WS_URL`	No	WebSocket URL for log streaming (default: `ws://localhost:6767`)

See each component's .env.example for all available options.

Project Structure

concordance/
├── backend/          # Rust observability service (Thunder)
├── cli/              # Rust CLI tool (concai)
├── engine/
│   ├── inference/    # Python inference server (Quote)
│   ├── sdk/          # Mod SDK
│   └── shared/       # Shared utilities
├── frontend/         # React web UI
├── scripts/          # Build and release scripts
├── setup.sh          # Interactive setup script
└── run.sh            # Service management script

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Make your changes
Run tests (cargo test, uv run pytest, npm test)
Submit a pull request

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.claude/skills/concordance-mod		.claude/skills/concordance-mod
artifacts/cli		artifacts/cli
backend		backend
cli		cli
docs		docs
engine		engine
frontend		frontend
scripts		scripts
.gitignore		.gitignore
Injection Examples - Sheet1.csv		Injection Examples - Sheet1.csv
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
run.sh		run.sh
setup.sh		setup.sh
versions.toml		versions.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Concordance

Table of Contents

Architecture

Quick Start

Prerequisites

Before Running Setup

Run the Setup Script

Start the Services

Manual Setup

Step 1: Set Up the Database

Step 2: Start the Backend

Step 3: Start the Engine

Step 4: Start the Frontend

Writing Mods

Deployment (Modal)

Component Documentation

Configuration Reference

Backend

Engine

Frontend

Project Structure

Contributing

License

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

concordance-co/quote

Folders and files

Latest commit

History

Repository files navigation

Concordance

Table of Contents

Architecture

Quick Start

Prerequisites

Before Running Setup

Run the Setup Script

Start the Services

Manual Setup

Step 1: Set Up the Database

Step 2: Start the Backend

Step 3: Start the Engine

Step 4: Start the Frontend

Writing Mods

Deployment (Modal)

Component Documentation

Configuration Reference

Backend

Engine

Frontend

Project Structure

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages