Autonomous Backend Refactor & Hardening Agent (Claude Sonnet 4)
Metapod is an autonomous refactor-and-hardening agent designed for production-grade backends. It plans, researches, edits, tests, validates, and ships small, reversible PR-sized changes until the task is truly done. It follows strict architectural and security standards; prefers Lovable (UI prototyping) + Supabase (DB/auth) + GitHub (SCM/CI).
- True autonomy: Keeps going until user's request is completely solved end-to-end
- No premature handoff: Only finishes after all TODO items are verified complete
- Small, safe steps: Prefers many small, safe steps with verification over large risky edits
- Resume capability: Can locate last incomplete TODO and proceed from there
- Fresh intelligence: Assumes training data is stale, fetches current documentation
- Authoritative sources: Uses search β fetch β follow links for definitive guidance
- Ground truth: Summarizes findings and cites URLs in comments/PR notes
- Cross-validation: Checks conflicting sources, prefers official docs
- Pure core: Keep use-cases pure; isolate I/O in adapters
- Authorization boundary: Centralized authz at use-case boundary
- No inline permissions: No ad-hoc authz in handlers
- Domain vs Infra vs Programmer errors β RFC 9457 Problem Details
- Fail closed: Clear problem+json responses
- No sensitive leakage: Proper error sanitization
- Timeouts: On every outbound call
- Retries: Jittered exponential backoff for transient failures
- Circuit breakers: Protect against cascade failures
- Graceful shutdown: Clean deployments and restarts
- Structured logs: req_id/user/tenant/build SHA
- RED metrics: Rate, Errors, Duration across all boundaries
- Distributed tracing: Cross-service correlation
- No PII/secrets: Redaction filters with tests
- All edges: Body/query/headers validated
- Fail closed: Clear problem+json on validation failure
- Type safety: Schema-driven validation
- Environment-based: Secrets via env/secret store
- Least privilege: Minimal DB/cloud roles
- No code secrets: Zero secrets in codebase
- Rotation ready: Support for secret rotation
- ASVS alignment: Address Application Security Verification Standard
- API Top 10: Map addressed items in PR descriptions
- PII handling: Tagging, retention, deletion/export paths
git clone https://github.com/NJRca/Metapod.git
cd Metapod
pip install -r requirements.txt# Let Metapod autonomously refactor your backend
python cli.py /path/to/your/project "Implement hexagonal architecture"
# Add error handling with RFC 9457 compliance
python cli.py /path/to/your/project "Add standardized error handling"
# Full production hardening
python cli.py /path/to/your/project "Harden for production deployment"python cli.py /path/to/your/project --interactiveInteractive commands:
refactor <description>- Start autonomous refactoringresearch <topic>- Research best practices for a topicstatus- Show current TODO progresshelp- Show all commands
# Use custom configuration
python cli.py /path/to/project --config metapod.yaml "Refactor request"
# Dry run to see planned changes
python cli.py /path/to/project --dry-run "Show planned changes"
# Verbose logging for debugging
python cli.py /path/to/project --verbose "Debug mode refactoring"Metapod follows a systematic 8-phase approach:
- Restate user's goal and identify risks
- Identify entrypoints, side-effects, acceptance criteria
- Create concise TODO list
- Enumerate dependencies and versions
- Check for EOL/LTS mismatches
- Capture current behavior via characterization tests
- Instrument basic observability if missing
- Produce phased, reversible refactor plan (2-5 steps max per PR)
- Each step shippable behind flags
- Specify contracts: ports, error model, idempotency rules
- For each unknown: search β fetch β follow links
- Obtain authoritative guidance from primary sources
- Record "Research Notes" with citations
- Minimal, verifiable changes
- After each change: compile/typecheck, run tests
- Enforce timeouts, retries, breakers, validation
- Update/add: unit, property-based, contract, smoke/e2e tests
- Run performance smoke tests
- Record before/after metrics
- Ensure logs/metrics/traces include correlation IDs
- Add/adjust dashboards and alerts
- Document runbooks for on-call
- Open PR with comprehensive checklist
- Ship behind feature flags if behavioral risk
- Provide rollback and verification steps
Metapod automatically tracks and displays progress:
TODO - Metapod Refactoring Session:
β
Scope & acceptance criteria confirmed
β
Baseline tests/telemetry in place
β
Plan approved (small reversible cuts)
β³ Implement step 1 (inputs validated, errors standardized)
β³ Tests green (unit/contract/property)
β³ Observability updated (logs/metrics/traces)
β³ PR opened with checklist & research notes
β³ Rollout plan & rollback documented
- UI: Lovable for rapid prototyping
- Database: Supabase Postgres with RLS
- Auth: Supabase Auth with row-level security
- Storage: Supabase Storage
- SCM: GitHub with Actions CI/CD
- Hosting: Vercel for automatic deployments
- Validation: Pydantic schemas
- HTTP Client: httpx with tenacity retry
- Logging: structlog with correlation IDs
- ORM: SQLAlchemy with async unit-of-work
- Testing: pytest with async support
- Validation: Zod schemas
- HTTP Client: undici with retry/breaker
- Logging: pino structured logging
- ORM: Prisma with connection pooling
- Testing: Jest with supertest
- Validation: validator package
- HTTP Client: net/http with context timeouts
- Logging: zerolog structured logs
- ORM: GORM or sqlc for type safety
- Testing: testify suite
src/
βββ core/
β βββ domain/ # Pure business models
β βββ use_cases/ # Business logic orchestration
β βββ ports/ # Interface definitions
βββ adapters/
β βββ web/ # HTTP handlers & middleware
β βββ database/ # Repository implementations
β βββ external/ # Third-party service clients
βββ infrastructure/
βββ config/ # Configuration management
βββ logging/ # Structured logging setup
βββ metrics/ # Observability infrastructure
{
"type": "https://api.example.com/errors/validation-failed",
"title": "Request validation failed",
"status": 400,
"detail": "The 'email' field must be a valid email address",
"instance": "/users/create",
"trace_id": "abc123-def456-ghi789"
}# Timeout example
async with httpx.AsyncClient(timeout=httpx.Timeout(30.0)) as client:
response = await client.get(url)
# Retry with backoff
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential_jitter(initial=1, max=10)
)
async def call_external_service():
pass
# Circuit breaker
@circuit_breaker(failure_threshold=5, timeout=60)
async def fragile_operation():
pass- β Type-checks (mypy/TypeScript/Go vet)
- β Lint/format compliance (black/prettier/gofmt)
- β Unit tests >90% coverage
- β Contract tests for all ports
- β SCA (dependency vulnerability scan)
- β SAST (static code analysis)
- β Secret scanning
- β API lint compliance
- β P95 latency budget enforced
- β Error rate budget (<1%)
- β Bundle size limits (where applicable)
- β Memory usage thresholds
Metapod automatically generates:
- Context and rationale for changes
- Implementation consequences
- References to standards and patterns
- Comprehensive checklists
- Security and reliability assessments
- Performance impact analysis
- Rollback procedures
- Health check endpoints
- Key metrics and alert thresholds
- Troubleshooting procedures
- Emergency contacts
git clone https://github.com/NJRca/Metapod.git
cd Metapod
pip install -r requirements.txt
python -m pytest test_metapod.py -v# Unit tests
python -m pytest test_metapod.py
# Integration tests
python -m pytest test_metapod.py::TestIntegration
# With coverage
python -m pytest --cov=metapod test_metapod.py# Format code
black *.py
# Type checking
mypy metapod.py
# Linting
flake8 *.pyMIT License - see LICENSE file for details.
- Documentation: Check this README and inline code documentation
- Issues: Open GitHub issues for bugs or feature requests
- Discussions: Use GitHub Discussions for questions and ideas
Metapod: Evolving your backend into its best form π‘οΈβπ¦