# ROADMAP — PowerRepo Development Tracker

> **Single source of truth** for what exists, what's planned, and what's in progress.
> Updated: 2026-05-28

---

## Legend

| Symbol | Meaning |
|--------|---------|
| ✅ | Done, tested |
| 🔧 | In progress |
| 📋 | Planned |
| ⚠️ | Blocked or has known issues |
| ❌ | Deprecated / Won't do |

---

## Epic 1: RAG Pipeline — Knowledge Base

**ADR:** [ADR-001](docs/adr/001-rag-pipeline-architecture.md)

### Core Pipeline

| ID | Feature | Status | Priority | ADR | Dependencies |
|----|---------|--------|----------|-----|-------------|
| RAG-01 | `rag_min_score` threshold in retrieval | 📋 | **P0** | ADR-001 | — |
| RAG-02 | Configure `MISTRAL_API_KEY` for OCR | 📋 | **P0** | ADR-001 | — |
| RAG-03 | Fix cache hit `chunks_count` (SELECT bug) | 📋 | **P1** | ADR-001 | — |
| RAG-04 | Defensive JSON parsing on embedding API | 📋 | **P1** | ADR-001 | — |
| RAG-05 | Populate `token_count` in rag_documents | 📋 | P2 | ADR-001 | — |
| RAG-06 | HyDE fallback logging | 📋 | P2 | ADR-001 | — |
| RAG-07 | Batch size validation for embedding API | 📋 | P3 | ADR-001 | — |
| RAG-08 | Web Scraping source type | 📋 | P2 | — | RAG-02 |
| RAG-09 | Batch upload (multiple files) | 📋 | P3 | — | — |
| RAG-10 | File replacement (PUT endpoint) | 📋 | P3 | — | — |
| RAG-11 | Reprocess without re-upload | 📋 | P2 | — | RAG-02 |
| RAG-12 | Re-ranking (cross-encoder) | 📋 | P3 | — | RAG-01 |
| RAG-13 | Retrieval evaluation framework | 📋 | P2 | — | RAG-01 |
| RAG-14 | Scheduled re-indexing | 📋 | P3 | — | — |
| RAG-15 | Firecrawl API integration | 📋 | P1 | — | RAG-08 |
| RAG-16 | Scheduled scraping with cron expressions | 📋 | P1 | — | RAG-15 |
| RAG-17 | Scraping configuration CRUD API | 📋 | P1 | — | RAG-15 |
| RAG-18 | Change detection via content hash | 📋 | P1 | — | RAG-15 |
| RAG-19 | Scraping config UI (SourceManager) | 📋 | P2 | — | RAG-17 |

### Frontend (SourceManager)

| ID | Feature | Status | Priority | Dependencies |
|----|---------|--------|----------|-------------|
| UI-01 | Fix login redirect bug (sandbox) | 📋 | P1 | — |
| UI-02 | Fix i18n missing keys (`POWER_AGENTS.SETTINGS_TITLE`) | 📋 | P2 | — |
| UI-03 | Remove duplicate `test_rag_doc.txt` in DB | 📋 | P2 | — |

---

## Epic 2: Development Infrastructure

**ADR:** TBD

| ID | Feature | Status | Priority | Dependencies |
|----|---------|--------|----------|-------------|
| INFRA-01 | ROADMAP.md (this file) | ✅ | — | — |
| INFRA-02 | ADR template + ADR-001 | ✅ | — | — |
| INFRA-03 | `run-tests` skill (exhaustive testing) | 📋 | **P0** | — |
| INFRA-04 | `review-feature` skill (multi-agent review) | 📋 | P1 | INFRA-03 |
| INFRA-05 | CHANGELOG.md (conventional commits) | 📋 | P1 | — |
| INFRA-06 | Feature doc template | 📋 | P1 | INFRA-02 |
| INFRA-07 | Architecture references doc (Substack) | 📋 | P1 | — |

---

## Epic 3: Agent Runner

| ID | Feature | Status | Priority | Dependencies |
|----|---------|--------|----------|-------------|
| AR-01 | Agent worker — display_name → UUID fallback | 🔧 | P1 | — |
| AR-02 | Agent worker — run_agent silent failure logging | 📋 | P1 | AR-01 |
| AR-03 | Tool execution — max iterations guard | ✅ | — | — |
| AR-04 | Tool execution — execute_tool_node | ✅ | — | — |
| AR-05 | NATS JetStream — publisher + consumer | ✅ | — | — |
| AR-06 | Admin API — tool CRUD with json.dumps | ✅ | — | — |
| AR-07 | Admin API — power_agents_controller Ruby fixes | ✅ | — | — |
| AR-08 | Multi-agent concurrency safety (locks, rate limiting) | 📋 | P1 | — | — |

---

## Epic 4: Cost Optimization

**Refs:** [references.md](docs/architecture/references.md) #8, #9, #10

| ID | Feature | Status | Priority | Dependencies |
|----|---------|--------|----------|-------------|
| COST-01 | Model routing by task type (nano/mini/full) | 📋 | **P0** | — |
| COST-02 | `max_tokens` explicit on every LLM call | 📋 | **P0** | — |
| COST-03 | Conversation window management (truncate/summarize) | 📋 | **P0** | — |
| COST-04 | Prompt caching structure (static first, dynamic last) | 📋 | P1 | COST-01 |
| COST-05 | Semantic caching via Redis | 📋 | P1 | — |
| COST-06 | Reasoning token monitoring (GPT-5) | 📋 | P2 | — |
| COST-07 | Batch processing for document ingestion | 📋 | P2 | — |
| COST-08 | Cost attribution per endpoint/feature | 📋 | P2 | — |

---

## Epic 5: Quality & Safety

**Refs:** [references.md](docs/architecture/references.md) #2, #7, #11, #12, #13

| ID | Feature | Status | Priority | Dependencies |
|----|---------|--------|----------|-------------|
| QUAL-01 | Code-based eval checks (JSON schema, regex, required fields) | 📋 | **P0** | — |
| QUAL-02 | `max_iterations` guard on agent loops | ✅ | — | — |
| QUAL-03 | LLM as Judge with rubric (3-5 criteria) | 📋 | P1 | QUAL-01 |
| QUAL-04 | Golden set (50+ production examples) | 📋 | P1 | QUAL-01 |
| QUAL-05 | Guardrails — PII detection | 📋 | P1 | — |
| QUAL-06 | Guardrails — toxicity/content filtering | 📋 | P2 | QUAL-05 |
| QUAL-07 | Human-in-the-loop for high-risk agent actions | 📋 | P1 | — |
| QUAL-08 | Policy layer (OPA or similar) | 📋 | P2 | QUAL-07 |
| QUAL-09 | Shadow mode for new policies | 📋 | P3 | QUAL-08 |
| QUAL-10 | Continuous eval pipeline in CI | 📋 | P2 | QUAL-03, QUAL-04 |
| QUAL-11 | Langfuse/LangSmith tracing | 📋 | P2 | — |
| QUAL-12 | Pass^K metric for agent reliability | 📋 | P1 | QUAL-01 |
| QUAL-13 | Eval harness (tasks → trials → transcripts → graders) | 📋 | P1 | QUAL-01 |
| QUAL-14 | Context engineering — progressive disclosure (tool-based retrieval) | 📋 | P1 | — |
| QUAL-15 | Agent note-taking (external scratchpad) | 📋 | P2 | QUAL-14 |
| QUAL-16 | Compaction — summarization + tool clearing | 📋 | P2 | COST-03 |
| QUAL-17 | Outcome-oriented evaluation (grade final state, not trajectory) | 📋 | P1 | QUAL-13 |
| QUAL-18 | Parallel guardrails with inference | 📋 | P2 | QUAL-05 |

---

## Epic 6: FastAPI Concurrency

**Refs:** [references.md](docs/architecture/references.md) #15

| ID | Feature | Status | Priority | Dependencies |
|----|---------|--------|----------|-------------|
| FAST-01 | Audit all async def routes for blocking calls | 📋 | **P0** | — |
| FAST-02 | Load models in lifespan, not endpoint | 📋 | P1 | — |
| FAST-03 | Externalize local inference to dedicated server (vLLM/TGI) | 📋 | P2 | FAST-02 |
| FAST-04 | BackgroundTasks for inference >30s | 📋 | P2 | — |

---

## Quick Stats

| Metric | Count |
|--------|-------|
| Total features tracked | 65 |
| Done | 10 |
| In progress | 1 |
| Planned | 54 |
| P0 (blocking) | 9 |
| P1 (high) | 24 |
| P2 (medium) | 21 |
| P3 (low) | 7 |

---

## Next Up (Priority Order)

1. **INFRA-03** — `run-tests` skill (blocks all future feature testing)
2. **RAG-01** — `rag_min_score` threshold (1 line, high impact)
3. **RAG-02** — Mistral OCR key (unlocks PDF/DOCX/PPTX/images)
4. **COST-01** — Model routing by task type
5. **COST-02** — `max_tokens` on every LLM call
6. **COST-03** — Conversation window management
7. **QUAL-01** — Code-based eval checks
8. **RAG-03** — Cache hit bug fix (1 line)