System Design

Experience Navigator Architecture

A portfolio feature designed to demonstrate AI product thinking: explainability, human-in-the-loop workflow, grounding, and structured output. No hallucinations — every response is drawn from the actual CV context.

← Back to portfolio

Request Pipeline

1

User submits query

Job description or topic text, up to 500 characters. Response format selected (Fit Summary, Recruiter Brief, Matched Bullets, Case Study).

2

Sanitization (Hono API)

Zod schema validation + NFKC unicode normalization. tiktoken token count gate (≤300 tokens). Special character ratio check. 12-pattern injection regex. In-process sliding window rate limit (10 req/60s per IP).

3

TF-IDF Concept Extraction

natural.TfIdf runs query against 50-term domain vocabulary. Top 8 concepts scored by salience (0–100), tiered as primary/supporting/context. Explicit mentions get +2.0 score boost. No LLM used in this step.

4

Human weight adjustment

User sees extracted concepts with TF-IDF salience scores. Drag to reorder, slide to adjust priority weight (0–10). Dismiss irrelevant concepts. This step makes the system auditable and demonstrates explainable AI.

5

Cache check

SHA-256 key on (query + responseMode). LRU cache with 5-min TTL and 10MB max size. Cache hit returns immediately with cached: true flag.

6

LLM Generation (Claude 3.5 Haiku)

Full cv.md passed as single context block in <experience> tags. Concepts sorted by userWeight descending — higher weight = explicit emphasis instruction. generateObject with Zod schema. temperature: 0. 30s timeout. Proxied via Helicone for observability.

7

Structured response

Returns: answer (format-specific), citedRoles (which roles were cited), confidence (high/medium/low). Evidence panel shows grounding sources. Response cached for future requests.

System Topology

  Browser (Vercel)                Fly.io (always-warm)
  ─────────────────               ──────────────────────────────────
  Next.js 15 App Router           Hono Node.js server
  │                               │
  ├── / (homepage)                ├── POST /extract
  │   └── AlgorithmCanvas             │  sanitize() → extractConcepts()
  │       └── Canvas 2D               │  → ConceptScore[]
  │           Dijkstra / BFS /        │
  │           MergeSort           ├── POST /generate
  │                                   │  cache check → sanitize()
  ├── /navigator                      │  → generateResponse()
  │   └── ExperienceNavigator         │  → LRU cache set
  │       ├── useReducer              │
  │       │   (state machine)     lib/
  │       ├── AnimatePresence      ├── cv.ts        (cv.md loaded once)
  │       │   (SPA transitions)    ├── cache.ts     (lru-cache v11)
  │       ├── InputScreen          ├── sanitize.ts  (4-step pipeline)
  │       ├── ConceptScreen        ├── extraction.ts (TF-IDF)
  │       │   └── ConceptPanel     └── generation.ts (Helicone→Anthropic)
  │       │       └── ConceptCard
  │       │           (@dnd-kit)
  │       ├── OutputScreen
  │       └── EvidenceScreen
  │
  └── /system-design              External Services
                                  ─────────────────
  lib/api.ts                      Helicone (proxy)
  └── extractConcepts()           └── Anthropic Claude 3.5 Haiku
      generateResponse()              temperature: 0, generateObject

Technology Stack

LayerTechnology
FrontendNext.js 15 (App Router)
AnimationsMotion v12 (AnimatePresence)
Drag & Drop@dnd-kit/sortable
API ServerHono on Fly.io (Node.js 22)
LLMClaude 3.5 Haiku via Vercel AI SDK v6
LLM ProxyHelicone
Concept Extractionnatural (TF-IDF)
Token Countingtiktoken (cl100k_base)
Input ValidationZod v4
Rate LimitingIn-process sliding window Map
Output Cachelru-cache v11
StylingTailwind CSS v4
Monorepopnpm workspaces + Turborepo

Architecture Decision Records

ADR-001Full CV as single context window, no vector DB
Accepted
Context
CV is ~2,000 tokens — well within Claude 3.5 Haiku's 200K context window. Many portfolio sites chunk the CV and use embeddings for retrieval.
Decision
Pass the entire cv.md as a single <experience> block in the system prompt. No vector DB, no chunking, no embeddings.
Consequences
Simpler architecture. Every response has full CV context — no missed references. Slight token cost increase ($0.001/request at Haiku pricing). No hallucination risk from retrieval errors.
ADR-002TF-IDF for concept extraction, not an LLM
Accepted
Context
Concept extraction could use an LLM call. This would add latency, cost, and nondeterminism to a step that benefits from speed and explainability.
Decision
Use natural.TfIdf with a 50-term domain vocabulary seeded from the CV. Query as document, vocabulary terms as corpus. Salience = normalized TF-IDF score.
Consequences
Extraction is instant (<5ms), deterministic, free, and auditable. The vocabulary is the explicit model of what the CV contains — users can see exactly why concepts were chosen.
ADR-003Dedicated Fly.io server over Vercel serverless functions
Accepted
Context
Vercel serverless functions would cold-start on every request and cannot hold in-memory state (LRU cache, rate limit Map) across invocations.
Decision
Deploy a Hono/Node.js server to Fly.io with auto_stop_machines=false and min_machines_running=1. Single always-warm 256MB instance.
Consequences
No cold starts. Shared in-memory LRU cache and rate limiter work correctly. Cost: $0 (Fly.io free tier covers one always-on shared-cpu-1x 256MB machine). Trade-off: single instance means no horizontal scaling — acceptable for a portfolio demo.
ADR-004Human-in-the-loop concept weight adjustment
Accepted
Context
The system could skip concept review and go straight to generation. This is faster but opaque.
Decision
Add a mandatory concept review step where users see the extracted themes, their TF-IDF salience scores, and can adjust priority weights (0–10) and reorder via drag-and-drop.
Consequences
Demonstrates explainability and human-in-the-loop design. User weights are injected into the prompt sorted by weight descending — higher weight = explicit emphasis instruction to Claude. Adds 1 interaction step but makes the system auditable.
ADR-005Structured LLM output with Zod schema
Accepted
Context
Free-text LLM output requires parsing and validation. Structured output with generateObject gives type-safe responses.
Decision
Use Vercel AI SDK generateObject with a Zod schema: { answer: string, citedRoles: string[], confidence: "high" | "medium" | "low" }. temperature: 0.
Consequences
Type-safe responses. citedRoles enables the evidence panel to show exactly which roles were cited. confidence enables the UI badge. temperature: 0 makes outputs deterministic (LRU cache effective).