Architecture¶
This page provides a high-level overview of Sovara's architecture and how its components work together.
System Overview¶
Sovara consists of three main processes that work together:

1. User Program (Green)¶
The user launches their program with so-record script.py. This feels exactly like running python script.py - same terminal I/O, same crash behavior, and debugger support.
Key point: User code runs completely unmodified. Sovara uses monkey patching to intercept LLM calls and content-based matching to detect dataflow edges.
Components:
- Agent Runner (
agent_runner.py) - Wraps the user's Python command. Sets up the environment, connects to the server, applies monkey patches, then executes the user's program. - Monkey Patching (
monkey_patching/) - Intercepts LLM API calls to record inputs/outputs. - String Matching (
string_matching.py) - Detects dataflow edges using content-based matching.
2. Development Server (Blue)¶
The core analysis engine that receives events from the user process and updates the UI.
Responsibilities:
- Receives LLM call events from the runner
- Builds and maintains the dataflow graph
- Manages LLM call caching
- Handles user edits to inputs/outputs
- Controls the UI
Communication: All messages flow through a TCP socket (default port: 5959).
3. UI (Red)¶
The VS Code extension (or web app) that displays the dataflow graph and provides interactive controls.
Features:
- Visualizes the dataflow graph
- Allows editing of LLM inputs/outputs
- Triggers re-runs with modifications
- Shows run history
Content-Based Edge Detection¶
Sovara detects dataflow between LLM calls using content-based matching:
- Store outputs: When an LLM call completes, all text strings from the response are stored in an in-memory registry
- Match inputs: When a new LLM call is made, we check if any previously stored output strings appear as substrings in the input
- Create edges: If a match is found, an edge is created from the source node to the current node
This approach is simple and robust: - User code runs completely unmodified - Works with any LLM library that uses httpx/requests - No risk of crashing user code
Execution Flow¶
- User runs
so-record script.py - Agent runner sets up environment (random seeds, server connection)
- Agent runner connects to server (starts it if needed)
- Monkey patches are applied to LLM libraries
- User code executes unmodified
- LLM calls are intercepted and reported to server
- Content-based matching detects dataflow edges
- Server builds dataflow graph
- UI displays the graph in real-time
Module Organization¶
src/
└── sovara/
├── cli/ # Command-line interface
│ ├── so_record.py # Main launch command
│ ├── so_server.py # Server management
│ └── so_config.py # Configuration
├── runner/ # Runtime execution
│ ├── agent_runner.py # Main runner (setup + execution)
│ ├── string_matching.py # Content-based edge detection
│ ├── context_manager.py # Session management
│ └── monkey_patching/ # API interception
│ ├── apply_monkey_patches.py
│ └── patches/ # Per-API patches
└── server/ # Core server
├── app.py # FastAPI app factory
├── state.py # In-memory state and git versioning
└── database_manager.py # Caching/storage + content registry
ui/
├── shared_components/ # Shared React components and types
├── vscode_extension/ # VS Code extension
└── web_app/ # Standalone web app
Next Steps¶
- Server internals - Deep dive into the development server
- Edge detection - How dataflow edges are detected
- API patching - How LLM APIs are intercepted