Architecture¶

This page provides a high-level overview of Sovara's architecture and how its components work together.

System Overview¶

Sovara consists of three main processes that work together:

Processes Overview

1. User Program (Green)¶

The user launches their program with so-record script.py. This feels exactly like running python script.py - same terminal I/O, same crash behavior, and debugger support.

Key point: User code runs completely unmodified. Sovara uses monkey patching to intercept LLM calls and content-based matching to detect dataflow edges.

Components:

Agent Runner (agent_runner.py) - Wraps the user's Python command. Sets up the environment, connects to the server, applies monkey patches, then executes the user's program.
Monkey Patching (monkey_patching/) - Intercepts LLM API calls to record inputs/outputs.
String Matching (string_matching.py) - Detects dataflow edges using content-based matching.

2. Development Server (Blue)¶

The core analysis engine that receives events from the user process and updates the UI.

Responsibilities:

Receives LLM call events from the runner
Builds and maintains the dataflow graph
Manages LLM call caching
Handles user edits to inputs/outputs
Controls the UI

Communication: All messages flow through a TCP socket (default port: 5959).

3. UI (Red)¶

The VS Code extension (or web app) that displays the dataflow graph and provides interactive controls.

Features:

Visualizes the dataflow graph
Allows editing of LLM inputs/outputs
Triggers re-runs with modifications
Shows run history

Content-Based Edge Detection¶

Sovara detects dataflow between LLM calls using content-based matching:

Store outputs: When an LLM call completes, all text strings from the response are stored in an in-memory registry
Match inputs: When a new LLM call is made, we check if any previously stored output strings appear as substrings in the input
Create edges: If a match is found, an edge is created from the source node to the current node

This approach is simple and robust: - User code runs completely unmodified - Works with any LLM library that uses httpx/requests - No risk of crashing user code

Execution Flow¶

User runs so-record script.py
Agent runner sets up environment (random seeds, server connection)
Agent runner connects to server (starts it if needed)
Monkey patches are applied to LLM libraries
User code executes unmodified
LLM calls are intercepted and reported to server
Content-based matching detects dataflow edges
Server builds dataflow graph
UI displays the graph in real-time

Module Organization¶

src/
└── sovara/
    ├── cli/                    # Command-line interface
    │   ├── so_record.py        # Main launch command
    │   ├── so_server.py        # Server management
    │   └── so_config.py        # Configuration
    ├── runner/                 # Runtime execution
    │   ├── agent_runner.py     # Main runner (setup + execution)
    │   ├── string_matching.py  # Content-based edge detection
    │   ├── context_manager.py  # Session management
    │   └── monkey_patching/    # API interception
    │       ├── apply_monkey_patches.py
    │       └── patches/        # Per-API patches
    └── server/                 # Core server
        ├── app.py              # FastAPI app factory
        ├── state.py            # In-memory state and git versioning
        └── database_manager.py # Caching/storage + content registry

ui/
├── shared_components/      # Shared React components and types
├── vscode_extension/       # VS Code extension
└── web_app/                # Standalone web app

Next Steps¶

Server internals - Deep dive into the development server
Edge detection - How dataflow edges are detected
API patching - How LLM APIs are intercepted