API Patching¶
Sovara uses monkey patching to intercept LLM API calls and record their inputs/outputs for building dataflow graphs.
Overview¶
When you import an LLM SDK (like OpenAI or Anthropic), Sovara patches the relevant methods to:
- Record the call inputs
- Execute the original API call
- Record the outputs
- Detect dataflow edges using content-based matching
- Report the call to the server
Supported APIs¶
Sovara intercepts LLM calls via HTTP library patches:
| Patch | Covers |
|---|---|
httpx_patch.py |
OpenAI, Anthropic (via httpx) |
requests_patch.py |
APIs using requests library |
genai_patch.py |
Google GenAI |
mcp_patches.py |
MCP tool calls |
randomness_patch.py |
numpy, torch, uuid seeding |
How Patches Are Applied¶
Patches are applied lazily when you import the relevant module. The PATCHES dict in apply_monkey_patches.py maps module names to patch functions:
PATCHES = {
"httpx": ("sovara.runner.monkey_patching.patches.httpx_patch", "httpx_patch"),
"requests": ("sovara.runner.monkey_patching.patches.requests_patch", "requests_patch"),
"google.genai": ("sovara.runner.monkey_patching.patches.genai_patch", "genai_patch"),
"mcp": ("sovara.runner.monkey_patching.patches.mcp_patches", "mcp_patch"),
...
}
When you import httpx, Sovara's import hook triggers httpx_patch() before returning the module.
Patch Structure¶
A typical patch follows this pattern (see httpx_patch.py for a complete example):
from sovara.runner.string_matching import find_source_nodes, store_output_strings
from sovara.runner.context_manager import get_session_id
def patched_function(self, *args, **kwargs):
api_type = "my_api.method"
# 1. Build input dict from args/kwargs
input_dict = get_input_dict(original_function, *args, **kwargs)
# 2. Find edges using content-based matching (BEFORE cache lookup)
session_id = get_session_id()
source_node_ids = find_source_nodes(session_id, input_dict, api_type)
# 3. Check cache or call the LLM
cache_output = DB.get_in_out(input_dict, api_type)
if cache_output.output is None:
result = original_function(**cache_output.input_dict)
DB.cache_output(cache_result=cache_output, output_obj=result, api_type=api_type)
# 4. Store output strings for future matching
store_output_strings(cache_output.session_id, cache_output.node_id, cache_output.output, api_type)
# 5. Report node and edges to server
send_graph_node_and_edges(
node_id=cache_output.node_id,
input_dict=cache_output.input_dict,
output_obj=cache_output.output,
source_node_ids=source_node_ids,
api_type=api_type,
)
return cache_output.output
Content-Based Edge Detection¶
Sovara detects dataflow between LLM calls using content-based matching:
- Store outputs: When an LLM call completes, all text strings from the response are stored
- Match inputs: When a new LLM call is made, we check if any stored output strings appear in the input
- Create edges: If a match is found, an edge is created from the source node to the current node
This approach is simple and robust - user code runs completely unmodified, and edges are detected automatically.
Writing New Patches¶
Step 1: Identify the Target¶
Determine which method you need to patch. For example:
Step 2: Create the Patch File¶
Add a new file in src/sovara/runner/monkey_patching/patches/:
# src/sovara/runner/monkey_patching/patches/my_api_patch.py
from functools import wraps
from sovara.runner.monkey_patching.patching_utils import get_input_dict, send_graph_node_and_edges
from sovara.runner.string_matching import find_source_nodes, store_output_strings
from sovara.server.database_manager import DB
def patch_my_api_send(original_send):
@wraps(original_send)
def patched_send(self, *args, **kwargs):
# Your patching logic here (see httpx_patch.py for full example)
pass
return patched_send
Step 3: Create the Patch Function¶
In your patch file, create a function that applies the patches when called:
def my_api_patch():
try:
from my_api import Client
except ImportError:
logger.info("my_api not installed, skipping patches")
return
def create_patched_init(original_init):
@wraps(original_init)
def patched_init(self, *args, **kwargs):
original_init(self, *args, **kwargs)
# Apply method patches here
patch_my_api_send(self, type(self))
return patched_init
Client.__init__ = create_patched_init(Client.__init__)
Step 4: Register in PATCHES¶
Add your patch to the PATCHES dict in apply_monkey_patches.py:
PATCHES = {
"httpx": ("sovara.runner.monkey_patching.patches.httpx_patch", "httpx_patch"),
"my_api": ("sovara.runner.monkey_patching.patches.my_api_patch", "my_api_patch"), # Add here
...
}
The patch will be applied automatically when users import my_api.
Example: httpx Patch¶
Here's a simplified view of how the httpx patch works (used by OpenAI, Anthropic, etc.):
def patch_httpx_send(bound_obj, bound_cls):
original_function = bound_obj.send
@wraps(original_function)
def patched_function(self, *args, **kwargs):
api_type = "httpx.Client.send"
input_dict = get_input_dict(original_function, *args, **kwargs)
# Check if URL is whitelisted (LLM endpoint)
request = input_dict["request"]
if not is_whitelisted_endpoint(str(request.url), request.url.path):
return original_function(*args, **kwargs)
# Get cached result or call LLM
cache_output = DB.get_in_out(input_dict, api_type)
if cache_output.output is None:
result = original_function(**cache_output.input_dict)
DB.cache_output(cache_result=cache_output, output_obj=result, api_type=api_type)
# Content-based edge detection
source_node_ids = find_source_nodes(cache_output.session_id, cache_output.input_dict, api_type)
store_output_strings(cache_output.session_id, cache_output.node_id, cache_output.output, api_type)
# Report to server
send_graph_node_and_edges(...)
return cache_output.output
bound_obj.send = patched_function.__get__(bound_obj, bound_cls)
Async Support¶
Many LLM APIs are async. Patches must handle both sync and async methods:
def patch_method(original):
if asyncio.iscoroutinefunction(original):
@wraps(original)
async def async_patched(*args, **kwargs):
# async implementation
pass
return async_patched
else:
@wraps(original)
def sync_patched(*args, **kwargs):
# sync implementation
pass
return sync_patched
API Parsers¶
Each LLM API has different request/response formats. API parsers extract relevant information:
src/sovara/runner/monkey_patching/api_parsers/
├── httpx_api_parser.py # OpenAI, Anthropic (via httpx)
├── requests_api_parser.py # APIs using requests
├── genai_api_parser.py # Google GenAI
└── mcp_api_parser.py # MCP tool calls
Parsers normalize HTTP responses into a common format for caching and display. See api_parser.py for the main interface that routes to the appropriate parser based on api_type.
Maintenance¶
LLM APIs change frequently. To detect API changes:
- Run tests after upgrading SDK versions
- Check for deprecation warnings
- Review SDK changelogs
Next Steps¶
- Edge Detection - How dataflow edges are detected
- Testing - Running the test suite
- Architecture - System overview