Skip to main content

Type Safety

Strong type hints do two things: they guide agents toward the correct output structure, and they give you type-safe returns in your code. The more specific your types, the more constrained an agent’s output will be. Use literal types to restrict outputs to specific values:
from typing import Literal

@agentic()
async def classify(text: str) -> Literal['positive', 'negative', 'neutral']:
    """Classify sentiment"""
    ...

# The agent can only return one of these three exact strings
result = await classify("Great product!")  # Type is Literal['positive', 'negative', 'neutral']
Use structured types for complex outputs. The agent will match your type structure exactly:
from dataclasses import dataclass
from typing import Literal

@dataclass
class Review:
    rating: Literal[1, 2, 3, 4, 5]
    sentiment: Literal['positive', 'negative', 'neutral']
    categories: list[str]
    summary: str

@agentic()
async def analyze_review(text: str) -> Review:
    """Analyze a product review"""
    ...

# Returns a fully typed Review object
review = await analyze_review("Great product, fast shipping!")
print(review.rating)  # Type-safe access
Combine types with validation for even stronger guarantees. See Error Handling for validation patterns using Pydantic and Zod.

Security

Credential Management

Never hardcode API keys or secrets. Use environment variables. This keeps credentials out of your codebase and allows different values per environment.
import os

# Good - use environment variables
api_key = os.environ["API_KEY"]
database_url = os.environ.get("DATABASE_URL")

# Bad - hardcoded secrets
api_key = "sk-proj-abc123..."  # Never commit this
Never pass raw API keys to agents. Instead, pass pre-authenticated SDK clients or specific methods. The agent uses the functionality without ever seeing the credentials:
from agentica import spawn
from github import Github

# Good - pass authenticated client methods
gh = Github(os.environ["GITHUB_TOKEN"])
agent = await spawn(premise="You are a GitHub analyst")

result = await agent.call(
    Report,
    "Analyze the repository's recent activity",
    get_repo=gh.get_repo,
    search_issues=gh.search_issues
)
# Agent can use GitHub API without accessing the token

# Bad - passing raw credentials
result = await agent.call(
    Report,
    "Analyze repository",
    github_token=os.environ["GITHUB_TOKEN"]  # Never do this
)

Input Validation

Validate user input before passing it to agentic functions. This prevents injection attacks and ensures your agentic functions receive clean data.
from agentica import agentic

@agentic()
async def query_database(user_input: str, schema: dict) -> list[dict]:
    """
    Generate and execute a database query based on user input.
    Only generate SELECT queries. Use the schema to validate table/column names.
    """
    ...

async def safe_query(user_input: str) -> list[dict]:
    # Validate input length
    if len(user_input) > 500:
        raise ValueError("Input too long")

    # Check for suspicious patterns
    dangerous_keywords = ['drop', 'delete', 'truncate', 'insert', 'update']
    if any(keyword in user_input.lower() for keyword in dangerous_keywords):
        raise ValueError("Invalid query keywords")

    # Now safe to pass to an agent
    return await query_database(user_input, schema)

File Access Scope

Agents that can open arbitrary paths can easily escape its intended sandbox (for example by traversing ../) and read, modify, or delete files across your system. Avoid passing Path objects or unrestricted file paths directly to agents or agentic functions. Instead, pre-open only the specific files you want the agent to access and pass those file handles in scope.
from typing import TextIO
from agentica import agentic

@agentic()
async def summarize_report(report_file: TextIO) -> str:
    """
    Read the already-open report_file and summarize its contents.
    """
    ...

with open("/var/reports/weekly.csv", "r", encoding="utf-8") as f:
    # The agent only sees this specific handle, not your whole filesystem
    summary = await summarize_report(f)

Rate Limiting

Implement rate limiting to protect against abuse and manage costs. This is especially important for user-facing features.
from collections import defaultdict
from time import time

class RateLimiter:
    def __init__(self, max_calls: int, window_seconds: int):
        self.max_calls = max_calls
        self.window = window_seconds
        self.calls: dict[str, list[float]] = defaultdict(list)

    def allow(self, user_id: str) -> bool:
        now = time()
        # Remove old calls outside window
        self.calls[user_id] = [t for t in self.calls[user_id] if now - t < self.window]

        if len(self.calls[user_id]) >= self.max_calls:
            return False

        self.calls[user_id].append(now)
        return True

limiter = RateLimiter(max_calls=10, window_seconds=60)

@agentic()
async def summarize(text: str) -> str:
    """Summarize the text"""
    ...

async def rate_limited_summarize(user_id: str, text: str) -> str:
    if not limiter.allow(user_id):
        raise Exception("Rate limit exceeded. Try again in a minute.")
    return await summarize(text)
Exponential backoff handles transient failures when agentic functions or agents call external APIs that may be rate-limited.
import asyncio
from agentica import agentic
from dataclasses import dataclass
from typing import Literal

@dataclass
class FetchResult:
    status: Literal['success', 'rate_limited', 'error']
    data: list[dict] | None
    message: str

@agentic()
async def fetch_github_data(query: str, api_search) -> FetchResult:
    """
    Search GitHub using the provided api_search function.
    If you encounter a rate limit response, return status='rate_limited'.
    If successful, return status='success' with the data.
    If other error, return status='error' with a message.
    """
    ...

async def fetch_with_backoff(query: str, api_search, max_retries: int = 3) -> FetchResult:
    for attempt in range(max_retries):
        result = await fetch_github_data(query, api_search)

        if result.status == 'success':
            return result
        elif result.status == 'rate_limited' and attempt < max_retries - 1:
            # Exponential backoff: 1s, 2s, 4s
            wait_time = 2 ** attempt
            await asyncio.sleep(wait_time)
            continue
        else:
            return result

    return FetchResult('error', None, 'Max retries exceeded')

Monitoring

Track these key metrics in production to understand your agentic operations:
  • Latency. How long do agentic functions and agents take to respond?
  • Error rates. What percentage of agentic calls fail or timeout?
  • Usage patterns. Which functions are called most? By which users?
  • Output quality. Are results meeting expectations? Use sampling to review outputs.

Logging

Log agentic operations with structured data. Include the operation name, input size, model used, and timing. This helps debug issues and identify patterns.
import logging
import time

logger = logging.getLogger(__name__)

@agentic()
async def classify(text: str) -> str:
    """Classify sentiment"""
    ...

def monitored_classify(text: str) -> str:
    start = time.time()
    try:
        result = await classify(text)
        logger.info("Agentic operation succeeded", extra={
            "operation": "classify",
            "input_length": len(text),
            "latency_ms": (time.time() - start) * 1000,
            "model": "gpt-4"
        })
        return result
    except Exception as e:
        logger.error("Agentic operation failed", extra={
            "operation": "classify",
            "error": str(e),
            "input_length": len(text),
            "latency_ms": (time.time() - start) * 1000
        })
        raise
Never log sensitive data. User inputs, API keys, or PII should not appear in logs. See Error Handling › Sensitive Data Handling for examples of safe logging practices.

Performance

Caching

Cache agent responses when the same inputs produce the same outputs. This reduces latency and costs for repeated operations. Use caching for:
  • Reference data that changes infrequently (product descriptions, documentation)
  • Expensive operations called repeatedly with the same inputs
  • Read-heavy workflows where consistency is acceptable
from functools import lru_cache

# Decorate the agentic function directly
@lru_cache(maxsize=1000)
@agentic()
async def categorize_product(description: str) -> str:
    """Categorize product into a department"""
    ...

# Same description returns cached result
category1 = await categorize_product("Red cotton t-shirt")  # Calls agent
category2 = await categorize_product("Red cotton t-shirt")  # Returns cached
Advanced: Best-of-N caching with retries. Like JIT compilation that eventually compiles hot code paths, you can combine caching with retry strategies to create a “best-of-N” pattern: retry failed operations until you get a high-quality result, then cache that successful response. Future calls skip the retry logic entirely and use the cached “compiled” result. This is particularly useful for expensive operations where you want to pay the retry cost once, then reuse the validated output.

Parallel Processing

Process multiple items in parallel when they’re independent. This is faster than sequential processing.
import asyncio

@agentic()
async def analyze(text: str) -> dict:
    """Analyze the text"""
    ...

# Process all texts in parallel
texts = ["text 1", "text 2", "text 3"]
results = await asyncio.gather(*[analyze(text) for text in texts])

Stateful Workflows with Agents

Use agents for multi-step workflows where later steps depend on earlier results. Agents maintain context across invocations, allowing them to make decisions based on what they’ve already done. Here’s an agent that debugs code by analyzing, then deciding whether to fix or explain based on what it finds:
from agentica import spawn

agent = await spawn(
    premise="""
    You are a code debugger. When given code with an error:
    1. First analyze the error to understand the root cause
    2. If it's a simple fix (syntax, typo), fix it and return the corrected code
    3. If it's a logic error requiring design changes, explain the issue instead
    """,
    model="openai:gpt-4.1"
)

# First invocation: analyze
await agent.call(None, "Analyze this error", code=broken_code, error=error_msg)

# Second invocation: agent decides to fix or explain based on analysis
result = await agent.call(
    str,
    "Based on your analysis, either fix the code or explain what needs to change"
)
# The agent remembers its analysis and chooses the appropriate action
For truly independent operations, use agentic functions and process in parallel. For dependent workflows where context matters, use a single agent across multiple calls.

Cost Optimization

Inference costs money — optimize by choosing the right model, caching responses, and using agents only when needed. Choose the right model for the task. Use cheaper models for simple operations, more expensive models for complex reasoning. See Model Selection for guidance. Cache aggressively. Every cache hit is a cost you don’t pay. See Caching above. Keep prompts concise. Longer prompts cost more. Remove unnecessary context or examples once you’ve validated your agentic function works. Use agents strategically. Agents maintain conversation history, which grows with each call and costs more. For stateless operations, use agentic functions instead. Bad: Using an agent for independent operations
# Inefficient - agent maintains unnecessary history
agent = await spawn(premise="You are a data processor")

for item in items:
    result = await agent.call(dict, f"Process this item: {item}")
    # Each call adds to history, increasing cost
Good: Using agentic function for independent operations
@agentic()
async def process_item(item: str) -> dict:
    """Process the item"""
    ...

# Each call is independent, no growing history
for item in items:
    result = await process_item(item)
Good: Using agent when context matters
# Agent remembers context across steps
agent = await spawn(premise="You are a research assistant")

# Step 1: Find relevant papers
papers = await agent.call(list[str], "Search for papers on quantum computing", web_search=search)

# Step 2: Agent remembers which papers it found
summary = await agent.call(str, "Summarize the key findings from these papers")

# Step 3: Agent has full context to compare
comparison = await agent.call(str, "Which paper has the most practical applications?")

Deployment Checklist

Before deploying agentic features to production: Environment & Configuration
Environment variables configured for all environments (dev, staging, prod)
API keys secured and not hardcoded
Model selections appropriate for each environment (cheaper models for dev/test)
Error Handling & Reliability
Try/catch blocks around all agents operations
Fallback strategies for critical paths
Retry logic for transient failures
Validation on agent outputs where needed
Security
Input validation on user-provided data
Rate limiting implemented for user-facing features
Sensitive data excluded from logs
Authenticated SDK clients passed instead of raw API keys
Monitoring & Observability
Structured logging in place for agentic operations
Metrics tracked (latency, error rates, usage)
Alerts configured for error spikes or high latency
Sample-based output quality monitoring
Testing
Unit tests for agentic functions with representative inputs
Integration tests for multi-agent workflows
Load tests if serving high-volume traffic
Manual review of agent outputs on diverse test cases
Cost Management
Caching implemented for repeated operations
Model selection optimized (avoid expensive models for simple tasks)
Budget alerts configured with your agent provider
Rate limits prevent runaway costs

Next Steps