Skip to main content

Overview

Agentic operations can fail due to factors outside your code — infrastructure issues, network problems, or service limitations. These operational errors originate from the platform or external services, not from your application logic or the agent’s decisions. Common operational errors include:
  • Inference errors: When the underlying model fails to respond
  • Network/API errors: Connection failures, rate limits
  • Sandbox errors: Communication issues between the model and its execution environment
  • Internal server errors: Platform infrastructure problems
Operational errors are different from agent errors, which are exceptions intentionally raised by the agent based on your business logic.

When Errors Reach You

The platform handles most operational errors automatically. Agentica employs retry strategies with exponential backoff and jitter for transient failures like inference rate limits, temporary network issues, and service unavailability. You only see errors that are unrecoverable at the platform level. When an error bubbles up to your code, it means the platform has exhausted its retry attempts and the operation cannot be completed without intervention. This design keeps your error handling focused on genuinely exceptional situations. For errors that do reach your code, you should handle them as you would for any other async network operation — with try/catch blocks, additional retry logic if appropriate, and fallback strategies. Some of these errors may represent limitations or bugs in Agentica itself, please see Reporting Bugs for information about how to report such bugs to us.

Error Types

Agentica exports a comprehensive set of error classes to help you handle different failure scenarios. All errors inherit from AgenticaError, making it easy to catch all SDK-related errors.
from agentica.errors import AgenticaError

Error Hierarchy

All operational errors inherit from AgenticaError. Errors not intentionally raised by the agent may be caught as exceptions inheriting from this base class, which is exported by the SDK under agentica.errors.
All errors inherit from AgenticaError. The hierarchy organizes errors by their source: connection issues, agent invocation problems, or errors from the inference service.

Base Exceptions

ErrorDescription
AgenticaErrorBase exception for all Agentica errors. Catch this to handle any error not purposefully raised by the agent.
ServerErrorBase class for errors during remote operations. Parent of GenerationError.
GenerationErrorBase class for errors during agent generation. Parent of inference-related errors.
InferenceErrorBase class for HTTP errors from the inference service. Parent of most API errors below.

Connection Errors

ErrorParentDescription
ConnectionErrorAgenticaErrorGeneral connection failure
WebSocketConnectionErrorConnectionErrorWebSocket connection failed or was interrupted
WebSocketTimeoutErrorConnectionErrorWebSocket connection timed out

Invocation Errors

ErrorParentDescription
InvocationErrorAgenticaErrorGeneral error during agent invocation
TooManyInvocationsErrorInvocationErrorExceeded maximum number of invocations
NotRunningErrorInvocationErrorAttempted to use an agent that is not running

Inference Service Errors

These errors originate from the inference service and are forwarded through the SDK:
ErrorParentDescription
MaxTokensErrorGenerationErrorResponse exceeded maximum token limit
ContentFilteringErrorGenerationErrorContent was filtered by safety systems
APIConnectionErrorInferenceErrorFailed to connect to the inference API
APITimeoutErrorInferenceErrorInference API request timed out
RateLimitErrorInferenceErrorRate limit exceeded, slow down requests
BadRequestErrorInferenceErrorRequest was malformed or invalid (400)
UnauthorizedErrorInferenceErrorAuthentication failed or missing (401)
PermissionDeniedErrorInferenceErrorInsufficient permissions (403)
NotFoundErrorInferenceErrorRequested resource not found (404)
ConflictErrorInferenceErrorRequest conflicts with current state (409)
UnprocessableEntityErrorInferenceErrorRequest understood but cannot be processed (422)
RequestTooLargeErrorInferenceErrorRequest payload too large for inference service
InternalServerErrorInferenceErrorInternal server error (500)
ServiceUnavailableErrorInferenceErrorService temporarily unavailable (503)
OverloadedErrorInferenceErrorInference service is overloaded, try again later
DeadlineExceededErrorInferenceErrorOperation exceeded its deadline

Controlling Token Limits

The MaxTokensError occurs when a response exceeds the maximum token limit. You can proactively control this by setting max_tokens when creating agents or agentic functions. Note that this limit applies per inference request — a single agent or agentic function invocation may make multiple inference requests, and each request is subject to this limit independently.
from agentica import spawn, agentic, MaxTokens

# For agents
agent = await spawn(
    premise="You are a concise assistant.",
    max_tokens=500  # Limit each invocation to total 500 output tokens
)

# For agentic functions
@agentic(max_tokens=1000)
async def summarize(text: str) -> str:
    """Create a brief summary."""
    ...

# For finer control, use MaxTokens
agent = await spawn(
    premise="You are a concise assistant.",
    max_tokens=MaxTokens(per_invocation=5000, per_round=1000, rounds=5)
)
For more details on MaxTokens, see the Python API reference. Use cases for token limits:
  • Cost control: Limit token usage per inference request to manage costs
  • Response length: Ensure individual inference outputs meet length requirements
  • Error prevention: Avoid unexpectedly long responses in a single inference request
When you set max_tokens, each inference request will stop generating when it reaches that limit. If the natural response would exceed this limit, a MaxTokensError will be raised, which you can catch and handle appropriately (see examples below).
Setting appropriate token limits upfront is more efficient than handling MaxTokensError after the fact, as it prevents wasted compute on overly long responses.

Catching Specific Errors

You can catch specific error types to implement different handling strategies. The error hierarchy lets you handle errors at different levels of granularity:
from agentica import agentic
from agentica.errors import (
    AgenticaError,
    RateLimitError,
    InferenceError,
    ConnectionError,
    MaxTokensError,
)

@agentic()
async def analyze_text(text: str) -> dict:
    """Analyze the sentiment and extract key topics."""
    ...

try:
    result = await analyze_text(document)
except RateLimitError as e:
    # Handle specific inference error
    logger.warning(f"Rate limited: {e}")
    time.sleep(60)
    result = await analyze_text(document)
except MaxTokensError as e:
    # Handle token limit from inference service
    logger.warning(f"Response too long: {e}")
    result = await analyze_text(document[:len(document)//2])  # Try with less input
except InferenceError as e:
    # Catch all other inference service errors (API errors, timeouts, etc.)
    logger.error(f"Inference service error: {e}")
    result = {"sentiment": "neutral", "topics": []}
except ConnectionError as e:
    # Handle connection failures
    logger.error(f"Connection failed: {e}")
    result = retry_with_backoff(lambda: analyze_text(document))
except AgenticaError as e:
    # Catch any other SDK errors
    logger.error(f"Unexpected Agentica error: {e}")
    raise

Retry Strategies

The platform already retries most failures automatically using exponential backoff and jitter. When an error reaches your code, it means the platform’s built-in retry logic has been exhausted. You typically don’t need additional retry logic, but you may add it for application-specific requirements.
If you need additional retry logic beyond what the platform provides (for example, for application-specific error handling or longer retry windows), you can implement your own retry strategy. You could implement a simple retry strategy like this:
def retry(operation: Callable, retries: int = 3) -> Any:
  error = None
  for _ in range(retries):
    try:
      return operation()
    except Exception as e:
      error = e
  raise error
which may be used to wrap any agentic function or agent call:
result = retry(lambda: extract_date(document), retries=3)
Since agentic functions are just functions already, you may also use widely available libraries such as tenacity to handle logic for retries.
from tenacity import retry, stop_after_attempt

@retry(stop=stop_after_attempt(3))
@agentic()
async def process_data(input_data: str) -> dict:
    """Process and analyze the input data."""
    ...
Remember that the platform has already retried transient failures before an error reaches your code. If an operation consistently fails even with your own additional retry logic, consider adjusting your prompts, providing more context, or choosing a different model rather than increasing retry attempts.
Once you get successful results, consider caching them. After retries produce a good response, you can cache it for future calls with the same inputs. This combines resilience with performance. See Caching for strategies.

Error Logging

Proper logging of agentic operations helps you monitor reliability, debug failures, and identify patterns in errors. Log enough context to diagnose issues, but be mindful of sensitive data.

Basic Structured Logging

Log agent failures with structured data that includes the operation, error type, and relevant context. This example shows agent-driven test generation with comprehensive logging:
import logging
from agentica import agentic

logger = logging.getLogger(__name__)

@agentic()
async def generate_tests(source_code: str, framework: str) -> str:
    """
    Generate comprehensive unit tests for the given source code.
    Use the specified testing framework.
    """
    ...

async def create_test_suite(source_code: str, framework: str, file_path: str) -> str:
    try:
        tests = await generate_tests(source_code, framework)
        logger.info("Test suite generated successfully", extra={
            "operation": "generate_tests",
            "framework": framework,
            "file_path": file_path,
            "source_lines": source_code.count('\n'),
            "test_lines": tests.count('\n')
        })
        return tests
    except Exception as e:
        logger.error("Test generation failed", extra={
            "operation": "generate_tests",
            "error_type": type(e).__name__,
            "error_message": str(e),
            "framework": framework,
            "file_path": file_path,
            "source_lines": source_code.count('\n')
        })
        raise

Sensitive Data Handling

Never log sensitive user data. Redact or omit PII while preserving enough context for debugging:
from agentica import agentic
import hashlib

def hash_for_logging(value: str) -> str:
    """Create a consistent hash for correlation without exposing data."""
    return hashlib.sha256(value.encode()).hexdigest()[:8]

@agentic()
async def analyze_support_ticket(ticket_text: str, customer_email: str) -> dict:
    """Analyze customer support ticket."""
    ...

async def process_ticket(ticket_text: str, customer_email: str) -> dict:
    try:
        result = await analyze_support_ticket(ticket_text, customer_email)
        logger.info("Ticket analyzed", extra={
            "operation": "analyze_support_ticket",
            "customer_id": hash_for_logging(customer_email),
            "ticket_length": len(ticket_text)
        })
        return result
    except Exception as e:
        logger.error("Ticket analysis failed", extra={
            "operation": "analyze_support_ticket",
            "error": str(e),
            "customer_id": hash_for_logging(customer_email),
            "ticket_length": len(ticket_text)
            # Do NOT log customer_email or ticket_text
        })
        raise

Tracking Degraded Performance

When using fallback strategies, log which tier succeeded. Frequent fallbacks suggest reviewing your approach — consider trying a different model better suited to the task, refining your prompts, or providing additional context. This example shows agentic code review with quality tracking:
from agentica import agentic
from dataclasses import dataclass

@dataclass
class ReviewResult:
    issues: list[str]
    severity: str
    suggestions: list[str]

@agentic()
async def deep_code_review(code: str, context: dict) -> ReviewResult:
    """
    Perform comprehensive code review including:
    - Security vulnerabilities
    - Performance issues
    - Best practices violations
    - Design pattern suggestions
    Analyze with full codebase context.
    """
    ...

@agentic()
async def basic_code_review(code: str) -> ReviewResult:
    """
    Perform basic code review:
    - Syntax issues
    - Common anti-patterns
    - Simple style violations
    No codebase context required.
    """
    ...

async def review_code_with_monitoring(code: str, context: dict = None) -> ReviewResult:
    # Try comprehensive review with context
    if context:
        try:
            result = await deep_code_review(code, context)
            logger.info("Code review completed", extra={
                "method": "deep_review",
                "issues_found": len(result.issues),
                "severity": result.severity
            })
            return result
        except Exception as e:
            logger.warning("Deep review failed, falling back to basic", extra={
                "error": str(e),
                "code_length": len(code)
            })

    # Fallback to basic review
    try:
        result = await basic_code_review(code)
        logger.warning("Code review completed with basic analysis only", extra={
            "method": "basic_review",
            "issues_found": len(result.issues),
            "severity": result.severity
        })
        return result
    except Exception as e:
        logger.error("All review methods failed", extra={
            "error": str(e),
            "code_length": len(code)
        })
        raise
Use metrics from these logs to track fallback rates and identify patterns. High fallback rates are a useful diagnostic — they suggest opportunities to choose a model better suited to your task, refine your prompts, or provide additional context.

Caching

Caching with respect to inference is managed internally and specific to the model provider.
Previous agent invocations may be cached client-side. In Python, just use the @functools.cache decorator!

Rate limiting

When rate limits from providers are imposed, exponential backoff is employed with sensible defaults in a blocking fashion. The initial delay is multiplied by a factor of exponential_base * (1 + jitter * random_float) with every retry till max_retries is reached, where 0.0 <= random_float < 1.0.

Next Steps