Skip to main content

Overview

AI operations can fail due to factors outside your code—infrastructure issues, network problems, or service limitations. These operational errors originate from the platform or external services, not from your application logic or the agent’s decisions. Common operational errors include:
  • Inference errors: When the AI model fails to respond
  • Network/API errors: Connection failures, rate limits
  • Sandbox errors: Communication issues between the model and its execution environment
  • Internal server errors: Platform infrastructure problems
Operational errors are different from agent errors, which are exceptions intentionally raised by the agent based on your business logic.

When Errors Reach You

The platform handles most operational errors automatically. Agentica employs retry strategies with exponential backoff and jitter for transient failures like inference rate limits, temporary network issues, and service unavailability. You only see errors that are unrecoverable at the platform level. When an error bubbles up to your code, it means the platform has exhausted its retry attempts and the operation cannot be completed without intervention. This design keeps your error handling focused on genuinely exceptional situations. For errors that do reach your code, you should handle them as you would for any other async network operation—with try/catch blocks, additional retry logic if appropriate, and fallback strategies.

Error Types

Agentica exports a comprehensive set of error classes to help you handle different failure scenarios. All errors inherit from AgenticaError, making it easy to catch all SDK-related errors.
from agentica.errors import AgenticaError

Error Hierarchy

All operational errors inherit from AgenticaError. Errors not intentionally raised by the agent may be caught as exceptions inheriting from this base class, which is exported by the SDK under agentica.errors.
All errors inherit from AgenticaError. The hierarchy organizes errors by their source: connection issues, agent invocation problems, or errors from the inference service.

Base Exceptions

ErrorDescription
AgenticaErrorBase exception for all Agentica errors. Catch this to handle any error not purposefully raised by the agent.
ServerErrorBase class for errors during remote operations. Parent of GenerationError.
GenerationErrorBase class for errors during agent generation. Parent of inference-related errors.
InferenceErrorBase class for HTTP errors from the inference service. Parent of most API errors below.

Connection Errors

ErrorParentDescription
ConnectionErrorAgenticaErrorGeneral connection failure
WebSocketConnectionErrorConnectionErrorWebSocket connection failed or was interrupted
WebSocketTimeoutErrorConnectionErrorWebSocket connection timed out

Invocation Errors

ErrorParentDescription
InvocationErrorAgenticaErrorGeneral error during agent invocation
TooManyInvocationsErrorInvocationErrorExceeded maximum number of invocations
NotRunningErrorInvocationErrorAttempted to use an agent that is not running

Inference Service Errors

These errors originate from the inference service and are forwarded through the SDK:
ErrorParentDescription
MaxTokensErrorGenerationErrorResponse exceeded maximum token limit
ContentFilteringErrorGenerationErrorContent was filtered by safety systems
APIConnectionErrorInferenceErrorFailed to connect to the inference API
APITimeoutErrorInferenceErrorInference API request timed out
RateLimitErrorInferenceErrorRate limit exceeded, slow down requests
BadRequestErrorInferenceErrorRequest was malformed or invalid (400)
UnauthorizedErrorInferenceErrorAuthentication failed or missing (401)
PermissionDeniedErrorInferenceErrorInsufficient permissions (403)
NotFoundErrorInferenceErrorRequested resource not found (404)
ConflictErrorInferenceErrorRequest conflicts with current state (409)
UnprocessableEntityErrorInferenceErrorRequest understood but cannot be processed (422)
RequestTooLargeErrorInferenceErrorRequest payload too large for inference service
InternalServerErrorInferenceErrorInternal server error (500)
ServiceUnavailableErrorInferenceErrorService temporarily unavailable (503)
OverloadedErrorInferenceErrorInference service is overloaded, try again later
DeadlineExceededErrorInferenceErrorOperation exceeded its deadline

Controlling Token Limits

The MaxTokensError occurs when a response exceeds the maximum token limit. You can proactively control this by setting max_tokens when creating agents or magic functions. Note that this limit applies per inference request—a single agent or magic function invocation may make multiple inference requests, and each request is subject to this limit independently.
from agentica import spawn, magic

# For agents
agent = await spawn(
    premise="You are a concise assistant.",
    max_tokens=500  # Limit each inference request to 500 tokens
)

# For magic functions
@magic(max_tokens=1000)
async def summarize(text: str) -> str:
    """Create a brief summary."""
    ...
Use cases for token limits:
  • Cost control: Limit token usage per inference request to manage costs
  • Response length: Ensure individual inference outputs meet length requirements
  • Error prevention: Avoid unexpectedly long responses in a single inference request
When you set max_tokens, each inference request will stop generating when it reaches that limit. If the natural response would exceed this limit, a MaxTokensError will be raised, which you can catch and handle appropriately (see examples below).
Setting appropriate token limits upfront is more efficient than handling MaxTokensError after the fact, as it prevents wasted compute on overly long responses.

Catching Specific Errors

You can catch specific error types to implement different handling strategies. The error hierarchy lets you handle errors at different levels of granularity:
from agentica import magic
from agentica.errors import (
    AgenticaError,
    RateLimitError,
    InferenceError,
    ConnectionError,
    MaxTokensError,
)

@magic()
def analyze_text(text: str) -> dict:
    """Analyze the sentiment and extract key topics."""
    ...

try:
    result = analyze_text(document)
except RateLimitError as e:
    # Handle specific inference error
    logger.warning(f"Rate limited: {e}")
    time.sleep(60)
    result = analyze_text(document)
except MaxTokensError as e:
    # Handle token limit from inference service
    logger.warning(f"Response too long: {e}")
    result = analyze_text(document[:len(document)//2])  # Try with less input
except InferenceError as e:
    # Catch all other inference service errors (API errors, timeouts, etc.)
    logger.error(f"Inference service error: {e}")
    result = {"sentiment": "neutral", "topics": []}
except ConnectionError as e:
    # Handle connection failures
    logger.error(f"Connection failed: {e}")
    result = retry_with_backoff(lambda: analyze_text(document))
except AgenticaError as e:
    # Catch any other SDK errors
    logger.error(f"Unexpected Agentica error: {e}")
    raise

Retry Strategies

The platform already retries most failures automatically using exponential backoff and jitter. When an error reaches your code, it means the platform’s built-in retry logic has been exhausted. You typically don’t need additional retry logic, but you may add it for application-specific requirements.
If you need additional retry logic beyond what the platform provides (for example, for application-specific error handling or longer retry windows), you can implement your own retry strategy. You could implement a simple retry strategy like this:
def retry(operation: Callable, retries: int = 3) -> Any:
  error = None
  for _ in range(retries):
    try:
      return operation()
    except Exception as e:
      error = e
  raise error
which may be used to wrap any magic function or agent call:
result = retry(lambda: extract_date(document), retries=3)
Since magic functions are just functions already, you may also use widely available libraries such as tenacity to handle logic for retries.
from tenacity import retry, stop_after_attempt

@retry(stop=stop_after_attempt(3))
@magic()
def process_data(input_data: str) -> dict:
    """Process and analyze the input data."""
    ...
Remember that the platform has already retried transient failures before an error reaches your code. If an operation consistently fails even with your own additional retry logic, consider adjusting your prompts, providing more context, or choosing a different model rather than increasing retry attempts.
Once you get successful results, consider caching them. After retries produce a good response, you can cache it for future calls with the same inputs. This combines resilience with performance. See Caching for strategies.

Error Logging

Proper logging of AI operations helps you monitor reliability, debug failures, and identify patterns in errors. Log enough context to diagnose issues, but be mindful of sensitive data.

Basic Structured Logging

Log AI failures with structured data that includes the operation, error type, and relevant context. This example shows AI-driven test generation with comprehensive logging:
import logging
from agentica import magic

logger = logging.getLogger(__name__)

@magic()
def generate_tests(source_code: str, framework: str) -> str:
    """
    Generate comprehensive unit tests for the given source code.
    Use the specified testing framework.
    """
    ...

def create_test_suite(source_code: str, framework: str, file_path: str) -> str:
    try:
        tests = generate_tests(source_code, framework)
        logger.info("Test suite generated successfully", extra={
            "operation": "generate_tests",
            "framework": framework,
            "file_path": file_path,
            "source_lines": source_code.count('\n'),
            "test_lines": tests.count('\n')
        })
        return tests
    except Exception as e:
        logger.error("Test generation failed", extra={
            "operation": "generate_tests",
            "error_type": type(e).__name__,
            "error_message": str(e),
            "framework": framework,
            "file_path": file_path,
            "source_lines": source_code.count('\n')
        })
        raise

Sensitive Data Handling

Never log sensitive user data. Redact or omit PII while preserving enough context for debugging:
from agentica import magic
import hashlib

def hash_for_logging(value: str) -> str:
    """Create a consistent hash for correlation without exposing data."""
    return hashlib.sha256(value.encode()).hexdigest()[:8]

@magic()
def analyze_support_ticket(ticket_text: str, customer_email: str) -> dict:
    """Analyze customer support ticket."""
    ...

def process_ticket(ticket_text: str, customer_email: str) -> dict:
    try:
        result = analyze_support_ticket(ticket_text, customer_email)
        logger.info("Ticket analyzed", extra={
            "operation": "analyze_support_ticket",
            "customer_id": hash_for_logging(customer_email),
            "ticket_length": len(ticket_text)
        })
        return result
    except Exception as e:
        logger.error("Ticket analysis failed", extra={
            "operation": "analyze_support_ticket",
            "error": str(e),
            "customer_id": hash_for_logging(customer_email),
            "ticket_length": len(ticket_text)
            # Do NOT log customer_email or ticket_text
        })
        raise

Tracking Degraded Performance

When using fallback strategies, log which tier succeeded. Frequent fallbacks suggest reviewing your approach—consider trying a different model better suited to the task, refining your prompts, or providing additional context. This example shows AI-powered code review with quality tracking:
from agentica import magic
from dataclasses import dataclass

@dataclass
class ReviewResult:
    issues: list[str]
    severity: str
    suggestions: list[str]

@magic()
def deep_code_review(code: str, context: dict) -> ReviewResult:
    """
    Perform comprehensive code review including:
    - Security vulnerabilities
    - Performance issues
    - Best practices violations
    - Design pattern suggestions
    Analyze with full codebase context.
    """
    ...

@magic()
def basic_code_review(code: str) -> ReviewResult:
    """
    Perform basic code review:
    - Syntax issues
    - Common anti-patterns
    - Simple style violations
    No codebase context required.
    """
    ...

def review_code_with_monitoring(code: str, context: dict = None) -> ReviewResult:
    # Try comprehensive review with context
    if context:
        try:
            result = deep_code_review(code, context)
            logger.info("Code review completed", extra={
                "method": "deep_review",
                "issues_found": len(result.issues),
                "severity": result.severity
            })
            return result
        except Exception as e:
            logger.warning("Deep review failed, falling back to basic", extra={
                "error": str(e),
                "code_length": len(code)
            })

    # Fallback to basic review
    try:
        result = basic_code_review(code)
        logger.warning("Code review completed with basic analysis only", extra={
            "method": "basic_review",
            "issues_found": len(result.issues),
            "severity": result.severity
        })
        return result
    except Exception as e:
        logger.error("All review methods failed", extra={
            "error": str(e),
            "code_length": len(code)
        })
        raise
Use metrics from these logs to track fallback rates and identify patterns. High fallback rates are a useful diagnostic—they suggest opportunities to choose a model better suited to your task, refine your prompts, or provide additional context.

Next Steps