> ## Documentation Index
> Fetch the complete documentation index at: https://docs.symbolica.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Operational Errors

> Handle platform-level failures in agent operations

export const TypeScript = props => {
  const getLang = () => {
    let lang = null;
    try {
      const raw = window.localStorage.getItem('code');
      lang = raw;
      try {
        lang = JSON.parse(raw);
      } catch (_err) {}
    } catch (_err) {}
    return (lang || '').toLowerCase();
  };
  const lang = getLang();
  return <span class="when-lang when-typescript" style={{
    display: lang.includes('typescript') ? 'inline' : 'none'
  }}>{props.children}</span>;
};

export const Python = props => {
  let lang = null;
  try {
    const raw = window.localStorage.getItem('code');
    lang = raw;
    try {
      lang = JSON.parse(raw);
    } catch (_err) {}
  } catch (_err) {}
  lang = (lang || 'Python').toLowerCase();
  if (!['python', 'typescript'].some(l => lang.includes(l))) {
    lang = 'python';
  }
  return <span class="when-lang when-python" style={{
    display: lang.includes('python') ? 'inline' : 'none'
  }}>{props.children}</span>;
};

## Overview

Agentic operations can fail due to factors outside your code -- infrastructure issues, network problems, or service limitations. These **operational errors** originate from the platform or external services, not from your application logic or the agent's decisions.

Common operational errors include:

* **Inference errors**: When the underlying model fails to respond
* **Network/API errors**: Connection failures, rate limits
* **Sandbox errors**: Communication issues between the model and its execution environment
* **Internal server errors**: Platform infrastructure problems

<Tip>
  Operational errors are different from [agent errors](/guides/agent-errors), which are exceptions intentionally raised by the agent based on your business logic.
</Tip>

### When Errors Reach You

**The platform handles most operational errors automatically.** The Agentica SDK employs retry strategies with exponential backoff and jitter for transient failures like inference rate limits, temporary network issues, and service unavailability.

**You only see errors that are unrecoverable** at the platform level. When an error bubbles up to your code, it means the platform has exhausted its retry attempts and the operation cannot be completed without intervention. This design keeps your error handling focused on genuinely exceptional situations.

For errors that do reach your code, you should handle them as you would for any other async network operation -- with try/catch blocks, additional retry logic if appropriate, and fallback strategies.

Some of these errors may represent limitations or bugs in the Agentica SDK itself, please see [Reporting Bugs](/guides/reporting-bugs) for information about how to report such bugs to us.

## Error Types

The Agentica SDK exports a comprehensive set of error classes to help you handle different failure scenarios. All errors inherit from `AgenticaError`, making it easy to catch all SDK-related errors.

<CodeGroup>
  ```python Python theme={null}
  from agentica.errors import AgenticaError
  ```

  ```typescript TypeScript theme={null}
  import { AgenticaError } from '@symbolica/agentica/errors';
  ```
</CodeGroup>

### Error Hierarchy

All operational errors inherit from `AgenticaError`. Errors not intentionally raised by the agent may be caught as exceptions inheriting from this base class, which is exported by the SDK under <Python>`agentica.errors`</Python><TypeScript>`@symbolica/agentica/errors`</TypeScript>.

<Accordion title="Complete Error Reference">
  All errors inherit from `AgenticaError`. The hierarchy organizes errors by their source: connection issues, agent invocation problems, or errors from the inference service.

  ### Base Exceptions

  | Error             | Description                                                                                                      |
  | ----------------- | ---------------------------------------------------------------------------------------------------------------- |
  | `AgenticaError`   | Base exception for all Agentica SDK errors. Catch this to handle any error not purposefully raised by the agent. |
  | `ServerError`     | Base class for errors during remote operations. Parent of `GenerationError`.                                     |
  | `GenerationError` | Base class for errors during agent generation. Parent of inference-related errors.                               |
  | `InferenceError`  | Base class for HTTP errors from the inference service. Parent of most API errors below.                          |

  ### Connection Errors

  | Error                      | Parent            | Description                                    |
  | -------------------------- | ----------------- | ---------------------------------------------- |
  | `ConnectionError`          | `AgenticaError`   | General connection failure                     |
  | `WebSocketConnectionError` | `ConnectionError` | WebSocket connection failed or was interrupted |
  | `WebSocketTimeoutError`    | `ConnectionError` | WebSocket connection timed out                 |

  ### Invocation Errors

  | Error                     | Parent            | Description                                   |
  | ------------------------- | ----------------- | --------------------------------------------- |
  | `InvocationError`         | `AgenticaError`   | General error during agent invocation         |
  | `TooManyInvocationsError` | `InvocationError` | Exceeded maximum number of invocations        |
  | `NotRunningError`         | `InvocationError` | Attempted to use an agent that is not running |

  ### Inference Service Errors

  These errors originate from the inference service and are forwarded through the SDK:

  | Error                      | Parent            | Description                                      |
  | -------------------------- | ----------------- | ------------------------------------------------ |
  | `MaxTokensError`           | `GenerationError` | Response exceeded maximum token limit            |
  | `ContentFilteringError`    | `GenerationError` | Content was filtered by safety systems           |
  | `APIConnectionError`       | `InferenceError`  | Failed to connect to the inference API           |
  | `APITimeoutError`          | `InferenceError`  | Inference API request timed out                  |
  | `RateLimitError`           | `InferenceError`  | Rate limit exceeded, slow down requests          |
  | `BadRequestError`          | `InferenceError`  | Request was malformed or invalid (400)           |
  | `UnauthorizedError`        | `InferenceError`  | Authentication failed or missing (401)           |
  | `PermissionDeniedError`    | `InferenceError`  | Insufficient permissions (403)                   |
  | `NotFoundError`            | `InferenceError`  | Requested resource not found (404)               |
  | `ConflictError`            | `InferenceError`  | Request conflicts with current state (409)       |
  | `UnprocessableEntityError` | `InferenceError`  | Request understood but cannot be processed (422) |
  | `RequestTooLargeError`     | `InferenceError`  | Request payload too large for inference service  |
  | `InternalServerError`      | `InferenceError`  | Internal server error (500)                      |
  | `ServiceUnavailableError`  | `InferenceError`  | Service temporarily unavailable (503)            |
  | `OverloadedError`          | `InferenceError`  | Inference service is overloaded, try again later |
  | `DeadlineExceededError`    | `InferenceError`  | Operation exceeded its deadline                  |
</Accordion>

## Controlling Token Limits

The `MaxTokensError` occurs when a response exceeds the maximum token limit. You can proactively control this by setting <Python>`max_tokens`</Python><TypeScript>`maxTokens`</TypeScript> when creating agents or agentic functions. When an integer is supplied, this is the maximum number of output tokens for an invocation (across all rounds of inference). For more fine-grained control, use a `MaxTokens` object.

<CodeGroup>
  ```python Python theme={null}
  from agentica import spawn, agentic, MaxTokens

  # For agents
  agent = await spawn(
      premise="You are a concise assistant.",
      max_tokens=500  # Limit each invocation to total 500 output tokens
  )

  # For agentic functions
  @agentic(max_tokens=1000)
  async def summarize(text: str) -> str:
      """Create a brief summary."""
      ...

  # For finer control, use MaxTokens
  agent = await spawn(
      premise="You are a concise assistant.",
      max_tokens=MaxTokens(per_invocation=5000, per_round=1000, rounds=5)
  )
  ```

  ```typescript TypeScript theme={null}
  import { agentic, MaxTokens, spawn } from '@symbolica/agentica';

  // For agents
  const agent = await spawn({
      premise: "You are a concise assistant.",
      maxTokens: 500  // Limit each invocation to total 500 output tokens
  });

  // For agentic functions
  async function summarize(text: string): Promise<string> {
      return agentic<string>("Create a brief summary", { text }, { maxTokens: 1000 });
  }

  // For finer control, use MaxTokens
  const agent2 = await spawn({
      premise: "You are a concise assistant.",
      maxTokens: MaxTokens.from({ perInvocation: 5000, perRound: 1000, rounds: 5 })
  });
  ```
</CodeGroup>

For more details on `MaxTokens`, see the <Python>[Python API reference](/references/python/usage#maxtokens)</Python><TypeScript>[TypeScript API reference](/references/ts/usage#maxtokens)</TypeScript>.

**Use cases for token limits:**

* **Cost control**: Limit token usage per inference request to manage costs
* **Response length**: Ensure individual inference outputs meet length requirements
* **Error prevention**: Avoid unexpectedly long responses in a single inference request

When you set <Python>`max_tokens`</Python><TypeScript>`maxTokens`</TypeScript>, each inference request will stop generating when it reaches that limit. If the natural response would exceed this limit, a `MaxTokensError` will be raised, which you can catch and handle appropriately (see examples below).

<Tip>
  Setting appropriate token limits upfront is more efficient than handling `MaxTokensError` after the fact, as it prevents wasted compute on overly long responses.
</Tip>

## Catching Specific Errors

You can catch specific error types to implement different handling strategies. The error hierarchy lets you handle errors at different levels of granularity:

<CodeGroup>
  ```python Python expandable theme={null}
  from agentica import agentic
  from agentica.errors import (
      AgenticaError,
      RateLimitError,
      InferenceError,
      ConnectionError,
      MaxTokensError,
  )

  @agentic()
  async def analyze_text(text: str) -> dict:
      """Analyze the sentiment and extract key topics."""
      ...

  try:
      result = await analyze_text(document)
  except RateLimitError as e:
      # Handle specific inference error
      logger.warning(f"Rate limited: {e}")
      time.sleep(60)
      result = await analyze_text(document)
  except MaxTokensError as e:
      # Handle token limit from inference service
      logger.warning(f"Response too long: {e}")
      result = await analyze_text(document[:len(document)//2])  # Try with less input
  except InferenceError as e:
      # Catch all other inference service errors (API errors, timeouts, etc.)
      logger.error(f"Inference service error: {e}")
      result = {"sentiment": "neutral", "topics": []}
  except ConnectionError as e:
      # Handle connection failures
      logger.error(f"Connection failed: {e}")
      result = retry_with_backoff(lambda: analyze_text(document))
  except AgenticaError as e:
      # Catch any other SDK errors
      logger.error(f"Unexpected Agentica error: {e}")
      raise
  ```

  ```typescript TypeScript expandable theme={null}
  import { agentic } from '@symbolica/agentica';
  import {
    AgenticaError,
    RateLimitError,
    InferenceError,
    ConnectionError,
    MaxTokensError,
  } from '@symbolica/agentica/errors';

  interface AnalysisResult {
    sentiment: string;
    topics: string[];
  }

  async function analyzeText(text: string): Promise<AnalysisResult> {
    return agentic<AnalysisResult>("Analyze the sentiment and extract key topics.", { text });
  }

  try {
    const result = await analyzeText(document);
  } catch (e) {
    if (e instanceof RateLimitError) {
      // Handle specific inference error
      logger.warn(`Rate limited: ${e}`);
      await sleep(60000);
      result = await analyzeText(document);
    } else if (e instanceof MaxTokensError) {
      // Handle token limit from inference service
      logger.warn(`Response too long: ${e}`);
      result = await analyzeText(document.slice(0, document.length / 2));
    } else if (e instanceof InferenceError) {
      // Catch all other inference service errors (API errors, timeouts, etc.)
      logger.error(`Inference service error: ${e}`);
      result = { sentiment: "neutral", topics: [] };
    } else if (e instanceof ConnectionError) {
      // Handle connection failures
      logger.error(`Connection failed: ${e}`);
      result = await retryWithBackoff(() => analyzeText(document));
    } else if (e instanceof AgenticaError) {
      // Catch any other SDK errors
      logger.error(`Unexpected Agentica error: ${e}`);
      throw e;
    } else {
      throw e;
    }
  }
  ```
</CodeGroup>

## Retry Strategies

<Note>
  **The platform already retries most failures automatically** using exponential backoff and jitter. When an error reaches your code, it means the platform's built-in retry logic has been exhausted. You typically don't need additional retry logic, but you may add it for application-specific requirements.
</Note>

If you need additional retry logic beyond what the platform provides (for example, for application-specific error handling or longer retry windows), you can implement your own retry strategy.

You could implement a simple retry strategy like this:

<CodeGroup>
  ```python Python theme={null}
  def retry(operation: Callable, retries: int = 3) -> Any:
    error = None
    for _ in range(retries):
      try:
        return operation()
      except Exception as e:
        error = e
    raise error
  ```

  ```typescript TypeScript theme={null}
  async function retry<T>(operation: () => Promise<T>, retries: number = 3): Promise<T> {
    let error: Error | undefined;
    for (let i = 0; i < retries; i++) {
      try {
        return await operation();
      } catch (e) {
        error = e as Error;
      }
    }
    throw error;
  }
  ```
</CodeGroup>

which may be used to wrap any agentic function or agent call:

<CodeGroup>
  ```python Python theme={null}
  result = retry(lambda: extract_date(document), retries=3)
  ```

  ```typescript TypeScript theme={null}
  const result = await retry(() => extractDate(document), 3);
  ```
</CodeGroup>

Since agentic functions are just functions already, you may also use widely available libraries such as <Python>[tenacity](https://github.com/jd/tenacity)</Python><TypeScript>[ts-retry-promise](https://github.com/normartin/ts-retry-promise)</TypeScript> to handle logic for retries.

<CodeGroup>
  ```python Python theme={null}
  from tenacity import retry, stop_after_attempt

  @retry(stop=stop_after_attempt(3))
  @agentic()
  async def process_data(input_data: str) -> dict:
      """Process and analyze the input data."""
      ...
  ```

  ```typescript TypeScript theme={null}
  import { retry } from 'ts-retry-promise';

  interface ProcessedData {
    result: string;
    confidence: number;
  }

  async function processData(input: string): Promise<ProcessedData> {
    return agentic<ProcessedData>("Process and analyze the input data", { input });
  }

  // Wrap with retry for transient failures
  const reliableProcessing = (input: string): Promise<ProcessedData> =>
    retry(() => processData(input), {
      retries: 3
    });
  ```
</CodeGroup>

<Note>
  Remember that the platform has already retried transient failures before an error reaches your code. If an operation consistently fails even with your own additional retry logic, consider adjusting your prompts, providing more context, or choosing a different model rather than increasing retry attempts.
</Note>

<Tip>
  **Once you get successful results, consider caching them.** After retries produce a good response, you can cache it for future calls with the same inputs. This combines resilience with performance. See [Caching](/guides/best-practices#caching) for strategies.
</Tip>

## Error Logging

Proper logging of agentic operations helps you monitor reliability, debug failures, and identify patterns in errors. Log enough context to diagnose issues, but be mindful of sensitive data.

### Basic Structured Logging

Log agent failures with structured data that includes the operation, error type, and relevant context. This example shows agent-driven test generation with comprehensive logging:

<CodeGroup>
  ```python Python theme={null}
  import logging
  from agentica import agentic

  logger = logging.getLogger(__name__)

  @agentic()
  async def generate_tests(source_code: str, framework: str) -> str:
      """
      Generate comprehensive unit tests for the given source code.
      Use the specified testing framework.
      """
      ...

  async def create_test_suite(source_code: str, framework: str, file_path: str) -> str:
      try:
          tests = await generate_tests(source_code, framework)
          logger.info("Test suite generated successfully", extra={
              "operation": "generate_tests",
              "framework": framework,
              "file_path": file_path,
              "source_lines": source_code.count('\n'),
              "test_lines": tests.count('\n')
          })
          return tests
      except Exception as e:
          logger.error("Test generation failed", extra={
              "operation": "generate_tests",
              "error_type": type(e).__name__,
              "error_message": str(e),
              "framework": framework,
              "file_path": file_path,
              "source_lines": source_code.count('\n')
          })
          raise
  ```

  ```typescript TypeScript theme={null}
  import { agentic } from '@symbolica/agentica';
  import { logger } from './logger';

  async function generateTests(sourceCode: string, framework: string): Promise<string> {
    return agentic<string>(
      `Generate comprehensive unit tests for the given source code.
       Use the specified testing framework.`,
      { sourceCode, framework }
    );
  }

  async function createTestSuite(
    sourceCode: string,
    framework: string,
    filePath: string
  ): Promise<string> {
    try {
      const tests = await generateTests(sourceCode, framework);
      logger.info("Test suite generated successfully", {
        operation: "generateTests",
        framework,
        filePath,
        sourceLines: (sourceCode.match(/\n/g) || []).length,
        testLines: (tests.match(/\n/g) || []).length
      });
      return tests;
    } catch (e) {
      logger.error("Test generation failed", {
        operation: "generateTests",
        errorType: e.constructor.name,
        errorMessage: String(e),
        framework,
        filePath,
        sourceLines: (sourceCode.match(/\n/g) || []).length
      });
      throw e;
    }
  }
  ```
</CodeGroup>

### Sensitive Data Handling

Never log sensitive user data. Redact or omit PII while preserving enough context for debugging:

<CodeGroup>
  ```python Python theme={null}
  from agentica import agentic
  import hashlib

  def hash_for_logging(value: str) -> str:
      """Create a consistent hash for correlation without exposing data."""
      return hashlib.sha256(value.encode()).hexdigest()[:8]

  @agentic()
  async def analyze_support_ticket(ticket_text: str, customer_email: str) -> dict:
      """Analyze customer support ticket."""
      ...

  async def process_ticket(ticket_text: str, customer_email: str) -> dict:
      try:
          result = await analyze_support_ticket(ticket_text, customer_email)
          logger.info("Ticket analyzed", extra={
              "operation": "analyze_support_ticket",
              "customer_id": hash_for_logging(customer_email),
              "ticket_length": len(ticket_text)
          })
          return result
      except Exception as e:
          logger.error("Ticket analysis failed", extra={
              "operation": "analyze_support_ticket",
              "error": str(e),
              "customer_id": hash_for_logging(customer_email),
              "ticket_length": len(ticket_text)
              # Do NOT log customer_email or ticket_text
          })
          raise
  ```

  ```typescript TypeScript theme={null}
  import { agentic } from '@symbolica/agentica';
  import { createHash } from 'crypto';

  function hashForLogging(value: string): string {
    return createHash('sha256').update(value).digest('hex').slice(0, 8);
  }

  async function analyzeSupportTicket(
    ticketText: string,
    customerEmail: string
  ): Promise<object> {
    return agentic<object>("Analyze customer support ticket.", { ticketText, customerEmail });
  }

  async function processTicket(ticketText: string, customerEmail: string): Promise<object> {
    try {
      const result = await analyzeSupportTicket(ticketText, customerEmail);
      logger.info("Ticket analyzed", {
        operation: "analyzeSupportTicket",
        customerId: hashForLogging(customerEmail),
        ticketLength: ticketText.length
      });
      return result;
    } catch (e) {
      logger.error("Ticket analysis failed", {
        operation: "analyzeSupportTicket",
        error: String(e),
        customerId: hashForLogging(customerEmail),
        ticketLength: ticketText.length
        // Do NOT log customerEmail or ticketText
      });
      throw e;
    }
  }
  ```
</CodeGroup>

### Tracking Degraded Performance

When using fallback strategies, log which tier succeeded. Frequent fallbacks suggest reviewing your approach -- consider trying a different model better suited to the task, refining your prompts, or providing additional context.

This example shows agentic code review with quality tracking:

<CodeGroup>
  ```python Python theme={null}
  from agentica import agentic
  from dataclasses import dataclass

  @dataclass
  class ReviewResult:
      issues: list[str]
      severity: str
      suggestions: list[str]

  @agentic()
  async def deep_code_review(code: str, context: dict) -> ReviewResult:
      """
      Perform comprehensive code review including:
      - Security vulnerabilities
      - Performance issues
      - Best practices violations
      - Design pattern suggestions
      Analyze with full codebase context.
      """
      ...

  @agentic()
  async def basic_code_review(code: str) -> ReviewResult:
      """
      Perform basic code review:
      - Syntax issues
      - Common anti-patterns
      - Simple style violations
      No codebase context required.
      """
      ...

  async def review_code_with_monitoring(code: str, context: dict = None) -> ReviewResult:
      # Try comprehensive review with context
      if context:
          try:
              result = await deep_code_review(code, context)
              logger.info("Code review completed", extra={
                  "method": "deep_review",
                  "issues_found": len(result.issues),
                  "severity": result.severity
              })
              return result
          except Exception as e:
              logger.warning("Deep review failed, falling back to basic", extra={
                  "error": str(e),
                  "code_length": len(code)
              })

      # Fallback to basic review
      try:
          result = await basic_code_review(code)
          logger.warning("Code review completed with basic analysis only", extra={
              "method": "basic_review",
              "issues_found": len(result.issues),
              "severity": result.severity
          })
          return result
      except Exception as e:
          logger.error("All review methods failed", extra={
              "error": str(e),
              "code_length": len(code)
          })
          raise
  ```

  ```typescript TypeScript theme={null}
  import { agentic } from '@symbolica/agentica';

  interface ReviewResult {
    issues: string[];
    severity: string;
    suggestions: string[];
  }

  async function deepCodeReview(code: string, context: object): Promise<ReviewResult> {
    return agentic<ReviewResult>(
      `Perform comprehensive code review including security vulnerabilities,
       performance issues, best practices, and design patterns.
       Analyze with full codebase context.`,
      { code, context }
    );
  }

  async function basicCodeReview(code: string): Promise<ReviewResult> {
    return agentic<ReviewResult>(
      `Perform basic code review: syntax issues, common anti-patterns,
       and simple style violations. No codebase context required.`,
      { code }
    );
  }

  async function reviewCodeWithMonitoring(
    code: string,
    context?: object
  ): Promise<ReviewResult> {
    // Try comprehensive review with context
    if (context) {
      try {
        const result = await deepCodeReview(code, context);
        logger.info("Code review completed", {
          method: "deep_review",
          issuesFound: result.issues.length,
          severity: result.severity
        });
        return result;
      } catch (e) {
        logger.warn("Deep review failed, falling back to basic", {
          error: String(e),
          codeLength: code.length
        });
      }
    }

    // Fallback to basic review
    try {
      const result = await basicCodeReview(code);
      logger.warn("Code review completed with basic analysis only", {
        method: "basic_review",
        issuesFound: result.issues.length,
        severity: result.severity
      });
      return result;
    } catch (e) {
      logger.error("All review methods failed", {
        error: String(e),
        codeLength: code.length
      });
      throw e;
    }
  }
  ```
</CodeGroup>

Use metrics from these logs to track fallback rates and identify patterns. High fallback rates are a useful diagnostic -- they suggest opportunities to choose a model better suited to your task, refine your prompts, or provide additional context.

## Caching

Caching with respect to inference is managed internally and specific to the model provider.

<Tip>
  Previous agent invocations may be cached client-side. In Python, just use the `@functools.cache` decorator!
</Tip>

### Rate limiting

When rate limits from providers are imposed, exponential backoff is employed with sensible defaults in a blocking fashion.
The initial `delay` is multiplied by a factor of `exponential_base * (1 + jitter * random_float)` with every retry till `max_retries` is reached, where `0.0 <= random_float < 1.0`.

## Next Steps

<CardGroup cols={2}>
  <Card title="Agent Errors" icon="robot" href="/guides/agent-errors">
    Handle custom exceptions raised by agents
  </Card>

  <Card title="Best Practices" icon="star" href="/guides/best-practices">
    Production deployment best practices
  </Card>

  <Card title="Multi-Agent Systems" icon="users" href="/guides/multi-agent-systems">
    Handle errors in multi-agent systems
  </Card>

  <Card title="How It Works" icon="gears" href="/concepts/how-it-works">
    Understand execution modes and RPC (Warp)
  </Card>

  <Card title="Reporting Bugs" icon="bug" href="/guides/reporting-bugs">
    Report bugs in the Agentica SDK
  </Card>
</CardGroup>
