Skip to main content

Overview

Agentica allows agents to operate within your runtime by allowing agents to write and execute arbitrary Python code in a sandboxed execution environment, all while having direct access to the objects in your runtime that you choose.

Why?

Agentica is built on the premise that code is the most expressive interface through which models can interact with their environment; with Agentica, agents can manage and engineer their own context, manipulate and return objects by reference and dynamically create their own tools. Check out the blog post Beyond Code Mode: Agentica.

The Mental Model

This is achieved by combining Remote Procedure Call (RPC) with transparent proxying in a Python REPL using a language-agnostic object model.
  • Sandboxed execution: Agents write Python code that is executed in a safe, isolated environment
  • RPC bridge: Functions you pass in scope appear in the sandbox as stubs, but execute in your runtime
  • Your code stays local: All actual computation happens in your process with full access to your dependencies
  • Objects are proxied: Return values of agents and agentic functions from your functions are represented as lightweight proxies in the sandbox, not fully serialized
  • Type safety enforced: Return values are validated against your type annotations

Anatomy of an Invocation

Let’s walk through exactly what happens when you call an agentic function (much like an agent). Consider the simplified example below where we have elided portions of the code for clarity.
from agentica import agentic

# Your existing types and functions
class OrderResult: ...
class CustomerTier: ...
def get_customer_tier(name: str) -> CustomerTier: ...
def calculate_discount(tier: CustomerTier) -> float: ...

@agentic(get_customer_tier, calculate_discount)
async def process_order(customer_name: str, base_price: float) -> OrderResult:
    """Look up customer tier, calculate discount, and create order"""
    ...

# Call the function
result = await process_order("Alice", 100.0)

What Happens Behind the Scenes

When process_order("Alice", 100.0) is called, it triggers the following interaction. 1. Agentica sends a request to the underlying model with:
  • The instruction: "Look up customer tier, calculate discount, and create order"
  • The input parameters: customer_name = "Alice", base_price = 100.0
  • The signatures and docstrings of functions in scope: get_customer_tier(), calculate_discount()
  • The details of the types in scope: OrderResult, CustomerTier
  • The details of the expected return type, OrderResult
2. The agent interactively writes and evaluates code in a sandboxed REPL environment. The agent always writes Python code in the sandbox, regardless of whether you’re using the Python or TypeScript SDK. From its perspective, the Python REPL in the sandbox has everything in scope: get_customer_tier, calculate_discount, OrderResult, CustomerTier, etc. The agent can write and evaluate code like normal. Here’s what a sample REPL session could look like.
An example of an agent's output in Agentica.
[Agent]
I should begin by obtaining the customer tier
```python
tier = get_customer_tier(customer_name)
tier
```
[Execution]
```
CustomerTier(level='gold', benefits=['free_shipping', 'priority_support'])
```
[Agent]
This must be an important customer
```python
I need to calculate the discount factor
discount = calculate_discount(tier)
discount
```
[Execution]
```
0.15
```
[Agent]
That looks reasonable, a 15% discount. I should compute the price using the discount
```python
final_price = base_price * (1 - discount)
final_price
```
[Execution]
```
85.0
```
[Agent]
Good, let me assemble the order
```python
return OrderResult(
     customer=customer_name,
     original_price=base_price,
     discount=discount,
     final_price=final_price
)
```
[Execution]
```
No output was produced.
```
Async Functions in the REPL: The REPL includes a top-level event loop, so async functions work naturally. When you pass async functions from your runtime, they appear in the REPL as functions returning Future[T] (Python async def foo(...) -> T becomes def foo(...) -> Future[T], TypeScript async function foo(...): Promise<T> similarly translates). The agent can use top-level await, and standard patterns like asyncio.gather() work as expected.

A breakdown

Let’s break down the key lines from the agent’s output above:
tier = get_customer_tier(customer_name)
This calls a stub function in the sandbox — get_customer_tier was never defined in the sandbox, but our RPC and transparent proxying makes it appear to be present in the agents execution environment. The stub intercepts the call and triggers an RPC to your runtime, where your actual get_customer_tier() executes with access to your database, environment, etc.
tier
CustomerTier(level='gold', benefits=['free_shipping', 'priority_support'])
The return value is sent back as a transparent proxy. On inspection, the agent sees what looks like a CustomerTier object, but it’s actually a lightweight reference to the real object in your runtime.
discount = calculate_discount(tier)
When the agent passes tier to another function, the proxy is sent over RPC. Your actual calculate_discount() executes in your runtime with the real CustomerTier object.
final_price = base_price * (1 - discount)
Simple calculations execute directly in the sandbox — no RPC needed for basic operations.
result = OrderResult(...)
Instantiating OrderResult calls a stub that triggers your actual class constructor in your runtime, returning another proxy. 3. The result type is validated and returned to your code:
result = process_order("Alice", 100.0)
# Returns: OrderResult(customer="Alice", original_price=100.0, discount=0.15, final_price=85.0)
Observe that no schema was generated or needed to return a value to your code. Instead an object was instantiated in your runtime and Agentica ensures that the type of result matches the required return type (OrderResult), which ensures type safety.

Current limitations

Agents and agentic functions currently cannot:
  • define a type and then return an instance of that type
  • return functions, types, or generators that they have defined themselves
  • take inputs such as certain iterators and generators, binary buffers, URL objects, etc.
  • use complex numerical data such as torch tensors, numpy arrays, and pandas dataframes
Stay tuned for updates!

Bug reports

To report bugs and errors to the Agentica team, please create an issue in the agentica-issues repo and include the agent logs (filenames are printed to output). Happy programming!

Next Steps