Overview
Agentica allows agents to operate within your runtime by allowing agents to write and execute arbitrary Python code in a sandboxed execution environment, all while having direct access to the objects in your runtime that you choose.Why?
Agentica is built on the premise that code is the most expressive interface through which models can interact with their environment; with Agentica, agents can manage and engineer their own context, manipulate and return objects by reference and dynamically create their own tools. Check out the blog post Beyond Code Mode: Agentica.The Mental Model
This is achieved by combining Remote Procedure Call (RPC) with transparent proxying in a Python REPL using a language-agnostic object model.- Sandboxed execution: Agents write Python code that is executed in a safe, isolated environment
- RPC bridge: Functions you pass in scope appear in the sandbox as stubs, but execute in your runtime
- Your code stays local: All actual computation happens in your process with full access to your dependencies
- Objects are proxied: Return values of agents and agentic functions from your functions are represented as lightweight proxies in the sandbox, not fully serialized
- Type safety enforced: Return values are validated against your type annotations
Anatomy of an Invocation
Let’s walk through exactly what happens when you call an agentic function (much like an agent). Consider the simplified example below where we have elided portions of the code for clarity.What Happens Behind the Scenes
Whenprocess_order("Alice", 100.0) is called, it triggers the following interaction.
1. Agentica sends a request to the underlying model with:
- The instruction:
"Look up customer tier, calculate discount, and create order" - The input parameters:
customer_name = "Alice",base_price = 100.0 - The signatures and docstrings of functions in scope:
get_customer_tier(),calculate_discount() - The details of the types in scope:
OrderResult,CustomerTier - The details of the expected return type,
OrderResult
get_customer_tier, calculate_discount, OrderResult, CustomerTier, etc. The agent can write and evaluate code like normal. Here’s what a sample REPL session could look like.
An example of an agent's output in Agentica.
Async Functions in the REPL: The REPL includes a top-level event loop, so async functions work naturally. When you pass async functions from your runtime, they appear in the REPL as functions returning
Future[T] (Python async def foo(...) -> T becomes def foo(...) -> Future[T], TypeScript async function foo(...): Promise<T> similarly translates). The agent can use top-level await, and standard patterns like asyncio.gather() work as expected.A breakdown
Let’s break down the key lines from the agent’s output above:get_customer_tier was never defined in the sandbox, but our RPC and transparent proxying makes it appear to be present in the agents execution environment. The stub intercepts the call and triggers an RPC to your runtime, where your actual get_customer_tier() executes with access to your database, environment, etc.
CustomerTier object, but it’s actually a lightweight reference to the real object in your runtime.
tier to another function, the proxy is sent over RPC. Your actual calculate_discount() executes in your runtime with the real CustomerTier object.
OrderResult calls a stub that triggers your actual class constructor in your runtime, returning another proxy.
3. The result type is validated and returned to your code:
result matches the required return type (OrderResult), which ensures type safety.
Current limitations
Agents and agentic functions currently cannot:- define a type and then return an instance of that type
- return functions, types, or generators that they have defined themselves
- take inputs such as certain iterators and generators, binary buffers, URL objects, etc.
- use complex numerical data such as torch tensors, numpy arrays, and pandas dataframes