Token Usage

`MaxTokens`

A dataclass to be used on spawning an agent or decorating an agentic function to control the maximum number of output tokens generated by the underlying model.

@dataclass(slots=True, frozen=True)
class MaxTokens:
    """Control the maximum number of tokens an agent can generate"""

    per_invocation: int | None = None
    per_round: int | None = None
    rounds: int | None = None

Parameters

per_invocation

int | None

The maximum number of output tokens generated over all rounds of inference in a single invocation.Defaults to None meaning unlimited.

per_round

int | None

The maximum number of output tokens generated in a single round of inference in a single invocation.Defaults to None meaning unlimited.

rounds

int | None

The maximum number of rounds of inference in a single invocation.Defaults to None meaning unlimited.

`ResponseUsage`

Token usage statistics returned by all usage-reporting methods. This is ResponseUsage from openai.types.responses, which provides richer detail including cached and reasoning token breakdowns.

from openai.types.responses import ResponseUsage

class ResponseUsage:
    input_tokens: int
    output_tokens: int
    total_tokens: int
    input_tokens_details: InputTokensDetails   # .cached_tokens: int
    output_tokens_details: OutputTokensDetails  # .reasoning_tokens: int

Key Fields

input_tokens

int

The number of input tokens consumed by the model.

output_tokens

int

The number of output tokens generated by the model.

input_tokens_details.cached_tokens

int

The number of input tokens that were served from cache.

output_tokens_details.reasoning_tokens

int

The number of tokens used for internal reasoning / chain-of-thought.

Example

u = agent.last_usage()
print(f"Input tokens:     {u.input_tokens}")
print(f"Output tokens:    {u.output_tokens}")
print(f"Cached tokens:    {u.input_tokens_details.cached_tokens}")
print(f"Reasoning tokens: {u.output_tokens_details.reasoning_tokens}")

`total_usage`

A function to obtain the total token usage of an agent or agentic function across all invocations.

def total_usage(ag: AgenticFunction | Agent | Callable[..., Any]) -> ResponseUsage:

Parameters

AgenticFunction | Agent | Callable[..., Any]

The agent, agentic function, or decorated callable to query.

Returns

ResponseUsage

The token usage across all invocations of the agent or agentic function.

`last_usage`

A function to obtain the token usage of an agent or agentic function for the last invocation.

def last_usage(ag: AgenticFunction | Agent | Callable[..., Any]) -> ResponseUsage:

Parameters

AgenticFunction | Agent | Callable[..., Any]

The agent, agentic function, or decorated callable to query.

Returns

ResponseUsage

The token usage of the last invocation of the agent or agentic function.

Core

Logging

Prompting

`MaxTokens`

`ResponseUsage`

`total_usage`

`last_usage`

Core

Logging

Prompting

​MaxTokens

​ResponseUsage

​total_usage

​last_usage

`MaxTokens`

`ResponseUsage`

`total_usage`

`last_usage`