Skip to main content

MaxTokens

A dataclass to be used on spawning an agent or decorating an agentic function to control the maximum number of output tokens generated by the underlying model.
@dataclass(slots=True, frozen=True)
class MaxTokens:
    """Control the maximum number of tokens an agent can generate"""

    per_invocation: int | None = None
    per_round: int | None = None
    rounds: int | None = None
Parameters
per_invocation
int | None
The maximum number of output tokens generated over all rounds of inference in a single invocation.Defaults to None meaning unlimited.
per_round
int | None
The maximum number of output tokens generated in a single round of inference in a single invocation.Defaults to None meaning unlimited.
rounds
int | None
The maximum number of rounds of inference in a single invocation.Defaults to None meaning unlimited.

ResponseUsage

Token usage statistics returned by all usage-reporting methods. This is ResponseUsage from openai.types.responses, which provides richer detail including cached and reasoning token breakdowns.
from openai.types.responses import ResponseUsage

class ResponseUsage:
    input_tokens: int
    output_tokens: int
    total_tokens: int
    input_tokens_details: InputTokensDetails   # .cached_tokens: int
    output_tokens_details: OutputTokensDetails  # .reasoning_tokens: int
Key Fields
input_tokens
int
The number of input tokens consumed by the model.
output_tokens
int
The number of output tokens generated by the model.
input_tokens_details.cached_tokens
int
The number of input tokens that were served from cache.
output_tokens_details.reasoning_tokens
int
The number of tokens used for internal reasoning / chain-of-thought.
Example
u = agent.last_usage()
print(f"Input tokens:     {u.input_tokens}")
print(f"Output tokens:    {u.output_tokens}")
print(f"Cached tokens:    {u.input_tokens_details.cached_tokens}")
print(f"Reasoning tokens: {u.output_tokens_details.reasoning_tokens}")

total_usage

A function to obtain the total token usage of an agent or agentic function across all invocations.
def total_usage(ag: AgenticFunction | Agent | Callable[..., Any]) -> ResponseUsage:
Parameters
ag
AgenticFunction | Agent | Callable[..., Any]
The agent, agentic function, or decorated callable to query.
Returns
ResponseUsage
The token usage across all invocations of the agent or agentic function.

last_usage

A function to obtain the token usage of an agent or agentic function for the last invocation.
def last_usage(ag: AgenticFunction | Agent | Callable[..., Any]) -> ResponseUsage:
Parameters
ag
AgenticFunction | Agent | Callable[..., Any]
The agent, agentic function, or decorated callable to query.
Returns
ResponseUsage
The token usage of the last invocation of the agent or agentic function.