MaxTokens
A dataclass to be used on spawning an agent or decorating an agentic function to control the maximum number of output tokens generated by the underlying model.
The maximum number of output tokens generated over all rounds of inference in a single invocation.Defaults to
None meaning unlimited.The maximum number of output tokens generated in a single round of inference in a single invocation.Defaults to
None meaning unlimited.The maximum number of rounds of inference in a single invocation.Defaults to
None meaning unlimited.ResponseUsage
Token usage statistics returned by all usage-reporting methods. This is ResponseUsage from openai.types.responses, which provides richer detail including cached and reasoning token breakdowns.
The number of input tokens consumed by the model.
The number of output tokens generated by the model.
The number of input tokens that were served from cache.
The number of tokens used for internal reasoning / chain-of-thought.
total_usage
A function to obtain the total token usage of an agent or agentic function across all invocations.
The agent, agentic function, or decorated callable to query.
The token usage across all invocations of the agent or agentic function.
last_usage
A function to obtain the token usage of an agent or agentic function for the last invocation.
The agent, agentic function, or decorated callable to query.
The token usage of the last invocation of the agent or agentic function.