AIG.Set MaxToken - Set the maximum token limit for responses

AIG.SetMaxToken

Previous Top Next

MiniRobotLanguage (MRL)

AIG.Set MaxToken
Set the Maximum Token Limit for AI Responses

Intention

SetMaxToken Command: Control Response Length

The SetMaxToken command allows you to define the maximum number of output tokens the Google Gemini AI can generate in a response, helping manage length, cost, and processing time.

This command sets the AIG_MaxToken global variable, influencing subsequent API calls.

It’s part of the AIG - Google Gemini API suite.

What is the SetMaxToken Command?

This command configures the maximum output token limit for responses from the Google Gemini API, ranging from 1 to 8192 tokens.

If an invalid value is provided (e.g., exceeding 8192 or less than 1), it defaults to 2048 tokens, aligning with the Gemini 1.0 Pro default output limit.

Why Do You Need It?

Setting the output token limit is critical for:

•Cost Management: Caps output tokens to control API usage costs.

•Response Precision: Ensures concise, focused answers.

•Performance Optimization: Reduces latency for large responses.

How to Use the SetMaxToken Command?

Specify a numeric value between 1 and 8192 to set the output token limit.

The Google Gemini API (as of March 19, 2025) supports various models with distinct token capabilities:

•Gemini 1.0 Pro: 32,768 input tokens, 2,048 output tokens. Cost: $0.000125/1K input, $0.000375/1K output.

•Gemini 1.5 Pro: 2M input tokens (stable at 128K, up to 2M in preview), 8,192 output tokens. Cost: $1.25/1M input (128K), $2.50/1M output.

•Gemini 1.5 Flash: 1M input tokens, 8,192 output tokens. Cost: $0.0375/1M input, $0.15/1M output.

•Gemini 2.0 Flash: 2M input tokens (experimental), 8,192 output tokens. Cost: TBD, experimental release.

Note: The AIG framework currently limits output to 8192 tokens, matching the API’s maximum output capacity.

Latest Google Gemini API Updates

As of March 19, 2025, the Gemini API has introduced:

•Gemini 2.0 Flash: Experimental model with enhanced multimodal capabilities and 2M token context.

•Context Caching: Reduces costs for repeated inputs, available for 1.5 models.

•Rate Limits: Increased to 2,000 RPM for 1.5 Flash, 1,000 RPM for 1.5 Pro.

Example Usage

AIG.Set MaxToken|1000

AIG.Ask|Describe the universe

DBP.Response limited to 1000 tokens

Sets the output limit to 1000 tokens for subsequent API calls.

Illustration

┌───────────────┐

│ Max Tokens │

├───────────────┤

│ 1000 │

└───────────────┘

Caps the AI response at 1000 output tokens.

Syntax

AIG.SetMaxToken|P1

AIG.Set_MaxToken|P1

Parameter Explanation

P1 - The maximum output token value (1-8192). If invalid, defaults to 2048.

Example

AIG.Set MaxToken|500

AIG.Ask|What is consciousness?

DBP.Response capped at 500 tokens

ENR.

Remarks

- Applies to output tokens only; input context can reach 2M tokens depending on the model.

- Persists until changed or reset via AIG_Initialize.

Limitations

- Hard limit of 8192 output tokens aligns with Gemini API constraints.

- Does not affect input token limits, which vary by model.

See also:

• AIG.Get MaxToken

• AIG.Set Model