06 Context Window

Introduction¶

Maximum number of tokens that the model has to consider before generating the next token.
Includes all the prompts (system prompts and user prompts) and their subsequent replies and so on.
This is important for multi-shot prompting where the prompt includes examples or for long conversations.

Comparison of Context Windows¶

As of today, the below is the order of LLMs in terms of context windows from the highest to lowest 1. Llama 4 Scout - 10,000,000 tokens 2. Llama 4 Maverick - 10,000,000 tokens 3. Gemini 2.5 Flash - 1,000,000 tokens 4. GPT-4.1 nano - 1,000,000 tokens 5. GPT-4.1 mini - 1,000,000 tokens

References¶

Refer LLM Leaderboard which compares and ranks the most common LLM models.