Skip to content

06 Context Window

Introduction

  • Maximum number of tokens that the model has to consider before generating the next token.
  • Includes all the prompts (system prompts and user prompts) and their subsequent replies and so on.
  • This is important for multi-shot prompting where the prompt includes examples or for long conversations.

Comparison of Context Windows

As of today, the below is the order of LLMs in terms of context windows from the highest to lowest 1. Llama 4 Scout - 10,000,000 tokens 2. Llama 4 Maverick - 10,000,000 tokens 3. Gemini 2.5 Flash - 1,000,000 tokens 4. GPT-4.1 nano - 1,000,000 tokens 5. GPT-4.1 mini - 1,000,000 tokens

References

Refer LLM Leaderboard which compares and ranks the most common LLM models.