Context Window
The context window is the maximum amount of text (in tokens) that a model can take as input in one request; it limits how much you can send or retain in a conversation.
In Simple Terms
Think of it as the size of the whiteboard the model can see at once; anything beyond that is out of view.
Detailed Explanation
Everything you send—system prompt, history, and current message—counts toward the context window. When you exceed it, you must shorten, summarize, or drop older content. When it matters: for long documents, multi-turn chats, and RAG (how many chunks you can include). Common mistakes: sending huge prompts without checking the limit, or forgetting that the model reply also uses context.
Related Terms
Transformer
A transformer is a neural network architecture that uses attention to process sequences (e.g., text or tokens) in parallel rather than step-by-step. It underlies most large language models and many vision and multimodal systems.
Read moreAttention Mechanism
The attention mechanism lets a model focus on different parts of its input when producing each part of the output. It is the core of transformer architectures and enables handling long sequences and rich context.
Read moreFine-Tuning
Fine-tuning is training a pre-trained model on your own data so it gets better at specific tasks or styles while keeping its general abilities.
Read moreWant to Implement AI in Your Business?
Let's discuss how these AI concepts can drive value in your organization.
Schedule a Consultation