A sliding context window maintains only the most recent messages within the context limit. When new messages arrive, older messages are "pushed out" of the window.
Initial State (Window Size = 3):
┌─────────────────────────┐
│ [A] [B] [C] │ ← Context Window
└─────────────────────────┘
After Adding Message D:
┌─────────────────────────┐
│ [B] [C] [D] │ ← A is dropped
└─────────────────────────┘
After Adding Message E:
┌─────────────────────────┐
│ [C] [D] [E] │ ← B is dropped
└─────────────────────────┘
class SlidingWindow:
def __init__(self, window_size=3):
self.window_size = window_size
self.messages = []
def add_message(self, message):
self.messages.append(message)
# Keep only the most recent messages
if len(self.messages) > self.window_size:
self.messages = self.messages[-self.window_size:]
def get_context(self):
return self.messages
Conversation:
User: "I'm working on a Python web app"
Assistant: "Great! What framework are you using?"
User: "Django with PostgreSQL"
Assistant: "Excellent choice! What's your app about?"
User: "It's an inventory management system"
Assistant: "Nice! What specific features do you need?"
Sliding Window (last 3 messages):
┌─────────────────────────────────────────────────┐
│ "What specific features do you need?" │
│ "It's an inventory management system" │
│ "Nice! What specific features do you need?" │
└─────────────────────────────────────────────────┘
Problem: Model no longer knows we're using Django and PostgreSQL!
Original Context:
User: "My name is John, I'm a frontend developer"
Assistant: "Nice to meet you, John!"
User: "Can you help me with React?"
Assistant: "Sure, what React issue are you facing?"
User: "I'm having trouble with hooks"
Assistant: "What specific hook problem?"
After Sliding:
┌─────────────────────────────────────────┐
│ "I'm having trouble with hooks" │
│ "What specific hook problem?" │
│ "Sure, what React issue are you facing?" │
└─────────────────────────────────────────┘
Result: Model doesn't know the user's name or that they're a developer!
The model may lose track of the original conversation topic and provide irrelevant responses.
User: "What's my name?"
Assistant: "I don't know your name."
User: "I told you earlier, I'm John"
Assistant: "Nice to meet you, John!"
User: "Now what's my name?"
Assistant: "I don't know your name." ← Context window slid again!
System prompts and important instructions can be lost:
Initial:
System: "Always respond in JSON format"
User: "Convert this to JSON"
Assistant: {"status": "ok"}
Later (after sliding):
User: "Convert this to JSON"
Assistant: "Sure, here's the conversion:" ← Forgot JSON format!
For brief interactions where context loss isn't critical:
When each request is largely independent:
For live conversations where only recent messages matter:
def priority_sliding(messages, window_size):
# Always keep system messages
system_msgs = [m for m in messages if m['role'] == 'system']
other_msgs = [m for m in messages if m['role'] != 'system']
# Keep recent messages within window size
available_space = window_size - len(system_msgs)
recent_msgs = other_msgs[-available_space:]
return system_msgs + recent_msgs
Instead of purely time-based sliding, keep messages based on relevance:
def semantic_sliding(messages, window_size, current_query):
# Score messages by relevance to current query
scored = [(msg, relevance_score(msg, current_query))
for msg in messages]
# Keep most relevant messages
scored.sort(key=lambda x: x[1], reverse=True)
return [msg for msg, _ in scored[:window_size]]
Combine sliding with other strategies:
Sliding window is simple and efficient but can lead to significant context loss. It's best suited for:
For longer, more complex conversations, consider combining sliding windows with other context management strategies.