Context Engineering

Sliding Window Strategy

How sliding context windows work and their limitations.

Sliding Window Strategy

How Sliding Context Window Works

A sliding context window maintains only the most recent messages within the context limit. When new messages arrive, older messages are "pushed out" of the window.

Basic Mechanism

Initial State (Window Size = 3):
┌─────────────────────────┐
│ [A] [B] [C]            │ ← Context Window
└─────────────────────────┘

After Adding Message D:
┌─────────────────────────┐
│ [B] [C] [D]            │ ← A is dropped
└─────────────────────────┘

After Adding Message E:
┌─────────────────────────┐
│ [C] [D] [E]            │ ← B is dropped
└─────────────────────────┘

Implementation Example

class SlidingWindow:
    def __init__(self, window_size=3):
        self.window_size = window_size
        self.messages = []
    
    def add_message(self, message):
        self.messages.append(message)
        
        # Keep only the most recent messages
        if len(self.messages) > self.window_size:
            self.messages = self.messages[-self.window_size:]
    
    def get_context(self):
        return self.messages

Real-world Scenario

Conversation:
User: "I'm working on a Python web app"
Assistant: "Great! What framework are you using?"
User: "Django with PostgreSQL"
Assistant: "Excellent choice! What's your app about?"
User: "It's an inventory management system"
Assistant: "Nice! What specific features do you need?"

Sliding Window (last 3 messages):
┌─────────────────────────────────────────────────┐
│ "What specific features do you need?"           │
│ "It's an inventory management system"           │
│ "Nice! What specific features do you need?"     │
└─────────────────────────────────────────────────┘

Problem: Model no longer knows we're using Django and PostgreSQL!

Problems with Sliding Window

1. Loss of Critical Context

Original Context:
User: "My name is John, I'm a frontend developer"
Assistant: "Nice to meet you, John!"
User: "Can you help me with React?"
Assistant: "Sure, what React issue are you facing?"
User: "I'm having trouble with hooks"
Assistant: "What specific hook problem?"

After Sliding:
┌─────────────────────────────────────────┐
│ "I'm having trouble with hooks"         │
│ "What specific hook problem?"           │
│ "Sure, what React issue are you facing?" │
└─────────────────────────────────────────┘

Result: Model doesn't know the user's name or that they're a developer!

2. Topic Drift

The model may lose track of the original conversation topic and provide irrelevant responses.

3. Repetitive Questions

User: "What's my name?"
Assistant: "I don't know your name."
User: "I told you earlier, I'm John"
Assistant: "Nice to meet you, John!"
User: "Now what's my name?"
Assistant: "I don't know your name." ← Context window slid again!

4. Loss of Instructions

System prompts and important instructions can be lost:

Initial:
System: "Always respond in JSON format"
User: "Convert this to JSON"
Assistant: {"status": "ok"}

Later (after sliding):
User: "Convert this to JSON"
Assistant: "Sure, here's the conversion:" ← Forgot JSON format!

When Sliding Window Works Well

1. Short Conversations

For brief interactions where context loss isn't critical:

  • Simple Q&A
  • Code generation
  • Translation tasks

2. Stateless Operations

When each request is largely independent:

  • API calls
  • Data processing
  • Content generation

3. Real-time Chat

For live conversations where only recent messages matter:

  • Customer support
  • Live assistance
  • Quick troubleshooting

Advanced Sliding Window Techniques

1. Priority Sliding

def priority_sliding(messages, window_size):
    # Always keep system messages
    system_msgs = [m for m in messages if m['role'] == 'system']
    other_msgs = [m for m in messages if m['role'] != 'system']
    
    # Keep recent messages within window size
    available_space = window_size - len(system_msgs)
    recent_msgs = other_msgs[-available_space:]
    
    return system_msgs + recent_msgs

2. Semantic Sliding

Instead of purely time-based sliding, keep messages based on relevance:

def semantic_sliding(messages, window_size, current_query):
    # Score messages by relevance to current query
    scored = [(msg, relevance_score(msg, current_query)) 
              for msg in messages]
    
    # Keep most relevant messages
    scored.sort(key=lambda x: x[1], reverse=True)
    return [msg for msg, _ in scored[:window_size]]

3. Hybrid Approach

Combine sliding with other strategies:

  • Keep system messages always
  • Slide user/assistant messages
  • Periodically summarize older content

Summary

Sliding window is simple and efficient but can lead to significant context loss. It's best suited for:

  • Short conversations
  • Stateless operations
  • Situations where recent context is most important

For longer, more complex conversations, consider combining sliding windows with other context management strategies.