Home

Introduction to Server-Sent Events (SSE)

Server-Sent Events (SSE) is the protocol your MCP server uses to stream AI model responses in real-time. This guide explains how to consume SSE streams from your MCP server and implement client-side handling for streaming AI responses.

What is SSE?

SSE (Server-Sent Events) is a standardized technology that enables servers to push updates to clients over a single HTTP connection. When you request a completion from your MCP server, it uses SSE to deliver tokens as they're generated, allowing for real-time display and processing of AI responses.

Why MCP Servers Use SSE

Your MCP server uses SSE for streaming AI responses because:

  1. Real-time Delivery: Tokens appear immediately as they're generated
  2. Efficient Protocol: Minimizes overhead compared to alternatives
  3. Broad Compatibility: Works across browsers, frameworks, and languages
  4. Auto-Reconnection: Built-in reconnection if the connection drops
  5. HTTP-Based: Works with standard web infrastructure and security

Consuming SSE Streams from Your MCP Server

JavaScript Example

Here's how to connect to your MCP server's SSE stream in a web application:

// Request a streaming completion from your MCP server
async function getStreamingCompletion(prompt) {
  // First, prepare the request to your MCP server
  const response = await fetch('https://your-mcp-server.mcp-cloud.ai/v1/chat/completions', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': 'Bearer your_api_key'
    },
    body: JSON.stringify({
      model: 'claude-3-5-sonnet',
      messages: [{ role: 'user', content: prompt }],
      stream: true  // Request streaming response
    })
  });
  
  // Create a new EventSource from the response
  const reader = response.body.getReader();
  const decoder = new TextDecoder('utf-8');
  let buffer = '';
  
  // Process the stream
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    
    // Decode the chunk and add to buffer
    buffer += decoder.decode(value, { stream: true });
    
    // Process complete SSE messages in the buffer
    const lines = buffer.split('\n\n');
    buffer = lines.pop() || '';  // Keep the incomplete chunk in the buffer
    
    for (const line of lines) {
      if (line.startsWith('data: ')) {
        const data = line.substring(6);
        
        // Check for [DONE] message
        if (data === '[DONE]') {
          console.log('Stream complete');
          continue;
        }
        
        try {
          // Parse the JSON data
          const jsonData = JSON.parse(data);
          
          // Extract the token content
          const token = jsonData.choices[0]?.delta?.content || '';
          
          // Do something with the token
          displayToken(token);
        } catch (error) {
          console.error('Error parsing SSE data:', error);
        }
      }
    }
  }
}

// Call the function
getStreamingCompletion('Explain quantum computing');

Using EventSource in Browsers

For simpler browser-based implementations, you can use the native EventSource API:

// Note: This approach requires your MCP server to support GET requests for completions
// or you'll need a proxy that converts POST to GET
const evtSource = new EventSource(
  'https://your-mcp-server.mcp-cloud.ai/v1/stream?prompt=Explain+quantum+computing',
  { withCredentials: true } // Include if you need to send cookies for auth
);

evtSource.onmessage = function(event) {
  const token = JSON.parse(event.data).choices[0].delta.content;
  document.getElementById('response').innerText += token;
};

evtSource.onerror = function(error) {
  console.error('EventSource error:', error);
  evtSource.close();
};

// Close the connection when done
function closeConnection() {
  evtSource.close();
}

Python Example

Using Python to consume SSE streams from your MCP server:

import requests
import json

def stream_completion(prompt):
    # Set up the request to your MCP server
    url = "https://your-mcp-server.mcp-cloud.ai/v1/chat/completions"
    headers = {
        "Content-Type": "application/json",
        "Authorization": "Bearer your_api_key"
    }
    data = {
        "model": "claude-3-5-sonnet",
        "messages": [{"role": "user", "content": prompt}],
        "stream": True
    }
    
    # Make the request and process the streaming response
    response = requests.post(url, headers=headers, json=data, stream=True)
    
    # Ensure the request was successful
    response.raise_for_status()
    
    # Process the streaming response
    for line in response.iter_lines():
        if line:
            # Lines start with "data: "
            line = line.decode('utf-8')
            if line.startswith('data: '):
                data_str = line[6:]  # Remove 'data: ' prefix
                
                # Check for stream end
                if data_str == '[DONE]':
                    break
                
                try:
                    # Parse the JSON data
                    data = json.loads(data_str)
                    
                    # Extract token content
                    token = data['choices'][0]['delta'].get('content', '')
                    
                    # Do something with the token
                    print(token, end='', flush=True)
                except json.JSONDecodeError:
                    print(f"Error parsing JSON: {data_str}")

# Use the function
stream_completion("Explain quantum computing")

Understanding the MCP Server SSE Format

When your MCP server sends an SSE stream, each message follows this format:

data: {"id":"chatcmpl-123abc","object":"chat.completion.chunk","created":1694268190,"model":"claude-3-5-sonnet","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-123abc","object":"chat.completion.chunk","created":1694268190,"model":"claude-3-5-sonnet","choices":[{"index":0,"delta":{"content":" world"},"finish_reason":null}]}

data: {"id":"chatcmpl-123abc","object":"chat.completion.chunk","created":1694268190,"model":"claude-3-5-sonnet","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-123abc","object":"chat.completion.chunk","created":1694268190,"model":"claude-3-5-sonnet","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Key elements to note:

Client Libraries for SSE Consumption

Several libraries can simplify consuming SSE streams from your MCP server:

JavaScript

Python

Other Languages

Handling Errors and Reconnection

SSE clients should implement proper error handling:

function setupSSEConnection(url, apiKey) {
  let retryCount = 0;
  const maxRetries = 5;
  let retryDelay = 1000; // Start with 1 second
  
  function connect() {
    const eventSource = new EventSource(url);
    
    eventSource.onopen = () => {
      console.log('Connected to SSE stream');
      retryCount = 0;
      retryDelay = 1000;
    };
    
    eventSource.onmessage = (event) => {
      // Handle message...
    };
    
    eventSource.onerror = (error) => {
      eventSource.close();
      
      if (retryCount < maxRetries) {
        retryCount++;
        console.log(`Connection error, reconnecting (${retryCount}/${maxRetries})...`);
        setTimeout(connect, retryDelay);
        retryDelay = Math.min(retryDelay * 2, 30000); // Exponential backoff, max 30 seconds
      } else {
        console.error('Maximum reconnection attempts reached');
      }
    };
    
    return eventSource;
  }
  
  return connect();
}

Performance Considerations

For optimal performance when consuming SSE streams from your MCP server:

  1. Buffer Management: Process tokens efficiently to avoid memory buildup
  2. UI Updates: Batch DOM updates to prevent rendering bottlenecks
  3. Connection Pooling: Reuse connections when making multiple requests
  4. Error Handling: Implement robust error recovery and reconnection logic
  5. Timeout Settings: Configure appropriate timeouts for long-running streams