Introduction to Server-Sent Events (SSE)
Server-Sent Events (SSE) is the protocol your MCP server uses to stream AI model responses in real-time. This guide explains how to consume SSE streams from your MCP server and implement client-side handling for streaming AI responses.
What is SSE?
SSE (Server-Sent Events) is a standardized technology that enables servers to push updates to clients over a single HTTP connection. When you request a completion from your MCP server, it uses SSE to deliver tokens as they're generated, allowing for real-time display and processing of AI responses.
Why MCP Servers Use SSE
Your MCP server uses SSE for streaming AI responses because:
- Real-time Delivery: Tokens appear immediately as they're generated
- Efficient Protocol: Minimizes overhead compared to alternatives
- Broad Compatibility: Works across browsers, frameworks, and languages
- Auto-Reconnection: Built-in reconnection if the connection drops
- HTTP-Based: Works with standard web infrastructure and security
Consuming SSE Streams from Your MCP Server
JavaScript Example
Here's how to connect to your MCP server's SSE stream in a web application:
// Request a streaming completion from your MCP server
async function getStreamingCompletion(prompt) {
// First, prepare the request to your MCP server
const response = await fetch('https://your-mcp-server.mcp-cloud.ai/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer your_api_key'
},
body: JSON.stringify({
model: 'claude-3-5-sonnet',
messages: [{ role: 'user', content: prompt }],
stream: true // Request streaming response
})
});
// Create a new EventSource from the response
const reader = response.body.getReader();
const decoder = new TextDecoder('utf-8');
let buffer = '';
// Process the stream
while (true) {
const { done, value } = await reader.read();
if (done) break;
// Decode the chunk and add to buffer
buffer += decoder.decode(value, { stream: true });
// Process complete SSE messages in the buffer
const lines = buffer.split('\n\n');
buffer = lines.pop() || ''; // Keep the incomplete chunk in the buffer
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.substring(6);
// Check for [DONE] message
if (data === '[DONE]') {
console.log('Stream complete');
continue;
}
try {
// Parse the JSON data
const jsonData = JSON.parse(data);
// Extract the token content
const token = jsonData.choices[0]?.delta?.content || '';
// Do something with the token
displayToken(token);
} catch (error) {
console.error('Error parsing SSE data:', error);
}
}
}
}
}
// Call the function
getStreamingCompletion('Explain quantum computing');
Using EventSource in Browsers
For simpler browser-based implementations, you can use the native EventSource API:
// Note: This approach requires your MCP server to support GET requests for completions
// or you'll need a proxy that converts POST to GET
const evtSource = new EventSource(
'https://your-mcp-server.mcp-cloud.ai/v1/stream?prompt=Explain+quantum+computing',
{ withCredentials: true } // Include if you need to send cookies for auth
);
evtSource.onmessage = function(event) {
const token = JSON.parse(event.data).choices[0].delta.content;
document.getElementById('response').innerText += token;
};
evtSource.onerror = function(error) {
console.error('EventSource error:', error);
evtSource.close();
};
// Close the connection when done
function closeConnection() {
evtSource.close();
}
Python Example
Using Python to consume SSE streams from your MCP server:
import requests
import json
def stream_completion(prompt):
# Set up the request to your MCP server
url = "https://your-mcp-server.mcp-cloud.ai/v1/chat/completions"
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer your_api_key"
}
data = {
"model": "claude-3-5-sonnet",
"messages": [{"role": "user", "content": prompt}],
"stream": True
}
# Make the request and process the streaming response
response = requests.post(url, headers=headers, json=data, stream=True)
# Ensure the request was successful
response.raise_for_status()
# Process the streaming response
for line in response.iter_lines():
if line:
# Lines start with "data: "
line = line.decode('utf-8')
if line.startswith('data: '):
data_str = line[6:] # Remove 'data: ' prefix
# Check for stream end
if data_str == '[DONE]':
break
try:
# Parse the JSON data
data = json.loads(data_str)
# Extract token content
token = data['choices'][0]['delta'].get('content', '')
# Do something with the token
print(token, end='', flush=True)
except json.JSONDecodeError:
print(f"Error parsing JSON: {data_str}")
# Use the function
stream_completion("Explain quantum computing")
Understanding the MCP Server SSE Format
When your MCP server sends an SSE stream, each message follows this format:
data: {"id":"chatcmpl-123abc","object":"chat.completion.chunk","created":1694268190,"model":"claude-3-5-sonnet","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-123abc","object":"chat.completion.chunk","created":1694268190,"model":"claude-3-5-sonnet","choices":[{"index":0,"delta":{"content":" world"},"finish_reason":null}]}
data: {"id":"chatcmpl-123abc","object":"chat.completion.chunk","created":1694268190,"model":"claude-3-5-sonnet","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}
data: {"id":"chatcmpl-123abc","object":"chat.completion.chunk","created":1694268190,"model":"claude-3-5-sonnet","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]
Key elements to note:
- Each chunk starts with
data:
- The JSON contains a
delta
object with the new token content - The final chunk includes a
finish_reason
- The stream ends with
data: [DONE]
Client Libraries for SSE Consumption
Several libraries can simplify consuming SSE streams from your MCP server:
JavaScript
Python
Other Languages
- Java: okhttp-eventsource
- C#: LaunchDarkly.EventSource
- Ruby: sse-client-ruby
Handling Errors and Reconnection
SSE clients should implement proper error handling:
function setupSSEConnection(url, apiKey) {
let retryCount = 0;
const maxRetries = 5;
let retryDelay = 1000; // Start with 1 second
function connect() {
const eventSource = new EventSource(url);
eventSource.onopen = () => {
console.log('Connected to SSE stream');
retryCount = 0;
retryDelay = 1000;
};
eventSource.onmessage = (event) => {
// Handle message...
};
eventSource.onerror = (error) => {
eventSource.close();
if (retryCount < maxRetries) {
retryCount++;
console.log(`Connection error, reconnecting (${retryCount}/${maxRetries})...`);
setTimeout(connect, retryDelay);
retryDelay = Math.min(retryDelay * 2, 30000); // Exponential backoff, max 30 seconds
} else {
console.error('Maximum reconnection attempts reached');
}
};
return eventSource;
}
return connect();
}
Performance Considerations
For optimal performance when consuming SSE streams from your MCP server:
- Buffer Management: Process tokens efficiently to avoid memory buildup
- UI Updates: Batch DOM updates to prevent rendering bottlenecks
- Connection Pooling: Reuse connections when making multiple requests
- Error Handling: Implement robust error recovery and reconnection logic
- Timeout Settings: Configure appropriate timeouts for long-running streams