Monitoring MCP Servers
This guide explains how to monitor the performance and health of your MCP servers using MCP-Cloud's monitoring features.
Overview
MCP-Cloud provides comprehensive monitoring capabilities for your MCP servers, including:
- Real-time metrics visualization
- Log streaming and analysis
- Alerting and notification
- Historical performance data
- Usage analytics
Metrics Dashboard
The metrics dashboard provides a real-time overview of your server's performance.
Key Metrics
Metric | Description | Typical Values | Alert Threshold |
---|---|---|---|
CPU Usage | Percentage of CPU resources used | 10-60% | >80% |
Memory Usage | Memory consumption in MB | Varies by model | >90% of allocated |
Request Rate | Requests per minute | Varies by usage | N/A |
Response Time | Average response time in ms | 200-2000ms | >5000ms |
Error Rate | Percentage of failed requests | <1% | >5% |
Token Throughput | Tokens processed per second | Varies by model | N/A |
Real-time Logs
MCP-Cloud provides real-time log streaming via Server-Sent Events (SSE), allowing you to monitor your server's operations as they happen.
Accessing Logs
- Navigate to your server's dashboard
- Select the "Logs" tab
- View real-time logs or search historical logs
Log Levels
Logs are categorized by severity:
- DEBUG: Detailed information for debugging
- INFO: General operational information
- WARN: Warnings that don't affect functionality
- ERROR: Errors that affect functionality
- FATAL: Critical errors that cause service failure
Log Filtering
You can filter logs by:
- Time range
- Log level
- Message content
- Request ID
SSE Metrics Stream
For programmatic monitoring, MCP-Cloud provides a metrics stream via SSE:
const eventSource = new EventSource(
'https://api.mcp-cloud.ai/api/servers/123/metrics/stream',
{
headers: {
'Authorization': 'Bearer YOUR_API_TOKEN'
}
}
);
eventSource.addEventListener('metrics', (event) => {
const metrics = JSON.parse(event.data);
console.log('Current CPU usage:', metrics.cpu, '%');
console.log('Current memory usage:', metrics.memory, 'MB');
console.log('Requests in last minute:', metrics.requestRate);
});
Setting Up Alerts
MCP-Cloud allows you to configure alerts based on metrics thresholds:
- Navigate to your server's dashboard
- Select the "Alerts" tab
- Click "Create Alert"
- Configure alert conditions:
- Metric: Select the metric to monitor
- Threshold: Set the trigger threshold
- Duration: How long the condition must persist
- Channel: Email, SMS, or webhook notification
Alert Channels
MCP-Cloud supports multiple notification channels:
- Email notifications
- SMS alerts
- Webhook integration
- Slack integration
- PagerDuty integration
Historical Analytics
MCP-Cloud retains performance data for historical analysis:
- 1-minute resolution for the last 24 hours
- 1-hour resolution for the last 30 days
- 1-day resolution for the last year
Access historical data through:
- The web dashboard
- CSV export
- API access
Resource Optimization
Based on monitoring data, MCP-Cloud can recommend resource optimizations:
- Memory allocation adjustments
- CPU allocation adjustments
- Scaling configuration changes
- Region recommendations
Integration with External Monitoring
MCP-Cloud can integrate with external monitoring systems:
- Prometheus metrics export
- Grafana dashboards
- Datadog integration
- New Relic integration
Example Prometheus configuration:
scrape_configs:
- job_name: 'mcp-server'
scrape_interval: 15s
metrics_path: '/metrics'
scheme: https
basic_auth:
username: 'prometheus'
password: 'YOUR_API_TOKEN'
static_configs:
- targets: ['your-server-id.metrics.mcp-cloud.ai']
Best Practices
- Set up alerts for critical metrics: CPU, memory, error rate
- Monitor response times: Identify performance degradation early
- Review logs regularly: Look for warning patterns before they become errors
- Track usage patterns: Understand peak usage times
- Correlate metrics with business events: Identify impact of marketing campaigns, etc.
- Test alert configurations: Ensure notifications work as expected
Troubleshooting
If monitoring data appears incorrect:
- Check the server status to ensure it's running
- Verify that any proxy or firewall isn't blocking metrics
- Ensure your API token has the necessary permissions
- Try accessing raw metrics via the API to bypass dashboard issues
- Contact support if issues persist