Analytics

System Health

Monitor Keva's operational status and performance

The System Health dashboard shows the operational status of Keva's infrastructure. Monitor uptime, API performance, and service availability in real-time.

Health Overview

The health dashboard displays:

  • Overall Status - System-wide health indicator
  • Service Status - Individual component health
  • Performance Metrics - Response times and throughput
  • Recent Incidents - Past issues and resolutions

Status Indicators

StatusMeaning
OperationalAll systems normal
DegradedPartial issues, reduced performance
Partial OutageSome features unavailable
Major OutageCritical services down

Service Components

Monitor each service independently:

Core Services

ServiceDescription
Web AppMain application interface
APIREST API endpoints
WorkerBackground job processing
DatabasePostgreSQL cluster

AI Services

ServiceDescription
AI EngineClaude/Anthropic API
EmbeddingsVector search service
BrainLearning and memory

Integrations

ServiceDescription
EmailIMAP/SMTP connections
ConnectorsPlatform integrations
WebhooksInbound events

Performance Metrics

API Response Times

Track latency across endpoints:

  • p50 - Median response time
  • p95 - 95th percentile
  • p99 - Worst case scenarios

Target: p95 under 200ms

Throughput

Requests processed per minute:

  • Current rate
  • Peak rate (24h)
  • Rate limit remaining

Error Rate

Percentage of failed requests:

  • 4xx errors (client issues)
  • 5xx errors (server issues)
  • Target: < 0.1%

Uptime Tracking

View historical availability:

Last 24 hours:  ████████████████████████  100%
Last 7 days:    ███████████████████████░   99.8%
Last 30 days:   ███████████████████████░   99.9%
Last 90 days:   ███████████████████████░   99.95%

Click any period for detailed breakdown.

Incident History

Review past issues:

DateDurationImpactRoot Cause
Mar 205 minDegradedDatabase failover
Mar 152 minNoneDeployment
Mar 1015 minPartialThird-party API

Click incidents for full post-mortem.

Alerts Configuration

Set up health alerts:

Threshold Alerts

  1. Go to Health > Alerts
  2. Click New Alert
  3. Choose metric (response time, error rate)
  4. Set threshold value
  5. Configure notification channel

Alert Channels

  • Email notification
  • Slack channel message
  • Webhook to external system
  • SMS for critical alerts

Example Alerts

AlertConditionSeverity
High Error Rate> 1% errorsCritical
Slow Responsep95 > 500msWarning
Service DownHealth check failsCritical
High Queue> 1000 pending jobsWarning

Maintenance Windows

Schedule planned maintenance:

  1. Go to Health > Maintenance
  2. Click Schedule Window
  3. Set start time and duration
  4. Add description
  5. Alerts are suppressed during window

SOC 2 Health Checks

For compliance, automated checks run every 5 minutes:

  • Service availability
  • Security controls active
  • Backup verification
  • Access control status

Results feed into compliance evidence collection.

External Monitoring

Keva's status is also available at:

  • Status Page - Public availability dashboard
  • API Endpoint - /api/health returns JSON status
  • RSS Feed - Subscribe to incident updates

Troubleshooting

If you see degraded status:

  1. Check the affected component
  2. Review recent changes or deployments
  3. Check third-party service status
  4. Contact support if persistent