Internal Tool

Vigil

Always watching. Never blinking.

Explore Features

Screenshots

See It in Action

Features

Know Before Your Users Do

Continuous endpoint monitoring with sub-second WebSocket alerts, latency tracking, incident management, and uptime reporting. When something breaks at 3 AM, Vigil is already on it.

Health Monitoring

Endpoints, Latency, and Uptime

  • Automated Endpoint Polling

    Configurable check intervals per site. HTTP status validation, response time measurement, and content verification. APScheduler-based polling engine.

  • Latency Tracking

    Millisecond-resolution response time history. Sparkline visualizations on dashboard cards. Detailed timeseries charts in site detail view.

  • Uptime Statistics

    Rolling uptime percentages across 24-hour, 7-day, 30-day, and 90-day windows. Per-site and aggregate fleet uptime on the dashboard.

  • Health Status Classification

    Three-state health model: healthy (green), degraded (yellow), and down (red). Degraded status triggered by slow response times before full failure.

Incident Management

Alerts, Incidents, and Resolution

  • Automatic Incident Creation

    Incidents created automatically when health checks fail. Tracks start time, affected service, and current status. Auto-resolves when the service recovers.

  • Alarm Banner

    Red banner across the top of the dashboard when any service is down. Shows count of affected sites. Mutable for acknowledged incidents.

  • WebSocket Live Updates

    Real-time check results and incident broadcasts via WebSocket. Dashboard updates instantly without polling. No refresh needed.

  • Incident History

    Full incident timeline per site with start, end, and duration. Filter by status (active, resolved). Incident table in the site detail modal.

Dashboard

Card View and List View

  • Card View

    Grid of site cards showing name, status indicator, uptime percentage, latency sparkline, and last check time. Color-coded borders for health state.

  • List View

    Tabular view with sortable columns for name, URL, status, uptime, latency, and last check. Toggle between card and list from the header.

  • Site Detail Modal

    Click any site to open a detail panel with latency chart (Recharts), uptime stats across all time windows, incident table, sub-checks, and settings.

  • Status Summary

    Aggregated counts at the top: total sites, healthy, degraded, and down. Fleet-wide uptime percentage for the last 24 hours.

Accessibility

Built for Everyone

  • WCAG 2.1 AA Compliance

    4.5:1 contrast for body text, 3:1 for large text and UI components, in both light and dark themes.

  • Keyboard Navigation

    Every interaction reachable via keyboard. Logical tab order, visible focus indicators, Escape-to-dismiss for modals.

  • Screen Reader Support

    VoiceOver, NVDA, and JAWS tested. Semantic HTML, ARIA labels, live regions for dynamic updates.

  • Reduced Motion

    Respects prefers-reduced-motion. Usable at 200% zoom. Touch targets meet 44x44 minimum.

How It Works

From Endpoint to Alert

  1. Step 1: Register

    Add endpoints to monitor with URL, expected status, check interval, and timeout thresholds.

  2. Step 2: Poll

    APScheduler fires checks on schedule. Records status code, response time, and content hash for each endpoint.

  3. Step 3: Detect

    Failed checks trigger incidents automatically. Degraded checks fire when latency exceeds thresholds. WebSocket broadcasts the change instantly.

  4. Step 4: Resolve

    When the endpoint recovers, the incident auto-resolves. Uptime stats recalculate. The alarm banner clears.

Technical Specifications

Under the Hood

  • Backend

    • FastAPI (Python 3.12+)
    • PostgreSQL + SQLAlchemy 2.0 async
    • APScheduler for check scheduling
    • Alembic migrations
    • avian-diagnostics integration
  • Frontend

    • React 19 + TypeScript
    • Vite build system
    • Recharts for latency charts
    • Framer Motion animations
    • CSS Modules with AVIAN design system
    • Light and dark mode
  • Real-Time

    • WebSocket for live updates
    • Check result broadcasting
    • Incident state changes
    • No polling required
  • Monitoring

    • HTTP status validation
    • Response time measurement
    • Content hash verification
    • Configurable check intervals
    • Automatic incident lifecycle

Development

100% Built by Claude

Every tool in the Renkara fleet was built by Claude (Anthropic) working alongside a single human supervisor. Every line of code, every test, every deployment: AI-authored with human direction. The leverage factor across the fleet runs in the 20x–50x range, with individual sessions regularly exceeding 100x.

See the daily leverage records for per-task numbers across the full build history.