Skip to content

Monitoring

Overslash exposes a Prometheus metrics endpoint, structured JSON logs, and a public status page that operators can mirror or reuse. The metrics surface counts actions, approvals, secret reads, OAuth refreshes, and HTTP error rates by service — enough to alert on stuck approvals, failing connections, and unusual write volume.

Prometheus metrics

Metrics are exposed at GET /internal/metrics in Prometheus text format, on the API's normal port. The endpoint is mounted outside auth and rate limiting so a scraper can reach it without credentials — keep it on an internal network or restrict it at your proxy.

A minimal scrape config:

yaml
scrape_configs:
  - job_name: 'overslash'
    metrics_path: /internal/metrics
    static_configs:
      - targets: ['overslash:8080']

Every metric is prefixed overslash_. They are grouped below by area.

HTTP

MetricTypeLabels
overslash_http_requests_totalcountermethod, path, status
overslash_http_request_duration_secondshistogrammethod, path
overslash_http_requests_in_flightgauge

Action execution

MetricTypeLabels
overslash_action_executions_totalcountertemplate_key, mode, status
overslash_action_execution_duration_secondshistogramtemplate_key, mode
overslash_action_validations_totalcountertemplate_key, mode, outcome
overslash_action_validation_duration_secondshistogramtemplate_key, mode
overslash_outbound_http_totalcountertemplate_key, status_class
overslash_outbound_http_duration_secondshistogramtemplate_key, status_class

Approvals

MetricTypeLabels
overslash_approval_events_totalcounterevent, identity_kind
overslash_approval_resolution_duration_secondshistogramdecision
overslash_approvals_pendinggauge

OAuth

MetricTypeLabels
overslash_oauth_events_totalcounterprovider, flow, status
overslash_oauth_token_refresh_duration_secondshistogramprovider, status

Permissions & rate limiting

MetricTypeLabels
overslash_permission_checks_totalcounterdecision, layer
overslash_rate_limit_decisions_totalcounterscope, decision

Search & secrets

MetricTypeLabels
overslash_search_queries_totalcountermode, status
overslash_secret_operations_totalcounterop, status

Webhooks

MetricTypeLabels
overslash_webhook_deliveries_totalcounterevent_type, status, final
overslash_webhook_delivery_attemptshistogramevent_type, outcome

Database & background tasks

MetricTypeLabels
overslash_db_pool_connectionsgaugestate (active/idle)
overslash_background_task_ticks_totalcountertask, status
overslash_background_task_duration_secondshistogramtask
overslash_background_task_last_success_timestampgaugetask

Structured logs

Overslash emits structured logs via tracing. Control verbosity with RUST_LOG — a global level or per-target filters:

bash
RUST_LOG=info
RUST_LOG=info,overslash_metrics=debug   # per-crate override

Run behind a log collector that parses the output and ships it to your aggregator.

Health checks

Two unauthenticated endpoints, mounted outside auth and rate limiting and safe to poll frequently:

EndpointMeaning
GET /healthLiveness — always returns 200 once the process is up.
GET /readyReadiness — returns 200 when the app is initialised (migrations done, pool connected).

Wire /ready to your load balancer / Kubernetes readinessProbe and /health to the livenessProbe.

Status page

Live production health: status.overslash.com.

A starting set, expressed against the metrics above:

  • Stuck background tasktime() - max by (task) (overslash_background_task_last_success_timestamp) > 300. A task that hasn't succeeded in 5 minutes is wedged.
  • Pending approvals piling up — sustained high overslash_approvals_pending, or growth without resolutions in overslash_approval_events_total{event="approved"}.
  • HTTP 5xx ratiosum(rate(overslash_http_requests_total{status=~"5.."}[5m])) / sum(rate(overslash_http_requests_total[5m])) above your threshold.
  • OAuth refresh failuresrate(overslash_oauth_events_total{flow="refresh",status="failure"}[5m]) > 0. Failing refreshes mean connections will start breaking.
  • Webhook delivery failuresrate(overslash_webhook_deliveries_total{status="failed"}[15m]) > 0.
  • Secret operation errors / denials — watch overslash_secret_operations_total{status="error"} and {status="denied"} for misconfiguration or abuse.

Pre-release software — subject to change without notice.