Performance & Scaling

Real-world throughput numbers, resource targets, and how to scale Cariosan from a $12 droplet to production.

Cariosan is designed to be cheap to run and stay fast as you grow. This page documents what one server can carry, where the bottlenecks are, and how to scale when you outgrow a single droplet.

Headline number

A $12 DigitalOcean Singapore droplet (1 vCPU, 2 GB RAM) sustains 500 concurrent WebSocket connections at 0% CPU and ~28 MiB resident memory. Postgres, Redis, MinIO, and the Caddy edge proxy share the same droplet. Server design target is 10,000 concurrent connections per node — see scaling section below for the path there.

Baseline measurement

Recorded 2026-05-05 against staging.cariosan.com from a k6 client in Indonesia, 5-minute soak, 500 VUs, ping every 10s, 100 sender VUs sending 1 msg per 30s.

Metric	Result
WebSocket upgrade success	100%
Median pong RTT	85 ms
p95 pong RTT	591 ms (tail elevation from client laptop concurrency, not server)
Server CPU at peak	~0%
Server memory at peak	28 MiB
Per-connection rate-limit drops	348 (defensive, policy-driven)

Median 85ms RTT includes ~30ms baseline (Indonesia laptop → Singapore droplet). Co-located clients in production will see lower numbers.

Why Go + WebSocket scales cheap

net/http based stack. Cariosan uses github.com/coder/websocket over Echo's net/http server — no custom event loop, no userland scheduling. Goroutines per connection let the runtime spread load across cores without thread pools you have to size.
Postgres + Redis split. Hot reads (presence, typing, rate limits, pub/sub) hit Redis. Durable state (messages, members, channels) hits Postgres. Each scales independently, and Redis takes the brunt of fan-out work that would otherwise slam the database.
No per-connection memory growth. WebSocket frames are decoded into reusable buffers. With 500 connections we observed 28 MiB resident — roughly 56 KiB per connection including Go runtime overhead. Linear scaling projects to ~280 MiB at 5000 connections.
Distroless Docker image. Multi-stage build produces a ~30 MB final image with no shell, no package manager, no init system. Cold start in production is dominated by Postgres connection pool warm-up, not Go binary startup.

Performance targets by tier

These are the official MVP targets from the technical spec. The baseline above already hits the resource targets at smaller scale.

Metric	Target
Message send latency (API return)	< 50 ms p95
Message delivery (WebSocket peer-to-peer, local)	< 100 ms p95
Message history fetch (50 messages)	< 100 ms p95
Concurrent WebSocket connections per server	10,000
Docker image size	< 30 MB
Server memory at idle	< 64 MB
Server memory at 1,000 connections	< 256 MB

Resource sizing guide

Tier	Vertical	Concurrent connections	Sane droplet
Hobby / staging	1 vCPU, 2 GB	up to ~1,000	$12 DO droplet
Production small	2 vCPU, 4 GB	up to ~5,000	$24 DO droplet, dedicated DB
Production scale	4 vCPU, 8 GB	up to ~10,000	$48 DO droplet, managed Postgres + Redis
Multi-node	horizontal	10,000+	one Cariosan server per ~10k connections, shared Redis pub/sub for cross-node fan-out

Single-node ceiling is ~10k connections regardless of vCPU because at that point the Go runtime's network poller becomes the bottleneck. Beyond that, scale horizontally — a Redis pub/sub channel per workspace handles cross-node message fan-out without touching Postgres.

When to scale up

Watch these signals on the staging or prod server:

Server CPU > 60% sustained — scale vertically (add vCPU) before it hits 80%. Go's GC under pressure makes tail latency worse before throughput drops.
Postgres connections saturated — cariosan-server uses pgx connection pooling; default pool is 10. Bump CARIOS_DATABASE_MAX_CONNS if pool wait time enters logs.
Redis memory > 70% — presence + rate limit data is volatile but accumulates. Promote to managed Redis (DO Managed Database, Upstash, Redis Cloud).
WebSocket drop rate > 1% — client-side reconnection loops flood your logs and bills. Diagnose with runbook-ws-connection-drops.md (in the source repo's docs/operations/).

Reproducing the baseline

terminal — k6 baseline run

# Clone the load-test scripts (in source repo, not the binary distribution)
cd apps/server/loadtest
k6 run \
  --vus 500 \
  --duration 5m \
  --env BASE=wss://staging.cariosan.com \
  --env API_KEY=$YOUR_KEY \
  --env API_SECRET=$YOUR_SECRET \
  ws-baseline.js

The k6 script lives next to the server source. Each run produces a JSON summary you can diff against the baseline above to spot regressions before they ship.

Caveats

Client geography matters. Run k6 from the same region as your production server for accurate p95 numbers. Cross-region tests inflate tail latency by ~30-100 ms baseline.
MinIO co-located with the server limits attachment throughput. For production, use S3 directly or run MinIO on a separate VM with dedicated I/O.
The 10,000 connection target is per Cariosan server process, not per droplet. Run multiple replicas on bigger droplets if your kernel can handle the FD count.

Next steps

Docker Compose self-host — deploy the full stack
Environment variables — config flags that affect performance (CARIOS_DATABASE_MAX_CONNS, CARIOS_RATE_LIMIT_*)
Operations runbook — daily ops, monitoring hooks, log aggregation

Was this page helpful?

On this page