Moderation
Automatic, configurable content moderation and anti-abuse — non-AI, in-core, free.
Cariosan moderates message content automatically — no manual report/review queue. Filtering is non-AI (wordlists + patterns + counters), runs in the normal server at zero extra cost, and is off by default (a workspace opts in). AI toxicity classification is a separate, optional upgrade on the roadmap; it never gates the core.
How it works
Every send is evaluated before it persists. If a rule matches, the workspace's configured action is applied:
| Action | Effect |
|---|---|
flag | Message is delivered; an action is recorded + a message.moderated webhook fires. |
mask | Matched spans are replaced with ***; the masked text is what's stored and broadcast. |
reject | Send fails with 422 MESSAGE_REJECTED; no message is created. |
auto_mute | Reject and mute the sender for a window (mute machinery lands with block/mute). |
A clean message, or a workspace that hasn't enabled moderation, passes straight through unchanged.
Rules
profanity— a built-in Indonesian + English wordlist, matched as whole words (so "pantai" never trips "tai"). Extend it per workspace with a custom blocklist.link— explicit URLs (http(s)://,www.).phone— contact-number leakage (Indonesian08…/+62…mobiles and generic long digit runs) — the classic marketplace "DM me at 08…" pattern.flood— the same message repeated by one user in a channel more than N times within a short window.
profanity and flood are on by default once moderation is enabled; link and phone are opt-in (they're higher-false-positive and use-case dependent).
Blocking & muting
Two complementary controls sit alongside the automatic filter:
- Block — an end-user capability.
client.blockUser(externalId)hides that user's messages from your channel history and search, workspace-wide;unblockUserreverses it.listBlocked()returns your list — use it to also filter the live WebSocket stream client-side (history/search are filtered server-side). Blocking yourself is rejected. - Mute — a moderation/admin control. A user is silenced in a channel: their sends are rejected (
403 MUTED) until the mute expires. Written automatically by theauto_muteaction above (and by channel admins later). Muting isn't an end-user action — to stop seeing someone, use block.
Audit trail
Every automatic action is recorded in a purpose-built moderation_actions log — rule, action, offending user, the message (when one was created), and a redacted excerpt — powering the dashboard feed and the message.moderated webhook. Nothing requires a human in the loop.
Configuration
Moderation settings are per-workspace (enabled rules, action, custom blocklist, flood thresholds) and will be managed from the Cariosan Cloud dashboard. Until that ships they default to disabled, so existing deployments are unaffected until you turn moderation on.
Not AI (yet)
These filters are deterministic and free. Toxicity/AI moderation is an optional, self-hostable add-on planned later — it extends, and never replaces, this in-core layer.
Was this page helpful?