The moderation problem

Gaming chat is built on aggressive language that is almost never a real threat. Players say they will 'destroy', 'kill', or 'wreck' each other every match. A filter that treats those as violence floods your review queue and frustrates your most engaged users — while the genuine harassment hides in the same vocabulary.

Why a word list isn't enough

A static word list can't tell 'I'm gonna kill you, easy win' from 'kill yourself'. Both contain the same tokens. Once you whitelist 'kill' to stop the false positives, you've also blinded yourself to the real threat.

How The Profanity API handles it

Use context `gaming` so single static matches like 'die' or 'trash' never auto-flag on their own, and run `balanced` mode so layer disagreement (static flags it, semantic intent is low) routes ambiguous lines to LLM intent classification instead of blanket-blocking.

Input

Verdict

get wrecked noob, ez

joking → allowed

kys you worthless trash

abusive / threatening → flagged

Quick start

const res = await fetch("https://api.theprofanityapi.com/v1/check", {
  method: "POST",
  headers: {
    "Authorization": "Bearer " + process.env.PROFANITY_API_KEY,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ text, context: "gaming", mode: "balanced" }),
});

const { flagged, intent, score } = await res.json();

Gaming Chat Moderation

The moderation problem

Why a word list isn't enough

How The Profanity API handles it

Examples

Quick start

Other use cases

Ready to ship moderation?