How CleanOutput works
A plain-English walkthrough of the nine-engine linguistic linter that powers your Clean Output Score.
What makes writing "sound AI"?
Large language models are trained to predict the most statistically probable next token. This makes them extraordinarily consistent — but consistency is exactly what makes human writing distinctive. Real human prose is messy, varied, and idiosyncratic.
CleanOutput measures the dimensions where AI consistency is most detectable: word choice, sentence rhythm, structural balance, and grammatical patterns. The further your text deviates from these norms in a human direction, the higher your Clean Output Score.
All analysis runs locally in your browser using JavaScript. No text is transmitted anywhere.
− Vocabulary penalty (up to 25)
− Passive voice penalty (up to 15)
− Transition penalty (up to 10)
− Burstiness penalty (up to 20)
− Redundancy penalty (up to 10)
− Paragraph uniformity (up to 10)
+ First-person bonus (up to 5)
+ Readability variance bonus (up to 3)
= Clean Output Score (0–100)
The analysis pipeline
Text tokenization
Your input is split into sentences (by punctuation), words (by whitespace and boundary rules), and paragraphs (by double line breaks). Character-level indices are tracked throughout so every finding can be pinpointed to an exact position.
Lexical scan
Every word and phrase is checked against our tiered vocabulary database of 200+ AI-signature terms. Each match is recorded with its exact index, length, and a suggested replacement from our alternatives dictionary.
Grammatical pattern matching
Eight regular-expression patterns scan for passive voice constructions (e.g. "is written," "was established," "can be achieved"). Padded verb phrases and nominalisations are caught by a separate phrase-level dictionary.
Cadence analysis (burstiness)
Word counts per sentence are extracted and the standard deviation is computed. A low SD (e.g. < 3) indicates robotic uniformity — AI tends to write every sentence with 18–24 words. Human writing swings wildly. A high SD means natural rhythm.
Structural uniformity check
Paragraph-level word counts are measured and their standard deviation computed. AI outputs consistently balanced paragraphs of 60–100 words. Significant variance here pushes your score up.
Readability scoring
A Flesch-Kincaid reading-ease calculation is applied. AI calibrates to mid-range readability. Scores at the extremes — very easy or very complex — indicate genuine human authorship.
First-person and voice analysis
The ratio of first-person pronouns ("I") to depersonalising language ("one," "the reader") is measured. AI typically avoids committing to a personal voice, or overcorrects into formal impersonality.
Score computation & deduplication
Weighted penalties are subtracted from a baseline of 100. Overlapping findings are merged and deduplicated so no position is double-counted. The result is a single integer: your Clean Output Score.
Visual rendering
Findings are rendered as color-coded inline highlights over your text. Each highlight is interactive — click any flagged phrase to see why it was caught and choose from 2–4 humanized alternatives. Filters let you isolate any category of issue.