A writing style request made at the start of a conversation degrades as the session grows. The model does not decide to stop using your requested style. The instruction moves deeper into context, competes with accumulated conversation history, and loses attention weight. By message thirty, the style you requested at message one is a faint signal in a noisy context.
Analysis Briefing
- Topic: Style instruction drift in long LLM writing sessions
- Analyst: Mike D (@MrComputerScience)
- Context: Sparked by a question from Claude Sonnet 4.6
- Source: Pithy Cyborg
- Key Question: Why does the voice I carefully set up at the start of our session disappear by the end?
Why Style Instructions Are the First Thing Context Rot Takes
Style is a property of every output token. A instruction to write in a specific voice, tense, sentence length, or register must be applied consistently across every sentence the model generates. That requires the style instruction to maintain attention weight throughout the entire generation process for every response.
In a short conversation, the style instruction is prominent. It sits near the beginning of the context, close to where the model is generating, and the context is small enough that the instruction competes with few other signals. The style holds.
In a long conversation, the style instruction has moved deep into the context while hundreds of subsequent tokens have accumulated between it and the current generation position. The model’s own prior outputs, most of which are in whatever style the model was producing at the time, now dominate the examples of how to write in this conversation. Style drifts toward the model’s default because the default is what the recent context reflects most strongly.
The drift is not sudden. It is gradual and hard to notice until the writing sounds noticeably different from what you requested. Creative writers, content creators, and anyone using AI for consistent-voice writing are the most likely to catch it because they are attuned to voice consistency in ways that most users are not.
The Self-Averaging Effect on Custom Voices
The model’s prior outputs in the session become implicit style examples for subsequent outputs. If the style held perfectly for the first ten responses and then drifted slightly in response eleven, response eleven becomes part of the context that response twelve is generated from. Response twelve has a slight pull toward the drifted style in response eleven.
This self-averaging effect means style drift compounds. A small deviation from the requested style in an early response creates a slightly larger deviation in the next, which creates a slightly larger one after that. By the end of a long session, the accumulated drift can be substantial even though no single response deviated dramatically from the previous one.
The effect is strongest for styles that differ significantly from the model’s RLHF-trained default. A request to write in a highly formal register, an extremely terse style, or a distinctive voice pattern diverges further from the model’s default than a request for moderate polish. The further the requested style is from the default, the more the self-averaging pull toward the default affects the output over time.
The Approaches That Maintain Style Through Long Sessions
Style anchoring with examples in the system prompt is the most reliable approach for API users. Including two or three examples of the exact style you want directly in the system prompt, alongside the style description, gives the model concrete reference points that remain in the context window throughout the session and maintain attention weight better than a textual description alone.
Periodic style re-injection works for users without system prompt access. Every ten to fifteen exchanges, paste a brief reminder of the style requirements alongside your next request. “Continue in the same terse, direct style as earlier” re-anchors the instruction near the current generation position where it has maximum influence.
For long-form writing projects specifically, treating each major section as a separate session is often more effective than trying to maintain style across one very long session. Start each section fresh with the full style instructions, produce the section, then distill it into the next session’s starting context. The style holds better across shorter sessions than across one marathon session.
What This Means For You
- Include style examples in your system prompt, not just style descriptions. Two or three concrete examples of the target voice give the model reference points that hold attention better than a textual description of what you want.
- Re-inject style instructions every 10 exchanges in long writing sessions. A brief “continue in the [specific] style we established” reanchors the instruction before drift compounds.
- Treat each major section as a fresh session for long-form writing projects. Carry forward the essential context and style anchors, not the full conversation history. Shorter sessions maintain style better than marathon ones.
- Check the first response of each session against your style requirements before proceeding. If the first response does not match the style, correct it before the self-averaging effect sets a wrong baseline for the rest of the session.
Enjoyed this deep dive? Join my inner circle:
- Pithy Cyborg → AI news made simple without hype.
