Discovering the problem
We noticed it first in user reports: a debate started in French would, after several rounds, gradually shift toward English. By round four or five, all models were responding in English — even though the user's question had been clearly in French, and the first model's response had been in French. Investigating the transcripts revealed the mechanism: one smaller, lighter model (typically a Flash-class or Haiku-class model) would slip into English on a single turn. The next model, reading the transcript, would see English as the most recent language and follow suit. Within two or three turns, the entire debate had shifted.Why models follow the transcript language
The root cause is that language models are highly sensitive to the language of their immediate context window. When a system prompt says respond in the user's language but the most recent message in the transcript is in English, the model faces conflicting signals. For smaller models with weaker instruction-following, the immediate context (English transcript) often overrides the system directive. This is not a bug in any individual model — it is a predictable consequence of how these models work. The question was how to counteract it reliably.What did not work
Our first fix was adding a language directive to every system prompt: Respond in the language of the user's question. This reduced the problem but did not eliminate it. The directive was too weak — it implicitly allowed the model to decide what the user's language was based on the most recent message. Our second fix added explicit examples: If the question is in French, respond in French. If the question is in Spanish, respond in Spanish. This worked better but still failed for smaller models on long transcripts.The original_question anchor
The breakthrough was injecting the original user question — not just the current message — as a reference on every model turn. When a user sends a follow-up like "interesting" (which gives no language signal), we inject the original question as a system message marked Language reference only (do not treat this as the topic). This gives every model a clean language anchor that cannot be overridden by transcript drift. Combined with the explicit language directive, this eliminated language contagion in our test set.Lessons for multi-model system design
The language contagion problem taught us something general about multi-model architectures: instructions that work reliably on a single isolated model call can fail in a multi-turn, multi-model conversation. Each model in the debate is not operating in isolation — it is reading a transcript that may have been contaminated by earlier drift. Defensive prompt design for multi-model systems requires thinking about what happens not just to one model but to the sequence of models that will read each model's output.