# Authoring Prompt Ablation

## The Problem

Identity models were skewing toward dominant topics in the source data. A subject who wrote extensively about prediction markets had their entire identity model framed around prediction markets — even though their actual identity is about probabilistic reasoning, institutional skepticism, and charitable interpretation. The authoring prompts (~1,000 words each) had no guard against topic-specific positions being elevated to identity axioms.

## The Finding

A 73-word instruction eliminated topic skew entirely:

> **DOMAIN-AGNOSTIC REQUIREMENT:** You are writing a UNIVERSAL operating guide — not a summary of interests or positions. Every item must apply ACROSS this person's life, not within one topic. Test: if removing a specific subject (markets, policy, technology, medicine) makes the item meaningless, it does not belong. How someone reasons IS identity. What they reason ABOUT is not.

## Test Design

We ran 4 rounds of testing across 10 prompt conditions, testing on two subjects with known skew problems (one with 74 prediction market facts in 1,478 total; one with 45 trading facts in 115 behavioral facts).

### Round 1: Does the guard work?

| Condition | Prompt size | Topic mentions | Result |
|-----------|------------|----------------|--------|
| Control (current) | 983 words | 9 mentions | Timed out on large inputs |
| Stripped (no guard) | 260 words | 9 mentions | Same skew, faster |
| **Stripped + guard** | **333 words** | **0 mentions** | **Topic skew eliminated** |
| Minimal + guard | 164 words | 0 mentions | Also works |
| Ultra-minimal + guard | 128 words | 0 mentions | Also works |

The guard is the only change that matters. 700 words of the original prompt were ceremonial.

### Round 2: How concise can we go?

We combined the best qualities from different conditions: concise output (C), interaction failure modes (D), and psychological depth (E).

**Winner: Condition H** — stripped structure + guard + hard output caps + psychological precision + interaction failure modes.

- 78% smaller prompts (2,903 words to 645 words)
- Zero topic skew
- Tightest output (3,690 words total across 3 layers)
- Axiom interactions now include explicit failure modes

### Round 3: Detection balance

Even with the domain guard, prediction detection examples can skew toward the dominant domain (the data is densest there). Two additional instructions fixed this:

- **Detection balance:** Lead detection with less-represented domains
- **Domain suppression:** No single domain in more than 2 predictions

Result: 0 trading terms in predictions, down from 12.

### Round 4: Does framing matter?

We tested three framings: "operating guide" (H3), "find the invariants" (H5), and "behavioral specification" (H6).

| Framing | Total output | Topic skew |
|---------|-------------|-----------|
| Operating guide | 3,384 words | 5 terms |
| Abstraction/invariants | 4,580 words | 8 terms |
| Behavioral specification | 3,944 words | 2 terms |

"Operating guide" produces the most concise, directive output. "Behavioral specification" has lowest skew but 17% more words. "Find the invariants" actually increased both output and skew.

## What Changed

The identity model now captures **how someone reasons** (probabilistic thinking, structural analysis, charitable interpretation) rather than **what they reason about** (prediction markets, trading, policy). The same behavioral patterns that showed up as domain-specific in the old output now appear as universal patterns with domain-specific detection examples.

Before: "Frame complex social problems as information aggregation challenges that prediction markets could solve."

After: "They reason from a stable ranking of evidence types — empirical measurement beats theoretical argument, randomized beats observational, outcome beats process."

Same person. Same data. Different abstraction level.

## Implications

1. **Identity is domain-agnostic.** How you think is who you are. What you think about is context.
2. **Prompt bloat is real.** 78% of our authoring instructions were accumulated ceremony that didn't affect output quality.
3. **Small guards beat large constraints.** 73 words did what 1,000 words of careful instruction couldn't.
4. **The model already knows the difference** between identity and interests — it just needs to be asked.
