# Model Cheatsheet

Quick reference for model-specific prompt optimization.

---

## Claude (Anthropic)

| Aspect | Recommendation |
|--------|---------------|
| Structure | XML tags: `<instructions>`, `<context>`, `<example>`, `<thinking>` |
| Strengths | Long context (200K), instruction following, nuanced reasoning |
| Format | Layered context with XML separators, explicit role in system message |
| CoT | Naturally strong -- use `<thinking>` tags for internal reasoning |
| Constraints | Responds well to direct, assertive constraints ("You MUST", "NEVER") |
| Pitfall | Over-politeness in output -- use direct tone instructions to counter |

## GPT (OpenAI)

| Aspect | Recommendation |
|--------|---------------|
| Structure | Markdown with `###` delimiters, system role emphasis |
| Strengths | Function/tool calling, structured outputs (JSON mode), broad knowledge |
| Format | Strong system message + clear user/assistant separation |
| CoT | Use explicit "Think step by step" or structured output with reasoning field |
| Constraints | System message constraints are highly respected |
| Pitfall | Can be verbose -- always specify length limits |

## Gemini (Google)

| Aspect | Recommendation |
|--------|---------------|
| Structure | Structured context blocks, grounding with provided data |
| Strengths | Multi-modal (text + image + audio + video), large context (1M+) |
| Format | Clear context separation, explicit grounding instructions |
| CoT | Benefits from explicit decomposition of complex tasks |
| Constraints | Use "You must ONLY use information from the provided context" for grounding |
| Pitfall | Can over-generalize -- be very specific about scope |

## Mistral

| Aspect | Recommendation |
|--------|---------------|
| Structure | Brevity-first, concise instructions |
| Strengths | Speed, efficiency, JSON mode, function calling, code generation |
| Format | Short system messages, direct task description |
| CoT | Keep reasoning chains short -- model is optimized for speed |
| Constraints | Fewer, stronger constraints work better than many weak ones |
| Pitfall | Less reliable with very long, complex instructions -- simplify |

---

## Universal Tips

1. **System message** -- always use it for role and constraints (all models respect it)
2. **Temperature** -- 0.0-0.3 for factual/structured, 0.7-1.0 for creative tasks
3. **Max tokens** -- always set explicitly to prevent runaway responses
4. **Stop sequences** -- use for predictable output termination
5. **Few-shot** -- works across all models, 2-3 examples is optimal
