Chain of Thought | Recipes

Concepto

Chain of Thought (CoT) instruye al modelo a razonar antes de responder. Al separar el proceso de pensamiento del output final, el modelo comete menos errores en tareas que requieren múltiples pasos lógicos: matemáticas, análisis legal, debugging, decisiones multi-criterio.

Prompt base

Before answering, work through the problem step by step inside <thinking> tags.
Your reasoning in <thinking> is private — the user does not see it.
After reasoning, write your final answer inside <answer> tags.

<thinking>
[Reason through the problem here. Consider edge cases, alternative
interpretations, and potential errors in your own reasoning.]
</thinking>

<answer>
[Final response here — clear, concise, without showing the reasoning steps
unless the user specifically asked for them.]
</answer>

Variante con confianza

Útil cuando necesitas que el modelo evalúe su propia certeza:

Work through the problem step by step inside <thinking> tags, then give
your final answer inside <answer> tags and a confidence score (0–100) inside
<confidence> tags.

<thinking>
...
</thinking>

<answer>
...
</answer>

<confidence>82</confidence>

Parsing del output (TypeScript)

typescript

function parseCoTResponse(text: string): {
  thinking: string;
  answer: string;
  confidence?: number;
} {
  const thinking = text.match(/<thinking>([\s\S]*?)<\/thinking>/)?.[1]?.trim() ?? "";
  const answer = text.match(/<answer>([\s\S]*?)<\/answer>/)?.[1]?.trim() ?? text;
  const confidenceStr = text.match(/<confidence>(\d+)<\/confidence>/)?.[1];
  const confidence = confidenceStr ? parseInt(confidenceStr, 10) : undefined;
 
  return { thinking, answer, confidence };
}
 
const response = await client.messages.create({
  model: "claude-opus-4-5",
  max_tokens: 4096,
  system: COT_SYSTEM_PROMPT,
  messages: [{ role: "user", content: userQuestion }],
});
 
const rawText = response.content
  .filter((b): b is Anthropic.TextBlock => b.type === "text")
  .map((b) => b.text)
  .join("");
 
const { thinking, answer, confidence } = parseCoTResponse(rawText);
 
// Log thinking for debugging, show only answer to user
console.debug("[thinking]", thinking);
console.log("[answer]", answer);

Extended Thinking (Claude)

Para Claude, puedes delegar el razonamiento al modelo de forma nativa usando thinking:

typescript

const response = await client.messages.create({
  model: "claude-opus-4-5",
  max_tokens: 16000,
  thinking: {
    type: "enabled",
    budget_tokens: 10000, // Tokens máximos para razonamiento interno
  },
  messages: [{ role: "user", content: userQuestion }],
});
 
for (const block of response.content) {
  if (block.type === "thinking") {
    console.debug("[thinking]", block.thinking);
  }
  if (block.type === "text") {
    console.log("[answer]", block.text);
  }
}

Cuándo usar CoT

Tarea	Beneficio esperado
Matemáticas y lógica	Alto — reduce errores de cálculo
Clasificación multi-criterio	Alto — mejora consistencia
Preguntas de conocimiento general	Bajo — puede añadir ruido
Generación creativa	Bajo o negativo — rompe el flujo

No uses CoT para todas las llamadas. En tareas simples, el razonamiento adicional no mejora el resultado y aumenta la latencia y el costo. Reserva CoT para decisiones que tienen consecuencias reales.

Errores comunes

No separar el thinking del output: si el razonamiento queda en la respuesta final, confunde al usuario y expone trabajo interno que puede incluir errores intermedios.
Budget demasiado bajo: con budget_tokens muy pequeño en extended thinking, el modelo puede generar razonamiento cortado que degrada la respuesta.
Pensar que CoT = correcto: CoT mejora la fiabilidad, no la garantiza. Sigue siendo necesaria la validación del output en casos críticos.