Constrained Decoding Visualizer

Watch how grammar constraints reshape a language model's token distribution, step by step.

1.0
Step 0
Feasible Mass Zt
--
fraction of probability mass on grammar-valid tokens
00.20.50.81.0
Raw model distribution pt
Constrained distribution qt

At each generation step, the model produces a distribution p_t over all tokens. A grammar constraint defines which tokens are valid given the current parse state. The feasible mass Z_t = sum of p_t(i) for valid tokens i tells us how much the constraint "agrees" with what the model wanted to say.

The constrained distribution q_t(i) = p_t(i) / Z_t for valid tokens (zero otherwise) is what we actually sample from. When Z_t is low (red), the constraint is fighting the model — few valid tokens carry probability mass. When Z_t is high (green), the model naturally wants to produce grammar-valid tokens.