Dragging distributions apart
Each panel starts from five logits over the same vocabulary. Drag a bar handle up or down to change those logits directly, then compare the softmax probabilities, the resulting samples, and the KL divergence.
Words same 5-word support for both distributions
Drag the logits directly
KL shows how far
P is from Q
KL divergence
0.000 bits
This page shows DKL(P || Q) = Σ p(x) log2(p(x) / q(x)). It is directional: swapping
P and Q would give a different value.
Reference Distribution
Bar heights are logits for P.
P
Softmax Probabilities
Sample Sequence From P
Comparison Distribution
Bar heights are logits for Q.
Softmax Probabilities
Sample Sequence From Q