The Geometry of LLM Logits (an analytical outer bound)

Published on May 30, 2025 1:21 AM GMT

1 Preliminaries

Symbol	Meaning
$d$	width of the residual stream (e.g. 768 in GPT-2-small)
$L$	number of Transformer blocks
$V$	vocabulary size, so logits live in $R^{V}$
$h^{(ℓ)}$	residual-stream vector entering block $ℓ$
$r^{(ℓ)}$	the update written by block $ℓ$
$W_{U} \in R^{V \times d}, b \in R^{V}$	un-embedding matrix and bias

Additive residual stream.With (pre-/peri-norm) residual connections,

$h^{(ℓ + 1)} = h^{(ℓ)} + r^{(ℓ)}, ℓ = 0, \dots, L - 1.$

Hence the final pre-logit state is the sum of $L + 1$ contributions (block 0 = token+positional embeddings):

$h^{(L)} = L \sum ℓ = 0 r^{(ℓ)} .$

2 Each update is contained in an ellipsoid

Why a bound exists.Every sub-module (attention head or MLP)

reads

∥ u ∥_{2} \leq ρ_{ℓ}

ρ_{ℓ} := γ_{ℓ} \sqrt{d}

γ_{ℓ}

linear

Lipschitz

R^{d}

Because the composition of linear maps and Lipschitz functions is itself Lipschitz, there exists a constant $κ_{ℓ}$ such that

$∥ r^{(ℓ)} ∥_{2} \leq κ_{ℓ} whenever ∥ u ∥_{2} \leq ρ_{ℓ} .$

Define the centred ellipsoid

$E^{(ℓ)} := {x \in R^{d} : ∥ x ∥_{2} \leq κ_{ℓ}} .$

Then every realisable update lies inside that ellipsoid:

$r^{(ℓ)} \in E^{(ℓ)} .$

3 Residual stream ⊆ Minkowski sum of ellipsoids

Using additivity and Step 2,

$h^{(L)} = L \sum ℓ = 0 r^{(ℓ)} \in L \sum ℓ = 0 E^{(ℓ)} =: E_{tot},$

where $\sum ℓ E^{(ℓ)} = E^{(0)} \oplus \dots \oplus E^{(L)}$ is the Minkowski sum of the individual ellipsoids.

4 Logit space is an affine image of that sum

Logits are produced by the affine map $x \mapsto W_{U} x + b$ .For any sets $S_{1}, \dots, S_{m}$ ,

$W_{U} (⨁ i S_{i}) = ⨁ i W_{U} S_{i} .$

Hence

$logits = W_{U} h^{(L)} + b \in b + L ⨁ ℓ = 0 W_{U} E^{(ℓ)} .$

Because linear images of ellipsoids are ellipsoids, each $W_{U} E^{(ℓ)}$ is still an ellipsoid.

5 Ellipsotopes

An ellipsotope is an affine shift of a finite Minkowski sum of ellipsoids.The set

$L_{outer} := b + L ⨁ ℓ = 0 W_{U} E^{(ℓ)}$

therefore is an ellipsotope.

6 Main result (outer bound)

Theorem.For any pre-norm or peri-norm Transformer language model whose blocks receive LayerNormed inputs, the set $L$ of all logit vectors attainable over every prompt and position satisfies
$L \subseteq L_{o u t e r},$
where $L_{o u t e r}$ is the ellipsotope defined above.

Proof.Containments in Steps 2–4 compose to give the stated inclusion; Step 5 shows the outer set is an ellipsotope. ∎

7 Remarks & implications

It is an outer approximation.Equality $L = L_{outer}$ would require showing that every point of the ellipsotope can actually be realised by some token context, which the argument does not provide.

Geometry-aware compression and safety.Because $L_{outer}$ is convex and centrally symmetric, one can fit a minimum-volume outer ellipsoid to it, yielding tight norm-based regularisers or robustness certificates against weight noise / quantisation.

Layer-wise attribution.The individual sets $W_{U} E^{(ℓ)}$ bound how much any single layer can move the logits, complementing “logit-lens’’ style analyses.

Assumptions.LayerNorm guarantees $∥ u ∥_{2}$ is bounded; Lipschitz—but not necessarily bounded—activations (GELU, SiLU) then give finite $κ_{ℓ}$ . Architectures without such norm control would require separate analysis.

Discuss

1 Preliminaries

2 Each update is contained in an ellipsoid

3 Residual stream ⊆ Minkowski sum of ellipsoids

4 Logit space is an affine image of that sum

5 Ellipsotopes

6 Main result (outer bound)

7 Remarks & implications

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签