Skip to content

Scoring Dimensions

eigenhelm scores code across five dimensions. Each measures a different aspect of code quality, and together they produce a holistic assessment.

Manifold drift

Weight: 0.30 | Source: PCA reconstruction error

Measures how far a file's structural fingerprint sits from the learned code quality manifold. High drift means the code has unusual structural properties not seen in the training corpus.

What drives high drift:

  • Unusual AST patterns (deeply nested conditionals, atypical class structures)
  • Code that doesn't resemble any category in the training corpus
  • Generated code with repetitive boilerplate

Manifold alignment

Weight: 0.30 | Source: PCA projection onto quality axes

Measures how well the code aligns with the principal directions of quality variation learned during training. Good alignment means the code varies from the corpus mean in the same directions that high-quality code does.

What drives poor alignment:

  • Metric combinations rarely seen in quality code (e.g., high complexity + low vocabulary)
  • Structural patterns orthogonal to the quality manifold

Token entropy

Weight: 0.15 | Source: Shannon entropy of byte stream

Measures the information density of the source code. Well-written code tends to have moderate entropy — not too repetitive (low entropy) and not too noisy (high entropy).

The normalized penalty is 1.0 - (entropy / 8.0), so lower entropy (more repetitive) scores worse.

What drives poor entropy:

  • Highly repetitive code (copy-pasted blocks)
  • Extremely terse or obfuscated code

Compression structure

Weight: 0.15 | Source: Birkhoff aesthetic measure

Applies the Birkhoff aesthetic measure — an information-theoretic ratio of order to complexity:

$$M_Z = \frac{N \cdot H - K}{N \cdot H}$$

Where N = raw bytes, H = entropy (bits/byte), K = compressed bytes (zlib).

High values indicate the code has significant redundancy that compression can exploit — often a sign of structural repetition.

Small files

Files under ~80 lines almost always score high on this dimension due to insufficient statistical mass. Directives from compression structure are capped at medium severity for small files.

NCD exemplar distance

Weight: 0.10 | Source: Normalized Compression Distance

Measures how similar the code is to the nearest high-quality exemplar stored in the model, using Normalized Compression Distance:

$$NCD(x, y) = \frac{C(xy) - \min(C(x), C(y))}{\max(C(x), C(y))}$$

Low NCD means the code structurally resembles a known good example. High NCD means it's unlike anything in the exemplar set.

Weight configurations

Weights adjust based on available data:

Scenario Drift Alignment Entropy Compression NCD
Model + exemplars 0.30 0.30 0.15 0.15 0.10
Model only 0.35 0.35 0.15 0.15 0.00
Exemplars only 0.00 0.00 0.30 0.30 0.40
Neither (fallback) 0.00 0.00 0.50 0.50 0.00