Quick Start¶
Evaluate a file¶
The --classify flag adds accept/marginal/reject labels to the output.
Evaluate a directory¶
eigenhelm discovers all supported files recursively.
Evaluate only changed files¶
Useful in CI to score only what changed in a PR.
Read the output¶
myfile.py
decision: marginal
score: 0.55 (p78 — better than 78% of training corpus)
confidence: high
contributions:
manifold_drift 0.13 (weight: 0.30, normalized: 0.43)
manifold_alignment 0.12 (weight: 0.30, normalized: 0.40)
token_entropy 0.07 (weight: 0.15, normalized: 0.46)
compression_structure 0.14 (weight: 0.15, normalized: 0.92)
ncd_exemplar_distance 0.09 (weight: 0.10, normalized: 0.92)
directives:
[low] reduce_complexity → MyClass (lines 7-50)
#1 halstead_difficulty: contribution=-0.84, deviation=+1.6σ
Score: 0.0 (ideal) to 1.0 (worst). Lower is better.
Percentile: "p78" means this file scores better than 78% of the training corpus.
Decision:
| Decision | Hardcoded default | Meaning |
|---|---|---|
| accept | score < 0.4 | Code quality is good |
| marginal | 0.4 ≤ score < 0.6 | Review directives, improve if straightforward |
| reject | score ≥ 0.6 | Quality issues need attention |
Note
eh init generates .eigenhelm.toml with thresholds of 0.3/0.7. These override the hardcoded defaults above. Model-calibrated thresholds (from training corpus percentiles) also override when available.
Directives: Actionable suggestions with severity ([high], [medium], [low]) pointing to specific code locations.
Regions: When a file contains inline test code (Rust #[cfg(test)], Python class Test*), eigenhelm shows a breakdown:
This helps you see whether a bad score comes from production code or repetitive test patterns. See Test code dilution for details.
Output formats¶
Initialize project config¶
Creates .eigenhelm.toml with sensible defaults. See Configuration for details.
Set up for AI agents¶
Install the agent skill to give your coding agent the correct workflow:
The skill teaches agents to evaluate after tests pass, apply obvious fixes for rejects, and stop after two passes — preventing the over-iteration trap where agents break working code to chase a score.