Skip to content

Quick Start

Evaluate a file

eh evaluate path/to/file.py --classify

The --classify flag adds accept/marginal/reject labels to the output.

Evaluate a directory

eh evaluate src/ --classify

eigenhelm discovers all supported files recursively.

Evaluate only changed files

eh evaluate --diff origin/main...HEAD --classify

Useful in CI to score only what changed in a PR.

Read the output

myfile.py
  decision: marginal
  score:    0.55 (p78 — better than 78% of training corpus)
  confidence: high
  contributions:
    manifold_drift           0.13  (weight: 0.30, normalized: 0.43)
    manifold_alignment       0.12  (weight: 0.30, normalized: 0.40)
    token_entropy            0.07  (weight: 0.15, normalized: 0.46)
    compression_structure    0.14  (weight: 0.15, normalized: 0.92)
    ncd_exemplar_distance    0.09  (weight: 0.10, normalized: 0.92)
  directives:
    [low] reduce_complexity → MyClass (lines 7-50)
      #1 halstead_difficulty: contribution=-0.84, deviation=+1.6σ

Score: 0.0 (ideal) to 1.0 (worst). Lower is better.

Percentile: "p78" means this file scores better than 78% of the training corpus.

Decision:

Decision Hardcoded default Meaning
accept score < 0.4 Code quality is good
marginal 0.4 ≤ score < 0.6 Review directives, improve if straightforward
reject score ≥ 0.6 Quality issues need attention

Note

eh init generates .eigenhelm.toml with thresholds of 0.3/0.7. These override the hardcoded defaults above. Model-calibrated thresholds (from training corpus percentiles) also override when available.

Directives: Actionable suggestions with severity ([high], [medium], [low]) pointing to specific code locations.

Regions: When a file contains inline test code (Rust #[cfg(test)], Python class Test*), eigenhelm shows a breakdown:

  regions:
    production (lines 1-80):  0.55 (p55)
    test (lines 81-270):      0.82 (p8)

This helps you see whether a bad score comes from production code or repetitive test patterns. See Test code dilution for details.

Output formats

eh evaluate myfile.py --classify
eh evaluate myfile.py --format json
eh evaluate myfile.py --format sarif

SARIF output integrates with GitHub Code Scanning and other static analysis dashboards.

Initialize project config

eh init

Creates .eigenhelm.toml with sensible defaults. See Configuration for details.

Set up for AI agents

Install the agent skill to give your coding agent the correct workflow:

# Via skills registry
npx skills add metacogdev/skills

# Or via eigenhelm CLI
eh skill --install

The skill teaches agents to evaluate after tests pass, apply obvious fixes for rejects, and stop after two passes — preventing the over-iteration trap where agents break working code to chase a score.

Full agent integration guide

Next steps