Model Validation & Testing#

nnterp includes automatic validation to prevent silent failures and ensure model correctness. When you load a model, a series of fast tests run automatically to verify the model works as expected.

Automatic Testing System#

When loading a StandardizedTransformer, nnterp automatically runs tests to ensure:

  • Model renaming correctness: All modules are properly renamed to the standardized interface

  • Module output shapes: Layer outputs have expected shapes (batch_size, seq_len, hidden_size)

  • Attention probabilities: If enabled, attention probabilities have correct shape (batch_size, num_heads, seq_len, seq_len), sum to 1 for each token, and modifying them changes model output

from nnterp import StandardizedTransformer

# Automatic tests run during model loading
model = StandardizedTransformer("gpt2")
# Tests passed: model is ready to use

# If tests fail, you'll see detailed error messages
model = StandardizedTransformer("unsupported-model")
# Error: Could not find layers module...

What nnterp Guarantees#

nnterp guarantees that:

  • All models follow the standardized naming convention

  • model.layers_output[i] returns tensors with expected shapes

  • model.attention_probabilities[i] (if enabled) returns properly normalized attention matrices

What nnterp Cannot Guarantee#

nnterp cannot guarantee:

  • Attention probabilities remain unmodified: The model might apply additional transformations after the attention probabilities are computed but before they’re used. Check model.attention_probabilities.print_source() to understand the exact hook location in the HuggingFace implementation.

  • Perfect HuggingFace compatibility: While nnterp uses original HuggingFace implementations, some edge cases might behave differently due to the renaming process.

# Check where attention probabilities are hooked
model.attention_probabilities.print_source()

Trade-offs and Configuration#

The automatic testing system comes with some trade-offs:

Model Dispatch#

nnterp automatically dispatches your model to available devices (device_map="auto") during loading. This can be inconvenient if you don’t want to load model weights immediately. However you can set allow_dispatch=False to disable this (but some tests won’t be run).

Attention Implementation#

nnterp uses the default HuggingFace attention implementation by default. To access attention probabilities, you must explicitly set enable_attention_probs=True, which automatically configures the model to use attn_implementation="eager" since other implementations (like sdpa or flash attention) don’t support attention pattern tracing.

# To access attention probabilities (slower but traceable)
model = StandardizedTransformer(
    "gpt2",
    enable_attention_probs=True
)

# Use default HuggingFace implementation (faster, no attention tracing)
model = StandardizedTransformer("gpt2")  # Uses default HF implementation

If you try to use both enable_attention_probs=True and a non-eager attn_implementation, nnterp will raise an error:

# This will raise an error
model = StandardizedTransformer(
    "gpt2",
    enable_attention_probs=True,
    attn_implementation="sdpa"  # Conflicts with enable_attention_probs
)

Manual Testing#

You can run tests manually for specific models or architectures:

# Test specific models
python -m nnterp run_tests --model-names "gpt2" "meta-llama/Llama-2-7b-hf"

# Test using toy models of specific architectures (faster/cheaper)
python -m nnterp run_tests --class-names "LlamaForCausalLM" "GPT2LMHeadModel"

This is useful when:

  • You’re using a different version of transformers or nnsight than officially tested

  • You want to test a new model architecture before using it in research

Version Compatibility#

nnterp checks if tests were run for your current nnsight and transformers versions. If not, it will warn you and suggest running manual tests.

The automatic testing system ensures that even if an architecture hasn’t been officially tested, if it loads successfully, it’s probably working correctly.