Basic Usage
===========

.. meta::
   :llm-description: Learn the standardized interface for transformer models. Covers model loading, accessing layer inputs/outputs, skip layers functionality, and built-in methods like project_on_vocab and steer.

Standardized Interface
----------------------

Different transformer models use different naming conventions. ``nnterp`` standardizes all models to use the llama naming convention:

.. code-block:: text

   StandardizedTransformer
   ├── embed_tokens
   ├── layers
   │   ├── self_attn
   │   └── mlp
   ├── ln_final
   └── lm_head

Loading Models
~~~~~~~~~~~~~~

.. code-block:: python

   from nnterp import StandardizedTransformer
   
   # These all work the same way
   model = StandardizedTransformer("gpt2")
   model = StandardizedTransformer("meta-llama/Llama-2-7b-hf")
   
   # Uses device_map="auto" by default
   print(model.device)
   # Access model configuration attributes
   print(f"number of layers: {model.num_layers}")
   print(f"hidden size: {model.hidden_size}")
   print(f"number of attention heads: {model.num_heads}")
   print(f"vocabulary size: {model.vocab_size}")

Accessing Module I/O
--------------------

Access layer inputs and outputs directly:

.. code-block:: python

   with model.trace("hello"):
       # Access layer outputs
       layer_5_output = model.layers_output[5]
       
   # Access attention and MLP outputs:
   with model.trace("hello"):
       attn_output = model.attentions_output[3]
       mlp_output = model.mlps_output[3]

Skip Layers
~~~~~~~~~~~

.. code-block:: python

   with model.trace("Hello world"):
       # Skip layer 1
       model.skip_layer(1)
       # Skip layers 2 through 3
       model.skip_layers(2, 3)

Use saved activations:

.. code-block:: python

   import torch

   with model.trace("Hello world") as tracer:
       layer_6_out = model.layers_output[6].save()
       tracer.stop()
   
   with model.trace("Hello world"):
       model.skip_layers(0, 6, skip_with=layer_6_out)
       result = model.logits.save()
    
    with model.trace("Hello world"):
        results_vanilla = model.logits.save()
    
    assert torch.allclose(results_vanilla, results_skipped)

Built-in Methods
----------------

Project to vocabulary (apply unembed ln_final and lm_head to an activation):

.. code-block:: python

   with model.trace("The capital of France is"):
       hidden = model.layers_output[5]
       logits = model.project_on_vocab(hidden)

Steering:

.. code-block:: python

   import torch
   
   steering_vector = torch.randn(768)  # gpt2 hidden size
   with model.trace("The weather today is"):
       model.steer(layers=[1, 3], steering_vector=steering_vector, factor=0.5)