Scaffolding has emerged as a critical approach to enhancing large language model (LLM) capabilities without modifying their internal architecture. This methodology allows developers to build external systems that significantly expand what LLMs can accomplish, from using tools to reducing errors, while simultaneously creating new opportunities for safety evaluation and interpretability research.
The big picture: Scaffolding refers to code structures built around LLMs that augment their abilities without altering their internal workings like fine-tuning or activation steering would.
Why this matters: Understanding scaffolding is crucial for safety evaluations because once deployed, users inevitably attempt to enhance LLM power through external systems, potentially revealing latent capabilities that might not be detected in standard prompting tests.
Key capabilities: Scaffolding systems allow LLMs to perform functions beyond their baseline abilities, including using tools, searching for information, and reducing error rates.
Common implementations: The scaffolding ecosystem includes various approaches of varying complexity, from simple prompt templates to sophisticated multi-agent systems.
Looking ahead: Despite slower progress than initially anticipated, researchers continue exploring scaffolding as a potential alternative path toward artificial general intelligence.