We’ve developed a new way to train small AI models with internal mechanisms that are easier for humans to examine and understand....

We’ve developed a new way to train small AI models with internal mechanisms that are easier for humans to examine and understand.

Language models like the ones behind ChatGPT have complex, sometimes surprising internal structures, and we don’t yet fully understand how they work.

This approach is an early step toward closing that gap, and a part of a broader set of efforts across OpenAI to make our systems more interpretable—developing methods that help us understand why a model produced a given output. In some cases that means looking at the model’s step-by-step reasoning, and in others it means trying to reverse-engineer the small circuits inside the network.

There’s still a long path to fully understanding the complex behaviors of our most capable models.

https://lnkd.in/gqWJyw_b

We’ve developed a new way to train small AI models with internal mechanisms that are easier for humans to examine and understand….

Recent Stories

Google Now Stuffing Ads Into Its AI Products

Is AI’s war on busywork a creativity killer? What the experts say

The Dark Side of Hot Seed Rounds in the Age of AI: When Founders Just Keep the Money