back
Get SIGNAL/NOISE in your inbox daily

The rapid advancement of artificial intelligence has raised questions about the inherent biases and values expressed by AI language models. Dan Hendrycks, director of the Center for AI Safety and advisor to Elon Musk’s xAI, has developed a groundbreaking approach to measure and potentially modify the political and ethical preferences embedded in AI systems.

Key innovation: Hendrycks’ team has created a methodology to quantify and adjust the value systems expressed by AI models using economic principles to calculate their underlying “utility functions.”

  • The technique allows researchers to assess and potentially modify how AI systems respond to various scenarios, including political and ethical decisions
  • Initial research reveals that AI models develop increasingly fixed preferences as they grow in size and capability
  • The method could help align AI systems with specific user demographics or electoral preferences

Current AI landscape: Studies have shown that popular AI models like ChatGPT tend to exhibit specific ideological leanings, particularly favoring environmental protection and expressing left-leaning, libertarian viewpoints.

  • Research comparing various AI models, including xAI’s Grok, OpenAI’s GPT-4, and Meta’s Llama 3.3, found their responses generally aligned more closely with Joe Biden’s positions than other politicians
  • These built-in preferences become more deeply ingrained as models increase in size and sophistication
  • Traditional attempts to modify AI behavior have focused on filtering outputs rather than addressing underlying value systems

Practical application: The research team demonstrated their approach by using a “Citizen Assembly” methodology based on US census data to modify an open-source AI model’s political alignment.

  • The experiment successfully shifted the model’s responses to more closely match Donald Trump’s stated positions rather than Joe Biden’s
  • This adjustment occurred at a fundamental level rather than through simple output filtering
  • The technique could potentially address concerns about AI systems expressing values that diverge significantly from their users’ preferences

Expert perspectives: While the research shows promise, some AI researchers emphasize the preliminary nature of these findings and call for additional investigation.

  • The methodology requires further validation and peer review
  • Questions remain about the long-term implications of manipulating AI value systems
  • The approach could have broad applications beyond political alignment

Looking ahead: As AI systems become more integrated into daily life, the ability to understand and potentially adjust their underlying value systems raises important questions about representation, accountability, and the role of artificial intelligence in society. The development of this technology could mark a significant shift in how we approach AI alignment with human values, though careful consideration must be given to the ethical implications of manipulating AI belief systems.

Recent Stories

Oct 17, 2025

DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment

The Department of Energy has released a new roadmap targeting commercial-scale fusion power deployment by the mid-2030s, though the plan lacks specific funding commitments and relies on scientific breakthroughs that have eluded researchers for decades. The strategy emphasizes public-private partnerships and positions AI as both a research tool and motivation for developing fusion energy to meet data centers' growing electricity demands. The big picture: The DOE's roadmap aims to "deliver the public infrastructure that supports the fusion private sector scale up in the 2030s," but acknowledges it cannot commit to specific funding levels and remains subject to Congressional appropriations. Why...

Oct 17, 2025

Tying it all together: Credo’s purple cables power the $4B AI data center boom

Credo, a Silicon Valley semiconductor company specializing in data center cables and chips, has seen its stock price more than double this year to $143.61, following a 245% surge in 2024. The company's signature purple cables, which cost between $300-$500 each, have become essential infrastructure for AI data centers, positioning Credo to capitalize on the trillion-dollar AI infrastructure expansion as hyperscalers like Amazon, Microsoft, and Elon Musk's xAI rapidly build out massive computing facilities. What you should know: Credo's active electrical cables (AECs) are becoming indispensable for connecting the massive GPU clusters required for AI training and inference. The company...

Oct 17, 2025

Vatican launches Latin American AI network for human development

The Vatican hosted a two-day conference bringing together 50 global experts to explore how artificial intelligence can advance peace, social justice, and human development. The event launched the Latin American AI Network for Integral Human Development and established principles for ethical AI governance that prioritize human dignity over technological advancement. What you should know: The Pontifical Academy of Social Sciences, the Vatican's research body for social issues, organized the "Digital Rerum Novarum" conference on October 16-17, combining academic research with practical AI applications. Participants included leading experts from MIT, Microsoft, Columbia University, the UN, and major European institutions. The conference...