AI Safety - CO/AI

News/AI Safety

Aug 3, 2024

NTIA Recommends Monitoring AI Risks While Supporting Open-Weight Models for Innovation

The National Telecommunications and Information Administration (NTIA) has released a report supporting the widespread availability of powerful AI models, known as open-weight models, to promote innovation and accessibility. However, the report also calls for active monitoring of potential risks and outlines steps for collecting evidence, evaluating it, and taking action if necessary. Key recommendations: The report recommends that the U.S. government refrain from restricting the availability of open model weights for currently available systems while actively monitoring for potential risks: The government should develop an ongoing program to collect evidence of risks and benefits, evaluate that evidence, and act on...

read Aug 3, 2024

OpenAI Partners With US Regulators to Boost AI Safety

OpenAI's new partnership with the U.S. AI Safety Institute marks a significant step in the company's efforts to prioritize AI safety and regain trust in the wake of recent criticisms. This collaboration, which includes giving the government agency early access to OpenAI's next major AI model for safety testing, could have far-reaching implications for the development and regulation of AI technologies. A shift in safety strategy: OpenAI CEO Sam Altman has announced a new push for AI safety measures, which could significantly impact ChatGPT and other OpenAI products: The company will provide early access to its next major AI model...

read Aug 2, 2024

California AI Safety Receives Widespread Criticism from AI Community

A new bill authored by Sen. Scot Wiener is making its way through the California Legislature with the intent to prevent AI from causing catastrophic effects. The proposed legislation, Senate Bill 1047, requires developers to conduct safety testing prior to public deployment and for the same reason is drawing strong opposition from various stakeholders in the AI community. Key provisions of the bill: SB 1047 seeks to balance fostering AI innovation with managing associated risks: AI developers would be required to safely test advanced AI models before training or releasing them to the public. The state attorney general would have...

read Aug 1, 2024

OpenAI Removes Non-Disparagement Clauses, Recommits to AI Safety

OpenAI's commitment to AI safety and employee rights takes center stage as the company gears up for its next major release, signaling a proactive approach to addressing key concerns in the rapidly evolving AI landscape. Collaboration with US AI Safety Institute: OpenAI has partnered with the US AI Safety Institute to provide early access to its upcoming foundation model, demonstrating a commitment to prioritizing safety in the development process: While no specific release date for the new model has been announced, the collaboration underscores OpenAI's efforts to engage with external experts to ensure responsible deployment of its AI technologies. The...

read Jul 31, 2024

Deepfake Porn is a Harrowing New Frontier of Digital Harassment for Women

The rise of deepfakes has enabled a disturbing new form of digital harassment targeting women, even at the highest levels of politics, with little recourse for victims. A public servant targeted: Sabrina Javellana, one of Florida's youngest elected officials, discovered explicit deepfake images of herself posted on online forums alongside misogynistic comments: The fakes were nearly identical to real photos from her social media accounts, but with her clothes digitally removed. As a progressive politician, Javellana had faced vitriolic threats before, but this violation felt distinctly personal and traumatizing. Seeking help and finding dead ends: Javellana's attempts to get the...

read Jul 31, 2024

Meta AI Inadvertently Spreads Misinformation on Trump Assassination

Trump assassination attempt misinformation highlights challenges with AI chatbots, as Meta's AI assistant and other generative AI models struggle to handle real-time events accurately. AI hallucinations and misinformation: Meta's AI assistant and other generative AI systems are prone to "hallucinations," where they provide incorrect or inappropriate responses, particularly when dealing with recent events: Meta's AI initially asserted that the attempted assassination of former President Donald Trump didn't happen, despite the incident being widely reported. Joel Kaplan, Meta's global head of policy, acknowledges that this is an "industry-wide issue" affecting all generative AI systems, presenting an ongoing challenge in handling real-time...

read Jul 30, 2024

Deepfakes and Algorithms: How Bad Actors Weaponize AI to Manipulate Minds

The rise of artificial intelligence (AI) has brought about unprecedented opportunities, but also significant dangers as bad actors exploit the technology to manipulate people and undermine trust in the digital ecosystem. The dark side of AI: Bad actors, from cybercriminals to unethical corporations and rogue states, are weaponizing AI to craft sophisticated strategies that influence individuals and groups, often without their knowledge: Deepfakes, hyper-realistic video or audio recordings that make it appear as if someone is saying or doing something they never did, pose a significant threat to personal reputations and the integrity of information. AI-powered social media bots and...

read Jul 29, 2024

Google Updates Gemini Chatbot’s Guidelines to Prioritize Safety and Ethical Behavior

Key focus on child safety and preventing harmful outputs: Google's first listed guideline for Gemini is to avoid generating any content related to child sexual abuse or that encourages dangerous activities or depicts shocking violence. However, the company acknowledges that context matters and educational, documentary, artistic, or scientific applications may be considered. Google admits ensuring Gemini always adheres to its own guidelines is challenging due to the unlimited ways users can interact with the chatbot and the probabilistic nature of the AI's responses. An internal "red team" at Google stress tests Gemini to find and patch any potential leaks or...

read Jul 29, 2024

Silicon Valley Entrepreneurs Advocate for Open-Source AI Development to Drive Innovation and Trust

The open-source approach to AI development will drive innovation and benefit society, argue two prominent Silicon Valley entrepreneurs. Martin Casado and Ion Stoica make a case for keeping AI models transparent and modifiable, contending that this approach can foster rapid progress without compromising security. Key arguments for open-source AI: Casado and Stoica believe that an open-source framework is essential for realizing AI's full potential: Open-source models allow for greater collaboration among researchers and developers, accelerating the pace of innovation and enabling more rapid improvements to AI systems. Transparency in AI development can help build public trust by allowing for greater...

read Jul 27, 2024

NIST Releases Guidance to Improve AI Safety, Security, and Trust

The U.S. Department of Commerce announced new guidance and software tools from the National Institute of Standards and Technology (NIST) to help improve the safety, security and trustworthiness of artificial intelligence (AI) systems, marking 270 days since President Biden's Executive Order on AI. Key NIST releases: NIST released three final guidance documents previously released in draft form for public comment in April, as well as two new products appearing for the first time: A draft guidance document from the U.S. AI Safety Institute intended to help mitigate risks stemming from generative AI and dual-use foundation models A software package called...

read Jul 27, 2024

Grok AI Spreads False Claims About 2024 Presidential Election Ballots

Grok AI spreads misinformation about presidential election ballots: Elon Musk's Grok AI chatbot has been erroneously telling voters that presidential ballots are "locked and loaded" in eight states, despite the Democratic nomination process still being underway: Grok claims ballots are finalized in Alabama, Indiana, Michigan, Minnesota, New Mexico, Ohio, Pennsylvania, Texas, and Washington, citing a tweet from a conservative pundit. However, Democratic delegates don't start voting until August 1st and the Democratic National Convention isn't until August 19th. States have not printed general election ballots yet. Even in "fun mode", Grok repeats the incorrect information. The AI bases its claims...

read Jul 27, 2024

Apple Joins Tech Giants in White House’s AI Safety Pledge

Apple signs onto The White House's AI commitments, joining other major tech companies in promoting safe and responsible AI development. Key players and commitments: Apple is the latest company to sign The White House's voluntary AI agreement, which outlines principles for the safe and responsible development of artificial intelligence: The agreement, released last year, has already been signed by major tech companies such as OpenAI, Amazon, Google, Microsoft, Meta, Adobe, and Nvidia. By signing the agreement, Apple commits to promoting the responsible development and deployment of AI technologies, addressing potential risks and ethical concerns. Broader context and implications: The White...

read Jul 27, 2024

White House Advances AI Leadership, Safety, and Innovation with New Actions and Commitments

The Biden-Harris administration has announced new actions and received an additional major voluntary commitment on artificial intelligence (AI), building on the landmark Executive Order issued by President Biden nine months ago to ensure America's leadership in managing the opportunities and risks of AI. Key developments: Apple has signed onto the voluntary AI commitments made by 15 leading U.S. AI companies last year, further cementing these commitments as cornerstones of responsible AI innovation. Federal agencies reported completing all 270-day actions in the AI Executive Order on schedule, making progress on critical areas such as managing AI's safety and security risks, protecting...

read Jul 26, 2024

Existential Risk from AI Too Unreliable to Base Policy Decisions on, Some Experts Say

In a recent blog post, Arvind Narayanan and Sayash Kapoor argue that forecasts of existential risk from AI are based on speculation and pseudo-quantification rather than sound evidence or methodology. Key issues with AI existential risk forecasting: The article identifies several reasons why current AI existential risk probability estimates are unreliable and unsuitable for guiding policy: Inductive probability estimation is unreliable due to the lack of a suitable reference class, as an AI-driven human extinction event would be unprecedented and dissimilar to any past events. Deductive probability estimation is unreliable due to the lack of a well-established theory or model...

read Jul 26, 2024

OpenAI Reassigns Its Top Safety Exec Amid Mounting Scrutiny, Antitrust Probes

OpenAI has reassigned top AI safety executive Aleksandr Madry to role focused on AI reasoning. Key developments: Last week, OpenAI removed Aleksandr Madry, one of its top safety executives, from his role as head of preparedness and reassigned him to a job focused on AI reasoning: Madry's preparedness team was tasked with protecting against catastrophic risks related to frontier AI models. He will still work on core AI safety in his new role. The decision came less than a week before Democratic senators sent a letter to OpenAI CEO Sam Altman questioning how the company is addressing emerging safety concerns...

read Jul 26, 2024

Meta’s Open-Source AI Sparks Debate Over Safety, Innovation, and Accountability

A debate over open-source versus closed AI models is emerging, as Meta releases an open-source model while OpenAI keeps its code private. This development raises important questions about the implications of these different approaches for AI safety, competition, and innovation. Meta's open-source approach sparks controversy: Meta CEO Mark Zuckerberg has called for open-source AI development and released an open-source model, Llama 3.1, which the company claims can compete with closed models like OpenAI's ChatGPT. Anthony Aguirre, executive director of the Future of Life Institute, suggests that open-source models are incompatible with safety regulation, as they lack the necessary guardrails to...

read Jul 25, 2024

Strategies for Ensuring AI Safety and Chatbot Flaws

The large language models (LLMs) powering chatbots like ChatGPT are capable of impressive feats, but still routinely produce errors and can be made to behave in undesirable or harmful ways. Making these AI systems safe and robust to misuse is a critical challenge as they become more widely deployed. Vulnerabilities rooted in fundamental properties of LLMs: Many of the problems with LLMs stem from how they work - by predicting the most likely next word based on statistical patterns in their training data: Performance can fluctuate wildly depending on how common the output, task or input text is on the...

read Jul 25, 2024

Senate Passes Landmark Bill to Combat Nonconsensual Deepfake Porn

The DEFIANCE Act, a bipartisan bill to provide legal recourse to victims of non-consensual deepfake pornography, has unanimously passed the Senate and now heads to the House. Key legislative details: The DEFIANCE Act amends the Violence Against Women Act to allow victims to sue producers, distributors, or recipients of deepfake porn if they knew or recklessly disregarded the lack of consent: The bill provides a civil cause of action for both adults and minors, becoming the first federal law to do so if passed by the House. Recent amendments clarify the definition of "digital forgery," update available damages, and add...

read Jul 24, 2024

Silicon Valley’s Audacious AI Attitude: Unleashing Progress or Imposing Values?

Tech industry's audacious attitude and actions regarding generative AI are cause for concern among many, who claim that a small group of Silicon Valley leaders believe they are on the cusp of creating an artificial superintelligence that will radically reshape society. Here are those opponents views... Key takeaways: AI companies are brushing aside problems such as copyright infringement, job displacement, and the spread of low-quality content, while embracing a manifest-destiny attitude toward their technologies. Some AI leaders are expressing paternalistic views about the future, suggesting that those who don't embrace their technology will be left behind. There are material concerns,...

read Jul 24, 2024

OpenAI’s “Rules Based Rewards” Aims to Automate AI Safety Alignment

OpenAI has developed a new method called Rules Based Rewards (RBR) to align AI models with safety policies more efficiently. Key Takeaways: RBR automates some of the fine-tuning process and reduces the time needed to ensure a model produces intended results: Safety and policy teams create a set of rules for the model to follow, and an AI model scores responses based on adherence to these rules. This approach is comparable to reinforcement learning from human feedback but reduces the subjectivity often faced by human evaluators. OpenAI acknowledges that RBR could reduce human oversight and potentially increase bias, so researchers...

read Jul 24, 2024

Senate Passes DEFIANCE Act, Enabling Victims to Sue Deepfake Creators for Damages

The U.S. Senate passed the DEFIANCE Act, a bill that allows victims of nonconsensual intimate AI-generated images, or "deepfakes," to sue the creators for damages, marking a significant step in addressing the growing problem of AI-enabled sexual exploitation. Key provisions of the DEFIANCE Act: The bill enables victims of sexually explicit deepfakes to seek civil remedies against those who created or processed the images with the intent to distribute them: Identifiable victims can receive up to $150,000 in damages, which can be increased to $250,000 if the incident is connected to sexual assault, stalking, harassment, or if it directly caused...

read Jul 24, 2024

ACLU Argues New Laws Regulating Deepfakes Infringe on Free Speech

The ACLU is fighting to protect free speech rights related to AI-generated content, arguing that some of the new laws regulating deepfakes and other AI outputs conflict with the First Amendment. This stance is leading to an uncomfortable reckoning for the movement to control AI. Key takeaways: AI itself has no rights, but people using AI to communicate have First Amendment protections. The ACLU contends that citizens have a constitutional right to use AI to spread untruths, just as they do with other forms of speech. Restricting who can listen to AI-generated speech would also infringe on the "right to...

read Jul 24, 2024

AI Compliance Startup Vanta Raises $150M, Valuation Soars to $2.45B Amid Tech Trust Concerns

Vanta, a startup revolutionizing security compliance through AI, has raised $150 million in Series C funding, propelling its valuation to $2.45 billion and marking a significant shift in the tech compliance landscape. AI-powered compliance reshaping security management: Vanta is at the forefront of using AI to automate and streamline security compliance, moving the industry from point-in-time checks to continuous, real-time monitoring: Vanta's AI-powered platform delivers the NIST AI risk management framework and ISO 42,001 certification, ensuring responsible use of customer data and addressing concerns about data privacy and security in the wake of advancements in large language models. The company's...

read Jul 23, 2024

TikTok Lite’s Missing Safeguards Put 1B+ Users at Risk, Study Reveals

TikTok's Lite version, popular in the Global South, lacks important safety features and content labels found in the main app, potentially exposing over 1 billion users to misinformation and inappropriate content. Key differences between TikTok and TikTok Lite: The report by the Mozilla Foundation and AI Forensics highlights several critical safety measures absent in the Lite version: AI-generated content is not labeled in TikTok Lite, unlike the main app, leaving users unaware of potentially deceptive material. Warnings about graphic content or dangerous behavior are missing in the Lite version. Resource hubs with credible information on topics like elections and health...

read