In The Space

Research and white papers

The are hundreds of smart people focused on issues related to AI trust and safety. The following is just a sample of some of the great work being done.

Hazard, Harm and/or Risk Taxonomies

Safety Benchmarks

Capability Benchmarks

Benchmark Aggregation

Evaluating AI

Model Development

Bias

General

Seminal Papers

In the Past

Anthropic makes the case for government regulation of catastrophic AI risks

Oct 2024

Anthropic, a leading AI safety and research company, has developed a Responsible Scaling Policy (RSP) that provides a credible roadmap for how governments can move forward with regulation of catastrophic AI risks: CBRNE (chemical, biological, radiological, nuclear and explosives) for example. Basing regulations on an RSP-type document is workable and could be passed in a reasonable timeframe. At the same time, however, we also need to encourage governments to adopt regulations around non-existential AI risks such as hate speech and bias (a comprehensive taxonomy of these types of risks can be found here).

U.S Senate hearing: Subcommittee on Privacy, Technology and the Law

May 2023

At just over three hours, it takes a bit of time to go through, but this hearing demonstrates how the Senate is trying to understand the mistakes that where made with respect to the oversight of Social Media and how they are really trying to get it right with AI.

White House Involvement: Blueprint for an AI Bill of Rights

Oct 2022

“…the White House Office of Science and Technology Policy has identified five principles that should guide the design, use, and deployment of automated systems to protect the American public in the age of artificial intelligence.”

Congress needs to clarify section 230

Oct 2022

The Internet has changed enormously in the last 25 years; it is high time Congress put in the work to clarify Section 230 (part of the Communications Decency Act of 1996) instead of leaving it to the Supreme Court to try and interpret how the law should work.

The Pause: Future of Life Institute details AI oversight policies

March 2023

Whether you agree with the FLI open letter asking for “…all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4”, their policy recommendations provide a good set of “concrete recommendations for how we can manage AI risks.”

EU AI Act: EU moves forward with AI governance

April 2021

The European Union is making great progress on attempting to mitigate the risks of artificial intelligence. The Act isn’t perfect, but it’s a good first step and is definitely moving in the right direction. The link below provides the text of the act in 24 different languages.

We need to work together to tackle disinformation

Oct 2022

The New York Times has a great article on the challenges of combating disinformation/misinformation. One of the key takeaways of the article is that, because of the way information is shared, tackling the problem will require companies and organizations to work together to find a comprehensive solution.

EU Artificial Intelligence Act Approved

March 2024

This groundbreaking agreement by member States of the European Union isn’t perfect, but it provides guidance that is sorely needed to help ensure AI can be deployed safely and effectively. Passage is “is just the beginning of a long road of further rulemaking, delegating and legalese wrangling.”

ASEAN releases Guide on AI Governance and Ethics

February 2024

This is a practical guide for organizations of the 10 member States of the Association of Southeast Asian Nations (ASEAN) in their use of AI. It “includes recommendations on national-level and regional-level initiatives that governments in the region can consider implementing to design, develop, and deploy AI systems responsibly.”

Implementation of President Biden’s Executive Order on AI Moving Forward

April 2024

It’s been 180 days since Biden’s EO 14110 on Safe, Secure, and Trustworthy Artificial Intelligence was issued and NIST (National Institute of Science and Technology) is continuing to make progress with the release of four documents (AI RMF Generative AI Profile, Secure Software Development Practices for Generative AI and Dual-Use Foundation Models, Reducing Risks Posed by Synthetic Content, and A Plan for Global Engagement on AI Standards) focused on trying to make AI safer. These documents are drafts and NIST is soliciting public feedback.

MLCommons announces AI safety benchmark

April 2024

This benchmark is a proof-of-concept used to show that it is possible to assess the risks posed by AI systems in a concrete way. The use case of this benchmark is “...text-to-text interactions with a general purpose AI chat model…” It is a great first step in the long road of helping to ensure AI can be deployed safely and effectively.