In The Space
In the Past
U.S Senate hearing: Subcommittee on Privacy, Technology and the Law
May 2023
At just over three hours, it takes a bit of time to go through, but this hearing demonstrates how the Senate is trying to understand the mistakes that where made with respect to the oversight of Social Media and how they are really trying to get it right with AI.
White House Involvement: Blueprint for an AI Bill of Rights
Oct 2022
“…the White House Office of Science and Technology Policy has identified five principles that should guide the design, use, and deployment of automated systems to protect the American public in the age of artificial intelligence.”
Congress needs to clarify section 230
Oct 2022
The Internet has changed enormously in the last 25 years; it is high time Congress puts in the work to clarify Section 230 (part of the Communications Decency Act of 1996) instead of leaving it to the Supreme Court to try and interpret how the law should work.
The Pause: Future of Life Institute details AI oversight policies
March 2023
Whether you agree with the FLI open letter asking for “…all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4”, their policy recommendations provide a good set of “concrete recommendations for how we can manage AI risks.”
EU AI Act: EU moves forward with AI governance
April 2021
The European Union is making great progress on attempting to mitigate the risks of artificial intelligence. The Act isn’t perfect, but it’s a good first step and is definitely moving in the right direction. The link below provides the text of the act in 24 different languages.
We need to work together to tackle disinformation
Oct 2022
The New York Times has a great article on the challenges of combating disinformation/misinformation. One of the key takeaways of the article is that, because of the way information is shared, tackling the problem will require companies and organizations to work together to find a comprehensive solution.
Research and white papers
The are hundreds of super smart people focused on issues related to AI safety. The following is just a sample of some of the great work being done.
Introducing v0.5 of the AI Safety Benchmark from MLCommons
BBQ: A Hand-Built Bias Benchmark for Question Answering
All that Agrees Is Not Gold: Evaluating Ground Truth Labels and Dialogue Content for Safety
A Framework to Assess (Dis)agreement Among Diverse Rater Groups
SIMPLESAFETYTESTS: A Test Suite for Identifying Critical Safety Risks in Large Language Models
Ethical and social risks of harm from Language Models
LLM Agents can Autonomously Hack Websites
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts
An Insider’s Guide to Designing and Operationalizing a Responsible AI Governance Framework
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
Evaluating Frontier Models for Dangerous Capabilities
Frontier Model Forum: What is Red Teaming?
Holistic Evaluation of Language Models
THE HISTORY AND RISKS OF REINFORCEMENT LEARNING AND HUMAN FEEDBACK
A Survey on Bias and Fairness in Machine Learning
The Political Preferences of LLMs