Documents Archive - Future of Life Institute

Turning Vision into Action: Implementing the Senate AI Roadmap

Taylor Jones — Mon, 17 Jun 2024 15:44:49 +0000

Executive Summary

On May 15, 2024, the Senate AI Working Group released “Driving U.S. Innovation in Artificial Intelligence: A Roadmap for Artificial Intelligence Policy in the United States Senate,” which synthesized the findings from the Senate AI Insight Forums into a set of recommendations for Senate action moving forward. The Senate’s sustained efforts to identify and remain abreast of the key issues raised by this rapidly evolving technology are commendable, and the Roadmap demonstrates a remarkable grasp of the critical questions Congress must grapple with as AI matures and permeates our everyday lives.

The need for regulation of the highest-risk AI systems is urgent. The pace of AI advancement has been frenetic, with Big Tech locked in an out-of-control race to develop increasingly powerful, and increasingly risky, AI systems. Given the more deliberate pace of the legislative process, we remain concerned that the Roadmap’s deference to committees for the development of policy frameworks could delay the passage of substantive legislation until it is too late for effective policy intervention.

To expedite the process of enacting meaningful regulation of AI, we offer the following actionable recommendations for such policy frameworks that can form the basis of legislation to reduce risks, foster innovation, secure wellbeing, and strengthen global leadership.

AGI and Testing of Advanced General-Purpose AI Systems

Adopt a stratified, capabilities-based oversight framework including pre-deployment assessment, auditing, and licensure, and post-deployment monitoring for the most advanced general-purpose AI systems (GPAIS), with the burden of proving the system’s suitability for release resting on the developer of the system.
Require audits evaluating the safety and security of advanced GPAIS to be conducted by independent, objective actors either within the government or accredited by the government, with the authority to prohibit the release of an advanced AI system if they identify that it poses a significant unresolved risk to the safety, security, or wellbeing of the American public.
Establish regulatory thresholds that are inclusive of, but not limited to, training compute to ensure that current and future systems of concern remain in scope, with each threshold independently sufficient to qualify a system as subject to additional scrutiny.
Establish a centralized federal authority responsible for monitoring, evaluating, and regulating advanced GPAIS, and for advising other agencies on activities related to AI within their respective jurisdictions.
Augment whistleblower protections to cover reporting on unsafe practices in the development or planned deployment of AI systems.

Liability

Specify strict liability (“abnormally dangerous activity”) for the development of the most advanced GPAIS, which pose risks that cannot be anticipated and therefore cannot be eliminated through reasonable care.
Apply a rebuttable presumption of negligence for harms caused by advanced GPAIS that do not meet the threshold to be subject to strict liability but do meet a lower threshold of capability.
Subject domain-specific AI systems to the legal standards applicable in that domain.
Apply joint and several liability for harms caused by advanced GPAIS.
Clarify that Section 230 does not shield AI providers from liability for harms resulting from their systems.

AI and National Security

Establish an Information Sharing and Analysis Center and require AI developers to share documentation with the ISAC pertaining to the development and deployment lifecycle.
Require the most powerful AI systems and those that could pose CBRN threats to be tested in an AIxBio sandbox as proposed in the 2025 NDAA draft.
Prohibit training models on the most dangerous dual-use research of concern and restrict the use of dual-use research of concern for training narrow AI systems.
Invest in core CBRN defense strategies such as personal protective equipment, novel medical countermeasures, and ultraviolet-c technologies.

Compute Security And Export Controls

Support the passage of the ENFORCE Act (H.R. 8315) with clarifications to avoid loopholes from open publication of model weights.
Require the use of chips with secure hardware for the training of advanced AI systems and to receive licensure for export of high-end chips to restricted countries.

Autonomous Weapons Systems And Military Integration Of AI

Mandate that nuclear launch systems remain independent from CJADC2 capabilities
Require DOD to establish boards comprised of AI ethics officers across offices involved in the production, procurement, development, and deployment of military AI systems.
Task CDAO with establishing clear protocols for measuring the accuracy of AI systems integrated into defense operations and prohibit integration of systems that do not meet the threshold of “five-digit accuracy.”
Codify DOD Directive 3000.09 and raise required human involvement in military use of AI from “appropriate levels of human judgment” to “meaningful human control”; require CDAO to file a report establishing concrete guidance for meaningful human control in practice.
Invest in the development of non-kinetic counter-autonomous weapons systems.

Open-source AI

Require advanced AI systems with open model weights to undergo thorough testing and evaluation in secure environments appropriate to their level of risk.
Hold developers legally responsible for performing all reasonable measures to prevent their models from being retrained to enable illegal activities, and for harms resulting from their failure to do so.
Pursue “public options” for AI such as the National AI Research Resource (NAIRR) to democratize AI development and combat concentration of power.

Supporting AI Innovation

Allocate R&D funding to BIS for the development of on-chip hardware governance solutions, and for the implementation of those solutions.
Expand NAIRR public compute programs to include funding directed toward the development of secure testing and usage infrastructure for academics, researchers, and members of civil society.
Ensure that R&D funding allocated towards improving interagency coordination at the intersection of AI and critical infrastructure includes funding requirements to safety and security research.

Combatting Deepfakes

Create civil and/or criminal liability mechanisms to hold developers and providers accountable for harms resulting from deepfakes.
Ensure users accessing models to produce and share deepfakes are subject to civil and/or criminal liability.
Place a responsibility on compute providers to revoke access to their services when they have knowledge that their services are being used to create harmful deepfakes, or to host models that facilitate the creation of harmful deepfakes.
Support the passage of proposed bills including the NO FAKES Act, with some modifications to clarify the liability of service providers such as model developers.

Provenance and Watermarking

Require model developers and providers to integrate provenance tracking capabilities into their systems.
Require model developers and providers to make content provenance information as difficult to bypass or remove as possible, taking into account the current state of science.
Support the passage of bills like the AI Labeling Act, which mandate clear and permanent notices on AI-generated content that identify the content as AI-produced and specify the tool used along with the creation date.

Introduction

In May 2024, the Bipartisan Senate AI Working Group, spearheaded by Majority Leader Schumer, Sen. Rounds, Sen. Heinrich, and Sen. Young, released a “roadmap for artificial intelligence policy in the United States Senate” entitled “Driving U.S. Innovation in Artificial Intelligence.” The Roadmap is a significant achievement in bipartisan consensus, and thoughtfully identifies the diversity of potential avenues AI presents for both flourishing and catastrophe. Drawing on the input of experts at the Senate AI Insight Forums and beyond, the Roadmap includes several promising recommendations for the Senate’s path forward.

At the same time, the Roadmap lacks the sense of urgency for congressional action we see as critical to ensuring AI is a net benefit for the wellbeing of the American public, rather than a source of unfettered risk. The pace of advancement in the field of AI has accelerated faster than even leading experts had anticipated, with competitive pressures and profit incentives driving Big Tech companies to race haphazardly toward creating more powerful, and consequently less controllable, systems by the month. A byproduct of this race is the relegation of safety and security to secondary concerns for these developers.

The speed with which this technology continues to evolve and integrate stands in stark contrast to the typical, more deliberate pace of government. This mismatch raises a risk that requisite government oversight will not be implemented quickly enough to steer AI development and adoption in a more responsible direction. Realization of this risk would likely result in a broad array of significant harms, from systematic discrimination against disadvantaged communities to the deliberate or accidental failure of critical infrastructure, that could otherwise be avoided. The social and economic permeation of AI could also render future regulation nearly impossible without disrupting and potentially destabilizing the US’s socioeconomic fabric – as we have seen with social media, reactive regulation of emerging technology raises significant obstacles where proactive regulation would not, and pervasive harm is often the result. In other words, the time to establish meaningful regulation and oversight of advanced AI is now.

With this in mind, we commend the Senate AI Working Group for acting swiftly to efficiently bring the Senate up to speed on this rapidly evolving technology through the Senate AI Insight Forums and other briefings. However, we are concerned that, in most cases, the Roadmap encourages committees to undertake additional consideration toward developing frameworks from which legislation could then be derived, rather than contributing to those actionable frameworks directly. We recognize that deference to committees of relevant jurisdiction is not unusual, but fear that this process will imprudently delay the implementation of AI governance, particularly given the November election’s potential to disrupt legislative priorities and personnel.

To streamline congressional action, we offer concrete recommendations for establishing legislative frameworks across a range of issues raised in the Roadmap. Rather than building the necessary frameworks from the ground up, our hope is that the analyses and recommendations included herein will provide actionable guidance for relevant committees and interested members that would reduce risks from advanced AI, improve US innovation, wellbeing, and global leadership, and meet the urgency of the moment.

AGI and Testing of Advanced General-Purpose AI Systems

We applaud the AI Working Group for recognizing the unpredictability and risk associated with the development of increasingly advanced general-purpose AI systems (GPAIS). The Roadmap notes “the significant level of uncertainty and unknowns associated with general purpose AI systems achieving AGI.” We caution, however, against the inclination that the uncertainty and risks from AGI manifest only beyond a defining, rigid threshold, and emphasize that these systems exist on a spectrum of capability that correlates with risk and uncertainty. Unpredictability and risks have already been observed in the current state-of-the-art, which most experts categorize as sub-AGI, and are expected to increase in successive generations of more advanced systems, even as new risks emerge.

While the Roadmap encourages relevant committees to identify and address gaps in the application of existing law to AI systems within their jurisdiction, the general capabilities of these systems make it particularly challenging to identify appropriate committees of jurisdiction as well as existing legal frameworks that may apply. This challenge was a major impetus for establishing the AI Working Group — as the Roadmap notes in the Introduction, “the AI Working Group’s objective has been to complement the traditional congressional committee-driven policy process, considering that this broad technology does not neatly fall into the jurisdiction of any single committee.”

Rather than a general approach to regulating the technology, the Roadmap suggests addressing the broad scope of AI risk through use case-based requirements on high-risk uses of AI. This approach may indeed be appropriate for most AI systems, which are designed to perform a particular function and operate exclusively within a specific domain. For instance, while some tweaks may be necessary, AI systems designed exclusively for financial evaluation and prediction can reasonably be overseen by existing bodies and frameworks for financial oversight. We are also pleased by the AI Working Group’s acknowledgement that some more egregious uses of AI should be categorically banned – the Roadmap specifically recommends a prohibition on the use of AI for social scoring, and encourages committees to “review whether other potential uses for AI should be either extremely limited or banned.”

That said, a use case-based approach is not sufficient for today’s most advanced GPAIS, which can effectively perform a wide range of tasks, including some for which they were not specifically designed, and can be utilized across distinct domains and jurisdictions. If the same system is routinely deployed in educational, medical, financial, military, and industrial contexts but is specialized for none of them, the governing laws, standards, and authorities applicable to that system cannot be easily discerned, complicating compliance with existing law and rendering regulatory oversight cumbersome and inefficient.

Consistent with this, the Roadmap asks committees to “consider a capabilities-based AI risk regime that takes into consideration short-, medium-, and long-term risks, with the recognition that model capabilities and testing and evaluation capabilities will change and grow over time.” In the case of GPAIS, such a regime would categorically include particular scrutiny for the most capable GPAIS, rather than distinguishing them based on the putative use-case.

Metrics

Our ability to preemptively assess the risks and capabilities of a system is currently limited. As the Roadmap prudently notes, “(a)s our understanding of AI risks further develops, we may discover better risk-management regimes or mechanisms. Where testing and evaluation are insufficient to directly measure capabilities, the AI Working Group encourages the relevant committees to explore proxy metrics that may be used in the interim.” While substantial public and private effort is being invested in the development of reliable benchmarks for assessment of capabilities and associated risk, the field has not yet fully matured. Though some metrics exist for testing the capabilities of models at various cognitive tasks, no established benchmarks exist for determining their capacity for hazardous behavior without extensive testing across multiple metrics.

The number of floating-point operations (FLOPs) are a measure of computation, and, in the context of AI, reflect the amount of computational resources (“compute”) used to train an AI model. Thus far, the amount of compute used to effectively train an AI model scales remarkably well with the general capabilities of that model. The recent flurry of advancement in AI capabilities over the past few years has been primarily driven by innovations in high-performance computing infrastructure that allow for leveraging more training data and computational power, rather than from major innovations in model design. The resulting models have demonstrated capabilities highly consistent with predictions based on the amount of computing power used in training, and capabilities have in turn consistently correlated with identifiable risks.

While it is not clear whether this trend will continue, training compute has so far been the objective, quantitative measurement that best predicts the capabilities of a model prior to testing. In a capabilities-based regulatory framework, such a quantitative threshold is essential for initially delineating models subject to certain requirements from those that are not. That said, using a single proxy metric as the threshold creates the risk of failing to identify potentially high-risk models as advances in technology and efficiency are made, and of gamesmanship to avoid regulatory oversight.

Recommendations

Congress should implement a stratified, capabilities-based oversight framework for the most advanced GPAIS to complement use case-dependent regulatory mechanisms for domain-specific AI systems. Such a framework, through pre-deployment assessment, auditing, and licensure, and post-deployment monitoring, could conceivably mitigate risks from these systems regardless of whether they meet the relatively arbitrary threshold of AGI. While establishing a consensus definition of AGI is a worthwhile objective, it should not be considered prerequisite to developing a policy framework designed to mitigate risks from advanced GPAIS.
Regulatory oversight of advanced GPAIS should employ the precautionary principle, placing the burden of proving the safety, security, and net public benefit of the system, and therefore its suitability for release, on the developer, and should prohibit the release of the system if it does not demonstrate such suitability. The framework should impose the most stringent requirements on the most advanced systems, with fewer regulatory requirements for less capable systems, in order to avoid unnecessary red tape and minimize the burden on smaller AI developers who lack the financial means to train the most powerful systems regardless.
Audits and assessments should be conducted by independent, objective third-parties who lack financial and other conflicts of interest. These auditors could either be employed by the government or accredited by the government to ensure they are bound by standards of practice. For less powerful and lower risk systems, some assessments could be conducted in-house to reduce regulatory burden, but verifying the safety of the highest-risk systems should under no circumstances rely on self-governance by profit-motivated companies.
Legislation governing advanced GPAIS should adopt regulatory thresholds that are inclusive of, but not limited to, training compute to ensure that current and future systems of concern remain in scope. Critically, these thresholds should each independently be sufficient to qualify a system as subject to additional scrutiny, such that exceeding, e.g. 10²⁵ FLOPs in training compute OR 100 billion parameters OR 2 trillion tokens of training data OR exceeding a particular score on a specified capabilities benchmark, risk assessment benchmark, risk assessment rubric, etc.,¹ would require a model to undergo independent auditing and receive a license for distribution. This accounts for potential blindspots resulting from the use of proxy metrics, and allows flexibility for expanding the threshold qualifications as new benchmarks become available.
Congress should establish a centralized federal authority responsible for monitoring, evaluating, and regulating GPAIS due to their multi-jurisdictional nature, and for advising other agencies on activities related to AI within their respective jurisdictions. This type of “hub and spoke” model for an agency has been effectively implemented for the Cybersecurity and Infrastructure Security Agency (CISA), and would be most appropriate for the efficient and informed regulation of AI. Such an agency could also lead response coordination in the event of an emergency caused by an AI system. Notably, CISA began as a directorate within the Department of Homeland Security (National Protection and Programs Directorate), but was granted additional operational independence thereafter. A similar model for an AI agency could mitigate the logistical and administrative strain that could delay establishment of a brand new agency, with the Department of Energy or the Department of Commerce serving as the hub for incubating the new oversight body.
Whistleblower protections should be augmented to cover reporting on unsafe practices in development and/or planned deployment of AI systems. It is not presently clear whether existing whistleblower protections for consumer product safety would be applicable in these circumstances; as such, new regulations may be necessary to encourage reporting of potentially dangerous practices. These protections should be expanded to cover a wide range of potential whistleblowers, including employees, contractors, and external stakeholders who know of unsafe practices. Protection should include legal protection against retaliation, confidentiality, safe reporting channels, and the investigation of reports documenting unsafe practices.

Liability

The Roadmap emphasizes the need to “hold AI developers and deployers accountable if their products or actions cause harm to consumers.” We agree that developers, deployers, and users should all be expected to behave responsibly in the creation, deployment, and use of AI systems, and emphasize that the imposition of liability on developers is particularly critical in order to encourage early design choices that prioritize the safety and wellbeing of the public.

The Roadmap also correctly points out that “the rapid evolution of technology and the varying degrees of autonomy in AI products present difficulties in assigning legal liability to AI companies and their users.” Under current law, it is unclear who is responsible when an AI system causes harm, particularly given the complexity of the AI supply chain.

Joint and Several Liability

When an individual is harmed as a consequence of an AI system, there are several parties that could be responsible: the developer who trained the AI model, the provider who offers that model for use, the deployer who deploys the model as part of an AI system, or the user/consumer who employs the system for a given purpose. In addition, advanced GPAIS often serve as “foundation models,” which are incorporated as one component of a more elaborate system, or which are fine-tuned by third-parties to select for particular characteristics. This presents the possibility for multiple parties to assume each of the aforementioned roles.

In such circumstances, joint and several liability is often appropriate. Joint and several liability provides that a person who has suffered harm can recover the full amount of damages from any of the joint and severally liable parties, i.e. those comprising the AI supply chain. The burden then rests on the defendant to recover portions of those damages from other parties based on their respective responsibilities for the harm. In other words, if a person is harmed by an AI system, that person would be able to sue any one of the developer, the provider, or the deployer of the system, and recover the full amount of damages, with these parties then determining their relative liability for that payment of damages independently of the injured party.

This absolves the person harmed of the burden of identifying the specific party responsible for the harm they suffered, which would be nearly impossible given the complexity of the supply chain, the opacity of the backend functions of these systems, and the likelihood that multiple parties may have contributed to the system causing harm. Instead, the defendant, who is more familiar with the parties involved in the system’s lifecycle and the relative contributions of each to the offending function, would be charged with identifying the other responsible parties, and joining them as co-defendants in the case, as appropriate.

Strict Liability

Strict liability refers to a form of liability in which the exercise of due care is not sufficient to absolve a defendant of liability for harm caused by their action or product. While products are generally subject to a particular brand of strict liability, in most cases services rely on a negligence standard, which absolves the defendant of liability if due care was exercised in the conduct of the service. The lack of clarity as to whether advanced GPAIS should be classified as products or services draws into question whether this strict liability framework applies.

The inherent unpredictability of advanced GPAIS and inevitability of emergent unforeseen risks make strict liability appropriate. Many characteristics of advanced GPAIS render their training and provision akin to an “abnormally dangerous activity” under existing tort law, and abnormally dangerous activities are typically subject to strict liability. Existing law considers an activity abnormally dangerous and subject to strict liability if: (1) the activity creates a foreseeable and highly significant risk of physical harm even when reasonable care is exercised by all actors; and (2) the activity is not one of common usage.² . A risk is considered highly significant if it is either unusually likely or unusually severe, or both. For instance, the operation of a nuclear power plant is considered to present a highly significant risk because while the likelihood of a harm-causing incident when reasonable care is exercised is low, the severity of harm should an incident occur would be extremely high.

A significant portion of leading AI experts have attested to a considerable risk of catastrophic harm from the most powerful AI systems, including executives from the major AI companies developing the most advanced systems.³ The presence of these harms is thus evidently foreseeable and highly significant. Importantly, reasonable care is not sufficient to eliminate catastrophic risk from advanced GPAIS due to their inherent unpredictability and opacity, as demonstrated by the emergence of behaviors that were not anticipated by their developers in today’s state-of-the-art systems.⁴ As more capable advanced GPAIS are developed, this insufficiency of reasonable care will likely compound – an AGI system that exceeds human capacity across virtually all cognitive tasks, for instance, by definition would surpass the capacity of humans to exercise reasonable care in order to allay its risks.

Additionally, given the financial and hardware constraints on training such advanced models, only a handful of companies have the capacity to do so, suggesting that the practice also “is not one of common usage.” In contrast, less capable systems are generally less likely to present emergent behaviors and inherently present a far lower risk of harm, particularly when reasonable care is exercised. Such systems are also less hardware intensive to train, and, while not necessarily “of common usage” at present, could qualify as such with continued proliferation.

Section 230

Section 230 of the Communications Decency Act of 1996 provides that, among other things, “no provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider.”⁵ This provision, along with the statute’s protection of interactive computer services from liability for good faith moderation actions, has been broadly interpreted to protect online platforms from liability for the content they host, so long as the content was contributed by a party other than the platform itself.

The application of this statute to AI has yet to be tested in courts, and there is disagreement among legal scholars as to how Section 230 relates to generative AI outputs. On one hand, the outputs of generative AI systems are dependent on the input of another information content provider, i.e. the user providing the prompt. On the other hand, the outputs generated by the system are wholly unique, more akin to content provided by the platform itself. The prevailing view among academics is that generative AI products “operate on something like a spectrum between a retrieval search engine (more likely to be covered by Section 230) and a creative engine (less likely to be covered).”⁶ A robust liability framework for AI must therefore ensure that this area of the law is clarified, either by explicitly superseding Section 230, or by amending Section 230 itself to provide this clarification. Shielding the developers and operators of advanced AI systems from liability for harms resulting from their products would provide little incentive for responsible design that minimizes risk, and, as we have seen with social media, could result in wildly misaligned incentives to the detriment of the American public.

Recommendations

The development of a GPAIS trained with greater than, e.g., 10²⁵ FLOPs, or a system equivalent in capability⁷, should be considered an abnormally dangerous activity and subject to strict liability due to its inherent unpredictability and risk, even when reasonable care is exercised.
Developers of advanced GPAIS that fall below this threshold, but still exceed a lower threshold (e.g. 10²³ FLOPs) should be subject to a rebuttable presumption of negligence if the system causes harm – given the complexity, opacity, and novelty of these systems, and the familiarity of their developers with the pre-release testing the systems underwent, developers of these systems are best positioned to bear the burden of proving reasonable care was taken.
Domain-specific AI systems should be subject to the legal standards applicable in that domain.
Where existing law does not explicitly indicate an alternative apportionment of liability, AI systems should be subject to joint and several liability, including for all advanced GPAIS.
Congress should clarify that Section 230 of the Communications Decency Act does not shield AI providers from liability for harms resulting from their systems, even if the output was generated in response to a prompt provided by a user. This could be accomplished by amending the definition of “information content provider” in Section 230(f)(3) to specify that the operator of a generative AI system shall be considered the “information content provider” for the outputs generated by that system.

AI and National Security

AI systems continue to display unexpected, emergent capabilities that present risks to the American public. Many of these risks, for example those from disinformation⁸ and cyberattacks⁹, were not discovered by evaluations conducted during the development life cycle (i.e. pre-training, training, and deployment), or were discovered but were deemed insufficiently severe or probable to justify delaying release. Moreover, AI companies have incentives to deploy AI systems quickly in order to establish and/or maintain market advantage, which may lead to substandard monitoring and mitigation of AI risks. When risks are discovered and harm is imminent or has occurred, it is vital for authorities to be informed as soon as possible to respond to the threat.

The Roadmap encourages committees to explore “whether there is a need for an AI-focused Information Sharing and Analysis Center to serve as an interface between commercial AI entities and the federal government to support monitoring of AI risks.” We see such an interface as essential to preserving national security in light of the risks, both unexpected and reasonably foreseeable, presented by these systems.

The Roadmap also notes that “AI has the potential to increase the risk posed by bioweapons and is directly relevant to federal efforts to defend against CBRN threats.” As state-of-the-art AI systems have become more advanced, they have increasingly demonstrated capabilities that could pose CBRN threats. For instance, in 2022, an AI system used for pharmaceutical research effectively identified 40,000 novel candidate chemical weapons in six hours¹⁰. While the current generation of models have not yet significantly increased the abilities of malicious actors to launch biological attacks, newer models are adept at providing the scientific knowledge, step-by-step experimental protocols, and guidance for troubleshooting experiments necessary to effectively develop biological weapons.¹¹ Additionally, currently models have been shown to significantly facilitate the identification and exploitation of cybervulnerabilities.¹² These capabilities are likely to scale over time.

Over the last two years, an additional threat has emerged at the convergence of biotechnology and AI, as ever-powerful AI models are ‘bootstrapped’ with increasingly sophisticated biological design tools , allowing for AI-assisted identification of virulence factors, in silico design of pathogens, and other capabilities that could significantly increase the capacity of malicious actors to cause harm.¹³

The US government should provide an infrastructure for monitoring these AI risks that puts the safety of the American public front and center, gives additional support to efforts by AI companies, and allows for rapid response to harms from AI systems. Several precedents for such monitoring already exist. For instance, CISA’s Joint Cyber Defense Collaborative is a nimble network of cross-sector entities that are trusted to analyze and share cyber threats, the SEC requires publicly traded companies to disclose cybersecurity incidents within four business days, and the 2023 AI Executive Order requires companies to disclose ‘the physical and cybersecurity protections taken to assure the integrity of that training process against sophisticated threats’.¹⁴

Recommendations

Congress should establish an Information Sharing and Analysis Center (ISAC) which will designate any model or system that meets a specified quantitative threshold¹⁵ as a model or system of national security concern. Congress should require developers building advanced AI systems to share documentation with the ISAC about the decisions taken throughout the development and deployment life-cycle (e.g., models cards detailing decisions taken before, during, and after the training and release of a model).
The current draft of the 2025 National Defense Authorization Act (NDAA)¹⁶ tasks the Chief Digital and Artificial Intelligence Officer with developing an implementation plan for a secure computing and data storage environment (an ‘AIxBio sandbox’) to facilitate the testing of AI models trained on biological data, as well as the testing of products generated by such models. Congress should mandate that AI systems as or more powerful than those defined as models of national security concern (see above) or are otherwise deemed to pose CBRN threats be subjected to testing in this sandbox before deployment to ensure that these systems do not pose severe risks to the American public. This type of faculty should follow design and protocols from the national security sector’s Sensitive Compartmented Information Facility (SCIF) standards or the similar Data Cleanroom standards used in software litigation discovery.
To ensure that GPAIS are not capable of revealing hazardous information, Congress should prohibit AI models from being trained on the most dangerous dual-use research of concern (DURC). Congress should also recommend appropriate restrictions for DURC data being used to train narrow AI systems – such as ringfencing of the most hazardous biological information from use in training – that could pose significant risk of misuse, malicious use, or unintended harm. In both cases, these requirements should cover data that, if widely available, would pose a potential CBRN risk.
The federal government should invest in core CBRN defense strategies that are agnostic to AI, while bearing in mind that AI increases the probability of these threats materializing. Such investments should include next-generation personal protective equipment (PPE), novel medical countermeasures, ultroviolet-C technologies, and other recommendations from the National Security Commission for Emerging Biotechnology. ¹⁷

Compute Security And Export Controls

High-end AI chips are responsible for much of the rapid acceleration in development of AI systems. As these chips are an integral component of AI development and rely on a fairly tight supply chain – i.e. the supply chain is concentrated in a small number of companies in a small number of countries¹⁸ – – chips are a promising avenue for regulating the proliferation of the highest-risk AI systems, especially among geopolitical adversaries and malicious non-state actors.

The Roadmap “encourages the relevant committees to ensure (the Bureau of Industry and Security) proactively manages (critical) technologies and to investigate whether there is a need for new authorities to address the unique and quickly burgeoning capabilities of AI, including the feasibility of options to implement on-chip security mechanisms for high-end AI chips.” We appreciate the recognition by the AI Working Group of on-chip security as a useful approach toward mitigating AI risk. Congress must focus on both regulatory and technical aspects of this policy problem to mitigate the risk of AI development from malicious actors.

The Roadmap also asks committees to develop a framework for determining when, or if, export controls should be placed on advanced AI systems. We view hardware governance and export controls as complementary and mutually-reinforcing measures, wherein on-chip security mechanisms can serve to mitigate shortcomings of export controls as a means of reducing broad proliferation of potentially dangerous systems.

Export controls, especially those with an expansive purview, often suffer from serious gaps in enforcement. In response to export controls on high-end chips used for training AI systems, for instance, a growing informal economy around chip smuggling has already emerged, and is likely to grow as BIS restrictions on AI-related hardware and systems become more expansive.¹⁹ Coupling export controls with on-chip governance mechanisms can help remedy this gap in enforcement by providing the ability to track and verify the location of chips, and to automatically or remotely disable their functionality based on their location when they are used or transferred in violation of export controls.

Export controls also generally target particular state actors rather than select applications, which may foreclose economic benefits and exacerbate geopolitical risks to United States interests relative to more targeted restrictions on trade. For example, broadly-applied export controls²⁰ targeted at the People’s Republic of China (PRC) do not effectively distinguish between harmless use cases (e.g., chips used for video games or peaceful academic collaborations) and harmful use cases (e.g., chips used to train dangerous AI military systems) within the PRC. Expansive export controls have already led to severe criticism from the Chinese government,²¹ and may be having the unintended effect of pushing China toward technological self-reliance.²²

In contrast, relaxing restrictions on chip exports to demonstrably low-risk customers and for low-risk uses in countries otherwise subject to export controls could improve the economic competitiveness of US firms and strengthen trade relationships key to maintaining global stability. These benefits are integral to guaranteeing sustained US leadership on the technological frontier, and to maintaining the geopolitical posture of the US. tThe ability for on-chip governance mechanisms to more precisely identify the location of a given chip and to determine whether the chip is co-located with many other chips or used in a training cluster could facilitate more targeted export controls that maintain chip trade with strategic competitors for harmless uses, while limiting their application toward potentially risky endeavors.

New and innovative hardware governance solutions are entirely compatible with the current state of the art chips sold by leading manufacturers. All hardware relevant to AI development (i.e. H100s, A100s, TPUs, etc.) have some form of “trusted platform module (TPM)”, a hardware device that generates random numbers, holds encryption keys, and interfaces with other hardware modules to ensure platform integrity and report security-relevant metrics.²³ Some new hardware (H100s in particular) has an additional “trusted execution environment (TEE)” or “secure enclave” capability, which prevents access to chosen sections of memory at the hardware level. TPMs and secure enclaves are already available and in use today, presently serving to prevent iPhones from being “jailbroken,” or used when stolen, and to secure biometric and other highly sensitive information in modern phones and laptops. As discussed, they can also facilitate monitoring of AI development to identify the most concerning uses of compute and take appropriate action, including automatic or remote shutdown if the chips are used in ways or in locations that are not permitted by US export controls.

These innovations could be transformative for policies designed to monitor AI development, as TEEs and TPMs use cryptographic technology to guarantee confidentiality and privacy for all users across a variety of use and governance models.²⁴ Such guarantees are likely necessary for these chips to become the industry and international standard for use, and for willing adoption by strategic competitors. TEE and TPM security capabilities can also be used to construct an “attested provenance” capability that gives cryptographic proof that a given set of AI model weights or model outputs results from a particular auditable combination of data, source code, training characteristics (including amount of compute employed), and input data. This provides a uniquely powerful tool in verifying and enforcing licensing standards.

Because state-of-the-art chips already possess the technical capability for this type of on-chip security, a technical solution to hardware governance would not impose serious costs on leading chip companies to modify the architecture of chips currently in inventory or in production. Additionally, it is possible to use these technical solutions for more centralized compute governance without creating back-channels that would harm the privacy of end-users of the chip supply chain – indeed these mechanisms can ensure privacy and limit communication of information to telemetry such as location and usage levels.

Recommendations

Congress should support the passage of H.R.8315, the Enhancing National Frameworks for Overseas Restriction of Critical Exports (ENFORCE) Act, which gives the Bureau of Industry and Security (BIS) the authority to control the export and re-export of covered AI systems, with amendments to ensure that the publication of AI models in a manner that is publicly accessible does not create a loophole to circumvent these controls, i.e., that open-weight systems meeting specified conditions qualify as exports under the Act.²⁵
Congress should require companies developing AI systems that meet specified thresholds²⁶ to use AI chips with secure hardware. This hardware should be privacy-preserving to allow for confidential computing but should also provide information on proof-of-location and the ability to switch chips off in emergency circumstances.²⁷ Such a technical solution would complement robust export controls by facilitating enforcement and more effectively targeting harmful applications in particular. This could be accomplished through direct legislation prohibiting the domestic training of advanced AI systems using chips without such technology, and by providing a statutory obligation for BIS to grant export licenses for high-end AI chips and dual-use AI models only if they are equipped with these on-chip security mechanisms and trained using such chips, respectively.
To avoid gaming, inaccuracy, or misrepresentation in the satisfaction of licensing requirements, Congress should phase-in increasingly stringent evidentiary requirements for reporting of compute usage and auditing results. The recently-established US AI Safety Institute within the National Institute of Standards and Technology should be tasked with developing a comprehensive standard for compute accounting to be used in threshold determinations. Additionally, self-attestation of compute usage and capability evaluations should be improved to cryptographically attested provenance when this becomes technically practical.

Autonomous Weapons Systems (AWS) and Military Integration of AI

The Roadmap asks committees to take actions that prioritize the “development of secure and trustworthy algorithms for autonomy in DOD platforms” and ensure “the development and deployment of Combined Joint All-Domain Command and Control (CJADC2) and similar capabilities by DOD.”

Following the 2021 CJADC2 Strategy²⁸, the Department of Defense (DOD) announced a new generation of capabilities for CJADC2 early this year, which intend to use AI to “connect data-centric information from all branches of service, partners, and allies, into a singular internet of military things.” This built on similar efforts led by the Chief Digital and Artificial Intelligence Office (CDAO) and on the objectives of Task Force Lima to monitor, develop, evaluate, and recommend the responsible and secure implementation of generative AI capabilities across DOD.

While such innovations in the war-fighting enterprise present potential benefits – e.g., rapid integration of military intelligence, providing strategic decision advantage to commanders – there are significant pitfalls to rapid integration of AI systems, which have continued to be proven unreliable, opaque, and unpredictable. Bugs in AI systems used in such critical settings could severely hamper the national defense enterprise, and put American citizens and allies in danger, as a centralized system responsible for virtually all military functions creates a single point of failure and vulnerability. Integration of these systems may also lead to amplification of correlated biases in the decision-making of what would otherwise be independent AI systems used in military applications.

The Roadmap also “recognizes the DOD’s transparency regarding its policy on fully autonomous lethal weapons systems (and encourages) relevant committees to assess whether aspects of the DOD’s policy should be codified or if other measures, such as notifications concerning the development and deployment of such weapon systems, are necessary.”

As the draft text of the 2025 National Defense Authorization Act (NDAA)²⁹ notes, the ‘small unmanned aircraft systems (UAS) threat continues to evolve, with enemy drones becoming more capable and dangerous.’ Autonomous weapons systems (AWS) are becoming increasingly cheap to produce and use, and swarms of such weapons pose a serious threat to the safety of citizens worldwide. When deployed en masse, swarms of autonomous weapons, which have demonstrated little progress in distinguishing between civilians and combatants in complex conflict environments, have the potential to cause mass casualties at the level of other kinds of WMDs. Their affordability also makes them a potentially potent tool for carrying out future genocides.

Overall, AWS have proven to be dangerously unpredictable and unreliable, demonstrating difficulty distinguishing between friend and foe. As these systems become more capable over time, they present a unique risk from loss of control or unintended escalation. Additionally, such systems are prone to cyber-vulnerabilities, and may be hacked by malicious actors and repurposed for malicious use.

Recommendations

Congress should mandate that nuclear launch systems remain independent from CJADC2 capabilities. The current air-gapped state of nuclear launch systems ensures that the critical decision to launch a nuclear weapon always remains within full human control. This situation also guards the nuclear command and control system against cyber-vulnerabilites which could otherwise present if the system was integrated with various other defense systems. Other systems may possess unique vulnerabilities from which nuclear launch systems are presently insulated, but to which they would be exposed were the functions integrated.
Building on the precedent set by the Air Force, Congress should require DOD to establish boards comprised of AI ethics officers across all offices involved in the production, procurement, development, and deployment of military AI systems.³⁰
In light of the comments made by former Chief Digital and AI Officer Dr. Craig Martell that all AI systems integrated into defense operations must have ‘five-digit accuracy’ (99.999%),³¹ Congress should task the CDAO with establishing clear protocols to measure this accuracy and prohibit systems which fall below this level of accuracy from being used in defense systems.
Congress should codify DOD Directive 3000.09 in statute to ensure that it is firmly established, and amend it to raise the bar from requiring ‘appropriate levels of human judgement’ to requiring ‘meaningful human control’ when AI is incorporated in military contexts. This is critical in ensuring that ‘human-in-the-loop’ is not used as a a rubber stamp, and in emphasizing the need for human control at each stage of deployment. In addition, Congress should require the CDAO to file a report which establishes concrete guidance for meaningful human control in practice, for both AWS and decision-support systems.
As the 2025 NDAA draft indicates, there is a need for development of counter-UAS (C-UAS) systems. Rather than ramping up development of unreliable and risky offensive AWS, Congress should instead instruct DOD to invest in non-kinetic counter-AWS (C-AWS) development. As AWS development accelerates and the risk of escalation heightens, the US should reassure allies that AWS is not the best countermeasure and instead push for advanced non-kinetic C-AWS technology.

Open-source AI

Recently, “open-source AI” has been used to refer to AI models for which model weights, the numerical values that dictate how a model translates inputs into outputs, are widely available to the public. It should be noted that an AI system with widely available model weights alone does not fit the traditional criteria for open-source. The inconsistent use of this term has allowed many companies to benefit from the implication that models with varying degrees of openness might still fulfill the promises of open-source software (OSS), even when they do not adhere to the core principles of the open-source movement³². Contrary to the marketing claims of Big Tech companies deploying “open-source” AI models, Widder et al. (2023) argue that while maximally “open” AI can indeed provide transparency, reusability, and extensibility, allowing third parties to deploy and build upon powerful AI models, it does not guarantee democratic access, meaningful competition, or sufficient oversight and scrutiny in the AI field.

Advanced AI models with widely available model weights pose particularly significant risks to society due to their unique characteristics, potential for misuse, and the difficulty of evaluating and controlling their capabilities. In the case of CBRN risks, as of early 2024, evidence suggests that the current generation of closed AI systems function as instruments comparable to internet search engines in facilitating the procurement of information that could lead to harm.³³ However, these experiments were carried out using proprietary models with fine-tuned safeguards. The release of model weights allows for trivial removal of any safeguards that might be added to mitigate these risks and lowers the barrier to entry for adapting systems toward more dangerous capabilities through fine-tuning.³⁴^,³⁵^,³⁶

As AI models become more advanced, their reasoning, planning, and persuasion capabilities are expected to continue to grow, which will in turn increase the potential for misuse by malicious actors and loss of control over the systems by careless operators. Relevant legislation should account for the difficulty in accurately predicting which models will possess capabilities strong enough to pose significant risks with and without the open release of their model weights.³⁷ Unanticipated vulnerabilities and dangerous capabilities can be particularly insidious in the latter case, as once model weights are released, such models cannot be effectively retracted in order patch issues, and the unpatched versions remain indefinitely available for use.

“Open AI systems” have already demonstrated the potential to facilitate harmful behavior, particularly by way of cyberattacks, disinformation, and the proliferation of child sexual abuse material (CSAM).³⁸^,³⁹ The UK National Cyber Security Centre found that AI systems are expected to significantly increase the volume and impact of cyber attacks by 2025, with varying degrees of influence on different types of cyber threats.⁴⁰ While the near-term threat primarily involves the enhancement of existing tactics, techniques, and procedures, AI is already being used by both state and non-state actors to improve reconnaissance and social engineering. More advanced AI applications in cyber operations would likely to be limited to well-resourced actors with access to quality training data, immense computational resources, and expertise, but open release of model weights by these well-resourced actors could provide the same capacity to a wider range of threat actors, including cybercriminals and state-sponsored groups.

The Roadmap asks committees to “investigate the policy implications of different product release choices for AI systems, particularly to understand the differences between closed versus fully open-source models (including the full spectrum of product release choices between those two ends of the spectrum).” We appreciate the Roadmap’s implication that “open-source model” product releases present additional questions in understanding the risks posed by AI systems, and recommend the following measures to mitigate the unique risks posed by the release of model weights.

Recommendations

Congress should require that AI systems with open model weights undergo thorough testing and evaluation in secure environments appropriate to their level of risk. The government should conduct these assessments directly or delegate them to a group of government-approved independent auditors. When assessing these models, auditors must assume that a) built-in safety measures or restrictions could be removed or bypassed once the model is released, and b) the model could be fine-tuned or combined with other resources, potentially leading to the development of entirely new and unanticipated capabilities. Insufficient safeguards to protect against dangerous capabilities or dangerous unpredictable behavior should justify the authority to suspend the release of model weights, and potentially the system itself, until such shortcomings are resolved. In cases where full access to a model’s weights is needed to evaluate reliably audit the capabilities of a system, assessment should be conducted in Sensitive Compartmented Information Facilities (SCIFs) to ensure appropriate security measures.⁴¹^,⁴²
Developers should be legally responsible for performing all reasonable measures to prevent their models from being retrained to substantially enable illegal activities, and for any harms resulting from their failure to do so. When model weights are made widely available, it becomes intractable for developers to retract, monitor, or patch the system. Presently, there is no reliable method of comprehensively identifying all of the capabilities of an AI system. Latent capabilities, problematic use-cases, and vulnerabilities are often identified far into the deployment life-cycle of a system or through additional fine-tuning. Despite the difficulties in identifying the full range of capabilities, developers should be held liable if their model was used to substantially enable illegal activities.
To mitigate the concentration of power of AI while ensuring AI safety and security, initiatives like the National Artificial Intelligence Research Resource (NAIRR) should be pursued to create “public options” for AI. As previously discussed, the impacts of open-source AI on the concentration of power and on mitigating market consolidation are often overstated. This does not discount the importance of preventing the concentration of power, both within the technology market and for society at large, that is likely to result from the high barrier to entry for training the most advanced AI systems. One potential solution is for the U.S. to further invest in “public options” for AI. Initiatives like the National Artificial Intelligence Research Resource could help develop and maintain publicly-funded AI models, services, and infrastructure. This approach would ensure that access to advanced AI is not solely controlled by corporate or proprietary interests, allowing researchers, entrepreneurs, and the general public to benefit from the technology while prioritizing safety, security, and oversight.

Supporting US AI Innovation

The Roadmap has the goal of “reaching as soon as possible the spending level proposed by the National Security Commission on Artificial Intelligence (NSCAI) in their final report: at least $32 billion per year for (non-defense) AI innovation.” We appreciate the support for non-military innovation of AI, and emphasize that AI innovation should not be limited to advancing the capabilities or raw power of AI systems. Rather, innovation should prioritize specific functions that maximize public benefit and tend to be under-incentivised in industry, and should include extensive research into improving the safety and security of AI systems. This means enhancing explainability of outputs, tools for evaluation of risk, and mechanisms for ensuring predictability and maintenance of control over system behavior.

To this end, the Roadmap also expresses the need for funding efforts to enhance AI safety and reliability through initiatives to support AI testing and evaluation infrastructure and the US AI Safety Institute, as well as increased resources for BIS to ensure effective monitoring and compliance with export control regulations. The Roadmap also emphasizes the importance of R&D and interagency coordination focused on the intersection of AI and critical infrastructure. We commend the comprehensive approach to R&D efforts across multiple agencies, as it recognizes the critical role that each of these entities plays in ensuring the safe and responsible development of AI technologies. In particular, we see the intersection of AI and critical infrastructure as a major vector of potential AI risk if due care is not paid to ensuring reliability and security of systems integrated in critical infrastructure, and in strengthening resilience against possible AI-assisted cyberthreats.

Research focused on the safe development, evaluation, and deployment of AI is vastly under-resourced when compared to research focused on the general development of AI. AI startups received almost $50 billion in funding in 2023.⁴³ According to the 2024 Stanford Index Report, industry produced 51 notable machine learning models, academia contributed 15, and the government contributed 2.⁴⁴ While the amount of resources that private companies allocate to safety research is unclear – there can be some overlap between safety and capabilities research – it is significantly less than investment in capabilities. Recently, the members of teams working on AI safety at OpenAI have resigned citing concerns about the company’s approach to AI safety research.⁴⁵ This underscores the need for funding focused on the safe development, evaluation, and deployment of AI.

Recommendations

1. R&D funding to BIS should include allocation to the development of on-chip hardware governance solutions, and the implementation of those solutions. To best complement the role of BIS in implementing export controls on advanced chips and potentially on AI models, this funding should include R&D supporting the further development of privacy-preserving monitoring such as proof-of-location and the ability to switch chips off in circumstances where there is a significant safety or regulatory violation.⁴⁶ After appropriate on-chip governance solutions are identified, funding should also be directed towards enabling the implementation of those solutions in relevant export control legislation.

2. The expansion of NAIRR programs should include funding directed toward the development of secure testing and usage infrastructure for academics, researchers, and members of civil society. We support efforts by the NAIRR pilot program to improve public access to research infrastructure. As AI systems become increasingly capable, levels of access to AI tools and resources should be dynamic relative to their level of risk. Accordingly, it may be beneficial for those receiving any government funding for their work on powerful models (including private sector) to provide structured access to their systems via the NAIRR, subject to specific limitations on use and security measures, including clearance and SCIFs, where necessary, to allow for third parties to probe these systems and develop the tools necessary to make them safer.

3. R&D on interagency coordination focused on the intersection of AI and critical infrastructure should include allocation to safety and security research. The ultimate goal of this research should be to establish stringent baseline standards for the safe and secure integration of AI into critical infrastructure. These standards should address key aspects such as transparency, predictability, and robustness of AI systems, ensuring that they can be effectively integrated without introducing additional vulnerabilities. Funding should also acknowledge the lower barrier to entry for malicious actors to conduct cyberattacks as publicly-accessible AI becomes more advanced and widespread, and seek improved mechanisms to strengthen cybersecurity accordingly.

Combating Deepfakes

The Roadmap encourages the relevant committees to consider legislation “to protect children from potential AI-powered harms online by ensuring companies take reasonable steps to consider such risks in product design and operation.” We appreciate the Roadmap’s recognition that product design, and by extension product developers, play a key role in mitigating AI-powered harms. The Roadmap also encourages the consideration of legislation “that protects against unauthorized use of one’s name, image, likeness, and voice, consistent with First Amendment principles, as it relates to AI”, “legislation to address online child sexual abuse material (CSAM), including ensuring existing protections specifically cover AI-generated CSAM,” and “legislation to address similar issues with non-consensual distribution of intimate images and other harmful deepfakes.”

Deepfakes, which are pictures, videos, and audio that depict a person without their consent, usually for the purpose of harming that person or misleading those who are exposed to the material, lie at the intersection of these objectives. There are many ways in which deepfakes systematically undermine individual autonomy, perpetuate fraud, and threaten our democracy. For example, 96% of deepfakes are sexual material⁴⁷ and fraud committed using deepfakes rose 3,000% globally in 2023 alone.⁴⁸ Deepfakes have also begun interfering with democratic processes by spreading false information and manipulating public opinion⁴⁹ through convincing fake media, which can and have influenced electoral outcomes⁵⁰. The Roadmap encourages committees to “review whether other potential uses for AI should be either extremely limited or banned.” We believe deepfakes fall into that category.

Deepfakes are the result of a multilayered supply chain, which begins with model developers, who design the underlying algorithms and models. Cloud compute providers such as Amazon Web Services (AWS), Google Cloud Platform, and Microsoft Azure form the next link in the chain by offering the necessary computational resources for running and in some cases training deepfake models. These platforms provide the infrastructure and scalability required to process large datasets and generate synthetic media efficiently. Following them are model providers, such as Deepnude⁵¹, Deepgram⁵², and Hoodem⁵³, which offer access to pre-trained deepfake models or user-friendly software tools, enabling even those with limited technical expertise to produce deepfakes.

The end users of deepfake technology are typically individuals or groups with malicious intent, utilizing these tools to spread misinformation, manipulate public opinion, blackmail individuals, or engage in other illicit activities. Once created, these deepfakes are distributed through various online platforms, including social media sites such as Facebook, Twitter, and YouTube, as well as messaging apps like WhatsApp and Telegram. The proliferation of deepfakes on these platforms can be rapid and extensive, making it nearly impossible to remove the synthetic media once published. Accordingly, it is critical to prevent the production of deepfakes before their publication and distribution can occur.

As the creators and distributors of the powerful tools that enable the mass production of this harmful content, model developers and providers hold the most control and responsibility in the deepfake supply chain. Developers have the capability to stop the misuse of these technologies at the source by restricting access, disabling harmful functionalities, and simply refusing to train models for harmful and illegal purposes such as the generation of non-consensual intimate images. There are far fewer model developers than providers, making this link in the supply chain particularly effective for operationalizing accountability mechanisms. While compute providers also play a role by supplying the necessary resources for AI systems to function, their ability to monitor and control the specific use cases of these resources is more limited. In order to effectively stem off the risks and harms which deepfakes engender, legislative solutions must address the issue as a whole, rather than only in particular use cases, in order to reflect the broad and multifaceted threats that extend beyond any single application. A comprehensive legal framework would ensure that all potential abuses are addressed, creating a robust defense against the diverse and evolving nature of deepfake technology.

Recommendations

Congress should set up accountability mechanisms that reflect the spread of responsibility and control across the deepfake supply chain. Specifically, model developers and providers should be subject to civil and/or criminal liability for harms resulting from deepfakes generated by their systems. Similar approaches have been taken in existing Congressional proposals such as the NO AI FRAUD Act (H.R. 6943), which would create a private right of action against companies providing a “personalized cloning service”. When a model is being used to quickly and cheaply create an onslaught of deepfakes, merely holding each end-user accountable would be infeasible and would nonetheless be insufficient to prevent the avalanche of harmful deepfakes flooding the internet.
Users accessing models to produce and share deepfakes should be subject to civil and/or criminal liability. This approach is already reflected in several bills proposed within Congress such as the NO AI FRAUD Act (H.R. 6943), NO FAKES Act, DEFIANCE Act (S.3696), and the Preventing Deepfakes of Intimate Images Act (H.R. 3106).
Congress should place a responsibility on compute providers to revoke access to their services when they have knowledge that their services are being used to create harmful deepfakes, or to host models that facilitate the creation of harmful deepfakes. This will ensure that compute providers are not complicit in the mass production of deepfakes.
Congress should support the passage of proposed bills like the NO FAKES Act, with some modifications to clarify the liability of model developers.⁵⁴ Many recently introduced bills contain elements which would be effective in combating deepfakes, although it is crucial that they are strengthened to adequately address the multilayered nature of the deepfakes supply chain.

Provenance And Watermarking

Watermarking aims to embed a statistical signal into AI-generated content, making it identifiable as such. Ideally, this would allow society to differentiate between AI-generated and non-AI content. However, watermarking has significant drawbacks. First, deepfakes such as non-consensual intimate images and CSAM are still considered harmful even when marked as AI-generated.⁵⁵ Websites hosting AI-generated sexual images often disclose their origin, yet the content continues to cause distress to those depicted. Second, recent research has shown that robust watermarking is infeasible, as determined adversaries can easily remove these markers.⁵⁶ As such, it is not sufficient to rely on watermarking alone as the solution to preventing the proliferation of deepfakes, nor for conclusively distinguishing real from synthetic content.

Nonetheless, certain types of watermarks and/or provenance data can be beneficial to combating the deepfake problem. “Model-of-origin” watermarking provisions, which would require generative AI models to include information on which model was used to create the output and the model’s developer and/or provider, can be included in the metadata of the output and can greatly enhance both legal and public accountability for developers of models used to create harmful content. Indicating the model of origin of outputs would also enable the identification of models that are disproportionately vulnerable to untoward use.

Consistent with this approach, the Roadmap encourages committees to “review forthcoming reports from the executive branch related to establishing provenance of digital content, for both synthetic and non-synthetic content.” It also recommends considering “developing legislation that incentivizes providers of software products using generative AI and hardware products such as cameras and microphones to provide content provenance information and to consider the need for legislation that requires or incentivizes online platforms to maintain access to that content provenance information.”

While forthcoming legislation should indeed require providers of AI models to include content provenance information embedded in or presented along with the outputted content, however, developers should also bear this responsibility. Unlike model providers, developers can embed provenance information directly into the models during the development phase, ensuring that it is an integral part of the AI-generated content from the outset.

Recommendations

Both model developers and providers should be required to integrate provenance tracking capabilities into their systems. While voluntary commitments have been made by certain developers, provenance watermarking is most trustworthy when it is widespread, and this is not currently the industry norm. As the National Institute of Standards and Technology report on Reducing Risks Posed by Synthetic Content (NIST AI 100-4) outlines, several watermarking and labeling techniques have become prominent, meaning that there are established standards that can be viably adopted by both developers and providers.
Model developers and providers should be expected to make content provenance information as difficult to bypass or remove as possible, taking into account the current state of science. It is unlikely that most users creating deepfakes have the technical competency to remove watermarks and/or metadata, but model-of-origin provisions are nonetheless most effective if they are inseparable from the content. While studies have shown that malicious actors can bypass current deepfake labeling and watermarking techniques, stakeholders should ensure, to the greatest extent possible, that such bypasses are minimized. The absence of requirements that model-of-origin information be as difficult to remove as possible may unintentionally incentivize developers and deployers to employ watermarks that are easier to remove in an effort to minimize accountability.
Congress should support the passage of the AI Labeling Act, which mandates clear and permanent notices on AI-generated content, identifying the content as AI-produced and specifying the tool used along with the creation date. This transparency helps hold developers accountable for harmful deepfakes, potentially deterring irresponsible AI system design.
Congress should support amendments to various bills originating in the House, including the AI Disclosure Act and the DEEPFAKES Accountability Act, such that they clearly include model-of-origin watermarking provisions.⁵⁷

Conclusion

We thank the Senate AI Working Group for its continued dedication to the pressing issue of AI governance. AI as a technology is complex, but the Roadmap demonstrates a remarkable grasp of the major issues it raises for the continued flourishing of the American people. The next several months will be critical for maintaining global leadership in responsible AI innovation, and the urgent adoption of binding regulation is essential to creating the right incentives for continued success. Time and time again, Congress has risen to meet the challenge of regulating complex technology, from airplanes to pharmaceuticals, and we are confident that the same can be done for AI.

^{↩ 1} Throughout this document, where we recommend thresholds for the most advanced/dangerous GPAIS, we are generally referring to multi-metric quantitative thresholds set at roughly these levels. While the recent AI Executive Order (“Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence”) presumes an AI model to be “dual-use” if is trained on 10²⁶ FLOPs or more, we recommend a training compute threshold set at 10²⁵ FLOPs to remain consistent with the EU AI Act’s threshold for presuming systemic risk from GPAIS. Such a threshold would apply to fewer than 10 current systems, most of which have already demonstrated some capacity for hazardous capabilities.

^{↩ 2} Restatement (Third) of Torts: Liability for Physical Harm § 20 (Am. Law Inst. 1997).

^{↩ 3} In a survey of 2,778 researchers who had published research in top-tier AI venues, roughly half of respondents gave at least a 10% chance of advanced AI leading to outcomes as bad as extinction. K Grace, et al., “Thousands of AI Authors on the Future of AI,” Jan. 2024.

See also, e.g., S Mukherjee, “Top AI CEOs, experts raise ‘risk of extinction’ from AI,” Reuters, May 30, 2023.

^{↩ 4} See, e.g., J Wei, et al., “Emergent Abilities of Large Language Models,” Jun. 15, 2022 (last revised: Oct. 26, 2022).

^{↩ 5} 47 U.S.C. § 230(c)(1)

^{↩ 6} Henderson P, Hashimoto T, Lemley M. Journal of Free Speech Law. “Where’s the Liability for Harmful AI Speech?“

^{↩ 7} See fn. 1.

^{↩ 8} Exclusive: GPT-4 readily spouts misinformation, study finds. Axios.

^{↩ 9} OpenAI’s GPT-4 Is Capable of Autonomously Exploiting Zero-Day Vulnerabilities. Security Today.

^{↩ 11} Can large language models democratize access to dual-use biotechnology? MIT.

^{↩ 12} R Fang, et al., “Teams of LLM Agents can Exploit Zero-Day Vulnerabilities,” Jun. 2, 2024; T Claburn, “OpenAI’s GPT-4 can exploit real vulnerabilities by reading security advisories,” The Register, Apr. 17, 2024 (accessed Jun. 13, 2024).

^{↩ 13} J O’Brien & C Nelson, “Assessing the Risks Posed by the Convergence of Artificial Intelligence and Biotechnology,” Health Secur., 2020 May/Jun; 18(3):219-227. doi: 10.1089/hs.2019.0122. https://pubmed.ncbi.nlm.nih.gov/32559154/.

^{↩ 14} Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. The White House.

SEC Adopts Rules on Cybersecurity Risk Management, Strategy, Governance, and Incident Disclosure by Public Companies. US Securities and Exchange Commission. https://www.sec.gov/news/press-release/2023-139

^{↩ 15} See fn. 1.

^{↩ 16} Draft text current as of May 25th, 2024.

^{↩ 17} Interim Report. National Security Commission on Emerging Biotechnology. Also see AIxBio White Paper 4: Policy Options for AIxBio. National Security Commission on Emerging Biotechnology.

^{↩ 18} Maintaining the AI Chip Competitive Advantage of the United States and its Allies. Center for Security and Emerging Technology.

^{↩ 19} Preventing AI Chip Smuggling to China. Center for a New American Security.

^{↩ 20} BIS has limited information on distinguishing between use-cases, and is compelled to favor highly adversarial and broad controls to mitigate security risks from lack of enforcement.

^{↩ 21} China lashes out at latest U.S. export controls on chips. Associated Press.

^{↩ 22} Examining US export controls against China. East Asia Forum.

^{↩ 23} For more information on TPMs, see Safeguarding the Future of AI: The Imperative for Responsible Development. Trusted Computing Group.

^{↩ 24} Similar technology is also employed in Apple’s Private Cloud Compute. See “Private Cloud Compute: A new frontier for AI privacy in the cloud,” Apple Security Research Blog, Jun. 10, 2024, https://security.apple.com/blog/private-cloud-compute/.

^{↩ 25} Open source systems developed in the United States have supercharged the development of AI systems in China, the UAE and elsewhere.

See, How dependent is China on US artificial intelligence technology? Reuters;

Also see, China’s Rush to Dominate A.I. Comes With a Twist: It Depends on U.S. Technology. New York Times.

^{↩ 26} See fn. 1.

^{↩ 27} For an example of a technical project meeting these conditions, see the Future of Life Institute response to the Bureau of Industry and Security’s Request for Comment RIN 0694–AI94 on implementation of additional export controls , which outlines an FLI project underway in collaboration with Mithril Security.

^{↩ 28} A CJADC2 Primer: Delivering on the Mission of “Sense, Make Sense, and Act”. Sigma Defense.

^{↩ 29} Draft text current as of May 25th, 2024.

^{↩ 30} “Air Force names Joe Chapa as chief responsible AI ethics officer“. FedScoop.

^{↩ 31} “US DoD AI chief on LLMs: ‘I need hackers to tell us how this stuff breaks’.” Venture Beat.

^{↩ 32} Widder, David Gray and West, Sarah and Whittaker, Meredith, Open (For Business): Big Tech, Concentrated Power, and the Political Economy of Open AI (August 17, 2023). Available at SSRN: https://ssrn.com/abstract=4543807

^{↩ 33} Mouton, Christopher A., Caleb Lucas, and Ella Guest, The Operational Risks of AI in Large-Scale Biological Attacks: Results of a Red-Team Study. Santa Monica, CA: RAND Corporation, 2024.

^{↩ 34} Lermen, Simon, Charlie Rogers-Smith, and Jeffrey Ladish. “LoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70B.” arXiv, Palisade Research, 2023, https://arxiv.org/abs/2310.20624.

^{↩ 35} Gade, Pranav, et al. “BadLlama: Cheaply Removing Safety Fine-Tuning from Llama 2-Chat 13B.” arXiv, Conjecture and Palisade Research, 2023, https://arxiv.org/abs/2311.00117.

^{↩ 36} Yang, Xianjun, et al. “Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models.” arXiv, 2023, https://arxiv.org/abs/2310.02949.

^{↩ 37} M Anderljung, et al., “Frontier AI Regulation: Managing Emerging Risks to Public Safety.” Nov. 7, 2023. pp.35-36. https://arxiv.org/pdf/2307.03718 (accessed June 7, 2024).

^{↩ 38} CrowdStrike. 2024 Global Threat Report. CrowdStrike, 2023, https://www.crowdstrike.com/global-threat-report/.

^{↩ 39} Thiel, David, Melissa Stroebel, and Rebecca Portnoff. “Generative ML and CSAM: Implications and Mitigations.” Stanford Cyber Policy Center, Stanford University, 24 June 2023.

^{↩ 40} The near-term impact of AI on the cyber threat. (n.d.).

^{↩ 41} S Casper, et al., “Black-Box Access is Insufficient for Rigorous AI Audits,” Jan. 25, 2024 (last revised: May 29, 2024), https://doi.org/10.1145/3630106.3659037.

^{↩ 42} Editor, C. C. (n.d.). Sensitive Compartmented Information Facility (SCIF) – glossary: CSRC. CSRC Content Editor.

^{↩ 43} “Rounds Raised by Startups Using AI In 2023,” Crunchbase.

^{↩ 44} N Maslej, et al., “The AI Index 2024 Annual Report,” AI Index Steering Committee, Institute for Human-Centered AI, Stanford University, Stanford, CA, Apr. 2024, https://aiindex.stanford.edu/report/.

^{↩ 45} Roose, K. (2024, June 4). OpenAI insiders warn of a “reckless” race for dominance. The New York Times.

^{↩ 46} For an example of a technical project meeting these conditions, see the Future of Life Institute response to the Bureau of Industry and Security’s Request for Comment (RIN 0694–AI94), which outlines an FLI project underway in collaboration with Mithril Security. Additional detail on the implementation of compute governance solutions can be found in the “Compute Security and Export Controls” section of this document.

^{↩ 47} H Ajder et al. “The State of Deepfakes.” Sept. 2019.

^{↩ 48} Onfido. “Identity Fraud Report 2024.” 2024.

^{↩ 49} G De Vynck. “OpenAI finds Russian and Chinese groups used its tech for propaganda campaigns.” May. 30, 2024.

^{↩ 50} M Meaker. “Slovakia’s election deepfakes show AI Is a danger to democracy.” Oct. 3, 2023.

^{↩ 51} S Cole. “This horrifying app undresses a photo of any Woman with a single click.” Jun. 26, 2019.

^{↩ 52} Deepgram. “Build voice into your apps.” https://deepgram.com/

^{↩ 53} Hoodem. “Create any deepfake with no limitation.” https://hoodem.com/

^{↩ 54} See Future of Life Institute’s ‘Recommended Amendments to Legislative Proposals on Deepfakes‘ report.

^{↩ 55} M B Kugler, C Pace. “Deepfake privacy: Attitudes and regulation.” Feb. 8, 2021.

^{↩ 56} H Zhang, B Elderman, & B Barak. “Watermarking in the sand.” Nov. 9, 2023. Kempner Institute, Harvard University.

^{↩ 57} See Future of Life Institute’s ‘Recommended Amendments to Legislative Proposals on Deepfakes‘ report.

FLI Recommendations for the AI Research, Innovation, and Accountability Act of 2023

Taylor Jones — Thu, 16 May 2024 14:35:08 +0000

Summary of Recommendations

Require external, independent assessment of compliance with prescribed TEVV standards for critical-impact AI systems.
Include developers of critical-impact AI systems in the definition of “critical-impact AI organization.”
Include advanced general-purpose AI systems in the definition of “critical-impact AI system.”
Remove right to cure.
Authorize the Secretary to prohibit a critical-impact AI organization from making a critical-impact AI system available to the public on the basis of reported information if the system presents unacceptable risk of harm.
Increase maximum civil penalties to scale with gross receipts of organization.
Clarify that the limitation on disclosure of trade secrets, IP, and confidential business information applies only to the extent that the information is not necessary to comply with the requirements of the bill.

Bill Summary

The “Artificial Intelligence Research, Innovation, and Accountability Act of 2023 (AIRIAA),” authored by Sen. Thune and co-authored by Sens. Klobuchar, Wicker, Hickenlooper, Luján, and Capito, is, to date, the most comprehensive legislative framework for the regulation of high-risk AI systems that has been introduced in Congress. The AIRIAA has several major provisions that seek to address a range of potential harms from advanced AI systems through transparency, recommendations for standards and best practices, mandatory assessments, and self-certification.

These provisions:

commission research and development of standards for verifying provenance of AI-generated and human-produced content, including through watermarking;
commission research and development of standardized methods for the detection and understanding of anomalous behavior by AI systems, and safeguards to mitigate “potentially adversarial or compromising anomalous behavior”;
commission a study into the barriers and best practices for the use of AI by government agencies;
require notice to users when a covered internet platform uses generative AI to generate content that the user sees;
require transparency reports for “high-impact AI systems” to be submitted to the Secretary of Commerce prior to deployment that detail design and safety plans for the system;
require the National Institute on Standards and Technology (NIST) to provide sector-specific recommendations to individual Federal agencies on the oversight of high-impact AI systems within their domain, and require each agency to submit a formal written response to the Director of NIST indicating whether or not those recommendations will be adopted;
require an organization that deploys a “critical-impact AI system” to perform a risk management assessment at least 30 days before the system is made publicly available and at least biennially thereafter, and to submit a report to the Secretary of Commerce detailing that assessment;
establish an advisory committee to provide recommendations on standards for testing, evaluation, validation, and verification (TEVV) of critical-impact AI systems;
require the Secretary of Commerce to establish a 3-year implementation plan for self-certification of compliance with established TEVV standards by critical-impact AI organizations, and;
establish a working group for furthering consumer education related to AI systems.

We applaud the Senator on his effort to develop a detailed framework toward AI governance infrastructure in both the public and private sector. The framework recognizes that AI systems vary in their potential risk, and takes the reasoned approach of stratifying requirements on the basis of impact. Provisions for development of digital content provenance and risk mitigation standards, detailed guidance on responsible government AI use, and transparency, assessment, and certification requirements for the highest-risk AI systems are commendable, and have the potential to reduce the risks presented by advanced AI systems.

Unfortunately, several shortcomings in the bill in print significantly limit the effectiveness of these provisions, rendering the proposed bill insufficient to adequately address the catastrophic risks these high-risk AI systems present. These shortcomings, discussed in the following section, can be easily resolved by relatively minor amendments to the bill’s provisions, which would render the bill a promising avenue toward thoughtful oversight of this promising yet risky technology. Failure by Congress to effectively mitigate these risks could lead to large-scale harm that would stymie future innovation in AI, depriving the public of the anticipated benefits of the technology, similar to the foreclosure of advancement in nuclear technology following the catastrophic failure at Three Mile Island.

We encourage the Senator to address the following concerns to ensure the bill accomplishes its laudable objective of mitigating the risks of advanced AI systems and realizing their considerable benefits.

Definitions

Critical-impact AI systems

The bill in print defines “critical-impact AI systems” to mean non-defense/intelligence AI systems that are “used or intended to be used […] to make decisions that have a legal or similarly significant effect on” the collection of biometric data without consent, the management or operation of critical infrastructure and space-based infrastructure, or criminal justice “in a manner that poses a significant risk to the rights afforded under the Constitution of the United States.”

“Used or intended to be used” — Demonstrating intent can be extremely challenging, and even if a system was not specifically designed for a particular use, developers and deployers should be cognizant of likely uses outside of their intended scope and evaluate accordingly. These are largely general-purpose technologies, so their intended use isn’t always apparent. The definition should include reasonably expected uses in addition to intended uses in order to avoid a loophole allowing circumvention of the bill’s requirements. This recommendation is also applicable to the analogous language in the definition of “high-impact AI system.”

“In a manner that poses a significant risk to rights […]” — The standards and assessments prescribed by the bill are intended to evaluate the risks to rights and safety posed by critical-impact systems. Requiring the systems to meet that criterion to fall within the scope of the standards and assessments risks non-compliance if the organization lacks (or alleges to lack) a prima facie expectation that their system poses a significant risk to rights or safety, whether or not it actually does. This provision should be struck, as systems involved in these critical functions should be presumed to pose a significant risk to constitutional rights or safety, with compliance with standards and assessment then evaluating the scale and scope of that risk.This recommendation is also applicable to the analogous language in the definition of “high-impact AI system.”

Advanced general-purpose AI systems (GPAIS) are not included in definition — Any general-purpose AI system that exceeds a certain threshold of capability poses an inherent risk of dangerous emergent behavior and should undergo the requisite scrutiny to ensure safety before deployment. We believe systems more capable than GPT-4 should fall into this category, based on the positions of various experts and existing evidence of emergent capabilities in that class of systems. We recognize that the bill prefers a use-based approach to regulation, but GPAIS do not fit neatly into such a paradigm, as they are by definition multi-use, and can exhibit capabilities for which they were not specifically trained. While their use for, e.g., collection of biometric data, management of critical infrastructure, or criminal justice may not be intended or anticipated, such uses may nonetheless arise, necessitating preliminary evaluation of their effects on rights and safety. Accordingly, inclusion of these systems in the category of “critical-impact” is imperative.

Significant risk

The bill in print defines “significant risk” to mean “a combination of severe, high-intensity, high-probability, and long-duration risk of harm to individuals.” Requiring a combination of these factors would exclude many major risks from AI, which tend to be high-intensity and/or long-duration, but relatively low-probability. For instance, failure or compromise of a critical infrastructure management system may be low-probability, but would have catastrophic, high-intensity harmful effects. Any of these characteristics in isolation should qualify as a significant risk. We thus recommend striking “a combination of” and changing the final “and” to an “or” to capture the spectrum of major risks AI is likely to present.

Critical-impact AI organization

The bill in print defines “critical-impact AI organization” to mean a non-governmental organization that serves as the deployer of a critical-impact AI system. However, the requirements placed on critical-impact AI organizations under this bill should also extend to developers of critical-impact AI systems. This would improve transparency by allowing developers to more readily provide requisite information to deployers, and would ensure that developers are accounting for safety by design in the process of developing the system itself. Developers are most directly exposed to the mechanics of system development and training, and as such are in the best position both to assess risk and to mitigate risks early in the value chain to ensure they do not proliferate. We recommend including developers and deployers of critical-impact AI systems under the definition of “critical-impact AI organization,” and further recommend striking the exclusion of deployers from the definition of “developer,” since this would otherwise absolve developers of these systems who also act as deployers from any obligations under the bill.

Transparency and Certification

*Self-certification*

Perhaps most concerning is that the bill in print does not require any independent evaluation of compliance with prescribed standards, and relies entirely on self-attestation to the Secretary that critical-impact AI organizations are compliant with the prescribed safety standards. Self-certification is mainly useful in that it can be adopted quickly, but self-certification as a long-term solution for ensuring the safety of critical-impact AI systems is woefully insufficient. It could be extremely dangerous to have those producing and deploying the AI systems managing our critical infrastructure, who may be driven by a profit motive, determining whether or not their own assessments were sufficient to comply with standards and determine if their systems are safe. These companies grading their own homework provides no external assurance that compliance has actually been achieved, and in this high-risk of a circumstance, that can lead to catastrophic outcomes. We do not rely on self-certification for assessing the safety of our airplanes, and we should not rely on self-certification for assessing the safety of these critical systems.

The three year period allotted for the development of TEVV standards provides a reasonable window to iron out any ambiguities in how to structure an independent mechanism for certifying compliance with the issued standards. To the extent that reasonable standards can be developed in a three year period, so too can a mechanism for independently verifying compliance with those standards.

Limitations on information disclosure requirements

The bill in print specifies that it shall not be construed to require deployers of high-impact AI systems or critical-impact AI organizations to disclose information relating to trade secrets, protected IP confidential business information, or privileged information. These limitations severely limit the usefulness of the reporting requirements and hamper effective assessment of the systems, as most pertinent information relating to the AI system can be construed to fall into these categories, and omissions on this basis would not be apparent to the Secretary.

In other words, if the deployer cannot be required to disclose confidential business information/trade secrets/intellectual property to the Secretary, the Secretary has no way to evaluate completeness of the assessment since the information they can evaluate on is substantially limited. These sensitive types of information are routinely disclosed to government agencies for a number of functions, and, to the extent that information is necessary to effectively evaluate the system and meet the bill’s requirements, should be disclosed here as well. Accordingly, we recommend amending the rules of construction to specify that the limitations on compelled information disclosure apply only to the extent that the information is not necessary for satisfying the bill’s requirements, and to reiterate the confidentiality of information disclosed to the Secretary in this manner.

Enforcement

Right to cure

The bill in print empowers the Secretary to initiate an enforcement action and recover penalties for violation of the bill’s provisions only if the violator does not “take sufficient action to remedy the non-compliance” within 15 days of notification. The inclusion of this right to cure provides little incentive for those subject to the bill’s requirements to comply, unless and until they receive notice of non-compliance from the Secretary. To the extent that compliance entails any cost to the company, it would not be in the financial interest of the company to comply prior to receiving notice, which in practical terms means a long latency before companies actually comply, additional administrative costs to the Secretary, and the possibility that more companies out of compliance fall through the cracks with no incentive to comply. The companies at issue here, which are predominately large and well-resourced, generally have the legal resources to remain up to speed on what laws are being passed, so a right to cure serves little benefit with significant cost.

Maximum civil penalty

The bill in print caps civil penalties for violation of the bill at the greater of “an amount not to exceed $300,000; or […] an amount that is twice the value of the transaction that is the basis of the violation with respect to which the penalty is imposed.” Because the latter may be difficult to quantify – not all violations will occur within the context of a transaction – the former will likely often apply, and seems insufficient to incentivize compliance by large companies. While $300,000 may have some effect on the budget of a very small company, the AI industry is presently dominated by a small number of very large companies, most of which would be virtually unaffected by such a penalty. For reference, $300,000 would constitute 0.00014% of Microsoft’s annual revenue, making such a penalty an extremely small line-item in their annual budget. Rather, the maximum civil penalty should be based on a percentage of the annual gross receipts of the organization such that the impact of the penalty scales with the size and resources of the organization. Additionally, it is not clear whether failure to, e.g., perform a required assessment at all would constitute a single violation, or whether each required assessment item not reported to the Secretary would be considered a separate violation.

Prohibition of deployment

The bill in print explicitly deprives the Secretary of the authority to prohibit a critical-impact AI organization from making a critical-impact AI system available to the public based on review of a transparency report or any additional clarifying information submitted to the Secretary. This undermines the bill’s intent to ensure that the systems being deployed are safe. If an assessment reveals material harms that are likely to occur from the release of the system, it is critical that the Secretary be able to take action to ensure that the unsafe system is not put on the market. For essentially all consumer products, if there is reason to believe the product is unsafe, there is an obligation to remove it from the market. In this case, these assessments are the most comprehensive evaluation of the safety of a system, so barring the Secretary from acting on the basis of that information seems inconsistent with the safety standards we otherwise expect of consumer products, which generally have significantly less potential for catastrophic harm relative to these systems.

The bill does permit the Secretary to prohibit the deployment of a critical-impact AI system if the Secretary determines that the organization intentionally violated the act or any regulation issued under the act. However, intent would be remarkably difficult to prove in order to actually enforce this. It would effectively require internal documentation/communications that explicitly communicate this intention, which are unlikely to be available to the Secretary in determining whether to take such action. Additionally, violations of the act would generally be failures to perform certain assessments or submit reports, but so long as the assessments are conducted and reports are submitted, the actual contents of those assessments and reports cannot be acted upon. This significantly hampers the Secretary’s capacity to protect the public from dangerous systems.

To put it more succinctly, if a system is being implemented in a critical-impact context (e.g. supporting the power grid, ensuring clean water, protecting biometric information, etc.), a single failure can be catastrophic. If there is reason to believe the system is likely to fail, it seems it should be well within the remit of the Secretary to prevent those risks from being realized.

Advisory Committee/Working Group

Exclusion of civil society

As it is currently drafted, the Artificial Intelligence Advisory Committee established to provide advice and recommendations on TEVV standards and certification of critical-impact AI systems would specifically require representation of two different types of technology companies, but no representation of civil society groups focused on ethical or safe AI. Similarly, while the working group relating to responsible consumer education efforts for AI systems established by the bill in print includes representation of nonprofit technology industry trade associations, it excludes independent civil society representation. Given the valuable, public-interest perspective civil society organizations can provide, we recommend including required representation of nonprofit organizations with a substantial focus on the safe and ethical use of AI in both the Advisory Committee and the working group.

Recommended Amendments to Legislative Proposals on Deepfakes

Taylor Jones — Mon, 13 May 2024 18:04:23 +0000

NO FAKES Act

The NO FAKES Act of 2023 was introduced by Sens. Coons, Blackburn, Klobuchar, and Tillis.

Creates a property right over one’s own likeness in relation to digital replicas which are nearly indistinguishable from the actual image, voice, or visual likeness of that individual.
Imposes civil liability for the non-consensual production, publication, distribution, or transmission of a digital replica.
States that it would not be a defense for the defendant to have included a disclaimer claiming that the digital replica was unauthorized or that they did not participate in its creation, development, distribution, or dissemination.

Analysis: This proposal would enable victims to bring civil liability claims against persons who produce, publish, transmit, or otherwise distribute deepfakes of them. While this might capture model developers or providers whose tools enable the production of such deepfakes, this interpretation should be made clearer by the bill. Moreover, civil liability for breaching a property right will not be as strong of a deterrent as criminal liability, which better reflects the level of harm deepfakes pose to the public.

Recommended Amendment

Amend § 2(c) as follows:

(1) IN GENERAL.—Any person that, in a manner affecting interstate or foreign commerce (or using any means or facility of interstate or foreign commerce), engages in an activity described in paragraph (2) shall be jointly and severally liable in a civil action brought under subsection (d) for any damages sustained by the individual or rights holder injured as a result of that activity.

(2) ACTIVITIES DESCRIBED.—An activity described in this paragraph is either of the following:

(A) The production of a digital replica without consent of the applicable individual or rights holder.

(B) The publication, distribution, or transmission of, or otherwise making available to the public, an unauthorized digital replica, if the person engaging in that activity has knowledge that the digital replica was not authorized by the applicable individual or rights holder.

(C) The publication, distribution, or transmission of, or otherwise making available to the public, a system or application which employs artificial intelligence to produce a digital replica without consent of the applicable individual or rights holder, if the system or application can generate such unauthorized digital replicas without extraordinary effort.

NO AI FRAUD Act

The NO AI FRAUD Act of 2024 was introduced by Rep. Salazar along with other House lawmakers.

Creates a property right in each person’s rights and likeness.

Imposes civil liability for distributing, transmitting, or otherwise making available to the public a “personalized cloning service”, with a remedy of at least $50,000.
Defines personalized cloning services as algorithms, software, tools, or other technologies, services, or devices the primary purpose or function of which is to produce digital voice replicas or digital depictions of particular, identified individuals.
Also imposes civil liability for publishing, performing, distributing, transmitting, or otherwise making available to the public a deepfake.

Analysis: This proposal is strong in that it specifically incorporates model developers and providers through a civil liability provision for persons distributing, transmitting, or otherwise publishing a “personalized cloning service”. However, the definition of “personalized cloning service” is limited by requiring the “primary purpose” of the AI system to be creating digital replicas. Because generative AI systems are often multi-purpose, this definition would therefore fail to capture systems which were not specifically designed to generate harmful deepfakes but which are actively being used for this purpose. Furthermore, the proposal requires actual knowledge that the individual or rights holder did not consent to the conduct in order to hold the creator, distributor, or other contributor liable. This standard is remarkably hard to meet and may discourage moderation of content produced by these services to ensure that there is plausible deniability of knowledge of what would otherwise be a violation of the bill. The remedy for providing such a service would also start at only $50,000 per violation. While the generation of many deepfakes would constitute multiple violations for those creating the deepfakes themselves, the publication of a service used to create them, i.e. the source of the deepfake supply chain, would seemingly only constitute a single violation regardless of the number of deepfakes produced by the service. This penalty may therefore be insufficient for incentivizing the incorporation of robust safeguards by model developers and providers.

Recommended Amendment

Amend §3(a)(3) as follows:

(3) The term “personalized cloning service” means an algorithm, software, tool, or other technology, service, or device ~~the primary purpose or function of which~~ which can, without extraordinary effort, produce one or more digital voice replicas or digital depictions of particular, identified individuals.

Amend §3(c)(1) as follows:

(1) IN GENERAL. — Any person or entity who, in a manner affecting interstate or foreign commerce (or using any means or facility of interstate or foreign commerce), and without consent of the individual holding the voice or likeness rights affected thereby–

(A) distributes, transmits, or otherwise makes available to the public a personalized cloning service;

(B) publishes, performs, distributes, transmits, or otherwise makes available to the public a digital voice replica or digital depiction ~~with knowledge~~ that the person knows or should have known ~~that the digital voice replica or digital depiction~~ was not authorized by the individual holding the voice or likeness rights affected thereby; or

(C) materially contributes to, directs, or otherwise facilitates any of the conduct proscribed in subparagraph (A) or (B) ~~with knowledge~~ and knows or should have known that the individual holding the affected voice or likeness rights has not consented to the conduct, shall be liable for damages as set forth in paragraph (2).

Amend §3(c)(2) as follows:

(A) The person or entity who violated the section shall be liable to the injured party or parties in an amount equal to the greater of—

(i) in the case of an unauthorized distribution, transmission, or other making available of a personalized cloning service, fifty thousand dollars ($50,000) per ~~violation~~ injured party or the actual damages suffered by the injured party or parties as a result of the unauthorized use, plus any profits from the unauthorized use that are attributable to such use and are not taken into account in computing the actual damages; and

Preventing Deepfakes of Intimate Images Act

The Preventing Deepfakes of Intimate Images Act was introduced by Rep. Morelle and Rep. Kean.

Prohibits the disclosure of intimate digital depictions, defining such depictions broadly to include sexually explicit content or altered images.
Outlines both civil and criminal penalties for unauthorized disclosure, provides relief options for victims, and includes exceptions for certain disclosures, such as those made in good faith to law enforcement or in matters of public concern.

Analysis: The bill successfully creates a two-pronged approach to combating non-consensually generated, sexually explicit material by introducing both civil and criminal penalties. The bill draws on the definition of ‘disclose’ outlined in the Violence Against Women Act Reauthorization Act of 2022 which is to “transfer, publish, distribute, or make accessible.” While this definition might encompass the actions performed by platforms which facilitate the distribution of deepfakes and developers who enable the creation of deepfakes, the language of the bill must be clarified in order to clearly capture these key entities in the supply chain.

Recommended Amendment

Amend §1309A(b) “Right of Action” by adding the following clause:

“(1) IN GENERAL.—Except as provided in subsection (e), an individual who is the subject of an intimate digital depiction that is disclosed, in or affecting interstate or foreign commerce or using any means or facility of interstate or foreign commerce, without the consent of the individual, where such disclosure was made by a person who knows that, or recklessly disregards whether, the individual has not consented to such disclosure, may bring a civil action against that person in an appropriate district court of the United States for relief as set forth in subsection (d).

“(2) RIGHT OF ACTION AGAINST DEVELOPERS.–Except as provided in subsection (e), an individual who is the subject of an intimate digital depiction that is disclosed, in or affecting interstate or foreign commerce or using any means of facility of interstate or foreign commerce, without the consent of the individual may bring a civil action against a developer or provider of an artificial intelligence system if both of the following apply:

“(A) the artificial intelligence system was used to create, alter, or distribute the intimate digital depiction; and

“(B) the developer or provider of the artificial intelligence system intentionally or negligently failed to implement reasonable measures to prevent such use.

[…]

AI Labeling Act

The AI Labeling Act of 2023 was proposed by Sens. Schatz and Kennedy.

Requires that each AI system that produces text, image, video, audio, or multimedia AI-generated content include on such AI-generated content a clear and conspicuous notice.
Requires that the notice include an identification that the content is AI-generated, the identity of the tool used to create the content, and the date and time the content was created.
Requires that, to the greatest extent possible, the disclosure must also be permanent or difficult to remove.
Imposes a responsibility on developers and third-party licensees to “implement reasonable procedures to prevent downstream use of such system without the disclosures required”. There is also a duty to include in any licenses a clause prohibiting the removal of the disclosure notice.

Analysis: Disclosing the tool used to generate the content in the notice can make it easier to identify and hold developers accountable for the harmful deepfakes their models create. This can also help identify models that are particularly vulnerable to being employed for the creation of deepfakes due to insufficient design safeguards. Though we suggest the authors clarify that identification of the tool used to create the content should include, at a minimum, the model name, model version, and the name of the model’s developer, we applaud the authors’ foresight in including this requirement. We further note that many deepfakes, such as those which are sexually abusive in nature, are harmful in and of themselves, even if it is known that the content has been manipulated. As such, while useful, labeling of AI-generated content alone is likely insufficient to discourage deepfake proliferation and incentivize responsible design of generative AI systems without imposing liability for harms from the deepfakes themselves. That said, this may best be accomplished through complementary legislation.

AI Disclosure Act

The AI Disclosure Act of 2023 was introduced by Rep. Torres in 2023.

Requires that any output generated by AI includes: “Disclaimer: this output has been generated by artificial intelligence.”
States that failing to include this disclaimer would violate the Federal Trade Commission Act (15 U.S.C. 57a(a)(1)(B)) regarding unfair or deceptive acts or practices.

Analysis: These types of disclaimers can be easily removed from the outputs of generative AI, such as videos, images, and audio, meaning malicious actors could continue to utilize deepfakes for deceptive purposes, which would be made more deceptive by absence of a disclaimer. It is also not clear which party would be responsible for the lack of disclaimer in the output. It would be most reasonable for the model developer to bear this responsibility given that they can designs models to ensure all outputs incorporate this disclaimer, but without any requirement to include information identifying the generative AI model used to create the content, enforcement would be extremely challenging.

Recommended Amendment

Amend §2(a) as follows:

(a) Disclaimer Required.—Generative artificial intelligence shall include on any output generated by such artificial intelligence the following: “Disclaimer: this output has been generated by artificial intelligence.”.

(b) Information Required.—Generative artificial intelligence shall embed in any output generated by such artificial intelligence the following information concerning the artificial intelligence model used to generate that output:

(1) The model name;

(2) The model version;

(3) The name of the person responsible for developing the model; and

(4) The date and time the output was generated by that model.

[…]

DEEPFAKES Accountability Act

The DEEPFAKES Accountability Act was introduced by Rep. Clarke.

Requires that deepfakes include a contents provenance disclosure, such as a notice that the content was created using artificial intelligence.
Requires that audiovisual content includes no less than one verbal statement to this effect and a written statement at the bottom of any visual component.
Imposes a criminal penalty for a failure to comply with disclosure requirements where the noncompliance is:
- intended to cause violence or physical harm;
- intended to incite armed or diplomatic conflict;
- intended to interfere in an official proceeding;
- in the course of criminal conduct related to fraud;
- intended to humiliate or harass a person featured in the deepfake in a sexual manner;
- by a foreign power intending to influence a domestic public policy debate, interfere in an election, or engage in other unlawful acts.
Also applies the criminal penalty where a person removes or meaninfully obscures the disclosures, with the intent of distribution.
Grants a private right of action against a person who fails to compy with the disclosure requirements, even if there is only a “tangible risk” of suffering the enumerated harms.

Analysis: While the inclusion of criminal penalties may increase compliance and reflects the range of harms posed by deepfakes, this penalty would only apply for failure to include a contents provenance disclosure. Therefore, producing and sharing harmful deepfakes would still be permitted. Moreover, the bill includes an exception to the disclosure requirements for deepfakes created by an officer or employee of the United States in furtherance of public safety or national security, which could create a wide loophole. Finally, the contents provenance provision does not appear to require disclosure of the specific model that was used to generate the deepfake content; as such it would be difficult to identify the upstream model provider responsible for a harmful deepfake.

Recommended Amendment

Amend §1041 as follows:

“(c) Audiovisual Disclosure.—Any advanced technological false personation records containing both an audio and a visual element shall include—

“(1) not less than 1 clearly articulated verbal statement that identifies the record as containing altered audio and visual elements, ~~and~~ a concise description of the extent of such alteration, a timestamp of when the content was generated or altered, and, where applicable, a description including the name, version, and developer of the artificial intelligence system used to generate or alter the record;

“(2) an unobscured written statement in clearly readable text appearing at the bottom of the image throughout the duration of the visual element that identifies the record as containing altered audio and visual elements, and a concise description of the extent of such alteration; and

“(3) a link, icon, or similar tool to signal that the content has been altered by, or is product of, generative artificial intelligence or similar technology.

“(d) Visual Disclosure.—Any advanced technological false personation records exclusively containing a visual element shall include an unobscured written statement in clearly readable text appearing at the bottom of the image throughout the duration of the visual element that identifies the record as containing altered visual elements, and either—

“(1) a concise description of the extent of such alteration, a timestamp of when the content was generated or altered, and, where applicable, a description including the name, version, and developer of the artificial intelligence system used to generate or alter the record; or

“(2) a clearly visible link, icon, or similar tool to signal that the content has been altered by, or is the product of, generative artificial intelligence or similar technology, along with a description, either linked or embedded, which provides the timestamp of when the content was generated or altered, and, where applicable, a description including the name, version, and developer of the artificial intelligence system used to generate or alter the record.

“(e) Audio Disclosure.—Any advanced technological false personation records exclusively containing an audio element shall include,

“(1) at the beginning of such record, a clearly articulated verbal statement that identifies the record as containing altered audio elements and a concise description of the extent of such alteration, a timestamp of when the content was generated or altered, and, where applicable, the name, version, and developer of the artificial intelligence system used to generate or alter the record; and

“(2) in the event such record exceeds two minutes in length, not less than 1 additional clearly articulated verbal statement and additional concise description at some interval during each two-minute period thereafter.

Protecting Consumers from Deceptive AI Act

The Protecting Consumers from Deceptive AI Act was introduced by Reps. Eshoo and Dunn.

Directs the National Institute of Standards and Technology to establish task forces to develop watermarking guidelines for identifying content created by AI.
Requires providers of generative AI applications to ensure that content created or modified by their application includes machine-readable disclosures acknowledging its AI origin.
Requires providers of generative AI applications to make available to users the ability to incorporate, within the metadata of content created or modified by the application, information including the AI application used to create or modify the content.
Requires providers of covered online platforms to prominently display disclosures included in the content accessed through their platform.
States that violations of these watermarking provisions would constitute unfair or deceptive practices under the FTC Act.

Analysis: The development of content provenance standards through dedicated NIST task forces would be valuable in establishing best practices for watermarking. While the proposal requires that application providers enable users to include the model name and version in the metadata of AI-generated content, application providers and model developers should be required to automatically embed this information in the metadata in order to facilitate tracing back to the relevant developer and provider. This enables greater accountability and incentivizes design choices that incorporate protections against deepfakes by default. However, the bill does not address how users, developers, and platforms might be held accountable for the harmful deepfakes they create, publish, and distribute outside of simply requiring watermarking of these deepfakes.

Recommended Amendment

Amend §2(b)(1) as follows:

(1) Developers and providers of generative artificial intelligence ~~applications~~.—A person who develops, distributes, transmits, or makes available to users a generative artificial intelligence model or a software application based on generative artificial intelligence technology shall—

(A) ensure that audio or visual content created or substantially modified by such model or application incorporates (as part of such content and in a manner that may or may not be perceptible by unaided human senses) a disclosure that–

[…]

(D) ensure that such model or application ~~makes available to users the ability to incorporate~~ incorporates, within the metadata of content created or modified by such model or application, information regarding the generative artificial intelligence origin of such content, including tamper-evident information regarding—

(i) the name of such application, where applicable;

(ii) the name, ~~and~~ version, and developer of the generative artificial intelligence model utilized ~~by such application~~ to create or modify such content;

(iii) the date and time associated with the creation or modification of such content by such model or application; and

(iv) the portion of such content that was created or modified by such model or application.

DEFIANCE Act

The DEFIANCE Act of 2024 is sponsored by Sens. Durbin, Hawley, Klobuchar, and Graham.

Imposes civil liability for disclosing, soliciting, and intending to disclose a “digital forgery”.
Defines “digital forgery” as “any intimate visual depiction of an identifiable individual” created through means including artificial intelligence, regardless of whether it includes a label.

Analysis: The bill explicitly states that sexually explicit deepfakes which include disclaimers should still be subject to civil liability, and it also prohibits the production and possession of sexual deepfakes in certain cases. However, it is not clear that it limits a provider’s ability to supply individuals with deepfake-generating services, nor does it incentivize developers of AI systems to ensure that their models cannot produce this content. Furthermore, we note that the definition for a “digital forgery” may be too broad in that it could also capture manipulated content created by means other than artificial intelligence, such as traditional image editing software. AI-powered production and alteration of visual content arguably necessitates a distinct approach from traditional tools, as AI technology trivializes the process of producing realistic deepfakes at scale.

Recommended Amendment

Amend §2 CIVIL ACTION RELATING TO DISCLOSURE OF INTIMATE IMAGES as follows:

(b) CIVIL ACTION.—Section 1309(b) of the Consolidated Appropriations Act, 2022 (15 U.S.C. 6851(b)) is amended—

(1) in paragraph (1)—

(A) by striking paragraph (A) and inserting the following:

‘‘(A) IN GENERAL.—Except as provided in paragraph (5)—

‘‘(i) an identifiable individual whose intimate visual depiction is disclosed, in or affecting interstate or foreign commerce or using any means or facility of interstate or foreign commerce, without the consent of the identifiable individual, where such disclosure was made by a person who knows or recklessly disregards that the identifiable individual has not consented to such disclosure, may bring a civil action against the developer or provider of an artificial intelligence system used to generate or modify the depiction, the ~~that~~ person that disclosed the intimate digital depiction, or any combination thereof, in an appropriate district court of the United States for relief as set forth in paragraph (3);

FLI Response to OMB: Request for Information on Responsible Procurement of Artificial Intelligence in Government

Taylor Jones — Mon, 29 Apr 2024 12:01:00 +0000

Organization: Future of Life Institute

Point of Contact: Isabella Hampton, AI Policy Researcher. isabella@futureoflife.org

About the Organization

The Future of Life Institute (FLI) is an independent nonprofit organization with the goal of reducing large-scale risks and steering transformative technologies to benefit humanity, with a particular focus on artificial intelligence (AI). Since its founding, FLI has taken a leading role in advancing key disciplines such as AI governance, AI safety, and trustworthy and responsible AI, and is widely considered to be among the first civil society actors focused on these issues. FLI was responsible for convening the first major conference on AI safety in Puerto Rico in 2015, and for publishing the Asilomar AI principles, one of the earliest and most influential frameworks for the governance of artificial intelligence, in 2017. FLI is the UN Secretary General’s designated civil society organization for recommendations on the governance of AI and has played a central role in deliberations regarding the EU AI Act’s treatment of risks from AI. FLI has also worked actively within the United States on legislation and executive directives concerning AI. Members of our team have contributed extensive feedback to the development of the NIST AI Risk Management Framework, testified at Senate AI Insight Forums, participated in the UK AI Summit, and connected leading experts in the policy and technical domains to policymakers across the US government. We thank the Office of Management and Budget (OMB) for the opportunity to respond to this request for information (RfI) regarding the OMB’s obligations regarding the responsible procurement of Artificial Intelligence in government, as outlined in the Executive Order on Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence.

Executive Summary

We welcome the recent publication of the OMB memorandum on Advancing Governance, Innovation, and Risk Management for Agency Use of Artificial Intelligence (hereafter known as the ‘Memo’), which we see as a promising first step in setting standards for the federal government and beyond for the procurement of advanced AI systems that could threaten the rights and safety of the public. This response to the RfI is intended to help the OMB ensure that AI procurement practices follow this spirit and protect citizens against potential rights- and safety-impacting risks.

This section summarizes our contributions to this RfI, which are as follows:

National security use cases should be within the scope of the Memo’s requirements. At present, any AI system that is part of a ‘National Security System’ is exempt from the Memo’s requirements. We believe bringing these systems within scope is vital to ensuring that the most high-risk cases of AI procurement receive appropriate scrutiny.
AI procurement in rights- and safety-impacting cases should be subject to strict criteria for model developers. In particular, we lay out the case for the use of Operational Design Domains (ODD) taxonomies relevant to the use of AI technologies. Agencies should be tasked with establishing specific operational conditions for their needs, and should take appropriate steps to ensure that the AI systems they procure are reliable and minimize risks, or can be modified to minimize risks, within these domains.
The granting of extensions and waivers should be strictly limited to appropriate cases. We ask that OMB only grant these extensions and waivers in exceptional circumstances, such as when temporarily suspending the use of an AI system to achieve compliance would significantly disrupt essential government services, or when the standards themselves would threaten safety and security, and more protective alternative safety standards have been identified and verifiably implemented.
Agencies should only be allowed to opt out of compliance after submitting extensive evidence to OMB and exploring alternative options: We recommend that the OMB require that agencies submit extensive evidence demonstrating how compliance would increase risks to safety or rights or would pose unacceptable impediments to the agency’s operations. This evidence should be reported to the OMB as part of the agency’s petition to opt-out of compliance, subject to OMB approval.
OMB should clarify its definition of ‘principal basis’ for use of AI systems to ensure compliance: We ask that the OMB clarify this term with a robust set of criteria to ensure that agencies only employ this exception in circumstances where the AI systems concerned do not play a material role in driving important decisions or actions.
An appeals process must be set up to address opt out decisions taken by Chief Artificial Intelligence Officers (CAIO): We ask that the OMB set up a review and appeals process for decisions made by CAIOs to ensure that they are taken responsibly and within the remit of exceptions outlined in the Memo.
A central office should be created to supervise all CAIOs: To ensure CAIOs can fulfill their obligations effectively, we recommend establishing a centralized office to supervise CAIOs across the US government. This approach will foster a whole-of-government approach and ensure coordination and consistency across several use cases.
The frequency of reporting should be increased to keep up with rapid AI development: We recommend that OMB change reporting requirements from at least annually to at least every six months. The publication of agencies’ compliance plans should also be updated from every two years to at least annually.

Recommendations

1. National security use cases should be within the scope of the Memo’s requirements. At present, much of the Memo applies to all agencies outlined in 44 U.S.C. § 3502(1). Any AI system that is part of a ‘National Security System,’ however, is exempt from the Memo’s requirements, and, in many cases, so are members of the intelligence community as defined in 50 U.S.C. § 3003. We believe bringing these systems within scope is vital for ensuring that the most high-risk cases of AI procurement receive appropriate scrutiny.

AI systems intended for use in national security and military applications present some of the greatest potential for catastrophic risk due to their intended use in critical, often life-or-death circumstances.¹ Considering the sizable impact malfunction, misunderstanding, or misuse of national security AI systems could entail, applicable standards should be at least as rigorous in assessing and mitigating potential risks as those developed for civilian AI systems.

National security AI systems often operate with less transparency and public oversight due to their classified nature. This lack of visibility makes it even more crucial that these systems adhere to rigorous risk assessment, testing, and monitoring standards. Without the Memo’s requirements in place, there’s a greater risk that flaws in these systems could go undetected or have minimal oversight. The Department of Defense’s (DoD) Chief Artificial Intelligence Officers (CAIO) has expressed a need for “more rigor” in AI development, stating that he expects “99.999% accuracy” for LLMs used in military context. These standards could help facilitate the DoD in their pursuit of this goal.

Bringing these systems within scope would ensure they receive the same level of risk assessment and mitigation as civilian AI applications. This is essential for maintaining public trust and ensuring that the benefits of AI in national security are realized without unintended and potentially devastating consequences.

2. AI procurement in rights- and safety-impacting cases should be subject to strict criteria for model developers. Current AI systems operate across a wide variety of domains. To enable the comprehensive evaluation of these diverse use-cases, we propose the development of Operational Design Domains (ODD)² taxonomies relevant to the use of AI systems. These taxonomies provide a framework for managing procurement, audits, and evaluations through their operational domain, i.e., the specific functions for which the target system is to be used.

Agencies and departments should be tasked with establishing specific operational domains for their AI procurement needs. This process involves defining the intended operational domain, including, users, vectors, protected characteristics, and assets relevant to their specific use cases. Agencies can better communicate their requirements to model developers or integrators by clearly outlining these conditions and ensure that the procured AI systems are designed to operate safely and effectively within the specified domain.

When evaluating competing bids for AI procurement contracts, agencies should prioritize proposals that demonstrate a strong success rate at minimizing identified risks within the specified operational domain. To prevent misuse or unintended consequences, strict prohibitions should be established on the use of the AI system outside of the application domain, along with appropriate penalties for violations, in accordance with existing legal and regulatory frameworks.³ Agencies should also implement regular audits and reviews to ensure that procured AI systems are being used responsibly and transparently, without any unauthorized expansion beyond the intended application domain.

Incorporating ODD taxonomies into AI procurement guidelines will allow agencies to ensure a more targeted and risk-informed approach to acquiring AI systems. This not only promotes the safe and responsible use of AI in rights- and safety-impacting cases but also fosters greater transparency and accountability in the procurement process. Model developers will be required to demonstrate that their AI systems can operate safely and effectively within the specified operational conditions, reducing the likelihood of harm to individuals and society as a whole.

3. The granting of extensions and waivers should be strictly limited to appropriate cases: According to current guidance in Section 5(a)(i) of the Memo, agencies must either implement the minimum practices specified in Section 5(c) for safety-impacting AI systems by December 1, 2024, or cease using any AI that is not compliant with these practices, unless they receive a one-year extension or a waiver from the OMB.

While we recognize that one-year extensions may be necessary for the continued use of systems already in place at the time of the Memo’s issuance to avoid disruption of essential government services, we believe that these extensions should only be granted in exceptional circumstances where the immediate cessation of the AI system would lead to significant harm or disruption.

To ensure that extensions are only granted when truly necessary, we recommend that OMB require agencies to provide a detailed justification for why the AI system cannot be brought into compliance with the minimum practices within the specified timeframe. This justification should include a comprehensive description of how essential government functions would be disrupted if the system were temporarily taken out of operation while the necessary improvements are made to achieve compliance.

Furthermore, agencies should be required to explain why alternative mechanisms for achieving those essential functions are not feasible in the interim period. This explanation should include an analysis of potential workarounds, manual processes, or other AI systems that could be used to fulfill the essential functions while the non-compliant system is being updated. Only in cases where no reasonable alternatives exist and the disruption to essential services would be severe should extensions be granted.

4. Agencies should only be allowed to opt out of compliance after submitting extensive evidence to OMB and exploring alternative options: While the Memo sets up important requirements for agencies procuring and using AI, it also allows agencies to waive these requirements if “fulfilling the requirement would increase risks to safety or rights overall or would create an unacceptable impediment to critical agency operations.” Given the subjectivity of this threshold, we are concerned that these exceptions could transform into a loophole, with requests being granted whenever agency priorities conflict with compliance.

To address this, we recommend that OMB require agencies to submit extensive evidence when seeking a waiver. This evidence should include detailed documentation of the specific AI system, the requirements the agency believes pose a safety risk, and a thorough analysis demonstrating how compliance would directly cause significant harm or danger. In cases where the agency believes compliance would increase overall risks to safety or rights, they should be required to provide a detailed explanation of these risks, along with alternative practices that are equally or more rigorous than the minimum standards outlined in the Memo.

We also recommend that OMB provide clearer guidance on what constitutes “critical agency operations (as described in section 5(c)(iii) of the Memo). Agencies should be required to justify why the disruption caused by compliance with the Memo’s requirements would be unacceptable and demonstrate that they have thoroughly explored alternative options that would be in compliance comply with the standards.

If an agency is granted a waiver, this waiver should be time-limited, and the agency should be required to immediately begin efforts to identify and procure alternative AI systems or other solutions that can comply with the Memo’s requirements without causing significant disruption. The waiver should only remain in effect until a compliant alternative can be implemented.

5. OMB should clarify its definition of ‘principal basis’ for use of AI systems to ensure compliance: The Memo allows agencies to opt out of applying minimum risk management practices to AI systems that are not deemed to be the “principal basis” for any decision or action impacting rights or safety. While this qualifier rightly aims to focus oversight on higher-stakes use cases, the current “principal basis” language is ambiguous and could engender unintended loopholes. The Memo does not provide an explicit definition of “principal basis” or any test for what constitutes a principal basis, leaving room for inconsistent interpretations across agencies.

We recommend that OMB replace the “principal basis” standard for rights- and safety-impacting AI with a clearer “material influence” threshold. The definition of “material influence” should provide a set of criteria to ensure agencies can only use this exception to opt out in circumstances where AI systems do not significantly influence important decisions or actions. An AI system should be considered to materially influence a decision or action when it is a contributing factor that markedly affects the outcome of the decision-making process, even if other information is also considered. Informing or being consulted in a decision or action would not constitute material influence alone, but an AI system need not be the sole or primary basis to meet this standard.

6. An appeals process must be set up to address opt out decisions taken by Chief Artificial Intelligence Officers (CAIO): We welcome the OMB’s 2023 recommendation to set up CAIOs in agencies across the federal government in order to ensure that a dedicated office supervises AI-related functions and decisions, including those taken on procurement. As currently outlined in the Memo, these CAIOs have considerable power in the procurement process – they can unilaterally make the decision to opt out of OMB requirements, and while their decision must be reported to both the OMB and the public, it is final.

Even if CAIOs have nominal independence, they may still act in the perceived interest of their agency’s mission, potentially conflicting with responsible, objective decision-making regarding compliance waivers and leading to more liberal use of the opt-out authority than appropriate. We ask the OMB to implement a review and appeals process for these decisions to ensure they are made responsibly and within the scope of exceptions outlined in the Memo.

7. A central office should be created to supervise all CAIOs: As stated previously, we welcome the establishment of CAIOs in each agency to ensure that minimum requirements as set out in the Memo are fulfilled across agencies, while giving different offices within each agency the opportunity to exercise their mandate when it comes to specific use cases. We believe that the next step in this process for ensuring that the CAIOs can fulfill their obligations is to set up a centralized office which supervises CAIOs across the US government, similar to how the Office of the Director of National Intelligence (ODNI) coordinates all intelligence community activities. This centralized office would foster a whole-of-government approach and ensure coordination and consistency across several use cases. This would complement existing efforts to set up an interagency council to coordinate the development and use of AI in agencies’ programs and operations as specified in the Memo.

8. The frequency of reporting should be increased to keep up with rapid AI development: Keeping pace with the rapid advancements in AI technology and its increasing adoption in government agencies is crucial for ensuring that oversight remains relevant. While we commend the Memo for establishing regular reporting requirements for agencies, we believe that the current reporting intervals may not be sufficient to keep up with the pace of AI development and procurement.

To address this issue, we propose that the OMB increase the frequency of information requests from agencies. Specifically, we recommend that, under the existing remit of the OMB, the reporting requirements be updated from at least annually to at least every six months, maintaining the opportunity to further shorten this interval should the pace of AI advancement and procurement continue to accelerate. This more frequent updating will provide the OMB with a clearer picture of how AI systems are deployed and used across government agencies, allowing for more timely identification of potential risks and misuse.

We also recommend that agencies be required to provide updates on their ODD taxonomies as part of their regular reporting, in accordance with our second recommendation. These ODD updates should be accompanied by the results of audits and evaluations, demonstrating that procured AI systems as deployed are operating solely within their intended operational domain. Agencies should also report on any changes to their intended applications, especially those that arise from new releases, security patches, or updates to procured systems. Agencies should be required to further report any expansions to the operational scope of AI systems since the previous reporting period, as well as relevant evaluations that have been performed to ensure the system’s reliability, safety, and compliance with OMB standards in the proposed use case. OMB should be granted the authority to reject the expansion of the AI system’s operational conditions should evidence of compliance with the standards within the new domain of use prove insufficient. This information will enable the OMB to assess the ongoing compliance of agencies with the Memo and the effectiveness of established risk management practices.

Closing Remarks

We would like to thank the OMB for the opportunity to provide comments on the OMB memorandum on Advancing Governance, Innovation, and Risk Management for Agency Use of Artificial Intelligence. We believe that the recommendations described above will help ensure the responsible procurement and use of AI systems across the government.

^{↩ 1} Moon, M. (2024, February 27). The Pentagon used Project Maven-developed AI to identify air strike targets. Engadget.

^{↩ 2} This section draws significantly from Heidy Khlaaf’s work on AI risk assessments. For more information on ODD taxonomies and their application in AI risk assessment, please see: Khlaaf, H. (n.d.). Toward Comprehensive Risk Assessments and Assurance of AI-Based Systems.

^{↩ 3} For more information on procurement regulations, see Federal Acquisition Regulation (FAR) Subpart 9.1, “Responsible Prospective Contractors,” which outlines standards for contractor responsibility and the factors that can lead to suspension or debarment.

Competition in Generative AI: Future of Life Institute’s Feedback to the European Commission’s Consultation

Taylor Jones — Wed, 20 Mar 2024 14:21:09 +0000

1) What are the main components (i.e., inputs) necessary to build, train, deploy and distribute generative AI systems? Please explain the importance of these components

Generative AI (GAI) systems are the user-facing applications built on top of general purpose AI (GPAI) models. These models undergo training and inference using cloud computing, typically infrastructure as a service (IaaS), and advanced semiconductors.¹ This requires access to limited and expensive chips, cloud capacity, based on an extensive volume of servers, vast amounts of (high quality) data, and sought-after skills needed to develop competitive GPAI models, including discovering innovative algorithms that advance the state-of-the-art.

Currently, buying API access to these models, or downloading open source alternatives, to adapt them into generative AI applications is much cheaper and faster than building these models in-house from scratch.² This means that only a handful of the most well-resourced corporations in history can afford to bankroll the development of the models that structure and underpin the applications upon which they are built. The generality of these models in competently performing a wide range of distinct tasks means that they could quickly become the digital infrastructure that forms the bedrock of the entire economy.³

GPT-4 boasts an end-user base of 100 million weekly active users and a business user base of over two million developers using it as a platform, including 92% of Fortune 500 companies.⁴ OpenAI’s GPT store allows users to develop and monetise their own GPTs, illustrating the base layer infrastructural nature of their GPAI model.⁵ These corporations are also preeminent in other markets, allowing them to disseminate GPAI models across cloud, search, social media, operating systems, app stores, and productivity software.⁶ Thus, the implications of market concentration are much starker than for other technologies. The combination of concentrated development resources with ubiquitous adoption and distribution throughout adjacent markets risks a winner-take-all scenario, as explored in this feedback.

2) What are the main barriers to entry and expansion for the provision, distribution, or integration of generative AI systems and/or components, including AI models? Please indicate to which components they relate.

Increasing the number of model parameters enhances a model’s capabilities by improving its capacity to learn from data. However, this requires more computing power, in chips and cloud capacity, and data, which makes it cost prohibitive for many SMEs and startups.⁷ The market for chips, the first layer in the AI value chain, is highly concentrated, a phenomenon which is exacerbated by shortages stemming from significant demand-supply imbalances in components. General purpose AI is fuelled by the parallel computation processing capabilities of NVIDIA-designed graphic processing units (GPUs), which capture 90% of the market⁸, and are manufactured by Taiwan Semiconductor Manufacturing Company (TSMC), which, in turn, captures the largest share in the global chips foundry market at 56%⁹. Many developers train AI models in CUDA, NVIDIA’s proprietary software development platform, but they must use NVIDIA’s GPUs.¹⁰ Even the well-capitalised challengers in this market observe competition issues, as OpenAI’s CEO, Sam Altman, has sought to raise $5-7 trillion to create his own chip-building capacity, highlighting the difficulties of competing on chips.¹¹

While the hardware market in semiconductors is almost a monopoly, the infrastructure market is more like an oligopoly¹², which should still concern the Commission from a competition perspective.¹³ Amazon’s AWS (31%), Microsoft’s Azure (24%) and Google Cloud (11%) collectively cover two thirds of the cloud computing market.¹⁴ This collective dominance arises from the significant investment required to establish data centres, server farms, and the network infrastructure to interconnect them.¹⁵ If OpenAI, Anthropic or DeepMind were to create their own in-house cloud infrastructure, independent of the Big Tech companies that have partnered, merged, or acquired them, it would require considerable investments in land, energy, and datacentre equipment (cabling servers, server racks, coolers, etc.).¹⁶ While the Data Act may abolish egress charges, excluding those related to parallel data storage, there remain additional non-financial hurdles hindering GPAI model developers from establishing their own in-house cloud hosting infrastructure. Namely, risks of services downtime for their customers, including generative AI developers and end-users.¹⁷

Hyperscalers (large cloud service providers that can provide computing and storage services at enterprise scale) enjoy privileged access to limited hardware resources, enabling them to offer exclusive access to GPAI models developed internally, in collaboration with partners, or through investments, thereby creating serious barriers to entry.¹⁸ Amazon not only provides cloud infrastructure to Anthropic, Stability AI, and AI21, but also competes with them by offering its own GPAI models on its Amazon Bedrock platform.¹⁹ Cloud hosts have unparalleled power to monitor, detect, and stifle competitors emerging on their cloud platform²⁰, while resources generated using upstream dominance allows them to influence research and development downstream.²¹

Big Tech-backed research papers are frequently cited in research, indicating a notable uptake of their ideas among the wider scientific community.²² Their ownership and operation of software development frameworks – standardised processes for developing AI and other software, including vast data repositories, training methods, and evaluation tools – shapes AI development and deployment by ensuring that engineers adhere to development practices that are interoperable with Big Tech products.²³ Although PyTorch functions as a research foundation within the Linux Foundation, it is bankrolled by Meta. Google’s TensorFlow programming is specifically designed for Google’s Tensor Processing Units (TPUs), Google’s inhouse AI semiconductors, available on Google Cloud Platform, facilitating Google’s vertical integration from development practices to compute resources.

Developers of the most advanced GPAI models currently on the market have a first-mover advantage. It is more straightforward for OpenAI to maintain and attract business users, because some of those clients may be hesitant to switch to GPAI competitor due to data security concerns or the cost and complexity of moving their data²⁴, as previously witnessed in the cloud space²⁵. Having made the large initial investment, OpenAI have a head start, learning from and building upon their already advanced model, while recovering that initial investment from the monetisation of GPT-4, as others seek to get their models off the ground.

Early entry allows these providers to purchase access to compute and data at rates that will be lower than for new entrants if increased demand pushes prices up.²⁶ It affords them greater time to improve their models’ performance through finetuning, build brand visibility, and a strong consumer base. They have a head start in harvesting user data to feed into future training runs and developing higher performance models. This reinforces their prominence, whereby greater performance attracts more users, builds better trust in the model at the expense of new and unknown alternatives, and gives them capital to continue crowding out the market.

The best models currently belong to Big Tech, not universities –in 2022, industry produced 32 leading models, while academia produced three.²⁷ This reduces academic access to cutting-edge models for evaluations of systemic risks and the development of effective mitigation measures. Compared to nonprofits and universities, the private sector has the most resources to recruit the best talent, use large amounts of compute, and access data, both in quantity and quality, all of which is required to build state-of-the-art GPAI. This limits the amount of high skilled workers, needed to build the most competitive AI to industry, hindering academia in training the next generation of advanced AI developers.²⁸ As a result, supply is not meeting demand, not least because there is a race to find better engineers, who can discover algorithmic innovations that reduce the amount of compute or data – and costs – required for training.²⁹ SMEs and startups must try to attract talent away from more resourceful incumbents, who can offer bigger employee remunerations.

GPAI models, and generative AI systems, involve fixed costs in development, such as pretraining and fine-tuning compute resources, data collation, and inhouse and outsourced labour, and relatively low marginal costs in deployment.³⁰ These economies of scale are a significant barrier to entry for startups, as they would need to develop and deploy models and systems at scale from the outset in order to compete.³¹ It is usually more realistic for smaller European providers to fine-tune American models into customised models or domain-specific systems that require less compute, data, and labour.³² But this still renders downstream developers and deployers dependent on larger upstream model providers.

The general purpose nature of these AI models and systems allows for versatile and flexible deployment settings, which will increase their uptake throughout diverse industries. For providers, this allows them to spread substantial initial investment spending across these use cases, while their downstream customers will save money by deploying the same capability across different tasks.³³ These economies of scope are a challenging barrier to entry for Big Tech challengers, as they would need to be servicing the same range of sectors in order to compete.³⁴

3) What are the main drivers of competition (i.e., the elements that make a company a successful player) for the provision, distribution or integration of generative AI systems and/or components, including AI models?

The leading GPAI models and generative systems are more performant because they have access to the most or best data, computational resources, and skilled developers. These factors allow them to attract more users; amass more data and capital to purchase more chips; access more cloud infrastructure; develop better models and applications; and, in turn, attract more and better developers.³⁵ OpenAI engineers can make up to $800,000 per year, salaries that no SME or startup, especially in Europe, could afford.³⁶ Importantly, as their models become increasingly capable, doors open up for the leading GPAI providers to monetise and advertise their models, as well as enter into commercial relationships with downstream system developers, which not only provides even greater user-facing visibility, but can also offer access to specialised domain or task specific data that is held by particular downstream parties.³⁷ If they are unable to obtain such unique data from these partnerships, then their increased revenues can be used to purchase it elsewhere.

These network effects are accelerated by data feedback effects, whereby general purpose AI developers leverage data generated from the conversations between the system and its users to advance capabilities.³⁸ While data generated during user interactions is not automatically used to train the model, since developers need to vet feedback for quality and safety, this may change if innovations lead to safe automatic continuous learning post-deployment.³⁹ OpenAI recently announced that ChatGPT will be able to memorise conversations in order to better tailor its responses to user preferences.⁴⁰ The more GPAI developers can refine a model toward their customers, the more useful it will be for customers, who will be less inclined to try out another challenger platform.⁴¹

Even if they mainly use feedback data in aggregate to understand wider trends, this is still a considerable competitive advantage for the most widely used models and systems that can collect the most amount of user data, providing more enriched aggregate analysis. Companies like OpenAI are at a particular advantage because they are present at both the model and system level, allowing them to use system level feedback to improve their model. European GPAI system developers, who will be largely reliant on building their systems upon American GPAI models, would be unable to benefit from this feedback loop, because they would be unable to use the data generated from their system to improve the underlying model. Superior access to resources – capital, computing power, data, and expertise – enables the creation of superior models. These models attract more consumers, resulting in increased revenue. This revenue, in turn, provides access to even better resources, thus perpetuating the cycle of developing high-quality models, asserting market dominance, and the capacity to foreclose competition from challengers.

4) Which competition issues will likely emerge for the provision, distribution, or integration of generative AI systems and/or components, including AI models? Please indicate to which components they relate.

While user feedback may not necessarily be leveraged for marketing services or constructing advertising profiles, enhanced capabilities can improve downstream GPAI services. This enables more precise customisation to consumer preferences, thereby driving adoption rates and establishing user loyalty.⁴² End users and business users will be locked in unless it is sufficiently practical to port data when switching to an alternative. Even with adequate interoperability, they may be discouraged from trying alternatives due to the convenience of accessing all their GPAI and related tools, services, and plug-ins via the one established platform. Such lock-in creates a positive feedback loop for the GPAI provider, positioning the model for further monetisation, as it continues to progressively build a more robust and holistic picture of the user, thereby empowering it to offer more tailored targeting of products, including the provider’s other downstream GPAI services in adjacent markets, such as search, social media, app stores and productivity software.

This grants the provider the power to engage in unfair and abusive practices. Dominance in both the GPAI model and system market coupled with dominance in these adjacent markets allows large incumbents to buttress their dominance in each by bundling their GPAI service with their other services in search, online platforms, or productivity software. Beyond the convenience of a one-stop shop interface, users may be unable to switch if doing so means they would lose access to another tied service. The first-mover advantage of the currently leading GPAI models – GPT-4, Gemini, Llama 2 – allows them to enjoy network effects, and with customer lock-in, switching costs will also act as a barrier to entry for SMEs and startups.

5) How will generative AI systems and/or components, including AI models likely be monetised, and which components will likely capture most of this monetization?

As recognised by Stanford researchers⁴³, when GPAI model providers grant sufficient access to their models to downstream system developers, through an application programming interface (API), they are operating AI as a platform, similar to platform as a service (PaaS) for software, allowing them to access models to adapt to specific user facing GPAI and AI systems, like an app store for app developers. Beyond this, OpenAI, for example, also allows plugin integrations that connect third-party apps to the paid version of ChatGPT (based on GPT-4, not GPT-3.5, as in the free version). This increases capabilities by allowing ChatGPT to retrieve real-time information, proprietary information, and action real-world user instructions.⁴⁴ Plugins empower ChatGPT to act as a platform by enabling it to select options among different providers or present different options to the user.⁴⁵

More recently, OpenAI launched GPT Store⁴⁶, so its non-expert paying users can find and build fine-tuned versions of the ChatGPT GPAI system.⁴⁷ All of this attracts third-party app and plugin developers to OpenAI’s ecosystem, rendering more applications compatible with its main GPAI system, while providing OpenAI with oversight on developments that threaten their offerings.⁴⁸ Smaller plugin providers, in particular, may come to rely on platforms like ChatGPT, the fastest growing consumer application in history⁴⁹, for their user base⁵⁰, but OpenAI may use this position to provide their own competing applications downstream.⁵¹ As OpenAI expands its plug-in offerings, their platform becomes more appealing for plug-in developers, allowing OpenAI to draw in more plug-ins, which increases the amount of consumers, motivates more developers, and makes their platform ever-more appealing.⁵²

6) Do open-source generative AI systems and/or components, including AI models compete effectively with proprietary AI generative systems and/or components? Please elaborate on your answer.

The considerable costs required to develop general purpose AI models from square one and then deploy them at scale, apply equally to closed and open models. While open source licenses offer new entrants more accessibility at the model level (parameters, data, training support), open source models do not address compute concentration in the markets for semiconductors and cloud infrastructures.⁵³ All tahe frontier open source models rely on Big Tech compute⁵⁴: Meta’s Llama 2 runs on Microsoft Azure; UAE-based Technology Innovation Institute’s Falcon 180B model runs on AWS⁵⁵; and Mistral’s Mixtral models runs on Google Cloud⁵⁶. EleutherAI’s GPT-NeoX-20B runs on NVIDIA-backed, AI-focused CoreWeave Cloud⁵⁷, who rent out GPUs at an hourly rate⁵⁸, allowing them to scale from 3 to 14 data centres in 2023⁵⁹, but remains well below Meta and Microsoft, who are NVIDIA’s top GPU customers⁶⁰. Microsoft have committed to billions of dollars in investment in CoreWeave in the coming years to secure access to NVIDIA’s GPUs⁶¹ ahead of their real rivals, AWS and Google Cloud⁶².

At first glance, Meta’s Llama 2 meets the definition of a free and open source license in the recently agreed AI Act, considering that Meta publishes the model parameters, including weights, and information on model architecture and model usage. However, Meta does not publish information on the model training data – precisely why providers of such models are required to do so under the AI Act, regardless of whether they present systemic risks or not. Nevertheless, Meta’s Llama 2 licence⁶³ is not open source⁶⁴, as widely recognised⁶⁵, particularly by the Open Source Initiative⁶⁶, whose open source definition⁶⁷ is the global community benchmark. Meta does not allow developers to use Llama 2 or its outputs to improve any other large language model (LLM), and app developers with more than 700 million monthly active users must request a license from Meta, which Meta is not obliged to grant, presumably if it feels competitively challenged.⁶⁸ By permitting commercial use of Llama 2, on a small and non-threatening scale, Meta leverages unpaid labour to enhance the model’s architecture, enabling it to monetise such improvements, as endorsed by their CEO.⁶⁹

European SMEs and startups will still be highly dependent on the largest chips developers (largely NVIDIA) and cloud providers (Amazon, Microsoft, and Google), as well as the leading AI development frameworks (Meta and Google). This dependence affirms and asserts gatekeepers’ market monitoring powers that can anticipate and foreclose competition from innovative new entrants through self-preferencing or copying.⁷⁰ Even with leveraging open source GPAI models, EU players will still need major funding to train and deploy their GPAI models, if they are to be competitive, which will need to come from EU governments and venture capital firms, if they are not to be bought up by Big Tech. Otherwise, EU GPAI developers will be limited to fine-tuning existing models, open or closed, which does not empower downstream parties to fundamentally alter data and design choices that were shaped upstream.⁷¹

According to Mistral, their latest Mixtral 8x7B model matches or exceeds Meta’s Llama 2 70B and OpenAI’s GPT-3.5 on many performance metrics and is better on maths, code, and multilingual tasks, while using fewer parameters during inference.⁷² By utilising only a portion of the overall parameters per token, it effectively manages costs and reduces latency. It is open source (though this is reasonably disputed)⁷³, released under the Apache 2.0 license, and free for academic and commercial use. Until recently, the European developer’s capacity to develop competitive GPAI models was supported by €385 million, among other funding, including from American venture capital firms, such as Andreessen Horowitz and Lightspeed.⁷⁴ Building on their successes, and seeking to secure their long-term financing and operational viability, they have reached a deal with Microsoft, who will invest €15 million. This allows Mistral to use Microsoft supercomputers to train their GPAI models on Azure and access Microsoft customers for greater distribution of their products, while it allows Microsoft to offer Mistral models as premium features for its customers.⁷⁵ The partnership positions Microsoft with a leading presence in both the open source model market (through Mistral) and closed proprietary model market (through OpenAI).⁷⁶ While Microsoft’s investment in Mistral currently doesn’t confer ownership stake, it could convert to equity in Mistral’s subsequent funding round.⁷⁷

This episode vividly illustrates that when an open source alternative appears to threaten the most well-funded proprietary models, such as GPT-4, those funding the challenged model quickly move in to stake their financial interests in the upstart new entrant, foreclosing competition. Microsoft is hedging its bets in case Mistral’s models should come to outperform those of their other investment, OpenAI, in case open source AI becomes the dominant business model or ecosystem that generates the greatest economic value. While open source holds promise for greater transparency and accessibility, this development underscores that it is incredibly difficult for open source AI models to get off the ground without the financial backing of Big Tech.

It highlights that the AI Act threshold for classifying models as systemic – those models trained on compute using 10^25 or more FLOPS – should not be raised, as desired by industry. During trilogue discussions, and even now, the French government argue that the threshold should be 10^26, in part due to concerns that their national champion, Mistral, would reach the threshold within a year. The deal between Microsoft and Mistral makes it clear that reaching that threshold, which depends on vast resources in cloud computing capacity, requires funding from those with entrenched positions in digital markets.

The partnership has undermined the self-proclaimed⁷⁸ European independence of Mistral.⁷⁹ For EU policymaking, naturally there is concern about conflicts of interest during the AI Act negotiations, as highlighted by Green MEPs in a letter to the Commission⁸⁰, especially considering that this deal was likely also under negotiation over the same period. While this may not reach a threshold of a competition breach or market abuse, the Commission should be concerned when European AI startups, that are able to achieve a certain scale, can only survive through gatekeeper funding. This renders the European AI startup vulnerable to being co-opted as a useful voice or vehicle for Big Tech lobbying that seeks to minimise their compliance burden at the expense of safety for European consumers. For highly capable or impactful open model GPAI, risks are amplified by the inability of the original provider to effectively remediate or recall a dangerous open model after it has been released and downloaded innumerable times. While their inherent transparency may have benefits for accountability, it can also provide malicious actors with access to the model weights, enabling the discovery and exploitation of vulnerabilities or the circumvention of guardrails to generate harmful illegal outputs, including the development of lethal weapons, cyberattacks against critical infrastructure, and electoral manipulation.

7) What is the role of data and what are its relevant characteristics for the provision of generative AI systems and/or components, including AI models?

While publicly available data is still broadly accessible to new entrants, public data can be low quality, leading to less capable and even dangerous models.⁸¹ Stanford researchers found that one of the leading public datasets, LAION-5B, includes thousands of images of child sexual abuse material.⁸² Lensa, an image generator built on top of the GPAI model Stable Diffusion, which is trained on LAION-5B, was found to create realistic sexualised and nude avatars of women, particularly from traditionally marginalised communities, with less propensity to do the same in male renditions when prompted.⁸³

Proprietary datasets can offer more specialised and unique data that will give a model a deeper understanding of the world.⁸⁴ This not only makes a model more capable, but also allows it to be more easily aligned with our interests since it can understand us better in theory. This mitigates biases and inaccuracies within models, generating trust and encouraging adoption, thereby facilitating positive feedback loops for those with the best data. Big Tech’s accumulated data banks – both personal data from their B2C markets and non-personal data from their B2B/B2G markets – gives them an edge, as they have access to the public datasets that new entrants would, as well as their own enormous proprietary datasets which are closed off to new entrants.⁸⁵ High quality proprietary data is often held in downstream companies that specialise in a certain domain and have gathered data on that domain’s customers.⁸⁶ Google’s $2.1 billion acquisition of Fitbit gives them millions of users’ health data, which has been tracked for over a decade, as well as access to Fitbit’s insurance partners.⁸⁷ This allows Google to leverage this wealth of proprietary data if they seek to fit their GPAI models with health features tailored to their users, giving them a competitive edge over models without this capability. Such an acquisition is beyond the reach of European startups.

The innovation gap is widened further by Big Tech’s greater experience in developing models, housing the best expertise, and scraping, labelling, and analysing data. Moreover, search engine providers, namely Google and Microsoft⁸⁸, can leverage public data more effectively by using web indexes to filter out meaningless or useless information, leaving behind the high quality public data, which may be a more efficient process since the web data is more vast than proprietary datasets.⁸⁹ One way European SMEs and startups could catch up is through algorithmic innovations that can do more with less data, but this requires access to the best talent, which is another increase in costs. The current competitive frontier goes even further in that ChatGPT and Gemini will compete on how many other systems they are connected to, providing them with continuous real-time up-to-date data.

Successes in commercial GPAI and its vast potential across innumerable use cases have also led to data providers seeking to cash in on the AI gold rush by monetising their data for AI training.⁹⁰ When data was free and easy to access, the current GPAI model leaders got in early.⁹¹ As content creators, or the platforms on which such content is hosted, restrict access, or seek remuneration, new entrants may face barriers to entry with prohibitively costly data on top of exorbitant compute costs. If legislation and judicial rulings reassert rightsholders’ intellectual property rights, public data could become increasingly scarce, pushing the price up further.⁹² European SMEs and startups could turn to synthetic data as an alternative to proprietary data, but more resources on compute are needed to generate such artificial information.⁹³ Saving on data pushes costs up elsewhere. Using AI models to generate data for future models can transfer errors and bugs from the old model to the new one.⁹⁴

8) What is the role of interoperability in the provision of generative AI systems and/or components, including AI models? Is the lack of interoperability between components a risk to effective competition?

The concentrated cloud market, upon which GPAI is developed and deployed, combined with the lack of interoperability between AWS, Azure, and Google Cloud Platform, provides single points of failure that could be disruptive and destabilising across sectors, given the market share of these three providers.⁹⁵ As single failure points, they are an attractive target for cyberattacks by malicious actors.⁹⁶ If such an attack were successful, it would cut off not only the cloud infrastructure platform, but also the general purpose AI model and the many downstream generative AI systems deriving from it that run on the same cloud. The lack of interoperability means critical applications, such as those in defence, health, or finance, cannot be easily migrated to another cloud provider in order to get them up and running again.⁹⁷ In a scenario where a hostile nation or well-funded terrorist group penetrates a single point of failure to cripple critical services, a full blown assault on both private and public databases could not only cause widespread disruption, but it may also be difficult to detect, making it all the more challenging to restore organisational data to a safe and reliable state.⁹⁸ Concentrations at the model level can also produce similar security risks given that any vulnerability in a model upon which user-facing applications are built could produce systemic hazards, exacerbated by emergent capabilities that can develop unpredictably with further fine-tuning at the system level.⁹⁹

9) Do the vertically integrated companies, which provide several components along the value chain of generative AI systems (including user facing applications and plug-ins), enjoy an advantage compared to other companies? Please elaborate on your answer.

General purpose AI models and generative AI systems that are plugged into third-party applications can operate as platforms, or they can be vertically integrated into another platform.¹⁰⁰ To catch up with Google, Microsoft integrated GPT-4 as Bing Chat into its platforms, Bing search and Edge¹⁰¹, and as Copilot into its 365 productivity suite.¹⁰² As a response, Google is testing its integration of generative AI into Google Search in the US¹⁰³ – Search Generative Experience (SGE)¹⁰⁴ – which allows Google to leverage an adjacent market, bolstering its advertising revenues and strengthening its grip on online website traffic.¹⁰⁵ This illustrates a transition from search engines to answer engines.¹⁰⁶ Users may be less inclined to visit third-party websites, provided as citations or footnotes, since the answer is already in the chatbot interface, to which advertisers turn their attention at the expense of third-party websites.¹⁰⁷ This could allow Google’s generative search engine to benefit from the intellectual property of others, whose data is fed into the generative interface, not only without compensation, but also without the usual user traffic to their site.

For users, and society at large, reliance on generative search engines risks reducing the accuracy of information, as it is difficult to distinguish between outputs derived from the training data and those from the search results, and the hidden hallucinations therein.¹⁰⁸ Stanford found that users’ perception of utility is inversely correlated with the precision of citations purporting to support claims made by the generative search/answer engine.¹⁰⁹ While Bing Chat achieves the highest citation precision rate, users find it the least useful, whereas YouChat has the lowest citation precision rate, but users deem it the most useful. Given that upstream GPAI models are likely to significantly augment the user experience or perceived utility, if not accuracy, of downstream search or answer engines, users will be increasingly drawn to these platforms¹¹⁰, which will be a barrier for entry for GPAI model providers that don’t compete on both nodes of the value chain by only offering GPAI models, but not GPAI-infused search engines.¹¹¹

Google is present throughout the entire AI value chain¹¹²: it produces its own semiconductors (TPUs), hosts its own cloud infrastructure (Google Cloud), develops its own GPAI models (PaLM-2 and Gemini), provides GPAI systems (Gemini, so users can interact directly with the model)¹¹³ and integrates those systems into its platforms (Search Generative Experience). From these platforms, it also gathers data that can be used to improve future iterations of Gemini, increasing the model’s capabilities and utility. The revenues can also be used to purchase more compute, data, and talent. Vertically integrated corporations will have easier access to unique data, such as conversations between users on their platforms.¹¹⁴

Vertical integration risks institutionalising unbreakable tech oligopolies, hindering innovative efforts of European SMEs and startups, weakening consumer choice, and inflating the cost of gatekeeper services beyond their value, either in subscription charges or data extraction. While NVIDIA is currently leading on GPUs, Microsoft, Google and Meta are all seeking to compete by developing their own chips for AI.¹¹⁵ If Microsoft or Google were to overtake NVIDIA, their vertical integration, either legally or in practice, from semiconductors (Microsoft’s ; Google’s TPUs) to cloud (AWS; GCP) to models (GPT-4; Gemini) to systems (Bing, ChatGPT, Copilot; Gemini) could ensure that AI development becomes a two-horse race, as it would be incredibly difficult, if not impossible, for challengers to compete at each level of that value chain. In this scenario, Microsoft or Google could then engage in unfair and abusive practices, such as limiting the access of GPAI model and system competitors to key ingredients like chips and cloud infrastructure.

Microsoft’s CEO, Satya Nadella, claims¹¹⁶ that his company’s partnership with OpenAI challenges vertically integrated companies like Google.¹¹⁷ Yet, concerns are mounting that the partnership may amount to a decisive influence under Article 3 of the Mergers Control Regulation, given their exclusivity arrangements, as well as the successful pressure Microsoft put on OpenAI’s board to reinstate their fired CEO, Sam Altman, including offering him and other OpenAI staff employment.¹¹⁸This raises questions about OpenAI’s ability to operate independently and be considered a separate company that is not vertically integrated with Microsoft “in spirit”, if not in narrow legal terms. The grey-area manoeuvrings of Altman’s firing and rehiring illuminate that Microsoft can control OpenAI, without acquiring it or breaking antitrust or merger laws, by leveraging its exclusive access to their leading GPAI models and the scaleups access to gatekeeper’s compute – an arrangement that prohibits OpenAI from migrating their models to other cloud providers .¹¹⁹

10) What is the rationale of the investments and/or acquisitions of large companies in small providers of generative AI systems and/or components, including AI models? How will they affect competition?

Cloud providers typically prefer to create partnerships with established GPAI providers, affording the latter preferential access to scare computational resources and investment opportunities.¹²⁰ This is cheaper for the GPAI developer than paying access via on-demand rates or via upfront or subscription charges, let alone building their own data centre. OpenAI must use Azure, while Microsoft can integrate OpenAI products across all its offerings¹²¹, with priority access.¹²²

11) Do you expect the emergence of generative AI systems and/or components, including AI models to trigger the need to adapt EU legal antitrust concepts?

While the Digital Markets Act (DMA) is not strictly an antitrust instrument, it does seek to ensure open digital markets and to provide an additional lever in the Commission toolkit for lengthening antitrust investigations. Although the DMA does not explicitly cover AI, generative AI should be in-scope when it is integrated into a core platform service.¹²³

At the infrastructural level in GPAI model and system development and deployment, cloud computing is already listed as a core platform service. However, none have been designated at the time of writing, primarily due to hyperscalers not meeting the quantitative thresholds given that they don’t technically have end-users, according to the DMA definition.¹²⁴ There is recognition that business users may also be counted as end users when they use cloud computing services for their own purposes (Recital 14 of the DMA). This should be included when counting active end users of cloud computing services, given that AI labs such as OpenAI and Anthropic (and the many other businesses fine-tuning their GPAI models via an API that is run on cloud services) might all be considered end-users¹²⁵ of Azure, Google Cloud Platform and AWS, rather than business users¹²⁶, based on DMA definitions. This could mean that the cloud services of hyperscalers would be designated as core platform services, and would thereby ensure that the oligopolist cloud market is tackled by EU ex-ante regulation, rather than complaints brought by cloud service challengers that would struggle to afford lengthy legal proceedings.

As in the Commission’s initial DMA impact assessment¹²⁷, the end user and business user definitions should equally cover infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS) in order to avoid loopholes in the DMA’s application. If this is not already the case, the Commission should amend and update the methodology and list of indicators in the Annex of the DMA through delegated acts (Article 3(7)) to ensure the DMA can mitigate further concentrations in the cloud market, which underpins GPAI development and deployment. To ensure the DMA remains fit for purpose given the rapid advances in AI, as well as API as an intermediation platform, the Commission should consider whether they have the legal basis to update the list of core platform services to accommodate GPAI models and systems.¹²⁸

According to reports, Microsoft has cautioned competing search engines that it will terminate licenses providing access to its Bing search index if they continue to use it for generative AI chat development.¹²⁹ Article 6(2) of the DMA prohibits gatekeepers from using non-publicly available data, generated by business users or their customers using the core platform service, to compete with those business users. This could help to ensure that GPAI model providers are hindered in preventing GPAI system developers, dependent on the model provider for API access, from competing using data generated through their use of the cloud service.¹³⁰ Although Bing has not been designated, it may reach the threshold if its integration with OpenAI models makes it more attractive to end users and business users.

Given their foundational and systemic function across the economy and society, large cloud computing and GPAI model providers should be regulated like public utilities, adhering to similar rules on non-discrimination, equitable treatment of all customers, and assurance of safety and security.¹³¹ Since antitrust law primarily seeks to address monopolies, public utility framing is critical, as the oligopoly in the cloud market may make the AI market more concentrated in the coming years.¹³²

The Commission could also consider the feasibility of structural separation to prohibit companies from owning both GPAI models and other technologies and platforms that enable them to engage in unfair and abusive practices.¹³³ While this could be achieved through antitrust law, it typically requires a lengthy investigation process, which means that regulation may be more viable. As in the AI Act, at present, the amount of compute used during training is one of the best ways of quantifying a GPAI model’s impact and capabilities. Based on the current state of the art, the Commission could use compute as a proxy for determining the largest market players in order to apply structural remedies that would mitigate market failures.¹³⁴

12) Do you expect the emergence of generative AI systems to trigger the need to adapt EU antitrust investigation tools and practices?

Notwithstanding the already increased scrutiny of competition authorities towards issues related to the digital economy in recent years¹³⁵, detecting and assessing potential competition law infringements will become increasingly complex. Such complexity is particularly pronounced when facing companies with business models that deploy network effects or benefit from ecosystems¹³⁶, which generate and collect data to enhance value. This data allows companies to refine algorithms and services, which subsequently increases their value on markets. Their business models often use GPAI models and systems, blockchain, IoT, robotics, algorithms, and machine learning¹³⁷ to offer services, such as providing search results (Google), recommending products (Amazon) or accommodation (Airbnb).¹³⁸ These digital platforms centered around data are changing competitive dynamics rapidly, posing considerable challenges for competition authorities.

As a result, the current competition law enforcement framework and tools will be under pressure. It might be necessary to account for increasingly more diverse parameters, beyond the traditional focus on output and prices. For example, in fast-moving and dynamic markets powered by AI, competition authorities will be required to capture and understand data more quickly. In addition, in the context of devising a market definition, which has become more complex for digital platforms, the traditional SSNIP test may no longer suffice. Similarly, while the EU Merger Regulation can be somewhat adapted, it doesn’t adequately capture collaborations where companies like Microsoft partner with, and provide minority investments, in other parties (such as OpenAI) gaining influence and control without outright ownership.¹³⁹ If it is not possible to tackle these kinds of relationships, of vertical outsourcing rather than vertical integration, then reassessment and revision of the Merger Regulation is needed.

GPAI also enables companies to engage in new kinds of anticompetitive behaviour (see also the answer to question 4). For example, algorithms enable companies to automatically monitor the prices of competitors in real time and then re-price (algorithmic collusion). Companies with substantial market power in a certain market, may use GPAI to reinforce their market power in another market or to exclude competitors (as seen in the Google Shopping Case¹⁴⁰).

In view of the transformations and advancements stemming from the emergence and deployment of GPAI, there is a significant risk that competition authorities may struggle to grasp companies’ behaviour and market dynamics in a timely manner in order to prevent anti-competitive effects from taking place. Considering that the European Commission directly enforces EU competition rules and possesses an extensive toolkit for antitrust investigations, it is imperative to bolster enforcement tools and reevaluate how competition is analyzed to ensure EU competition policy remains future proof. By fostering a competitive GPAI market and value chain, other regulations – such as the AI Act, the Product Liability Directive, the forthcoming AI Liability Directive, the Data Act, the GDPR etc. – become more enforceable. Monopolists and oligopolists should not become too big to regulate, treating fines for non-compliance with these EU laws as operating expenses.¹⁴¹ Better compliance improves AI safety, fostering trust, and accelerating adoption of beneficial AI across the EU, while levelling the playing field for innovative European AI startups to offer competitive alternatives.

^{↩ 1} Narechania and Sitaraman. “An Antimonopoly Approach to Governing AI”.

^{↩ 2} UK CMA. “AI Foundation Models Initial Report”.

^{↩ 3} Vipra and Korinek. “Market concentration implications of foundation models: The invisible hand of ChatGPT”.

^{↩ 4} Malik. “OpenAI’s ChatGPT now has 100 million weekly active users”.

^{↩ 5} Stringer, Wiggers, and Corrall. “ChatGPT: Everything you need to know about the AI-powered chatbot”.

^{↩ 6} Lynn, von Thun, and Montoya. “AI in the Public Interest: Confronting the Monopoly Threat”.

^{↩ 7} Carugati. “The Generative AI Challenges for Competition Authorities”.

^{↩ 8} Techo Vedas. “NVIDIA has 90% of the AI GPU Market Share; 1.5 to 2 million AI GPUs to be sold by NVIDIA in 2024”.

^{↩ 9} Statista. “Semiconductor foundries revenue share worldwide from 2019 to 2023, by quarter”.

^{↩ 10} Whittaker, Widder, and West. “Open (For Business): Big Tech, Concentrated Power, and the Political Economy of Open AI”.

^{↩ 11} Field. “OpenAI CEO Sam Altman seeks as much as $7 trillion for new AI chip project: Report”.

^{↩ 12} Informed by discussion with Friso Bostoen, Assistant Professor of Competition Law and Digital Regulation at Tilburg University.

^{↩ 13} AI Now Institute. “Computational Power and AI”.

^{↩ 14} Statista. “Amazon Maintains Cloud Lead as Microsoft Edges Closer”.

^{↩ 15} Narechania and Sitaraman. “An Antimonopoly Approach to Governing AI”.

^{↩ 16} Belfield and Hua. “Compute and Antitrust”.

^{↩ 17} Narechania and Sitaraman. “An Antimonopoly Approach to Governing AI”.

^{↩ 18} Bornstein, Appenzeller and Casado. “Who Owns the Generative AI Platform?”

^{↩ 19} UK CMA. “AI Foundation Models Initial Report”.

^{↩ 20} Lynn, von Thun, and Montoya. “AI in the Public Interest: Confronting the Monopoly Threat”.

^{↩ 21} Kak, Myers West, and Whittaker. “Make no mistake – AI is owned by Big Tech”.

^{↩ 22} Giziński et al. “Big Tech influence over AI research revisited: memetic analysis of attribution of ideas to affiliation.”

^{↩ 23} Whittaker, Widder, and West. “Open (For Business): Big Tech, Concentrated Power, and the Political Economy of Open AI”.

^{↩ 24} Economist. “Could OpenAI be the next tech giant?”.

^{↩ 25} Savanta. “European cloud customers affected by restrictive licensing terms for existing on-premise software, new research finds”.

^{↩ 26} UK CMA. “AI Foundation Models Initial Report”.

^{↩ 27} Standford University HAI. “AI Index Report 2023”.

^{↩ 28} UK CMA. “AI Foundation Models Initial Report”.

^{↩ 29} Vipra and Korinek. “Market concentration implications of foundation models: The invisible hand of ChatGPT”.

^{↩ 30} Ada Lovelace Institute. “Foundation models in the public sector”.

^{↩ 31} Vipra and Korinek. “Market concentration implications of foundation models: The invisible hand of ChatGPT”.

^{↩ 32} Leicht. “The Economic Case for Foundation Model Regulation”.

^{↩ 33} Vipra and Korinek. “Market concentration implications of foundation models: The invisible hand of ChatGPT”.

^{↩ 34} UK CMA. “AI Foundation Models Initial Report”.

^{↩ 35} Hausfeld. “ChatGPT, Bard & Co.: an introduction to AI for competition and regulatory lawyers”.

^{↩ 36} Constantz. “OpenAI Engineers Earning $800,000 a Year Turn Rare Skillset Into Leverage”.

^{↩ 37} Schrepel and Pentland. “Competition between AI Foundation Models: Dynamics and Policy Recommendations”.

^{↩ 38} OpenAI. “How your data is used to improve model performance”.

^{↩ 39} UK CMA. “AI Foundational Models Initial Report”.

^{↩ 40} OpenAI. “Memory and new controls for ChatGPT”.

^{↩ 41} UK CMA. “AI Foundational Models Initial Report”.

^{↩ 42} Ibid.

^{↩ 43} Stanford University HAI. “AI Accountability Policy Request for Comment”.

^{↩ 44} OpenAI. “Chat Plugins”.

^{↩ 45} Vipra and Korinek. “Market concentration implications of foundation models: The invisible hand of ChatGPT”.

^{↩ 46} OpenAI. “Introducing the GPT Store”.

^{↩ 47} Sentance. “The GPT Store isn’t ChatGPT’s ‘app store’ – but it’s still significant for marketers”.

^{↩ 48} OpenAI. “ChatGPT plugins”.

^{↩ 49} Reuters. “ChatGPT sets record for fastest-growing user base – analyst note”.

^{↩ 50} UK CMA. “AI Foundational Models Initial Report”.

^{↩ 51} Narechania and Sitaraman. “An Antimonopoly Approach to Governing AI”.

^{↩ 52} UK CMA. “AI Foundational Models Initial Report”.

^{↩ 53} Narechania and Sitaraman. “An Antimonopoly Approach to Governing AI”.

^{↩ 54} Whittaker, Widder, and West. “Open (For Business): Big Tech, Concentrated Power, and the Political Economy of Open AI”.

^{↩ 55} Jackson. “TII trains state-of-the-art LLM, Falcon 40B, on AWS model”.

^{↩ 56} Reuters. “Google Cloud partners with Mistral AI on generative language models”.

^{↩ 57} Hjelm. “Looking Ahead to 2023: How CoreWeave Is Using NVIDIA GPUs to Advance the New Era of AI and Machine Learning”.

^{↩ 58} Krazit. “How CoreWeave went all-in on Nvidia to take on Big Cloud”.

^{↩ 59} Economist. “Data centres improved greatly in energy efficiency as they grew massively larger”.

^{↩ 60} Elder. “Sell Nvidia”.

^{↩ 61} Novet. “Microsoft signs deal for A.I. computing power with Nvidia-backed CoreWeave that could be worth billions”.

^{↩ 62} Haranas. “Microsoft’s CoreWeave Deal ‘Adds AI Pressure’ To AWS, Google”.

^{↩ 63} Meta. “Request access to Llama”.

^{↩ 64} OpenUK. “State of Open: The UK in 2024 Phase One AI and Open Innovation”.

^{↩ 65} Tarkowski. “The Mirage of Open-source AI: Analysing Meta’S LLaMa 2 release strategy”.

^{↩ 66} Open Source Initiative. “Meta’s LLaMa 2 license is not Open Source”.

^{↩ 67} Open Source Initiative. “The Open Source Definition”.

^{↩ 68} OpenSource Connections. “Is Llama 2 open source? No – and perhaps we need a new definition of open…”.

^{↩ 69} Whittaker, Widder, and West. “Open (For Business): Big Tech, Concentrated Power, and the Political Economy of Open AI”.

^{↩ 70} Lynn, von Thun, and Montoya. “AI in the Public Interest: Confronting the Monopoly Threat”.

^{↩ 71} Whittaker, Widder, and West. “Open (For Business): Big Tech, Concentrated Power, and the Political Economy of Open AI”.

^{↩ 72} Jiang et al. “Mixtral of Experts”.

^{↩ 73} Robertson. “France’s Mistral takes a victory lap”.

^{↩ 74} Volpicelli. “Microsoft’s AI deal with France’s Mistral faces EU scrutiny”.

^{↩ 75} Volpicelli. “European lawmakers question Commission on Microsoft-Mistral AI deal”.

^{↩ 76} Murgia. “Microsoft strikes deal with Mistral in push beyond OpenAI”

^{↩ 77} Coulter and Yun Chee. “Microsoft’s deal with Mistral AI faces EU scrutiny”.

^{↩ 78} Mensch. X (Twitter) post.

^{↩ 79} Zenner. “Microsoft-Mistral partnership and the EU AI Act”

^{↩ 80} Volpicelli. “European lawmakers question Commission on Microsoft-Mistral AI deal”.

^{↩ 81} UK CMA. “AI Foundation Models Initial Report”.

^{↩ 82} Ropek. “An Influential AI Dataset Contains Thousands of Suspected Child Sexual Abuse Images”.

^{↩ 83} Heikkilä. “The viral AI avatar app Lensa undressed me without my consent.”

^{↩ 84} UK CMA. “AI Foundation Models Initial Report”.

^{↩ 85} Lynn, von Thun, and Montoya. “AI in the Public Interest: Confronting the Monopoly Threat”.

^{↩ 86} Vipra and Korinek. “Market concentration implications of foundation models: The invisible hand of ChatGPT”.

^{↩ 87} Austin. “The Real Reason Google Is Buying Fitbit”.

^{↩ 88} Hausfeld. “ChatGPT, Bard & Co.: an introduction to AI for competition and regulatory lawyers”.

^{↩ 89} UK CMA. “AI Foundation Models Initial Report”.

^{↩ 90} Narechania and Sitaraman. “An Antimonopoly Approach to Governing AI”.

^{↩ 91} Lynn, von Thun, and Montoya. “AI in the Public Interest: Confronting the Monopoly Threat”.

^{↩ 92} UK CMA. “AI Foundation Models Initial Report”.

^{↩ 93} Vipra and Korinek. “Market concentration implications of foundation models: The invisible hand of ChatGPT”.

^{↩ 94} UK CMA. “AI Foundation Models Initial Report”.

^{↩ 95} Narechania and Sitaraman. “An Antimonopoly Approach to Governing AI”.

^{↩ 96} Lutkevich. “Foundation models explained: Everything you need to know”.

^{↩ 97} Narechania and Sitaraman. “An Antimonopoly Approach to Governing AI”.

^{↩ 98} World Economic Forum. “Understanding Systemic Cyber Risk”.

^{↩ 99} Vipra and Korinek. “Market concentration implications of foundation models: The invisible hand of ChatGPT”.

^{↩ 100} Ibid.

^{↩ 101} Techsyn. “Microsoft Integrates OpenAI’s GPT-4 Model Into Bing For A Powerful Search Experience”.

^{↩ 102} Sullivan. “Inside Microsoft’s sprint to integrate OpenAI’s GPT-4 into its 365 app suite”.

^{↩ 103} Google. “Supercharging Search with generative AI”.

^{↩ 104} Google. “Search Labs”.

^{↩ 105} Lynn, von Thun, and Montoya. “AI in the Public Interest: Confronting the Monopoly Threat”.

^{↩ 106} Carugati. “Competition in Generative AI Foundation Models”.

^{↩ 107} Carugati. “The Generative AI Challenges for Competition Authorities”.

^{↩ 108} Miller. “Generative Search Engines: Beware the Facade of Trustworthiness”.

^{↩ 109} Liu, Zhang, and Liang. “Evaluating Verifiability in Generative Search Engines”.

^{↩ 110} Vipra and Korinek. “Market concentration implications of foundation models: The invisible hand of ChatGPT”.

^{↩ 111} UK CMA. “AI Foundation Models Initial Report”.

^{↩ 112} Narechania and Sitaraman. “An Antimonopoly Approach to Governing AI”.

^{↩ 113} Google. “Bard becomes Gemini: Try Ultra 1.0 and a new mobile app today”.

^{↩ 114} UK CMA. “AI Foundation Models Initial Report”.

^{↩ 115} David. “Chip race: Microsoft, Meta, Google, and Nvidia battle it out for AI chip supremacy”.v

^{↩ 116} Hartmann. “Microsoft CEO defends OpenAI’s ‘partnership’ amid EU, UK regulators’ scrutiny”.

^{↩ 117} Smith. “Microsoft’s AI Access Principles: Our commitments to promote innovation and competition in the new AI economy”.

^{↩ 118} Irish Council for Civil Liberties et al. “Submission to European Commission on Microsoft-OpenAI “partnership” merger inquiry”.

^{↩ 119} Callaci. “The Antitrust Lessons of the OpenAI Saga”.

^{↩ 120} UK CMA. “AI Foundation Models Initial Report”.

^{↩ 121} Lynn, von Thun, and Montoya. “AI in the Public Interest: Confronting the Monopoly Threat”.

^{↩ 122} Irish Council for Civil Liberties et al. “Submission to European Commission on Microsoft-OpenAI “partnership” merger inquiry”.

^{↩ 123} Informed by discussion with Friso Bostoen, Assistant Professor of Competition Law and Digital Regulation at Tilburg University.

^{↩ 124} Abecasis et al. “6 reflections on the recent designation of gatekeepers under the DMA”.

^{↩ 125} Digital Markets Act definition of active end users for cloud computing: “Number of unique end users who engaged with any cloud computing services from the relevant provider of cloud computing services at least once in the month, in return for any type of remuneration, regardless of whether this remuneration occurs in the same month.”

^{↩ 126} Digital Markets Act definition of active business users for cloud computing: “Number of unique business users who provided any cloud computing services hosted in the cloud infrastructure of the relevant provider of cloud computing services during the year. “

^{↩ 127} European Commission. “Impact assessment of the Digital Markets Act 1/2”: “Cloud services . . . provide infrastructure to support and enable functionality in services offered by others and at the same time offer a range of products and services across multiple sectors, and mediate many areas of society. . . They benefit from strong economies of scale (associated to a high fixed cost and minimal marginal costs) and high switching costs (associated to the integration of business users in the cloud). The vertical integration of the large cloud services providers and the business model they deploy has contributed to further concentration on the market, where it is very difficult for other less-integrated players, or market actors operating in just one market segment to compete. Consequently, these startups are likely to be completely reliant on large online platform companies.”

^{↩ 128} von Thun. “EU does not need to wait for the AI Act to act”.

^{↩ 129} Dixit. “Microsoft reportedly threatens to cut-off Bing search data access to rival AI chat products”.

^{↩ 130} Yasar et al. “AI and the EU Digital Markets Act: Addressing the Risks of Bigness in Generative AI”.

^{↩ 131} Lynn, von Thun, and Montoya. “AI in the Public Interest: Confronting the Monopoly Threat”.

^{↩ 132} Informed by discussion with Friso Bostoen, Assistant Professor of Competition Law and Digital Regulation at Tilburg University.

^{↩ 133} von Thun. “After Years of Leading the Charge Against Big Tech Dominance, is the EU Falling Behind?”

^{↩ 134} Belfield and Hua. “Compute and Antitrust”.

^{↩ 135} Google Android decision; Apple Pay Investigation; Apple App Store investigation; Amazon’s use of marketplace seller data Investigation.

^{↩ 136} Lianos: Hellenic Competition Commission and BRICS Competition Law and Policy Centre. “Computational Competition Law and Economics: An Inception Report”..

^{↩ 137} Schrepel. “Collusion by Blockchain and Smart Contracts”.

^{↩ 138} Iansiti and Lakhani. “From Disruption to Collision: The New Competitive Dynamics”.

^{↩ 139} Lynn, von Thun, and Montoya. “AI in the Public Interest: Confronting the Monopoly Threat”.

^{↩ 140} T-612/17 – Google and Alphabet v Commission (Google Shopping)

^{↩ 141} Lynn, von Thun, and Montoya. “AI in the Public Interest: Confronting the Monopoly Threat”.

Manifesto for the 2024-2029 European Commission

Taylor Jones — Wed, 13 Mar 2024 20:50:59 +0000

The Future of Life Institute (FLI) is an independent nonprofit organization with the goal of reducing large-scale risks and steering transformative technologies to benefit humanity, with a particular focus on artificial intelligence. Since its founding ten years ago, FLI has taken a leading role in advancing key disciplines such as AI governance, AI safety, and trustworthy and responsible AI, and is widely considered to be among the first civil society actors focused on these issues. FLI was responsible for convening the first major conference on AI safety in Puerto Rico in 2015, and for publishing the Asilomar AI principles, one of the earliest and most influential frameworks for the governance of artificial intelligence, in 2017. FLI is the UN Secretary General’s designated civil society organization for recommendations on the governance of AI and has played a central role in deliberations regarding the EU AI Act’s treatment of risks from AI. FLI has also worked actively within the United States on legislation and executive directives concerning AI. Members of our team have contributed extensive feedback to the development of the NIST AI Risk Management Framework, testified at Senate AI Insight Forums, participated in the UK AI Summit, and connected leading experts in the policy and technical domains to policymakers across the US government.

Europe must lead the way on innovating trustworthy AI

Policy recommendations for the next EU mandate

The rapid evolution of technology, particularly in artificial intelligence (AI), plays a pivotal role in shaping today’s Europe.

As AI capabilities continue to advance at an accelerated pace, the imperative to address the associated dangers becomes increasingly urgent. Europe’s future security is intricately linked to the formulation and implementation of measures that effectively mitigate the risks posed by AI technologies.

Myopic policies which fail to anticipate the possibly catastrophic risks posed by AI must be replaced with strategies that effectively combat emergent risks. Europe must continue leading the way on AI governance, as it has repeatedly shown that its digital policies create global ripple effects. Europe must seize this opportunity to ensure deployment of AI aligns with ethical considerations and prioritises the safety of individuals and societies.

Key Recommendations

Ensure that the AI Office is robust and has the ability to perform the tasks it has been set.
Reboot the AI Liability directive to safeguard against unchecked risks and ensure accountability.
Actively involve civil society organisations in the drafting of the Codes of Practice.
Issue clear, concise, and implementable AI Act guidance.
Proactively foster international collaboration.
Build relationships with national competent authorities and ensure seamless collaboration on enforcement.
Secure the future of AI regulation by addressing the AI Office funding challenge.

The AI Act is a done deal. Now it’s time to implement it.

With the historic adoption of the AI Act, the world’s first comprehensive hard-law regulation of AI, the focus will shift to its effective implementation and enforcement. This also necessitates renewed attention to complementary legislation, particularly the AI Liability Directive (AILD), to establish a holistic regulatory framework and solidify the EU’s position as a global leader. Prioritising the following areas will ensure that the shared goal of trustworthy, innovative, and safe AI is achieved:

i. Ensure that the AI Office is robust and has the ability to perform the tasks it has been set.

To ensure the robustness and efficacy of the AI Office within the European Commission, a series of strategic recommendations should be implemented. Firstly, offering competitive salaries to attract and retain top talent is essential. Adequate remuneration not only motivates technical experts who are usually captured by industry but also reflects the value placed on their expertise. Moreover, appointing leaders who possess a deep understanding of AI technologies and the risks they pose is crucial in order to articulate the mission and objectives of the AI Office, garnering support and engagement from stakeholders within and outside the Commission.

Additionally, facilitating secondments from industry and civil society organisations, as the UK AI Safety Institute has done, can bring diverse perspectives and experiences to the AI Office, within the context of limited resources. Temporary exchanges of personnel allow for knowledge transfer and collaboration, enriching the office’s monitoring and enforcement capabilities.

Furthermore, seamless collaboration between governance and technical teams, supported by effective leadership, operations, and human resources management, is paramount. Mirroring the range of roles and salaries made available by entities like the UK AI Safety Institute, the AI Office must provide sufficient incentives to attract experts who will further the Office’s goals, as prescribed by the AI Act.

ii. Reboot the AI Liability Directive to safeguard against unchecked risks and ensure accountability.

As the EU moves past the elections, it’s necessary to resume work on the AI Liability Directive (AILD). The explosive growth of AI across manufacturing, healthcare, finance, agriculture and beyond demands a robust legal framework that provides victims with recourse for damages caused by AI, and thereby incentivises responsible development and deployment. Current Union fragmentation, resulting from disparate AI liability regimes, leaves citizens vulnerable under less protective liability approaches at the national level. It also leads to legal uncertainty that hinders European competitiveness and inhibits start-ups from scaling up across national markets.

The AILD would enable customers, both businesses and citizens, to understand which AI providers are reliable, creating an environment of trust that facilitates uptake. By establishing clear rules for different risk profiles – from strict liability for systemic GPAI models to fault-based liability for others – we can foster fairness and accountability within the AI ecosystem. As these frontier GPAI systems have the most advanced capabilities, they present a diverse range of potential and sometimes unpredictable harms, leading to informational asymmetries which disempower potential claimants. Moreover, the necessary level of care and the acceptable level of risk may be too difficult for the judiciary to determine in view of how rapidly the most capable GPAI systems are evolving.

Re-engaging with the Directive reaffirms the EU’s position as a global leader in AI regulation, complementing the AI Act and PLD to create a holistic governance framework. The implementation of harmonised compensatory measures, covering both immaterial and societal damages, ensures uniform protection for victims throughout the EU. By addressing liability comprehensively and fairly, the AI Liability Directive can unlock the immense potential of AI for good while mitigating its risks. This is not just about regulating technology, but about shaping a future where AI empowers humanity, guided by principles of responsibility, trust, and the protection of individuals and society.

See FLI’s position paper on the proposed AI Liability Directive here.

iii. Actively involve civil society organisations in the drafting of Codes of Practice.

It is essential for the Commission to actively involve civil society groups in the formulation of Codes of Practice, as sanctioned by Article 52e(3) and Recital 60s of the AI Act. Both are ambivalent about civil society’s role, stating that civil society “may support the process” with the AI Office, which can consult civil society “where appropriate”. Collaborating with civil society organisations on the drafting of Codes of Practice is crucial to ensure that the guidelines reflect the state of the art and consider a diverse array of perspectives. More importantly, Codes of Practice will be relied upon up to the point that standards are developed, a process which is itself far from being concluded. It is therefore crucial that the Codes of Practice accurately reflect the neutral spirit of the AI Act and are not co-opted by industry in an effort to reduce their duties under the AI Act.

Civil society groups also often possess valuable expertise and insights, representing the interests of the wider public and offering unique viewpoints on the technical, economic, and social dimensions of various provisions. Inclusion of these stakeholders not only enhances the comprehensiveness and credibility of the Codes of Practice, but also fosters a more inclusive and democratic decision-making process. By tapping into the wealth of knowledge within civil society, the Commission can create a regulatory framework that is not only technically robust but also aligned with European values, reinforcing the commitment to responsible and accountable AI development within the EU.

iv. Issue clear, concise, and implementable AI Act guidance.

Another key goal for the new Commission and AI Office is to commit to issuing timely, concise, and implementable guidance on AI Act obligations. Drawing from lessons learned with the implementation of past Regulations, such as the GDPR, where extensive guidance documents became cumbersome and challenging even for experts, the focus should be on creating guidance that is clear, accessible, and practical.

Article 3 section (2)(c) from the Commission’s Decision on the AI Office highlights its role in assisting the Commission in preparing guidance for the practical implementation of forthcoming regulations. This collaboration should prioritise the development of streamlined guidance that demystifies the complexities of specific duties, especially with regards to general-purpose AI (GPAI) models with systemic risk. The availability of clear guidance removes ambiguities in the text which can otherwise be exploited. It also makes duties for providers, such as high-risk AI system developers, comprehensible, especially for SME developers with limited access to legal advice. The Commission should view guidance as an opportunity to start building lines of communication with SMEs, including start-ups and deployers.

For example, Article 62 of the AI Act centres around serious incident reporting and calls on the Commission to issue guidance on reporting such incidents. The effectiveness of Article 62, in many ways, rides on the comprehensiveness of the guidance the Commission provides.

v. Proactively foster international collaboration.

As the new Commission assumes its role, it is critical that it empowers the AI Office to spearhead international collaboration on AI safety. In accordance with Article 7 of the Commission Decision establishing the AI Office, which highlights its role in “advocating the responsible stewardship of AI and promoting the Union approach to trustworthy AI”, it is essential for the Commission to ensure that the AI Office takes a leadership position in fostering global partnerships. The upcoming AI safety summit in South Korea in May 2024 and the subsequent one in France in 2025 present opportune platforms for the EU to actively engage with other jurisdictions. When third countries take legislative inspiration from the EU, the AI Office can steer international governance according to the principles it has established through the AI Act.

Given the cross-border nature of AI, and for the purpose of establishing legal certainty for businesses, the AI Office should strive to work closely with foreign AI safety agencies, such as the recently established AI Safety Institutes in the US, UK, and Japan respectively. Additionally, it must play a pivotal role in the implementation of global agreements on AI rules. In doing so, the EU can position itself as a driving force in shaping international standards for AI safety, reinforcing the Union’s commitment to responsible innovation on the global stage.

vi. Build relationships with national competent authorities and ensure seamless collaboration on enforcement.

In line with Article 59 of the AI Act, we urge the new Commission to closely monitor the designation of national competent authorities and foster a collaborative relationship with them for robust enforcement of the AI Act. The Commission should exert political capital to nudge Member States to abide by the 12-month timeline for designating notifying and market surveillance authorities by each Member State. While these national competent authorities will operate independently, the Office should maintain a publicly accessible list of single points of contact and begin building roads for collaboration.

To ensure effective enforcement of the AI Act’s pivotal provisions, Member States must equip their national competent authorities with adequate technical, financial, and human resources, especially personnel with expertise in AI technologies, data protection, cybersecurity, and legal requirements. Given the uneven distribution of resources across Member States, it is to be expected that certain Member States may require more guidance and support from the Commission and AI Office specifically. It is crucial that the AI Board uses its powers in facilitating the exchange of experiences among national competent authorities, to effectively ensure that differences in competencies and resource availability would not impede incident monitoring.

vii. Secure the future of AI regulation by addressing the AI Office funding challenge.

Establishing the AI Office as mandated by the AI Act is crucial for effective governance and enforcement. However, concerns arise regarding the proposed funding through reallocation from the Digital Europe Program, originally geared towards cybersecurity and supercomputing. This approach risks diverting resources from existing priorities while potentially falling short of the AI Office’s needs. Moreover, the absence of dedicated funding within the current MFF (2021-2027) further necessitates a proactive solution.

While the new governance and enforcement structure presents uncertainties in cost prediction, established authorities like the European Data Protection Supervisor (EDPS) offer valuable benchmarks. Only in 2024, the EDPS has a budget of €24.33 million and employs 89 staff members. Another relevant benchmark is the European Medicines Agency (EMA), with 897 employees and a 2024 budget of €478.5 million (out of which €34.8 million is from EU budget). The AI Office would require comparable financial resources to other EU agencies, as well as an additional budget stream for compute resources which are needed to evaluate powerful models. Recent reports suggest a budget of €12.76 million once the AI Office is fully developed in 2025, an amount that will fall short of securing the proper governance and enforcement of the AI Act. Therefore, we urge the Commission to take immediate action and:

Guarantee adequate funding for the AI Office until the next MFF comes into effect. This interim measure should ensure the Office can begin its critical work without resource constraints.
Negotiate a dedicated budget line within the MFF 2028-2034. This aligns with the strategic importance of the AI Office and prevents reliance on reallocations potentially compromising other programs.

Investing in the AI Office is not just a budgetary decision; it’s an investment in a robust regulatory framework for responsible AI development. By ensuring adequate funding, the Commission can empower the AI Office to effectively oversee the AI Act, safeguard public trust, and enable Europe to remain at the forefront of responsible AI governance.

Chemical & Biological Weapons and Artificial Intelligence: Problem Analysis and US Policy Recommendations

Taylor Jones — Tue, 27 Feb 2024 20:08:59 +0000

Domain Definition

A Chemical Weapon is a chemical used intentionally to kill or harm with its toxic properties. Munitions, devices and other equipment specifically designed to weaponize toxic chemicals also fall under the definition of chemical weapons. Chemical agents such as blister agents, choking agents, nerve agents and blood agents have the potential to cause immense pain and suffering, permanent damage and death.¹ After these weapons caused millions of casualties in both world wars, 200 countries signed the Chemical Weapons Convention – enforced by the Organization for the Prohibition of Chemical Weapons (OPCW) – and sought to destroy their chemical stockpiles. With the destruction of the last chemical weapon by the United States in July 2023, the OPCW has declared the end of all official chemical stockpiles.² While small-scale attacks by non-state actors and rogue state actors have occurred over the last fifty years, these are isolated cases. The vast majority of chemical weapons have been eradicated.

Biosecurity encompasses actions to counter biological threats, reduce biological risks, and prepare for, respond to and recover from biological incidents – whether naturally occurring, accidental, or deliberate in origin and whether impacting human, animal, plant, or environmental health. The National Biodefense Strategy and Implementation Plan published by the White House in October 2022 finds biosecurity to be critical to American national security interests, economic innovation, and scientific empowerment.³ In addition, American leadership from both sides of the political spectrum has undertaken significant investments in strengthening biosecurity over the last two decades. Finally, the COVID-19 pandemic has crystallized the threat to American life, liberty, and prosperity from pandemics in the future.

Problem Definition

Artificial intelligence (AI) could reverse the progress made in the last fifty years to abolish chemical weapons and develop strong norms against their use. As part of an initiative at the Swiss Federal Institute for Nuclear, Biological, and Chemical (NBC) Protection, a computational toxicology company was asked to investigate the potential dual-use risks of AI systems involved in drug discovery. The initiative demonstrated that these systems could generate thousands of novel chemical weapons. Most of these new compounds, as well as their key precursors, were not on any government watch-lists due to their novelty.⁴ This is even more concerning in light of the advent of large language model (LLM) based artificial agents. This is because the ability of artificial agents to sense their environment, make decisions, and take actions compounds the unpredictability and risks associated with this kind of research.

On the biological weapons front, cutting-edge biosecurity research, such as gain-of-function research, qualifies as dual-use research of concern – i.e. while such research offers significant potential benefits it also creates significant hazards. For instance, such research may be employed to develop vital medical countermeasures or to synthesize and release a dangerous pathogen. Over the last two decades, the cost of advanced biotechnology has rapidly decreased and access has rapidly expanded through advancements in cheaper and more accessible DNA sequencing, faster DNA synthesis, the discovery of efficient and accurate gene-editing tools such as CRISPR/Cas9, and developments in synthetic biology.⁵

Accompanying these rapid developments are even faster advancements in AI tools used in tandem with biotechnology. For instance, advanced AI systems have enabled several novel practices such as AI-assisted identification of virulence factors and in silico design of novel pathogens.⁶ More general-purpose AI systems, such as large language models, have also enabled a much larger set of individuals to access potentially hazardous information with regard to procuring and weaponizing dangerous pathogens, lowering the barrier of biological competency necessary to carry out these malicious acts.

The threats posed by biological and chemical weapons in convergence with AI are of paramount importance. Sections 4.1 and 4.4 of the White House Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence explicitly outline the potential Chemical, Biological, Radiological and Nuclear (CBRN) threats posed by advanced AI systems.⁷ They can be broadly divided into two categories:

#1. Exponentially Enhanced Capacity to Engineer Deadly Toxins and Biological Weapons

As discussed in the example of the toxicology company above, there is growing concern regarding the potential misuse of molecular machine learning models for harmful purposes. The dual-use application of models for predicting cytotoxicity to create new poisons or employing AlphaFold2 to develop novel toxins has raised alarm. Recent developments in AI have allowed for an expansion of open-source biological design tools (BDTs), increasing access by bad actors.⁸ This creates three kinds of risks:

Increased Access to Rapid Identification of Toxins: The MegaSyn AI software used by the toxicology company discussed was able to find 40,000 toxins with minimal digital architecture (namely some programming), open-source data, a 2015 Mac computer and less than six hours of machine time.⁹ This suggests that AI systems may democratize the ability to create chemical weapons, increasing access by non-state actors, rogue states or individuals acting on their own who would otherwise be precluded by insufficient resources. Combined with the use of LLMs and other general-purpose AI tools, the bar for expert knowledge needed to develop chemical weapons has been substantially lowered, further diffusing the ability to identify and release deadly toxins.
Discovery of Novel Toxins: An important aspect of the findings from the experiment discussed above is that the AI system not only found VX and other known chemical weapons; it also discovered thousands of completely new putatively toxic substances. This creates serious hazards for chemical defense, as malevolent actors may try to make AI systems develop novel toxins that are less well understood, and for which defensive, neutralizing, or treatment procedures have not yet been developed.
AI-Accelerated Development of Biological Design Tools. These tools span different fields such as bio-informatics, genomics, synthetic biology, and others. In essence, these tools allow smaller groups of individuals, with fewer resources, to discover, synthesize, and deploy enhanced pathogens of pandemic potential (PPPs). Critically, these AI systems can amplify risks from gain-of-function research, enabling malevolent actors to make pathogens more deadly, transmissible, and resilient against medical counter-measures.¹⁰ AI assistance can also help bad actors direct such bio-weapons at targets of particular genotypes, races, ethnicities, tribes, families, or individuals, facilitating the conduct of genocide at a potentially global scale.¹¹

#2. Increased Access to Dangerous Information and Manipulation Techniques Through LLMs

Outside the use of narrow AI systems to discover deadly toxic substances, developments in general purpose systems such as large language models may allow malevolent actors to execute many of the other steps needed to deploy a chemical weapon. Essential steps include baseline knowledge of chemistry and biology, access to critical materials and lab infrastructure, and access to means of deploying the weapon (e.g. munitions). LLMs equip malevolent actors with the ability to send deceptive emails and payments to custom manufacturers of chemical and biological materials, access substances through illicit markets, and hire temporary workers to accomplish specialized, compartmentalized tasks around the world. Taken together, these capacities enable the production and deployment of chemical weapons. More narrow AI systems have displayed effectiveness in writing code to exploit technical loopholes in the cybersecurity architecture of several organizations, such as identifying and exploiting zero-day vulnerabilities.¹² Such techniques could be used to target critical bio-infrastructure such as Biosafety Level 3 and 4 Labs (BSL-3 and BSL-4), research laboratories, hospital networks, and more. These practices could enable access to dangerous information or be used to cripple recovery and response to a high-consequence biological incident.

An experiment conducted at MIT demonstrated that students without a technical background were able within 60 minutes to use LLMs to identify four potential pandemic pathogens, explain how they can be generated from synthetic DNA using reverse genetics, supply the names of DNA synthesis companies unlikely to screen orders, identify detailed protocols and how to troubleshoot them, and recommend that anyone lacking the skills to perform reverse genetics engage a core facility or contract research organization.¹³ Other experiments conducted across different settings and time horizons have also demonstrated how large language models can be exploited to access and/or use hazardous information.¹⁴ Traditionally, access to this kind of expertise and information was mediated through established systems (completing a Ph.D. in an advanced field, being hired by a top research laboratory, meeting specified safety and security criteria for conducting sensitive research, etc.). Nowits democratization allows many more individuals, with less skill and less intelligence, to access this knowledge and potentially use it to cause considerable harm.

AI-powered cyberattacks also present a threat to biosecurity and chemical security. Advancements in AI have allowed a wider net of actors to construct more easily cyber exploits that could be used to target cyber-vulnerabilities in water treatment facilities, research labs and containment facilities, to cause widespread harmful chemical or biological exposure. In addition, AI systems may be used to improve the cyber-manipulation techniques used by malevolent actors. Cyber-manipulation encompasses a wide array of practices such as spearphishing, pharming, smishing, vishing, and others intended to deceive, blackmail, mislead, or otherwise compel the victim of such a practice to reveal high-value information. Large language models have demonstrated a considerable capacity to amplify the power of these illegal practices, which could allow malevolent actors to access dangerous biological information or infrastructure by manipulating owners of DNA synthesis companies, prominent academics in the field, and biosecurity professionals.¹⁵ While many large language models have some preliminary guardrails built in to guard against this misuse, several experiments have demonstrated that even trivial efforts can overcome these safeguards.¹⁶ For instance, relabeling of these toxic substances within the data of the model can overcome safeguards which were set up to preclude them from providing dangerous information. Prompt engineering by compartmentalizing (breaking up one dangerous process into several steps which seem innocuous by themselves), as well as faking authority (pretending to be in charge of a government chemical facility), have also yielded success in manipulating these models.¹⁷

Policy Recommendations

In light of the significant challenges analyzed in the previous section, considerable attention from policymakers is necessary to ensure the safety and security of the American people. The following policy recommendations represent critical, targeted first steps to mitigating the risks posed by AI in the domains of chemical and biosecurity: :

Explicit Requirements to Evaluate Advanced General Purpose AI Systems for Chemical Weapons Use: There is considerable ongoing policy discussion to develop a framework for evaluating advanced general purpose AI systems before and after they are developed and/or deployed, through red-teaming, internal evaluations, external audits and other mechanisms. In order to guard against emerging risks from biological and chemical weapons, it is vital that these evaluations explicitly incorporate a regimen for evaluating a system’s capacity to facilitate access to sensitive information and procedures necessary to develop chemical weapons. This could include the capability of these systems to provide dangerous information as discussed, as well as the capability to deceive, manipulate, access illicit spaces, and/or order illegal financial transactions. In order to prevent malevolent actors from accessing hazardous information and expertise, or further exploiting AI systems to access high-risk infrastructure, it is also critical to set up minimum auditing requirements for these general-purpose systems before launch. These practices could help test and strengthen the safeguards underpinning these systems. Such a requirement could also be incorporated into the existing risk management frameworks, such as the NIST AI Risk Management Framework.
Restrict the Release of Model Weights for Systems that Could be Used, or Modified to be Used, to Discover Dangerous Toxins: In order to reduce the ability of malevolent actors to use AI capabilities in production of dangerous chemical toxins, it is critical that both narrow and general-purpose AI systems that are shown to be dangerous in this regard (as well as future iterations of those and similar systems) include significant restrictions on access both for use and to the underlying model weights . Critically, the release of model weights is an irreversible act that eliminates the capacity to restrict use in perpetuity. Accordingly, red-teaming procedures such as those mentioned in the previous recommendation must include extensive assessment to confirm the lack of potential for these dangerous capabilities, and for modification or fine-tuning to introduce these dangerous capabilities, if the developer intends to release the model weights..¹⁸
Ring-fence Dangerous Information from Being Used to Train Large Language Models. In order to ensure that general-purpose AI systems do not reveal hazardous information, it is vital to require that companies not use this kind of information during training runs to train their AI models. Proactively keeping information that would very likely pose a significant health and/or safety issue to the general public classified using new classification levels and initiatives would significantly reduce these risks.¹⁹
Incorporating AI Threats into Dual Use Research of Concern Guidance and Risk Frameworks: Over the last two decades, considerable policy attention has been devoted to establishing policy frameworks, including guidance and requirements, for biosecurity. However, these frameworks do not currently include policy prescriptions and guidance for unique challenges posed by AI. National-level policy frameworks such as those published by the National Science Advisory Board for Biosecurity (NSABB), the CDC, HHS, DHS, and others must explicitly integrate concerns at the convergence of AI and biosecurity, and establish technical working groups within these bodies populated by experts in both fields to study these risks. Finally, these convergence risks should also be integrated into AI risk frameworks such as the NIST AI RMF. With the exception of the NIST AI RMF, all of these regulatory directives and review regimes were instituted before the exponential development of AI systems seen over the last few years. It is important to update this guidance and include explicit provisions for the use of AI in dual-use biological and chemical research.
Expand Know Your Customer (KYC) and Know Your Order (KYO) Requirements. Companies that provide sequencing and synthesis services, research laboratories, and other relevant stakeholders should be required to follow KYC and KYO standards, ensuring that potentially dangerous sequences are kept out of the hands of malevolent actors.²⁰ Regulation should further require standardized, scalably secure synthesis screening methods (such as SecureDNA). These requirements must also include assurance that correspondence pertaining to these services is between human agents and not involving AI systems.
Strengthen Existing Capabilities and Capacities for Biodefense. As developments in AI and biotechnology accelerate, it is also vital to ensure that there is considerable capacity to prevent, detect, and respond to high-consequence biological incidents of all kinds. This includes significant investments in early warning and detection, response capacities, interoperability and coordination, national stockpiles of PPEs and other relevant infrastructure, supply-chain resilience, development of medical countermeasures, and accountability and enforcement mechanisms to disincentivize both accidents and intentional misuse.²¹

More general oversight and governance infrastructure for advanced AI systems is also essential to protect against biological and chemical risks from AI, among many other risks. We further recommend these broader regulatory approaches to track, evaluate, and incentivize the responsible design of advanced AI systems:

Require Advanced AI Developers to Register Large Training Runs and to “Know Their Customers”: The Federal Government lacks a mechanism for tracking the development and proliferation of advanced AI systems that could exacerbate bio-risk. To mitigate these risks adequately, it is essential to know what systems are being developed and who has access to them. Requiring registration for the acquisition of large amounts of computational resources for training advanced AI systems, and for carrying out the training runs themselves, would help with evaluating possible risks and taking appropriate precautions. “Know Your Customer” requirements similar to those imposed in the financial services industry would reduce the risk of systems that can facilitate biological and chemical attacks falling into the hands of malicious actors.

Clarify Liability for Developers of AI Systems Used in Bio- and Chemical Attacks: It is not clear under existing law whether the developers of AI systems used by others, for example to synthesize and launch a pathogen, would be held liable for resulting harms. Absolving developers of liability in these circumstances creates little incentive for profit-driven developers to expend financial resources on precautionary design principles and robust assessment. Because these systems are opaque and can possess unanticipated, emergent capabilities, there is inherent risk in developing advanced AI systems and systems expected to be used in critical contexts. Implementing strict liability when these systems facilitate or cause harm would better incentivize developers to take appropriate precautions against vulnerabilities, and the risk of use in biological and chemical attacks.

^{↩ 1} What is a Chemical Weapon? Organization for the Prohibition of Chemical Weapons.

^{↩ 2} US Completes Chemical Weapons Stockpile Destruction Operations. Department of Defense.

^{↩ 3} National Biodefense Strategy And Implementation Plan. The White House.

^{↩ 4} Dual Use of Artificial Intelligence-powered Drug Discovery. National Center for Biotechnology Information. National Institutes of Health.

^{↩ 5} The Blessing and Curse of Biotechnology: A Primer on Biosafety and Biosecurity. Carnegie Endowment for International Peace.

^{↩ 6} Assessing the Risks Posed by the Convergence of Artificial Intelligence and Biotechnology. National Center for Biotechnology Information. National Institutes of Health.

^{↩ 7} Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. The White House.

^{↩ 8} Bio X AI: Policy Recommendations For A New Frontier. Federation of American Scientists.

^{↩ 10} The Convergence of Artificial Intelligence and the Life Sciences. Nuclear Threat Initiative.

^{↩ 11} The Coming Threat of a Genetically Engineered ‘Ethnic Bioweapon’. National Review.

^{↩ 12} US adversaries employ generative AI in attempted cyberattack. Security Magazine.

^{↩ 13} Can large language models democratize access to dual-use biotechnology? Computer and Society. https://arxiv.org/abs/2306.03809

^{↩ 14} The Operational Risks of AI in Large-Scale Biological Attacks. RAND Corporation.

^{↩ 15} AI tools such as ChatGPT are generating a mammoth increase in malicious phishing emails. CNBC.

^{↩ 16} NIST Identifies Types of Cyberattacks That Manipulate Behavior of AI Systems. National Institutes of Standards and Technology.

^{↩ 17} Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study. Computer Engineering. https://arxiv.org/abs/2305.13860

^{↩ 18} BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13. Computation and Language. https://arxiv.org/abs/2311.00117

^{↩ 19} Artificial Intelligence in the Biological Sciences: Uses, Safety, Security, and Oversight. Congressional Research Service.

^{↩ 20} Preventing the Misuse of DNA Synthesis Technology. Nuclear Threat Initiative.

^{↩ 21} Biosecurity In The Age Of AI. Helena Biosecurity. https://www.helenabiosecurity.org

FLI Response to OMB: Request for Comments on AI Governance, Innovation, and Risk Management

Taylor Jones — Wed, 21 Feb 2024 11:02:27 +0000

Organization: Future of Life Institute

Point of Contact: Hamza Tariq Chaudhry, US Policy Specialist. hamza@futureoflife.org

We would like to thank the Office of Management and Budget (OMB) for the opportunity to provide comments on OMB–2023–0020, or the Memorandum on ‘Advancing Governance, Innovation, and Risk Management for Agency Use of Artificial Intelligence’ (hereafter referred to as ‘the Memorandum’). The Future of Life Institute (FLI) has a long-standing tradition of work on AI governance to mitigate the risks and maximize the benefits of artificial intelligence. For the remainder of this Request for Comment (RfC) document, we provide a brief summary of our organization’s work in this space, followed by substantive comments on the Memorandum. The ‘substantive comments’ section provides responses to the questions outlined in the RfC. The ‘miscellaneous comments’ section offers general comments outside the scope of the questions outlined in the Federal Register. We look forward to continuing this correspondence and being a resource for the OMB’s efforts in this space in the months and years to come.

About the Organization

The Future of Life Institute (FLI) is an independent nonprofit organization with the goal of reducing large-scale risks and steering transformative technologies to benefit humanity, with a particular focus on artificial intelligence. Since its founding ten years ago, FLI has taken a leading role in advancing key disciplines such as AI governance, AI safety, and trustworthy and responsible AI, and is widely considered to be among the first civil society actors focused on these issues. FLI was responsible for convening the first major conference on AI safety in Puerto Rico in 2015, and for publishing the Asilomar AI principles, one of the earliest and most influential frameworks for the governance of artificial intelligence. FLI is the UN Secretary General’s designated civil society organization for recommendations on the governance of AI and has played a central role in deliberations regarding the EU AI Act’s treatment of risks from AI. FLI has also worked actively within the United States on legislation and executive directives concerning AI. Members of our team have contributed extensive feedback to the development of the NIST AI Risk Management Framework, testified at the Senate Insight Forums, participated in the UK AI Summit, and helped connect leading experts in the policy and technical domains to policymakers across the US government.

FLI’s wide-ranging work on artificial intelligence can be found at www.futureoflife.org.

Substantive Comments

Definitions and best practices

Comments in Response to Questions 5 and 6 from the Federal Register: Definitions of and best practices regarding safety-impacting and rights-impacting AI.¹

The Memorandum establishes a clear minimum threshold for safety (Section 5,c,iv) that must be attained before agencies are allowed to use AI systems, applicable to both systems being used presently and those intended for use in the future.² The requirements for these agencies – which include impact assessments, real-world testing of AI systems before deployment, independent evaluations and periodic post-deployment testing – are a positive step towards minimizing the safety risks from government use of AI models.

We would, however, welcome further details for agencies on the periodic reviews that occur post-deployment to specify that these reviews would also include red-teaming and other auditing processes that make up portions of the pre-deployment review process. In addition, while we appreciate the inclusion of language prohibiting agencies from using AI systems in cases where ‘the benefits do not meaningfully outweigh the risks’, we invite the OMB to support this language with quantitative examples, as risks may capture both probability and magnitude of harm, especially in the case of safety concerns. For instance, even if the probability of any given risk is found to be considerably lower than that of potential benefit, the magnitude of a risk (e.g., a bio-weapon attack) may be so high that it overrides the benefit despite being of low probability. Agencies should be required to establish, subject to public comment and external review, risk tolerances for activities for which use of AI systems is anticipated, including unacceptable risks to individuals, communities, and society that would disqualify the system from adoption. Establishing these thresholds prior to testing and adoption would help prevent drift in risk tolerance that could gradually rise to insufficient levels.

The Memorandum provides adequate definitions for two categories of potential harm posed by AI systems – safety-impacting AI systems and rights-impacting AI systems. FLI, which predominately focuses on AI safety, supports the broader definition of safety-impacting AI systems offered in the Memorandum, which captures a more expansive set of AI models and does not rely on technical thresholds. We believe this best positions the executing Agencies to exercise appropriate oversight over use of AI models. In addition, we are pleased to see that under the proposed definition, many models are presumed to be safety-impacting (Section 5,b). This is critical as it relieves relevant agencies of administrative burdens and time delays that would otherwise occur in evaluating each system with risk assessments, instead allowing them to devote more time and resources to setting up adequate guardrails. On the same token, we are pleased that additional risk assessments can be conducted to expand the scope of systems receiving due scrutiny.

Finally, when it comes to ‘use of AI’, we support efforts to include cases in the Memorandum of procurement in addition to direct use (Section 5, d). However, the language of the Memorandum currently forwards guidance on procurement and contracts not as a set of requirements but as a set of recommendations. It is imperative that the OMB set up robust requirements for government purchasing of AI systems that mirror requirements on direct use, ensuring that procurement of AI systems includes consistent, robust evaluation to protect the safety and rights of the American public. This has the potential to minimize harm from government use of AI, and to inform best practices for the private sector, where most of that state-of-the-art models are created.

Chief AI Officer and AI Governance Body

Comments in Response to Questions 1 and 2 from the Federal Register: Role of Chief AI Officer and the benefits and drawbacks of central AI governance body.³

We agree that effective oversight of AI adoption by government agencies should rely on AI governance bodies within each agency to coordinate and supervise AI procurement and use across the broad functions of the agency.This structure facilitates oversight and accountability to ensure that minimum requirements as set out in the Memorandum are met by each agency writ large, while giving different offices within each agency the capability to exercise their mandate when it comes to specific use cases. In addition, we believe such a body can facilitate effective communication across different offices, bureaus and centers within the agency to ensure that poor communication feedback does not lead to under-reporting of use cases or use of AI that could lead to potential harm. Finally, we believe such a body would appropriately empower the Chief AI Officer (CAIO) to exercise their mandate as specified in the Memorandum.

However, we contend that this “hub and spoke” structure of a centralized AI governance body coordinating and overseeing domain-specific AI governance should be implemented on a whole-of-government level. In other words, we believe that just as there are benefits to having a new central body within each agency that helps enforce requirements laid out within the Memorandum, these bodies themselves would benefit from a single governance body that has representation and oversight across different agencies. This would facilitate interagency coordination, provide a central hub of expertise to advise agencies where appropriate, avoid costly redundancies in efforts by various agencies, and provide a body to inform and evaluate government AI adoption where domain-specific agency jurisdiction is not clear.

Information for Public Reporting

Comments in Response to Question 8 from the Federal Register: Nature of information that should be publicly reported by agencies in use case inventories.⁴

While we welcome provisions within the Memorandum which require annual reporting of use cases of covered AI systems by the relevant agencies (Section 3, a), we are concerned that further elaboration is not provided by the OMB on the details of these use case inventories. We believe that the public should have access to information on the full results of the impact assessments, real-world testing, independent evaluations, and periodic human reviews, wherever possible. Where it is not possible to provide this information in full, we believe it is vital to provide redacted iterations of these documents upon the filing of a Freedom of Information Act (FOIA) request. Secondly, considering that there is some precedent of agencies neglecting to report all use cases in the past, we believe that the Memorandum would benefit from having explicit provisions to guard against under-reporting of use cases. This could, for instance, include guidance for Inspectors General to audit these use cases periodically within their respective agencies. Finally, while we recognize this as a positive first step towards creating transparency in use cases, we emphasize that this does not ensure sufficient accountability in and of itself, and will require further guidance and requirements on empowering the OMB and the CAIOs, and other relevant authorities, to take against violations of use case guidance set up in the Memorandum.

Miscellaneous Comments

Comments on Scope

Section 2 (‘Scope’) explicitly exempts the intelligence community (‘covered agencies’, Section 2, a) and cases where AI when it is used as a component of a national security system (‘applicability to national security systems’, Section, c). As the Memorandum is intended to minimize the risks of government use of AI systems, we believe it is critical to establish robust requirements for the intelligence and defense communities, as these are likely to be the highest risk cases of government AI use with the greatest potential harm, and hence the most urgent need for scrutiny. Where it is within the remit of the OMB to set up requirements within these domains, we ask that they urgently do so.

Comments on Definitions

We are pleased to see that Section 6 (‘Definitions’) outlines an expansive definition of “artificial intelligence” that is broader than the definition offered in the AI Executive Order. In addition, we support that the Memorandum’s description of AI systems encompasses all those across different ranges of autonomous behavior, technical parameters and human oversight. However, we believe that it is vital to ensure that the definition of AI employed in this section is treated as an ‘or’ definition as opposed to an ‘and’ definition. In other words, we believe that any system which fulfills any of these criteria should fall within the definitional scope of AI. For the same reason, we are concerned that the definition of ‘dual-use foundation models’ mirrors the definition included in the AI Executive Order, which offers an ‘and’ definition leading to very few models coming under definitional scope, and potentially excluding those which pose safety risks but do not meet other criteria.⁵

The Memorandum also employs the AI Executive Order definition for ‘red-teaming’.⁶ While this definition outlines what red-teaming would cover, it does not provide any detail on how rigorous this red-teaming must be, and for what period within the lifecycle of the AI system. We support further clarification from the OMB in this regard to ensure that red-teaming as defined in guidance adequately tests models for safety harms for the duration of their procurement and use.

We endorse the OMB’s decision to establish a broad definition for what would count as ‘risks from the use of AI’ as well as the expansive definition of ‘safety-impacting AI’. However, we recommend the addition of loss of control from use of AI systems to the considerable list of risk factors identified in the definition of ‘safety-impacting AI’.

Comments on Distinguishing between Generative and Other AI

We believe that all advanced AI systems, whether they are generative or otherwise, should be subject to appropriate requirements to ensure safety. Hence, we are pleased to see that, in a slight divergence from the AI Executive Order, the Memorandum bases requirements on potential harms from AI and does not distinguish between generative AI and other AI systems.

^{↩ 1} 5. Are there use cases for presumed safety-impacting and rights-impacting AI (Section 5 (b)) that should be included, removed, or revised? If so, why?

6. Do the minimum practices identified for safety-impacting and rights-impacting AI set an appropriate baseline that is applicable across all agencies and all such uses of AI? How can the minimum practices be improved, recognizing that agencies will need to apply context-specific risk mitigations in addition to what is listed?

^{↩ 2} We are particularly pleased to see that the scope of this Memorandum applies not just to use and application of AI systems in the future, but also those currently in use by relevant agencies.

^{↩ 3} 1. The composition of Federal agencies varies significantly in ways that will shape the way they approach governance. An overarching Federal policy must account for differences in an agency’s size, organization, budget, mission, organic AI talent, and more. Are the roles, responsibilities, seniority, position, and reporting structures outlined for Chief AI Officers sufficiently flexible and achievable for the breadth of covered agencies?

2. What types of coordination mechanisms, either in the public or private sector, would be particularly effective for agencies to model in their establishment of an AI Governance Body? What are the benefits or drawbacks to having agencies establishing a new body to perform AI governance versus updating the scope of an existing group (for example, agency bodies focused on privacy, IT, or data)?

^{↩ 4} 8. What kind of information should be made public about agencies’ use of AI in their annual use case inventory?

^{↩ 5} Section 3 of the AI Executive Order defines such as model in the following way: “dual-use foundation model” means an AI model that is trained on broad data; generally uses self-supervision; contains at least tens of billions of parameters; is applicable across a wide range of contexts; and that exhibits, or could be easily modified to exhibit, high levels of performance at tasks that pose a serious risk to security, national economic security, national public health or safety, or any combination of those matters.” (Emphasis added).

^{↩ 6} Section 3 of the AI Executive Order defines red-teaming as: The term “AI red-teaming” means a structured testing effort to find flaws and vulnerabilities in an AI system, often in a controlled environment and in collaboration with developers of AI. Artificial Intelligence red-teaming is most often performed by dedicated “red teams” that adopt adversarial methods to identify flaws and vulnerabilities, such as harmful or discriminatory outputs from an AI system, unforeseen or undesirable system behaviors, limitations, or potential risks associated with the misuse of the system.

FLI Response to NIST: Request for Information on NIST’s Assignments under the AI Executive Order

Taylor Jones — Wed, 21 Feb 2024 10:53:07 +0000

Response to Request for Information (RFI NIST-2023-0009-0001) Related to NIST’s Assignments Under Sections 4.1, 4.5 and 11 of the Executive Order Concerning Artificial Intelligence (Sections 4.1, 4.5, and 11)

Organization: Future of Life Institute

Point of Contact: Hamza Tariq Chaudhry, US Policy Specialist. hamza@futureoflife.org

About the Organization

The Future of Life Institute (FLI) is an independent nonprofit organization with the goal of reducing large-scale risks and steering transformative technologies to benefit humanity, with a particular focus on artificial intelligence. Since its founding, FLI has taken a leading role in advancing key disciplines such as AI governance, AI safety, and trustworthy and responsible AI, and is widely considered to be among the first civil society actors focused on these issues. FLI was responsible for convening the first major conference on AI safety in Puerto Rico in 2015, and for publishing the Asilomar AI principles, one of the earliest and most influential frameworks for the governance of artificial intelligence, in 2017. FLI is the UN Secretary General’s designated civil society organization for recommendations on the governance of AI and has played a central role in deliberations regarding the EU AI Act’s treatment of risks from AI. FLI has also worked actively within the United States on legislation and executive directives concerning AI. Members of our team have contributed extensive feedback to the development of the NIST AI Risk Management Framework, testified at Senate AI Insight Forums, participated in the UK AI Summit, and connected leading experts in the policy and technical domains to policymakers across the US government.

Executive Summary

We would like to thank the National Institute of Standards and Technology (NIST) for the opportunity to provide comments regarding NIST’s assignments under Sections 4.1, 4.5 and 11 of the Executive Order Concerning Artificial Intelligence (Sections 4.1, 4.5, and 11). The Future of Life Institute (FLI) has a long-standing tradition of work on AI governance to mitigate the risks and maximize the benefits of artificial intelligence. In NIST’s implementation of the Executive Order 13960 on “Promoting the Use of Trustworthy Artificial Intelligence in the Federal Government” (EO), we recommend consideration of the following:

Military and national security AI use-cases should not be exempt from guidance. Military and national security AI use-cases, despite exemptions in the Executive Order, should not be beyond NIST’s guidance, considering their potential for serious harm. Given their previous work, NIST is well-positioned to incorporate standards for military and national security into their guidelines and standards.
A companion resource to the AI Risk Management Framework (RMF) should explicitly characterize minimum criteria for unacceptable risks. NIST should guide the establishment of tolerable risk thresholds. These thresholds should include guidelines on determining unacceptable risk and outline enforcement mechanisms and incentives to encourage compliance.
Responsibility for managing risks inherent to AI systems should fall primarily on developers. The NIST companion resource for generative AI should define roles for developers, deployers, and end-users in the assessment process. Developers, end-users, and deployers should all work to mitigate risk, but the responsibility of developers is paramount, considering their central role in ensuring the safety of systems.
Dual-use foundation models developed by AI companies should require external red-teaming. External red-teams are essential for encouraging comprehensive and unbiased assessments of AI models. NIST should establish standards to ensure that external auditors remain independent and aligned with best practices.
The NIST companion resource for generative AI should include specific guidance for AI models with widely available model weights. Safeguards designed to mitigate risks from dual-use foundations models with widely available weights can be easily removed and require specific standards to ensure security.
Embedded provenance on synthetic content should include developer and model information. Including information on synthetic content about the developer and system of origin would better inform consumers and incentivize developers to prioritize safety from the design phase.
NIST should adopt a less restrictive definition of “dual-use foundation models.” Switching from a restrictive definition (using ‘and’) to a more expansive definition (using ‘or’) as stated in the EO would enable NIST to bring all models of concern within its purview.

We look forward to continuing this correspondence and to serve as a resource for NIST efforts pertaining to AI in the months and years to come.

Recommendations

1. Military and national security use-cases

Standards for national security and military are not beyond the remit of NIST. AI systems intended for use in national security and military applications present some of the greatest potential for catastrophic risk due to their intended use in critical, often life-or-death circumstances. While the EO exempts national security and military AI from most of its provisions, NIST has previously established standards¹ related to national security, including standards for chemical, biological, radiological, nuclear, explosive (CBRNE) detection, personal protective equipment (PPE), and physical infrastructure resilience and security. Given this precedent, NIST can and should update the AI RMF and future companion pieces to includestandards applicable to national security and military uses of AI. Specifically, NIST can play a vital role in mitigating risks presented by these systems by, inter alia, working with the Defense Technology Security Administration (DTSA) and the Office of Science and Technology to instate standards for procurement, development and deployment of AI technologies. Considering the sizable impact malfunction, misuse, or malicious use of military or national security AI systems could entail, such standards should be at least as rigorous in assessing and mitigating potential risks as those developed for civilian AI applications.

2. Addressing AI RMF gaps

NIST should provide guidance on identifying unacceptable risks. The AI risk management framework lacks guidance on tolerable risk thresholds. As a result, developers of potentially dangerous AI systems can remain in compliance with the AI RMF despite failure to meaningfully mitigate substantial risks, so long as they document identification of the risk and determine that risk to be acceptable to them. Accordingly, companies can interpret risk solely in terms of their interests – tolerable risk may be construed as risks that are tolerable for the developer, even if those risks are unacceptable to other affected parties. The ability to make internal determinations of tolerable risk without a framework for evaluating externalities overlooks the potential impact on government, individuals, and society. NIST should revise the AI RMF, introducing criteria for determining tolerable risk thresholds. This revision should incorporate evaluations of risk to individuals, communities, and society at each stage of the assessment process, and these revisions should be applied to all relevant companion resources.

Enforcement mechanisms and structural incentives are necessary. While industries may voluntarily adopt NIST standards, we cannot rely on AI companies to continue to self-regulate. The significance of these standards warrants explicit commitment through structured incentives and enforcement measures. To encourage the adoption of these standards, NIST should offer independent evaluation of systems and practices for compliance with their framework, provide feedback, and provide compliant parties with a certificate of accreditation that can demonstrate good faith and strengthen credibility with the public and other stakeholders.

Guidelines must set clear red-lines to halt or remediate projects. NIST should internally define minimum red-lines and encourage AI companies to predetermine additional red-lines for each assessment. Failure to stay within these limits should prevent the project from progressing or mandate remediation. Red-lines should encompass material risks of catastrophic harm and significant risks related to the ease and scale of misinformation, disinformation, fraud, and objectionable content like child sexual abuse material and defamatory media. Such predetermined, explicit thresholds for halting a project or taking remediation efforts will prevent movement of safety and ethical goalposts in the face of potential profits by companies, increasing the practical impact of the AI RMF’s extensive guidance on assessment of risk.

3. AI developer responsibility

The NIST companion resource for generative AI should define clear roles for developers, deployers, and end-users in the assessment process. All of these parties should take steps to mitigate risks to the extent possible, but the role of the developer in proactively identifying, addressing, and continuously monitoring potential risks throughout the lifecycle of the AI system is paramount. This should include (but is not limited to) implementing robust risk mitigation strategies, regularly updating the system to address new vulnerabilities, and transparently communicating with deployers and end-users about the limitation and safe usage guidelines of the system.

Compared to downstream entities, developers have the most comprehensive understanding of how a system was trained, its behavior, implemented safeguards, architectural details, and potential vulnerabilities. This information is often withheld from the public for security or intellectual property reasons, significantly limiting the ability of deployers and end-users to understand the risks these systems may present. For this reason, deployers and end-users cannot be reasonably expected to anticipate, mitigate, or compensate harms to the extent that developers can.

Deployers implementing safety and security by design, and thus mitigating risks at the outset prior to distribution, is more cost-effective, as the responsibility for the most intensive assessment and risk mitigation falls primarily on the handful of major companies developing advanced systems, rather than imposing these requirements on the more numerous, often resource-limited deployers. This upstream-approach to risk-mitigation also simplifies oversight, as monitoring a smaller group of developers is more manageable than overseeing the larger population of deployers and end-users. Furthermore, the ability of generative AI to trivialize and scale the proliferation of content makes dealing with the issue primarily at the level of the end user infeasible and may also necessitate more privacy-invasive surveillance to implement effectively.

Developer responsibility does not fully exempt deployers or end-users from liability in cases of intentional misuse or harmful modifications of the system. A framework including strict, joint and several liability, which holds all parties in the value chain accountable within their respective liability scopes, is appropriate. Failure by a developer to design a system with sufficient safeguards that cannot be easily circumvented should be considered akin to producing and distributing an inherently unsafe or defective product.

4. External red-teaming of dual-use foundation models

External red-teaming should be considered a best practice for AI safety. While many AI developers currently hire external teams with specialized knowledge to test their products, relying solely on developers to select these teams is insufficient due to inadequate standardization, conflicts of interest, and lack of expertise.

Ideally, the government would establish the capacity to serve in this role. However, in situations where government-led red-teaming is not feasible, alternative mechanisms must be in place. NIST should move to establish a criteria to assess external auditors for their expertise and independence.² These mechanisms could be implemented as an official certification displayed on the product’s website, signifying that the model has passed testing by an approved entity. This approach not only enhances safety but also fosters transparency and public trust.

Ensuring comprehensive safety assessments requires red-teams to have access to the exact model intended for deployment, along with detailed information on implemented safeguards and internal red-teaming results. External testers are typically given “black-box” access to AI models via API access.³ While fine-tuning can still be supported via API access, this approach at least somewhat limits their testing abilities to prompting the system and observing its outputs. While this is a necessary part of the assessment process, it is not sufficient and has shown to be unreliable in various ways.⁴ Conversely, structured access provides testers with information that allows them to execute stronger, more comprehensive adversarial attacks.⁵ Many companies oppose providing complete access to their models due to concerns about intellectual property and security leaks. To mitigate these concerns, we recommend that NIST establish physical and contractual standards and protocols to enable secure model access such as on-site testing environments and nondisclosure agreements. To ensure that external auditors are conducting tests in accordance with these standards and practices, these should be conducted by the government or other approved entities.

Red-teams should be afforded ample time, resources, and access for comprehensive testing. A multi-stage red-teaming process including data, pre-training, model, system, deployment, and post-deployment phases is needed. Access to training data, for example, could foster transparency and enable pathways for the enforcement of copyright law. Furthermore, developers should be encouraged to proactively engage with deployers to understand the use-cases of their products and inform external auditors so that they may tailor their testing strategies effectively.

Finally, AI companies should be encouraged to establish mechanisms for the continuous identification and reporting of vulnerabilities post-deployment. Many companies have created pipelines for these processes.⁶ ⁷ NIST should consider providing guidelines to encourage consistency and standardization.

5. Safety limitations of AI models with widely available model weights

The NIST companion resource on generative AI should include recommendations on evaluating the risks of releasing and developing models with widely available model weights. With current technologies and architectures, removing safeguards from AI models with widely available model weights through fine-tuning is relatively trivial.⁸ This makes it intractable to set or enforce guidelines for developers who build on open-source models. This ease of removal has enabled the proliferation of harmful synthetic materials.⁹

6. Inclusion of developer information in synthetic content

Embedded information on synthetic content should include information about the developer and system of origin. Much attention has been paid in recent months to the potential for synthetic content to contribute to the spread of mis- and disinformation and non-consensual sexual imagery. The proliferation of synthetic content also carries significant national security risks, including the use of synthetic blackmail or spearphishing against high-ranking officials and the creation of fake intelligence, which could introduce serious vulnerabilities. Some generative AI systems may lack sufficient safeguards, making them more prone to these malicious uses, but detecting these vulnerabilities and holding their developers accountable for rectifying them is at present extremely challenging.

Labeling and watermarking techniques have been proposed as one possible method for verifying the authenticity or synthetic nature of content, and Section 4.5(a) of the EO tasks the Department of Commerce with developing or identifying existing tools, standards, methods, practices, and techniques for detecting, labeling, and authenticating synthetic content. We recommend that standards for watermarking or other embedded information should include information detailing the developer and system of origin. Such measures would incentivize developers to prioritize safety from the design phase, facilitate identification of systems especially vulnerable to creation of untoward content, and streamline the identification and tracking of problematic synthetic content back to its creators to impose liability for harms where appropriate. Given the stakes of the issues raised by synthetic content, the emphasis on safety and accountability should take precedent over concerns about the economic feasibility of implementation. That said, any additional economic burden for embedding system and developer of origin information would likely be negligible relative to embedding information relating to the authenticity of the content alone.

7. Definition of “dual-use foundation models”

The EO defines “dual-use foundation model” to mean “an AI model that is trained on broad data; generally uses self-supervision; contains at least tens of billions of parameters; is applicable across a wide range of contexts; and that exhibits, or could be easily modified to exhibit, high levels of performance at tasks that pose a serious risk to security, national economic security, national public health or safety, or any combination of those matters, such as by:

(i) substantially lowering the barrier of entry for non-experts to design, synthesize, acquire, or use chemical, biological, radiological, or nuclear (CBRN) weapons;

(ii) enabling powerful offensive cyber operations through automated vulnerability discovery and exploitation against a wide range of potential targets of cyber attacks; or

(iii) permitting the evasion of human control or oversight through means of deception or obfuscation.”

It should be noted, however, that the broad general purpose capabilities of “foundation” models inherently render them dual-use technologies. These models can often possess latent or unanticipated capabilities, or be used in unanticipated ways that present substantial risk, even if they do not obviously exhibit performance that poses “a serious risk to security, national economic security, national public health or safety, or any combination of those matters” upon initial observation. Furthermore, models that are not developed in accordance with the described characteristics (i.e. trained on broad data, generally using self-supervision, containing at least tens of billions of parameters, and applicable across a wide range of contexts) that exhibit, or can be easily modified to exhibit, high levels of performance at tasks that pose those serious risks should nonetheless be considered dual-use. Novel architectures for AI systems that can be trained on more specialized datasets or can effectively use fewer parameters, for instance, should fall under the definition if it is evident that they can pose serious risks to national security and public health. Models of this inherently risky architecture AND models that pose an evident risk to security and/or health should be subject to guidance and rigorous safety standards developed by NIST and other agencies pursuant to the EO and beyond.

A slight modification to the EO’s definition of “dual-use foundation models,” as follows, could accommodate this more inclusive concept of dual-use to appropriately scope NIST’s guidance for ensuring the safety of AI systems:

“[…]an AI model that is trained on broad data; generally uses self-supervision; contains at least tens of billions of parameters; is applicable across a wide range of contexts; ~~and~~ or that exhibits, or could be easily modified to exhibit, high levels of performance at tasks that pose a serious risk to security, national economic security, national public health or safety, or any combination of those matters, such…”

8. Global engagement and global military use-cases

Inclusion of strategic competitors: While we welcome efforts through NIST as outlined in Sec. 11 of the EO to advance global technical standards for AI development, we are concerned about the nature of engagement on this issue restricted to ‘key international allies and partners’. Cognizant of political realities, we ask that NIST also engage with strategic competitors on global technical standards, in particular those states which are considered to be leaders in AI development, such as the PRC. Without engaging with these strategic competitors, any global standards developed will suffer from a lack of enforcement and global legitimacy. Conversely, standards developed in cooperation with strategic competitors would likely strengthen the legitimacy and enforcement potential of technical standards. Moreover, it is in the United States’ national security interests for adversaries’ AI to behave more reliably and predictably, and for these systems to remain under proper human control, rather than malfunctioning to escalate situations without human intent or otherwise cause substantial harm that could diffuse beyond their borders.

The exclusion of military AI use-cases will hinder progress on developing global technical standards generally: As the EO outlines, developing global technical standards on civilian AI development and deployment is vital to reaching a global agreement on use of AI. However, considering the blurry boundary between AI developed and deployed for civilian versus military use, we are concerned that a standards agreement on civilian AI alone will likely be difficult without discussing basic guardrails regarding military development and use of AI. This is because with the most advanced AI systems, distinguishing between military and civilian use cases is becoming and will continue to become increasingly difficult, especially considering their general-purpose nature. Mistrust regarding military AI endeavors is likely to impede the international cooperation necessary to ensure global safety in a world with powerful AI systems, including in civilian domains. Adopting basic domestic safety standards for military use of AI, as recommended in #1 (“Military and national security use-cases”), would reduce the risk of catastrophic failure of military systems and inadvertent escalation between strategic competitors, encourage international adoption of military AI safety and security standards, and foster the trust necessary to encourage broader civilian global AI standards adoption. Hence, we reiterate the request that NIST work actively with the Department of State, the Assistant to the President for National Security and other relevant actors as specified in Section 11, to clarify how its AI safety and security standards can be applied in the military context, especially with respect to models that meet the EO definition of ‘dual-use foundation models’.

Closing Remarks

We appreciate the efforts of NIST to thoughtfully and comprehensively carry out its obligations under the AI EO and are grateful for the opportunity to contribute to this important effort. We hope to continue engaging with this project and subsequent projects seeking to ensure AI does not jeopardize the continued safety, security, and wellbeing of the United States.

^{↩ 1} Public Safety – National Security Standards. National Institute of Standards and Technology. Accessed at: https://www.nist.gov/national-security-standards

^{↩ 2} Inioluwa Deborah Raji, Peggy Xu, Colleen Honigsberg, and Daniel Ho. 2022. Outsider Oversight: Designing a Third Party Audit Ecosystem for AI Governance. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society (AIES ’22). Association for Computing Machinery, New York, NY, USA, 557–571. https://doi.org/10.1145/3514094.3534181

^{↩ 3} METR. (March 17, 2023). Update on ARC’s recent eval efforts. Model Evaluation and Threat Research.

^{↩ 4} Casper, S., Ezell, C., Siegmann, C., Kolt, N., Curtis, T. L., Bucknall, B., …and Hadfield-Menell, D. (2024). Black-Box Access is Insufficient for Rigorous AI Audits. arXiv preprint arXiv:2401.14446.

^{↩ 5} Bucknall, B. S., and Trager, R. F. (2023). Structured Access For Third-party Research On Frontier ai models: investigating researchers model access requirements. Oxford Martin School AI Governance Initiative.

^{↩ 6} Company Announcement. (July, 2023). Frontier Threats Red Teaming for AI Safety. Anthropic.

^{↩ 7} Blog. OpenAI Red Teaming Network. OpenAI.

^{↩ 8} Qi, X., Zeng, Y., Xie, T., Chen, P. Y., Jia, R., Mittal, P., & Henderson, P. (2023). Fine-tuning aligned language models compromises safety, even when users do not intend to!. arXiv preprint arXiv:2310.03693. Accessed at: https://arxiv.org/abs/2310.03693

^{↩ 9} Weiss, B. and Sternlicht, A. (January 8, 2024). Meta and OpenAI have spawned a wave of AI sex companions—and some of them are children.

FLI Response to Bureau of Industry and Security (BIS): Request for Comments on Implementation of Additional Export Controls

Taylor Jones — Wed, 21 Feb 2024 10:45:06 +0000

Request for Comments on Implementation of Additional Export Controls: Certain Advanced Computing Items; Supercomputer and Semiconductor End Use (RIN 0694–AI94)

Organization: Future of Life Institute

Point of Contact: Hamza Tariq Chaudhry, US Policy Specialist. hamza@futureoflife.org

We would like to thank the Bureau of Industry and Security for the opportunity to provide comments on the October 7 Interim Final Rule (IFR), or the Rule on ‘Implementation of Additional Export Controls: Certain Advanced Computing Items; Supercomputer and Semiconductor End Use’ (hereafter referred to as ‘AC/S IFR’). The Future of Life Institute (FLI) has a long-standing tradition of work on AI governance to mitigate the risks and maximize the benefits of artificial intelligence. For the remainder of this Request for Comment (RfC) document, we provide a brief summary of our organization’s work in this space, followed by comments on the AC/S IFR. Our primary comment responds to the RfC on developing technical solutions to exempt items otherwise classified under ECCNs 3A090 and 4A090, and recommends a pilot program for a technical solution. The comment includes arguments for how the pilot program could help improve BIS export controls and mitigate threats to US economic and national-security interests. In the final section, we offer general comments to the AC/S IFR.

We look forward to continuing this correspondence and to serve as a resource for BIS efforts pertaining to AI in the months and years to come.

About the Organization

The Future of Life Institute (FLI) is an independent nonprofit organization with the goal of reducing large-scale risks and steering transformative technologies to benefit humanity, with a particular focus on artificial intelligence. Since its founding ten years ago, FLI has taken a leading role in advancing key disciplines such as AI governance, AI safety, and trustworthy and responsible AI, and is widely considered to be among the first civil society actors focused on these issues. FLI was responsible for convening the first major conference on AI safety in Puerto Rico in 2015, and for publishing the Asilomar AI principles, one of the earliest and most influential frameworks for the governance of artificial intelligence, in 2017. FLI is the UN Secretary General’s designated civil society organization for recommendations on the governance of AI and has played a central role in deliberations regarding the EU AI Act’s treatment of risks from AI. FLI has also worked actively within the United States on legislation and executive directives concerning AI. Members of our team have contributed extensive feedback to the development of the NIST AI Risk Management Framework, testified at Senate AI Insight Forums, participated in the UK AI Summit, and connected leading experts in the policy and technical domains to policymakers across the US government.

FLI’s wide-ranging work on artificial intelligence and beyond can be found at www.futureoflife.org.

Primary Comment on Hardware Governance

On the Request for Comment on Developing technical solutions to exempt items otherwise classified under ECCNs 3A090 and 4A090.

We welcome the request for technical solutions on this issue. FLI has recently been involved in multiple initiatives to create and improve technical solutions for the governance of AI hardware, including semiconductors. In this primary comment, we offer arguments in favor of technical solutions for hardware governance, and introduce a new project from FLI which seeks to improve on-chip governance.

Arguments for Technical Solutions for Hardware Governance

Technical solutions for hardware governance, and specifically chip governance, offer many benefits that can supplement top-down export controls as currently instated by BIS.

Generic Export Controls More Vulnerable to Lack of Enforcement than Hardware Governance

Export controls, especially those with a wide and expansive purview, are likely to suffer from serious gaps in enforcement. A growing informal economy around chip smuggling has already emerged over the last few years, and it is likely to grow as BIS rules become more expansive. A solution focused on hardware governance is less liable to this gap in enforcement.

Hardware Governance as Less Blunt Instrument and Less Likely to Hurt US Economic Interests

Export controls most directly target state actors, leading to conflation between ‘actor’ vs ‘application’ that may foreclose benefits and exacerbate risks to United States interests. For instance, broadly-applied export controls targeted at the People’s Republic of China (PRC) do not distinguish between harmless and harmful use cases within the PRC, the former of which can be economically beneficial to the United States and reduce geo-strategic escalation. For instance, relaxing restrictions on chip exports to demonstrably low-risk customers in China helps drive the economic competitiveness of US firms. These economic benefits are integral to guaranteeing continuing US leadership in the technological frontier, and help preserve global stability. Hardware governance, a more targeted instrument, side-steps these issues with export controls, focusing on applications as opposed to actors.

Hardware Governance is Privacy Preserving and Compatible with Existing Chips Technology

New and innovative hardware governance solutions are completely compatible with the current state of the art chips sold by leading manufacturers. All relevant hardware (H100s, A100s, TPUs, etc.) have some form of “trusted platform module”, a hardware device that generates random numbers, holds encryption keys, and interfaces with other hardware modules to provide security. Some new hardware (H100s in particular) has an additional hardware “secure enclave” capability, which prevents access to chosen sections of memory at the hardware level. TPM and secure enclaves already serve to prevent iPhones from being “jailbroken,” and to secure biometric and other highly sensitive information in modern phones and laptops. Hence, a technical solution to hardware governance would not impose serious costs on leading chip companies to modify the architecture of chips currently in inventory or in production. Critically, as the project described below demonstrates, it is possible to use these technical solutions without creating back-channels that would harm the privacy of end-users of the chip supply chain. Accordingly, hardware governance solutions such as the one proposed below are less likely to face resistance to implementation from concerned parties.

Technical Project – Secure Hardware Solutions for Safe AI Deployment

Background

Modern techniques in cryptography and secure hardware technology provide the building blocks to create verifiable systems that can enforce AI governance policies. For example, an un-falsifiable cryptographic proof can be created to attest that a model comes from the application of a specific code on a specific dataset. This could prevent copyright issues, or prove that a certain number of training epochs were carried out for a given model, verifying whether a threshold in compute has or has not been breached. The field of secure hardware has been evolving and has reached a stage where it can be used in production to make AI safer. While initially developed for users’ devices (e.g. Apple’s use of secure enclaves to securely store and process biometric data on iPhones), large server-side processors have become mature enough to tackle securely governed AI workloads.

While recent cutting-edge AI hardware, such as Intel Xeon with Intel SGX or Nvidia H100s with Confidential Computing, possess the hardware features to implement technical mechanisms for AI governance , few projects have emerged to leverage them to build AI governance tooling. The Future of Life Institute has partnered with Mithril Security, a startup pioneering the use of secure hardware with enclave-based solutions for trustworthy AI. This collaboration aims to demonstrate how AI governance policies can be enforced with cryptographic guarantees. In our first joint project, we created a proof-of-concept demonstration of confidential inference. We provide details of this work here because a crucial step to potential adoption of these mechanisms is demonstration that various use cases are practical using current technology.

Description of Project

Consider here two parties:

an AI custodian with a powerful AI model
an AI borrower who wants to run the model on their infrastructure but is not to be trusted with the weights directly

The AI custodian wants technical guarantees that:

the model weights are not directly accessible to the AI borrower.
trustable telemetry is provided to know how much computing is being done.
a non-removable off-switch button can be used to shut down inference if necessary.

Current AI deployment solutions, where the model is shipped on the AI borrower infrastructure, provide no IP protection, and it is trivial for the AI borrower to extract the weights without awareness from the custodian.

Through this collaboration, we have developed a framework for packaging and deploying models in an enclave using Intel secure hardware. This enables the AI custodian to lease a model, deployed on the infrastructure of the AI borrower, while the hardware guarantees that the weights are protected, and the trustable telemetry for consumption and off-switch are enforced. While this proof-of-concept is not necessarily deployable as is due to performance (we used Intel CPUs¹,) and specific hardware attacks that need mitigation, it serves as a demonstration of how secure enclaves can enable collaboration under agreed terms between parties with potentially misaligned interests.

By building upon this work, one can imagine how the US could lease its advanced AI models to allied countries while ensuring the model’s IP is protected and the ally’s data remains confidential and not exposed to the model provider. By developing and evaluating frameworks for hardware-backed AI governance, FLI and Mithril hope to encourage the creation and use of such technical measures so that we can keep AI safe without compromising the interests of AI providers, users, or regulators.

Future Projects Planned

Many other capabilities are possible, and we plan to rollout demos and analyses of more technical governance approaches in the coming months. The topic of BIS’s solicitation is one such approach: hardware could require remote approval if it identifies as part of a cluster satisfying some set of properties including size, interconnection throughput, and/or certificates of authorization. The objectives of the AC/S IFR could be further achieved through a secure training faculty, whereby an authority metes out authorized training compute cycles that are required for large training runs to be able to take place.² This secure training faculty could include a training monitor where all ML training runs above a threshold cluster size require, by law, licensing and compute training monitoring. In this process, licensing could be required via regulation requiring cluster limiting in all GPUs, and commitment to training monitoring could be required to obtain a license for training.

Many of these solutions can be implemented on existing and widely deployed hardware to allow AI compute governance to be backed by hardware measures. This addresses concerns that compute governance mechanisms are unenforceable or enforceable only with intrusive surveillance. The security of these measures needs testing and improvement for some scenarios, and we hope these demonstrations, and the utility of hardware-backed AI governance, will encourage both chip-makers and policymakers to include more and better versions of such security measures in upcoming hardware. Thus, while initially relying heavily on export restrictions and cooperation of data centers and cloud providers, eventually in principle on-chip mechanisms could carry the lion’s share of the responsibility for enforcement.

In this spirit, we recommend that:

(a) BIS consider requiring the more robust secure enclaves on advanced ICs, rather than just TPMs, which can serve similar functions less robustly.

(b) BIS encourage and support engagement with chipmakers and other technical experts to audit, test, and improve security levels of hardware security measures.

We welcome more engagement and collaboration with BIS on this front.

General Comments

#1 – The Inclusion of Civil Society Groups in Input for BIS Rules

In its responses to comments to the IFR, BIS has made clear that the input of Technical Advisory Committees (TACs) is an important aspect of deliberating and instating new rules on export controls. It is also clear that the BIS annual report allows industry players to offer feedback on export control trade-offs for semiconductors, as outlined in ECRA Sections 1765 and 1752, and on national security issues under Sections 4812 and 4811. However, it’s not evident if civil society actors have the same opportunities for comment and input, aside from this specific Request for Comment. There is now a significant and diverse set of AI policy groups in the civil society ecosystem, populated – as in the case of FLI – by some of the world’s leading experts from academia, government and industry. These actors possess a vital viewpoint to share on export control beyond the perspectives typically shared by industry. We invite BIS to clarify and make explicit the requirement for considerable input from civil society actors when it comes to the opportunities listed above, and those in the years to come.

#2 Clarifying Rule Applying to Those States which Facilitate Third Party WMD Activities

We commend the actions outlined within the AC/S IFR to ensure that export controls facilitate the restriction of weapons of mass destruction (WMD) related activities. FLI has published multiple reports on cyber, nuclear, chemical, and biological risks that intersect with the development of advanced AI systems. However, this risk does not emanate from state actors alone. In fact, several reports published over the last year demonstrate these same threats from non-state actors. We invite that BIS clarify that the IFR applies both to states listed in Country Group D:5 (and elsewhere) that use semiconductor technology for indigenous WMD-related activities, and those that are liable to share these technologies with allied and sponsored non-state actors, which in turn could use them for furthering WMD-activities.

#3 Preventing a Chilling Effect on Friendly US-China Cooperation

We support that BIS has clarified its position on § 744.6 in light of concerns that an overreach of the AC/S IFR might have a chilling effect on academic and corporate cooperation between Chinese and American persons and entities, cooperation with which may in fact forward the economic and national-security interests of the United States. We ask that the BIS expand on this acknowledgement by positively affirming in a separate section that such cooperation is welcome and within the remit of the AC/S IFR. The absence of clarity over this rule could detrimentally impact the current balance of US-China cooperation, threatening global stability and harming US national-security interests.

#4 On National Security Updates to the IFR

We welcome the BIS decision to introduce a density performance parameter to ensure that less powerful chips cannot be ‘daisy-chained’ into more powerful technologies and hence circumvent the principle purpose of the BIS rule. We also commend the use of a tiered approach when it comes to control of advanced integrated circuits. We hope that the BIS takes further account of emerging technological developments in hardware governance. For instance, new and innovative secure training hardware governance mechanisms (see point #5) can be required of IC makers in order to help prevent training of dual use models using unauthorized, heterogeneous distributed training.

#5 On addressing access to “development” at an infrastructure as a service (IaaS) provider by customers developing or intending to develop large dual-use AI foundation models with potential capabilities of concern, such as models exceeding certain thresholds of parameter count, training compute, and/or training data.

We welcome discussion on thresholds and potential capabilities of concern with regards to large dual-use foundation models. However, it is important to underscore that there should be explicit authority to change (and likely lower) these thresholds over time. This is because large dual-use AI foundation models with a certain set of thresholds held constant may become more powerful and dangerous to due to other factors.

For instance, algorithmic improvements in an AI model may significantly drive dual-use risk even if parameter count and training compute are held constant. In addition, the threshold of training data cannot just be quantitative but also qualitative – a model trained on higher quality or more dangerous (albeit smaller) training datasets can still present capabilities of concern.

Finally, the IFR would benefit from explicit discussion of the unique risk profile for capabilities of concern presented by dual-use AI models with widely available model weights . Models with widely available model weights at the same threshold as closed models will likely present greater potential capabilities of concern, as the guardrails from these models are more easily removed (if there are guardrails in the first place) and the models can be fine-tuned, using relatively little compute resources, to improve specific capabilities of concern.

Closing Remarks

We appreciate the thoughtful approach of BIS to the development of the AC/S IFR and are grateful for the opportunity to contribute to this important effort. We hope to continue engaging with this project and subsequent projects seeking to ensure AI does not jeopardize the continued safety, security, and wellbeing of the United States.

^{↩ 1} While we used CPUs in this case, a variation of this proof of concept would also work for GPUs, as they also support the Trust Platform Module (TPM) and secure enclave architectures.

^{↩ 2} This mechanism also facilitates various future auditability affordances.

Response to CISA Request for Information on Secure by Design AI Software

Taylor Jones — Mon, 19 Feb 2024 10:49:12 +0000

Request for Information (CISA-2023-0027-0001) on “Shifting the Balance of Cybersecurity Risk: Principles and Approaches for Secure by Design Software”

Organization: Future of Life Institute

Point of Contact: Hamza Tariq Chaudhry, US Policy Specialist. hamza@futureoflife.org

About the Organization

The Future of Life Institute (FLI) is an independent nonprofit organization with the goal of reducing large-scale risks and steering transformative technologies to benefit humanity, with a particular focus on artificial intelligence (AI). Since its founding, FLI has taken a leading role in advancing key disciplines such as AI governance, AI safety, and trustworthy and responsible AI, and is widely considered to be among the first civil society actors focused on these issues. FLI was responsible for convening the first major conference on AI safety in Puerto Rico in 2015, and for publishing the Asilomar AI principles, one of the earliest and most influential frameworks for the governance of artificial intelligence, in 2017. FLI is the UN Secretary General’s designated civil society organization for recommendations on the governance of AI and has played a central role in deliberations regarding the EU AI Act’s treatment of risks from AI. FLI has also worked actively within the United States on legislation and executive directives concerning AI. Members of our team have contributed extensive feedback to the development of the NIST AI Risk Management Framework, testified at Senate AI Insight Forums, participated in the UK AI Summit, and connected leading experts in the policy and technical domains to policymakers across the US government. We thank the Cybersecurity and Infrastructure Security Agency (CISA) for the opportunity to respond to this request for comments (RfC) regarding Dual Use Foundation Artificial Intelligence Models with Widely Available Model Weights, as specified in the White House Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence.

Executive Summary

The Future of Life Institute (FLI) has a long-standing tradition of thought leadership on AI governance toward mitigating the risks and maximizing the benefits of AI. As part of this effort, we have undertaken research and policy work focused on

The principles outlined in CISA’s Secure by Design white paper offer a tractable foundation for ensuring the security of traditional software systems. However, as the RfI suggests, there are security considerations unique to AI that are not covered by, or necessitate reinterpretation of, these principles. Focusing on AI as software, we advocate for four core principles (Protect, Prevent, Strengthen, and Standardize) for actions taken by CISA when ensuring adherence by developers to secure by design principles:

Protect advanced AI models developed in the United States from theft by malicious state and non-state actors, and from manipulation by these actors.
Prevent advanced AI systems from being used to launch AI-powered cyberattacks, both targeted at other kinds of software and also at the AI software itself.
Strengthen requirements that must be met before integrating advanced AI into cyber-defense systems, to ensure that cyber-defenses are not vulnerable to data poisoning, bias and other AI-derived harms.
Standardize ontologies and terminology unique to AI to inform the safe development, deployment, and governance of AI in the context of cybersecurity.

In line with these principles, we offer the following contributions:

Offer a Framework for a ‘Secure By Design‘ Technical Solution for AI systems. The RfI is clear that ‘AI is software and therefore should adhere to secure by design principles.’ Using advanced AI for formal verification and mechanistic interpretability and relying on prior innovations such as cryptographic guarantees, we offer a framework for ‘provably safe’ AI systems, providing necessary conditions to make them secure by design.
Analysis of, and Recommendations to Mitigate, Harms Posed to Software at the Intersection of AI and Cybersecurity. The white paper extensively discusses the complexity of guarding against and responding to software vulnerabilities, and the RfI poses several questions regarding these issues. As advancements in AI have accelerated, the cyber threats posed to software underpinning our digital and physical infrastructure have also increased. In our policy contribution to this effort, we offer analysis of the risks posed to software by AI-cyber threats, alongside recommendations to mitigate them to protect and strengthen software security. This includes recommendations for the national cybersecurity strategy and guidance for the integration of AI in national security systems.
Ensuring Transparency and Accountability from AI Products: In keeping with a fundamental principle of the Secure by Design framework, which directs developers of software – including AI – to develop ‘safe and secure products’, we offer recommendations to ensure that any development of advanced AI software is transparent and that developers are held accountable for the advanced AI they produce. This includes suggestions for licensing, auditing, and assigning liability for resulting harms.
Foster the Development of Common Ontologies and Terminology: In order for software systems to be safe-by-design, they must be verifiable against technical specifications. However, these technical specifications and their expression through common ontologies have yet to be standardized. We recommend that CISA support the standardization of these ontologies and terms.

1. Technical Framework for ‘Secure By Design’ AI systems

Background – Summary of Research Findings

In September 2023, FLI founder and President Dr. Max Tegmark published a paper on provably safe AI systems, in co-authorship with AI safety pioneer Dr. Steve Omohundro.¹ Here, we condense the findings of that paper into a secure by design technical framework for AI systems.

The paper proposes a technical solution to designing secure AI systems by advancing the concept of provably safe AI systems. This framework has five components:

A Provably Compliant System (PCS) is a system (hardware, software, social or any combination thereof) that provably meets certain formal specifications.
Proof-carrying code (PCC) is software that is not only provably compliant, but also carries within it a formal mathematical proof of its compliance, i.e., that executing it will satisfy certain formal specifications. Because of the dramatic improvements in hardware and machine learning (ML), it is now feasible to expand the scope of PCC far beyond its original applications such as type safety, since ML can discover proofs too complex for humans to create.
Provably Compliant Hardware (PCH) is physical hardware the operation of which is governed by a Provable Contract.
Provable Contracts (PC) govern physical actions by using secure hardware to provably check compliance with a formal specification before actions are taken. They are a generalization of blockchain ”Smart Contracts” which use the cryptographic guarantees to ensure that specified code is correctly executed to enable blockchain transactions. Provable contracts can control the operation of devices such as drones, robots, GPUs and manufacturing centers. They can ensure safety by checking cryptographic signatures, zero-knowledge proofs, proof-carrying code proofs, etc. for compliance with the specified rules.
Provable Meta-Contracts (PMC) impose formal constraints on the creation or modification of other provable contracts. For example, they might precisely define a voting procedure for updating a contract. Or they might encode requirements that provable contracts obey local laws. At the highest level, a PMC might encode basic human values that all PCs must satisfy.

Taking these components together, provably compliant systems form a natural hierarchy of software and hardware. If a GPU is PCH, then it should be unable to run anything but PCC meeting the GPU’s specifications. As far as software is concerned, PCH guarantees are analogous to immutable laws of physics: the hardware simply cannot run non-compliant code. Moreover, a PCC can be often be conveniently factored into a hierarchy of packages, subroutines and functions that have their own compliance proofs. If a provable contract controls the hardware that PCC attempts to run on, it must comply with the specification. Compliance is guaranteed not by fear of sanctions from a court, but because it is provably physically impossible for the system to violate the contract.

Implications for Secure by Design AI

Due to the black box nature of AI systems, some AI experts argue that it is nearly impossible to fully secure an AI system through technical means alone.² ³

By applying and building on the research of Dr. Tegmark and Dr. Omohundro, however, developers can build technical components into AI systems that create a pathway to verifiably secure systems. Hence, this line of research serves as proof of concept that securing AI systems by design is technically feasible. Coupled with thoughtful policy mechanisms to strengthen the security of AI systems, we believe this type of technical solution can be effective in ensuring secure by design AI systems. We look forward to engaging with CISA in the future to expand this research project and integrate it with ‘secure by design’ guidance offered by CISA to AI software developers.

2. Problem Analysis and Recommendations to Mitigate Harms Posed to Software at the Intersection of AI and Cybersecurity

Numerous reports have pointed to the ways that AI systems can make it easier for malevolent actors to develop more virulent and disruptive malware, and can lower the barrier of technical expertise necessary for motivated individuals to carry out cyberattacks.⁴ ⁵ AI systems can also help adversaries automate attacks on cyberspaces, increasing the efficiency, creativity and impact of cyberattacks via novel zero-day exploits (i.e. previously unidentified vulnerabilities), targeting critical infrastructure, better automating penetration scans and exploits, and enhancing techniques such as phishing and ransomware. As AI systems are increasingly empowered to plan and execute self-selected tasks to achieve assigned objectives, we can also expect to see the emergence of autonomous hacking activities initiated by these systems in the near future. All of these developments have changed the threat landscape for software vulnerabilities. This policy contribution first summarizes these threats, and then provides recommendations that could help companies, government entities and other actors protect their software.

Threat Analysis

Threat to Software Underpinning Critical Infrastructure. An increasing proportion of US critical infrastructure, including those pieces relevant to health (e.g. hospital systems), utilities (e.g. heating, electrical supply and water supply), telecommunications, finance, and defense are now ‘on the grid’ – reliant on integrated online software- leaving them vulnerable to potential cyberattacks by malicious actors. Such an attack could, for instance, shut off the power supply of entire cities, access high-value confidential financial or security information, or disable telecommunications networks. AI systems are increasingly demonstrating success in exploiting such vulnerabilities in the software underpinning critical infrastructure.⁶ Crucially, the barrier to entry, i.e. the level of skill necessary, for conducting such an attack is considerably lower with AI than without it, increasing threats from non-state actors and the number and breadth of possible attempts that may occur. Patching these vulnerabilities once they have been exploited takes time, which means that painful and lasting damage may be inflicted before the problem is remedied.
Cyber-vulnerabilities in Labs Developing Advanced AI Software. As the RfI outlines, there is a need to ensure that AI is protected from vulnerabilities just as is the case with traditional software. The “Secure by Design” white paper advocates for software developers to “take ownership of their customer’s security outcomes.” This responsibility should also apply to AI developers, compelling them to address AI-specific cyber vulnerabilities that affect both product safety for customers and wider societal concerns. The most advanced AI systems in the world – primarily being developed in the United States – are very likely to be targeted by malicious state and non-state actors to access vital design information (e.g., the model weights underpinning the most advanced large language models). Because developing these systems is resource intensive and technically complex, strategic competitors and adversaries may instead steal these technologies without taking the considerable effort to innovate and develop them, damaging U.S. competitiveness and exacerbating risks from malicious use. Once model weights are obtained, these actors could relatively easily remove the safeguards from these powerful models, which normally protect against access to dangerous information such as how to develop WMDs. Several top cybersecurity experts have expressed concerns that the top AI labs are ill-equipped to protect these critical technologies from cyber-attacks.
Integration of Opaque, Unpredictable and Unreliable AI-Enabled Cybersecurity Systems. Partly to guard against exploitation of vulnerabilities, there has been increasing interest in the potential use of AI systems to enhance cybersecurity and cyber-defense. This comes with its own set of threats, especially with opaque AI systems for which behavior is extremely difficult to predict and explain. Data poisoning – cases where attackers manipulate the data being used to train cyber-AI systems – could lead to systems yielding false positives, failing to detect intrusions, or behaving in unexpected, undesired ways. In addition, the model weights of the systems themselves can be largely inferred or stolen using querying techniques designed to find loopholes in the model. These systems could also autonomously escalate or counter-attack beyond their operators’ intentions, targeting allied systems or risking serious escalations with adversaries.

In summary, software vulnerabilities are under greater threat of covert identification and exploitation due to AI-powered cyberattacks. At the same time, integration of AI into cybersecurity systems to guard software presents unique threats of its own. Finally, the state of the art AI software being developed within leading labs within the United States is itself under threat from malicious actors.

Recommendations for Threat Mitigation

To mitigate these problems, we propose the following recommendations:

Industry and governmental guidance should focus explicitly on AI-enabled cyber attacks in national cyber strategies: AI goes completely unmentioned in the National Cybersecurity Strategy Implementation Plan published by the White House in July 2023, despite recognition of cyber risks of AI in the National Cybersecurity Strategy itself. AI risks need to be integrated explicitly into a broader cybersecurity posture, including in the DOD Cyber Strategy, the National Cyber Incident Response Plan (NCIRP), the National Cybersecurity Investigative Joint Task Force (NCIJTF) and other relevant plans.
Promulgate Guidance for Minimum Standards for Integration of AI into Cybersecurity Systems and Critical Infrastructure: Integrating unpredictable and vulnerable AI systems into critical cybersecurity systems may create cyber-vulnerabilities of its own. Minimum standards regarding transparency, predictability and robustness of these systems should be set up before they are used for cybersecurity functions in critical industries. Additionally, building on guidance issued in accordance with EO 13636 on Improving Critical Infrastructure Cybersecurity4, EO 13800 on Strengthening the Cybersecurity of Federal Networks and Critical Infrastructure5, and the Framework for Improving Critical Infrastructure Cybersecurity published by NIST6, AI-conscious standards for cybersecurity in critical infrastructure should be developed and enforced. Such binding standards should account in particular for risks from AI-enabled cyber-attacks, and should be developed in coordination with CISA, SRMA and SLTT offices.

3. Ensuring Transparency and Accountability from AI Products

We ask that CISA and DHS consider the following recommendations to guarantee the transparent and accountable development of secure AI. In addition, these recommendations would ensure that developers take responsibility for software security and do not impose unfair costs on consumers, a fundamental principle of the Secure by Design framework. To protect and strengthen AI systems, we recommend that CISA:

Require Advanced AI Developers to Register Large Training Runs and to “Know Their Customers”: The Federal Government lacks a mechanism for tracking the development and proliferation of advanced AI systems, despite there being a clear need expressed by agencies including CISA to guarantee security of AI software. In addition, these advanced AI systems could exacerbate cyber-risk for other kinds of software. In order to mitigate cybersecurity risks, it is essential to know what systems are being developed and what kinds of actors have access to them. Requiring registration for the acquisition of large amounts of computational resources for training advanced AI systems, and for carrying out the training runs themselves, would help with tracking and evaluating possible risks and taking appropriate precautions. “Know Your Customer” requirements, similar to those imposed in the financial services industry, would reduce the risk of powerful AI systems falling into the hands of malicious actors.
Establish, or Support the Establishment of, a Robust Pre-deployment Auditing and Licensure Regime for Advanced AI Systems: In order to ensure the security of AI software, it must first be guaranteed that AI systems do not behave in dangerous and unpredictable ways. Advanced AI that can pose risks to cybersecurity, may be integrated into a system’s critical functions, or may be misused for malicious attacks are not presently required to undergo independent assessment for safety, security, and reliability before being deployed. Additionally, there are presently no comprehensive risk assessments for AI systems across their extensive applications and integrations. Requiring licensure before potentially dangerous advanced AI systems are deployed, contingent on credible independent audits for compliance with minimum standards for safety, security, and reliability, would identify and mitigate risks before the systems are released and become more difficult to contain. Audits should include red-teaming to identify cyber-vulnerabilities and to ensure that systems cannot be readily used or modified to threaten cybersecurity.
Clarify Liability for Developers of AI Systems Used in Cyber-attacks: In order to encourage transparency, accountability and generally protect software from AI-powered cyberattacks, it is critical to establish a liability framework for developers of AI systems that could conceivably be used to exploit cyber-vulnerabilities. At present, it is not clear under existing law whether the developers of AI systems used to, e.g., damage or unlawfully access critical infrastructure would be held liable for resulting harms. Absolving developers of liability in these circumstances creates little incentive for profit-driven developers to expend financial resources on precautionary design principles and robust assessment. Because these systems are opaque and can possess unanticipated, emergent capabilities, there is inherent risk in developing systems expected to be used in critical contexts as well as advanced AI systems more generally. Implementing strict liability when these systems facilitate or cause harm would better incentivize developers to take appropriate precautions against cybersecurity vulnerabilities, critical failure, and the risk of use in cyber-attacks.

4. Foster the Development of Common Ontologies and Terminology

The lack of standardized ontologies, terminology, and comprehensive risk management frameworks complicates the security landscape for AI systems, which present novel and amplified challenges compared to traditional software.⁷ In order for software systems to be safe by design, they must be verifiably compliant with technical specifications, and technical specifications are expressed using ontologies, i.e. graphical schema representing the entity types, properties, relationships, and constraints within one or more domains of knowledge. Furthermore, the general purpose nature of many machine learning systems, which inherently have a wide range of applications, renders the assessment of their risks particularly challenging. To standardize these shared approaches we recommend that CISA:

Induce and support the development of shared ontologies at the intersection of AI and cybersecurity⁸: These should be developed within and across industries, government, and nations so that broader and deeper networks of compatible and provable security can more easily flourish. Likewise, development of crosswalks, bridge ontologies, and ontology alignment faculties would also aid such an ecosystem.⁹
Support the standardization of terminology relevant to AI and cybersecurity: AI security approaches have often borrowed terms, frameworks, and techniques from related fields like cybersecurity, hardware, and system safety engineering.¹⁰ While this can occasionally be appropriate, it often leads to misinterpretations that prevent the effective use of established risk mitigation strategies. Formal definitions for what constitutes, e.g., audits, system requirements and safety requirements should be established within the context of AI and cybersecurity to avoid conflation with other fields and inform downstream management.¹¹

Closing Remarks

We appreciate the thoughtful approach of CISA to the development of the Secure by Design Software framework and are grateful for the opportunity to contribute to this important effort. We hope to continue engaging with this project and subsequent projects seeking to ensure AI software does not jeopardize the continued safety, security, and wellbeing of the United States.

^{↩ 1} Max Tegmark and Steve Omohudro. (2023). Provably safe systems: the only path to controllable AGI. arXiv preprint arXiv:2309.01933.

^{↩ 2} Mike Crapps. (March, 2023). Making AI trustworthy: Can we overcome black-box hallucinations? TechCrunch.

^{↩ 3} W.J. von Eschenbach. (2021). Transparency and the black box problem: Why we do not trust AI. Philosophy & Technology, 34(4), 1607-1622.

^{↩ 4} Bécue, A., Praça, I., & Gama, J. (2021). Artificial intelligence, cyber-threats and Industry 4.0: Challenges and opportunities. Artificial Intelligence Review, 54(5), 3849-3886.

^{↩ 5} Menn, J. (May, 2023). Cybersecurity faces a challenge from artificial intelligence’s rise. Washington Post.

^{↩ 6} Office of Intelligence and Analysis. Homeland Threat Assessment 2024. Department of Homeland Security.

^{↩ 7} While the NIST AI RMF may constitute a standardized RMF, we believe it still requires considerable iteration to fill gaps in AI risk management.

^{↩ 8} A shared ontology – or a shared schematic representation of concepts and terminologies across different contexts – is often developed to help collaborate on workflows. For instance, a shared biomedical ontology could help computer systems and decision-makers collate and analyze information across several different biomedical websites. In this context, it would help different actors working with a wide variety of systems in diverse contexts to effectively cooperate on AI and cybersecurity issues.

^{↩ 9} Crosswalks effectively function as translators in cases where complex networks of systems and data employ different terminologies and classifications for concepts. Crosswalks provide mappings to allow translation between these different schemes. A bridge ontology can serve a similar function, representing the construction of a bridge between different ontologies. All of these efforts feed into ontology alignment, the practice of ensuring correspondence between concepts in different ontologies.

^{↩ 10} Heidy Khlaf. (March, 2023). Toward Comprehensive Risk Assessments and Assurance of AI-Based Systems. Trail of Bits.

^{↩ 11} We commend current efforts in this regard such as the NIST glossary of terms as a starting point. (See Trustworthy and Responsible AI Resource Center. Glossary. NIST) We request that this glossary be expanded and more widely adopted and applied to serve the function of effectively standardizing terminology. CISA can play a critical role here by incorporating interpretations of cybersecurity terms in AI contexts where their meaning may be more ambiguous due to distinctions between AI and traditional software.

FLI AI Liability Directive: Executive Summary

Taylor Jones — Tue, 28 Nov 2023 20:07:35 +0000

Executive Summary

FLI position on adapting non-contractual civil liability rules to artificial intelligence (AI Liability Directive)¹

Introduction

The Future of Life Institute (FLI) welcomes the opportunity to provide feedback on the European Commission’s proposal to adapt non-contractual civil liability rules to artificial intelligence (AILD). Liability can play a key role in catalysing safe innovation by encouraging the development of risk-mitigation strategies that reduce the likelihood of harm before products or services are deployed into a market. Moreover, an effective liability framework protects consumers’ fundamental rights and can increase their trust in and uptake of new technologies. Safety and liability are intertwined concepts. Keeping AI safe requires a coherent and strong liability framework that guarantees the accountability of AI systems.

In light of the ongoing AI Act negotiations and the discussions around the adoption of a revised Product Liability Directive (PLD)², we considered it timely to update the recommendations of our 2022 AI Liability Position Paper.

The European Commission proposal on non-contractual civil liability rules for AI (AILD) establishes a fault-based liability framework for all AI systems, regardless of their risk, under the proposed AI Act. The AILD covers non-contractual fault-based civil liability claims for damages caused by an output, or the absence of an output, from an AI system. A fault-based claim usually requires proof of damage, the fault of a liable person or entity, and the causality link between that fault and the damage. However, AI systems can make it difficult or impossible for victims to gather the evidence required to establish this causal link. The difficulty in gathering evidence and presenting it in an explainable manner to a judge lies at the heart of claimants’ procedural rights. The AILD seeks to help claimants’ to fulfill their burden of proof by requiring disclosure of relevant evidence and by mandating access, under specific circumstances, to defendants’ information regarding high-risk AI systems that can be crucial for establishing and supporting liability claims. It also imposes a rebuttable presumption of causality, establishing a causal link between non-compliance with a duty of care and the AI system output, or failure to produce an output, that gave rise to the damage. This presumption aims to alleviate the burden of proof for claimants. Such a mechanism is distinct from a full reversal of the burden of proof in which the victim bears no burden, and the person presumed liable must prove that the conditions of liability are not fulfilled. Moreover, the AILD specifically addresses the burden of proof in AI-related damage claims, while national laws govern other aspects of civil liability. In this sense, the AILD focuses on the procedural aspects of liability consistent with a minimum harmonisation approach, which allows claimants to invoke more favourable rules under national law (e.g., reversal of the burden of proof). National laws can impose specific obligations to mitigate risks, including additional requirements for users of high-risk AI systems.

AILD’s shortcomings are hard to overlook.³ It falls short of what is expected for an effective AI liability framework in three crucial aspects. First, it underestimates the black box phenomena of AI systems and, therefore, the difficulties for claimants and sometimes defendants to understand and obtain relevant and explainable evidence of the logic involved in self-learning AI systems. This situation is particularly evident for advanced general purposes AI systems (GPAIS). Second, it fails to make a distinction between the requirements for evidential disclosure needed in the case of GPAIS versus other AI systems. In a case involving GPAIS, claimants’ ability to take their cases to court and provide relevant evidence will be severely undermined under a fault-based liability regime. Third, it does not acknowledge the distinct characteristics and potential for systemic risks and immaterial harms stemming from certain AI systems. It’s time to acknowledge these shortcomings and work towards enhanced effectiveness (an effective possibility for parties to access facts and adduce evidence in support of their claims) and fairness (implying a proportionate allocation of the burden of proof).

To remedy these points, FLI recommends the following:

I. Strict liability for general-purpose AI systems (GPAIS) to encourage safe innovation by AI providers.

FLI recommends a strict liability regime for GPAIS. Strict or no fault-based liability accounts for the knowledge gap between providers, operators of a system, claimants, and the courts. It also addresses the non-reciprocal risks created by AI systems. This model incentivises the development of safer systems and placement of appropriate guardrails for entities that develop GPAIS (including foundation models) and increases legal certainty. Furthermore, it protects the internal market from unpredictable and large-scale risks.

To clarify the scope of our proposal, it is important to understand that we define GPAIS as “An AI system that can accomplish or be adapted to accomplish a range of distinct tasks, including some for which it was not intentionally and specifically trained.”⁴ This definition underscores the unique capability of a GPAIS to accomplish tasks beyond its specific training. It Is worth noting that the AI Act, in its current stage of negotiation, seems to differentiate between foundation models and GPAIS. For the sake of clarity, we use “General purpose AI systems” as a future-proof term encompassing the terms “foundation model”, and “generative AI”. It provides legal certainty for standalone (deployed directly to affected persons) and foundational GPAIS (provided downstream to deployers or other developers). Moreover, we consider GPAIS to be over a certain threshold, allowing us to bring into scope currently deployed AI systems such as Megatron Turing MLG, LlaMa 2, OPT-175B, Gopher, PanGu Sigma, AlexaTM, and Falcon, among other examples. Furthermore, GPAIS can be used in high-risk use cases, such as dispatching first response services or recruiting natural persons for a job. Those cases are under the scope of high-risk AI systems. But GPAIS serves a wide range of functions not regulated by Annex III of the AI Act, which presents serious risks, for example, they can be used to develop code or create weapons.⁵

Given their emergent and unexpected capabilities, unpredictable outputs, potential for instrumental autonomous goal development, and low level of interpretability, GPAIS should be explicitly included in the scope of the AILD. GPAIS opacity challenges the basic goal of legal evidence, which is to provide accurate knowledge that is both fact-dependent and rationally construed.⁶ This barrier triggers myriad procedural issues in the context of GPAIS that are not resolved by the mechanisms established in Art. 3 and 4 AILD. It also disproportionately disadvantages claimants who need to lift the veil of opacity of GPAIS logic and outputs. Moreover, GPAIS creates non-reciprocal risks even if the desired level of care is attained; only strict liability is sufficient to incentivise a reduction of harmful levels of activity.

There are three compelling reasons for adding strict liability to GPAIS:

Strict liability mitigates informational asymmetries in disclosure rules for cases involving GPAIS, guaranteeing redress and a high level of consumer protection.
The necessary level of care to safely deploy a GPAIS is too complex for the judiciary to determine on a case-by-case basis, leading to a lack of legal certainty for all economic actors in the market.
Disclaimers on liability issues and a lack of adequate information-sharing regimes between upstream and downstream providers place a disproportionate compliance burden on downstream providers and operators using GPAIS.

Recommendations:

Specify the wording in Art. 1(1)(a) of the AILD so that the Directive will be applicable to GPAIS, whether or not they would otherwise qualify as high-risk AI systems.
Include GPAIS in the definitions in Art. 2 of the AILD, and clearly define GPAIS that will be subject to strict liability.
Add a provision to the AILD establishing strict liability for GPAIS.
Establish a joint liability scheme between upstream and downstream developers and deployers. In order to ensure consumers are protected, all parties should be held liable jointly when a GPAIS causes damage, with compensation mechanisms allowing the injured person to recover for the total relevant damage. This is in line with Art. 11, and 12 of the PLD, and the legislator can be inspired by the GDPR and the way responsibilities for controllers and processors of data are allocated.
Specify that knowledge of potential harm should be a standard when allocating responsibility to the different links of the value chain, whether the harm has occurred or not. Model cards on AI systems should be regarded as a standard of the knowledge of harm a GPAIS provider has on the deployment of their system in order to allocate risk.
Clearly link the forthcoming AI Act obligations on information sharing to GPAIS in the AILD to mitigate informational asymmetries between (potential) claimants and AI developers.
Specify that neither contractual derogations, nor financial ceilings on the liability of an AI corporation providing GPAIS are permitted. The objective of consumer protection would be undermined if it were possible to limit or exclude an economic operator’s liability through contractual provisions. This is in line with Recital 42 of the PLD proposal.

II. Include commercial and non-commercial open-source⁷ AI systems under the liability framework of the AILD to encourage a strong and effective liability framework,

The term “open source” is being applied to vastly different products without a clear definition.⁸ The business model of some AI systems labelled as open source is also unclear. Finally, there is no consensus on which elements can be determined to characterise commercial or non-commercial open source in this new regulatory landscape. Open-source AI systems are not directly addressed in the scope of the AILD. However, there are three crucial reasons to include commercial and non-commercial open-source AI systems explicitly under the liability framework of the AILD, regardless of whether they are considered GPAIS or narrow AI systems:

1. Unlike with open source software, there is no clarity of what “open source” means in the context of AI. This introduces loopholes for unsafe AI systems to be deployed under the banner of ‘open source’ to avoid regulatory scrutiny.

2. Deploying AI systems under an open source license poses irreversible security risks and enables misuse by malicious actors. This compromises the effectiveness and legal certainty of the whole AI liability framework. The decentralised control of open-source systems means that any misuses or unintended consequences that arise will be extremely challenging, if not impossible, to cut off by the upstream provider. There is no clear mechanism to control the open distribution of high-risk capabilities in the case of advanced AI systems and models once they are distributed or deployed.

3. If open-source AI systems are allowed to be deployed in the market without being subject to the same rules as other systems, this would not only create an unequal playing field between economic actors but also devoid the AI liability framework of its effectiveness. It would suffice to be branded open-source to escape liability, which is already a market dominance strategy of some tech behemoths. By going the route of explicitly including all open-source AI systems in the AILD framework, this ex-post framework would contribute indirectly to the enforcement of the AI Act provisions on risk mitigation and the application of sectoral product safety regulation that intersects with the products under the scope of the EU AI Act.

Recommendation:

Explicitly include in the scope of the AILD both commercial and non-commercial open-source AI systems.
Define the elements to be considered commercial open-source AI systems in collaboration with the open-source AI community to enhance economic operators’ legal certainty. LlaMa 2 is an example of a commercial open source, even though it is mostly not sold and its source code was not released. Therefore, it should be under the scope of the AILD.
Carefully review and justify based on evidence if exemptions for open source are needed. If yes, explicitly address non-commercial open-source AI systems exemptions, in line with other EU law instruments. For example, through licensing agreements, there could be a limited exemption in the liability framework for exclusively academic researchers, so long as they do not proliferate the liability-emitting artefacts to third parties and there are obligations to subject these systems to rigorous physical and cybersecurity access controls to prevent the deliberate or accidental leaking or proliferation of model weights. They should also be subject to external audits, red-teaming, and information-sharing obligations.

III. Establish a fault-based liability with reversed burden of proof for non-general purpose high-risk AI system.

FLI agrees with the AILD proposal in that some high-risk AI systems should fall under a fault-based liability regime. This will be the case with non-general purpose high-risk AI systems.⁹ However, the presumption of fault should lie on the provider of an AI system. Pursuing this course of action would ease the burden for claimants and increase their access to justice by minimising information asymmetry and transaction costs. Providers of AI systems can rebut this presumption of fault by proving their compliance with and observance of the required level of care or by the lack of a causal link between the output and the damage. Non-compliance liability relies on the AI Act as the “backbone” of AI safety legislation for the liability framework.

As mentioned earlier, several specific characteristics of AI can make it difficult and costly for injured parties to identify and prove the fault of a potentially liable entity in order to receive compensation.¹⁰ Harmed individuals are subject to significant information asymmetry with respect to the AI systems they interact with because they may not know which code or input caused harm. The interplay between different systems and components, the multitude of actors involved, and the increasing autonomy of AI systems add to the complexity of proving fault.¹¹ In this case, liability will be placed on the AI provider, the party that can reduce harm at the lowest cost.

FLI believes that a fault-based liability regime with a reversed burden of proof for non-general purpose high-risk AI systems is a sufficient and balanced approach. Following the risk-based approach of the AI Act, it seems sensible to have less stringent requirements than strict liability for these AI systems, which do not necessarily exhibit the self-learning and autonomous capabilities of GPAIS. Moreover, most of the use cases for these systems are defined narrowly in Annex III and will be subject to rigorous requirements under the AI Act. However, some non-general purpose AI systems might not be captured by Annex III of the AI Act, for this reason, we propose that the liability regime is not dependent on the high-risk categorisation of the AI Act, but that has a broader scope to fully capture risks for harm from AI providers and offer the effective possibility of redress to claimants.

Recommendation

Modify Art. 3 (1) AILD to include a reversed burden of proof for non-general purpose high-risk AI systems.
Establish a clear distinction between non-general purpose high-risk AI systems (also sometimes referenced to as high-risk narrow AI systems) and GPAIS in the AILD.
Create a mechanism that aligns the AI Act regulatory authorities, such as the AI Office, with the liability framework. For example, regulatory authorities under the AI Act could also become a “one-stop shop” for AI providers, potential claimants, and lawyers seeking to obtain evidence on high-risk systems and their compliance with their duty of care under the AI Act. They will be placed advantageously to assess prima facie the level of compliance of a given non-general purpose high-risk AI system and support potential claimants’ evidence requests. This “one-stop-shop” mechanism could mirror some of the features of the mechanisms under GDPR that allow for cross-border enforcement cooperation between data protection authorities.

IV.Protect the fundamental rights of parties injured by AI systems by including systemic harms and immaterial damages in the scope of the AILD.

FLI calls for compensable damages to be harmonised across the EU and include immaterial and systemic harms. This recommendation is without prejudice to the liability frameworks from EU Member States and the minimum harmonisation approach that the AILD aims to achieve. FLI argues that (a) immaterial and systemic harms stemming from AI systems should be in the scope of recoverable damages, and (b) in order to ensure consistent protection of fundamental rights across Member States, immaterial, societal, and systemic harms produced by an AI system should be defined by EU law and not by national laws.

Addressing “systemic risk” and, by extension, societal-level harms, is not a new concept for the EU legislator,¹² as it has been addressed in the context of the Digital Services Act (DSA).¹³ Some of the risks that AI poses are relatively small or unlikely on a per-incident basis, but together, can aggregate to generate severe, impactful, correlated, and adverse outcomes for specific communities or for society as a whole. Adding a systemic risk dimension to the proposed liability framework in the AILD, therefore, reflects fundamental rights considerations.

Along with systemic harms, we also propose that immaterial harms (also referred to as “non-material harms” or “non-material damages”) be covered within the scope of the AILD. Immaterial harms refer to harms that are challenging to quantify in monetary terms, as the damage itself is of a “qualitative” nature and not directly related to a person’s physical health, assets, wealth, or income. Covering immaterial harms is necessary to account for the particular nature of damages caused by AI systems, including “loss of privacy, limitations to the right of freedom of expression, human dignity, discrimination, for instance in access to employment.”¹⁴ It is reasonable to consider that risks associated with AI systems can quickly scale up and affect an entire society. However, the proposed Directive leaves it up to Member States to define the damages covered. This could mean that a person discriminated against by a credit-scoring AI system could claim damages for such discrimination in one Member State but not in another.

Scholars have also proposed attaching compensation for immaterial harms to a model of non-compliance liability when deployers and operators engage in prohibited or illegal practices under the AI Act.¹⁵ This model could fit easily into existing non-discrimination, data protection, and consumer protection legislation. For example, Article 82 of the GDPR¹⁶ provides for the liability of a controller or processor where this entity violates their obligations under the GDPR. In this sense, the scope of application for recoverable immaterial damages will not be too broad, countering the idea that including immaterial damages disproportionately broaden liability provisions.

Explicitly including immaterial damages and systemic harms in the recitals and definitions of the AILD would enhance the protective capacity of the framework and solidify the links between the AI Act and the AILD. Notably, given that Recital 4 of the AI Act¹⁷ explicitly recognises “immaterial” harms posed by AI, both in the European Commission and Council text. The European Parliament’s mandate for the AI Act further highlights immaterial harms, mentioning “societal” harm specifically.¹⁸ The resolution already proposed that ‘significant immaterial harm’ should be understood as harm that results in the affected person suffering considerable detriment, an objective and demonstrable impairment of his or her personal interests, and an economic loss.

Recommendation:

Modify Recital 10 AILD to include systemic harms and immaterial damages as recoverable damages.
Include a definition of immaterial harm in the AILD based on the AI Act, and the European Parliament’s resolution.
Include a notion of systemic risk in the AILD based on the DSA.

Notes & references

Proposal for a Directive of the European Parliament and of the Council on adapting non-contractual civil liability rules to artificial intelligence, COM(2022) 496 final, 28.9.2022 (AILD) ↩︎
For the proposed revision see Proposal for a Regulation of the European Parliament and of the Council on General Product Safety, amending Regulation (EU) No 1025/2012 of the European Parliament and of the Council, and repealing Council Directive 87/357/EEC and Directive 2001/95/EC of the European Parliament and of the Council, COM/2021/346 final (PLD proposal); For the original text see Council Directive 85/374/EEC of 25 July 1985 on the approximation of the laws, regulations and administrative provisions of the Member States concerning liability for defective products (OJ L 210, 7.8.1985, p. 29). ↩︎
For a detailed analysis on the main shortcomings of the AILD and its interaction with the PLD framework, see Hacker, Philipp, The European AI Liability Directives – Critique of a Half-Hearted Approach and Lessons for the Future (November 25, 2022). Available at http://dx.doi.org/10.2139/ssrn.4279796 ↩︎
This definition includes unimodal (e.g., GPT-3 and BLOOM) and multimodal (e.g., stable diffusion, GPT-4, and Dall-E) systems. It contains systems at different points of the autonomy spectra, with and without humans in the loop ↩︎
These risks have been acknowledged by the Hiroshima process. See, OECD (2023),‘G7 Hiroshima Process on Generative Artificial Intelligence (AI): Towards a G7 Common Understanding on Generative AI. ↩︎
For a procedural law perspective on the admissibility of evidence in courts regarding AI systems cases, see Grozdanovski, Ljupcho. (2022). L’agentivité algorithmique, fiction futuriste ou impératif de justice procédurale ?: Réflexions sur l’avenir du régime de responsabilité du fait de produits défectueux dans l’Union européenne. Réseaux. N° 232-233. 99-127. 10.3917/res.232.0099; Grozdanovski, Ljupcho. (2021). In search of effectiveness and fairness in proving algorithmic discrimination in EU law. Common Market Law Review. 58. 99-136. 10.54648/COLA2021005. ↩︎
For ease of understanding the term “open-source” is used as a colloquial term to refer to models with public model weights. As briefly discussed in this paper, so-called open-source AI systems don’t actually provide many of the benefits traditionally associated with open-source software, such as the ability to audit the source code to understand and predict functionality. ↩︎
Widder, David Gray and West, Sarah and Whittaker, Meredith, Open (For Business): Big Tech, Concentrated Power, and the Political Economy of Open AI (August 17, 2023). http://dx.doi.org/10.2139/ssrn.4543807 ↩︎
Non-general purpose AI systems sometimes referenced to as high-risk narrow AI systems. As indicated before GPAIS will be subject to strict liability. ↩︎
Such characteristics are autonomous behaviour, continuous adaptation, limited predictability, and opacity – European Commission (2021), Civil liability – adapting liability rules to the digital age and artificial intelligence, Inception Impact Assessment ↩︎
Buiten, Miriam and de Streel, Alexandre and Peitz, Martin, EU Liability Rules for the Age of Artificial Intelligence (April 1, 2021). Available at SSRN: https://ssrn.com/abstract=3817520 or http://dx.doi.org/10.2139/ssrn.3817520; Zech, H. Liability for AI: public policy considerations. ERA Forum 22, 147–158 (2021). https://doi.org/10.1007/s12027-020-00648-0. ↩︎
Interestingly, Recital 12 of the AILD acknowledges systemic risks under the DSA framework. ↩︎
Regulation (EU) 2022/2065 of the European Parliament and of the Council of 19 October 2022 on a Single Market For Digital Services and amending Directive 2000/31/EC (Digital Services Act) OJ L 277, 27.10.2022, p. 1–102. ↩︎
European Commission, White Paper On Artificial Intelligence – A European approach to excellence and trust, COM(2020) 65 final. ↩︎
See Wendehorst, C. (2022). Liability for Artificial Intelligence: The Need to Address Both Safety Risks and Fundamental Rights Risks. In S. Voeneky, P. Kellmeyer, O. Mueller, & W. Burgard (Eds.), The Cambridge Handbook of Responsible Artificial Intelligence: Interdisciplinary Perspectives (Cambridge Law Handbooks, pp. 187-209). Cambridge: Cambridge University Press. doi:10.1017/9781009207898.016; Hacker, Philipp, The European AI Liability Directives – Critique of a Half-Hearted Approach and Lessons for the Future (November 25, 2022). Available at http://dx.doi.org/10.2139/ssrn.4279796 ↩︎
Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) OJ L 119, 4.5.2016, p. 1–88. ↩︎
The European Commission’s initial proposal of the AI Act as well as the Council mandate both include in Recital 4 the wording: “Such harm might be material or immaterial.” ↩︎
Proposal for a Regulation of the European Parliament and of the Council laying down harmonised rules on Artificial Intelligence (Artificial Intelligence Act) and amending certain Union legislative acts, 2021/0106(COD), version for Trilogue on 24 October, 2023. ↩︎

FLI AI Liability Directive: Full Version

Taylor Jones — Tue, 28 Nov 2023 20:07:34 +0000

Artificial Intelligence and Nuclear Weapons: Problem Analysis and US Policy Recommendations

Taylor Jones — Tue, 14 Nov 2023 22:55:49 +0000

Domain Definition

Since 1945, eight states other than the United States have successfully acquired nuclear weapons: the UK, France, China, Russia, Israel, Pakistan, India, and North Korea. While the possession of nuclear weapons by a handful of states has the potential to create a stable equilibrium through strategic deterrence, the risk of nuclear weapons use on the part of any state actor – and consequent nuclear responses – poses an existential threat to the American public and the international community.

Problem Definition

Developments in artificial intelligence (AI) can produce destabilizing effects on nuclear deterrence, increasing the probability of nuclear weapons use and imperiling international security. Advanced AI systems could enhance nuclear risks through further integration into nuclear command and control procedures, by reducing the deterrence value of nuclear stockpiles through augmentation of Intelligence, Surveillance, and Reconnaissance (ISR), by making nuclear arsenals vulnerable to cyber-attacks and manipulation, and by driving nuclear escalation with AI-generated disinformation.

#1. AI Integration into Nuclear Command and Control

As developments in AI have accelerated, some military and civilian defense agencies have considered integrating AI systems into nuclear decision-making frameworks alongside integration into conventional weapons systems with the intention of reducing human error.¹ ² In the United States, this framework is referred to as the nuclear command, control, and communications (NC3) system, which dictates the means through which authority is exercised and operational command and control of nuclear procedures are conducted.

However, a growing body of research has highlighted the potentially destabilizing consequences of integrating AI into NC3.³ This includes the following threat vectors:

Increased Reliance on Inaccurate Information: AI systems have already displayed significant inaccuracy and inconsistency across a wide range of domains. As the relevant data that are needed for the training of AI systems for NC3 are extremely sparse – nuclear weapons have only been deployed twice in history, and were deployed in a substantially different nuclear landscape – AI systems are even more likely to exhibit error in these use cases than others. While there has been considerable focus on ensuring that there are ‘humans in the loop’ (i.e., the final decision is made by a human authority), this may prove to be challenging in practice. If an AI system claims that a nuclear weapon has been launched by an adversary, studies suggest it is unlikely that human agents would oppose this conclusion, regardless of its validity. This problem of ‘machine bias’ has already been demonstrated in other domains, making the problem of ensuring ‘meaningful human control’ over AI systems incredibly difficult.⁴
Increased Reliance on Unverifiable Information: At present, it is nearly impossible to determine the exact means by which advanced AI systems reach their conclusions. This is because current means of ‘interpretibility’ – or understanding why AI systems behave the way they do – lag far behind the state-of-the-art systems themselves. In addition, because modern nuclear launch vehicles (e.g. intercontinental ballistic missiles (ICBMs), submarine-launched ballistic missile (SLBMs)) deliver payloads in a matter of minutes, it is unlikely there would be enough time to independently verify inferences, conclusions, and recommendations or decisions made by AI systems integrated in NC3.
Artificial Escalation and General Loss of Control: If multiple nuclear powers integrate AI into nuclear decision-making, there is a risk of “artificial escalation.” Artificial escalation refers to a type of inadvertent escalation in which adversaries’ respective AI systems make calculations based on strategic maneuvers or information originating from other AI systems, rather than from human judgment, creating a positive feedback loop that continuously escalates conflict.⁵ Importantly, there is likely to be a dilution of human-control in these situations, as there would be incentives to rely on AI judgements in response to adversary states which are doing the same. For instance, if adversaries are presumed to be making military decisions at machine speeds, to avoid strategic disadvantage, military leaders are likely to yield increasing deference to decision-making and recommendations by advanced AI systems at the expense of meaningful human judgment. This leaves significantly less time for clear-headed communication and consideration, instead motivating first-strike, offensive actions with potentially catastrophic consequences.

#2. Expansion of Nuclear Arsenals and Escalation due to developments in Intelligence, Surveillance and Reconnaissance (ISR) capabilities

ISR refers to coordinated acquisition, processing, and dissemination of accurate, relevant, and timely information and intelligence to support military decision-making processes. The belief that other states do not have perfect information about their adversaries’ nuclear launch capabilities is essential to maintaining strategic deterrence and reducing insecurity, as it theoretically preserves second strike capabilities in the event of an attack, underscoring mutually-assured destruction. Toward this end, many nuclear powers, including Russia and China, employ mobile missile launchers because they are more difficult to track and target compared to stationary weapons systems. However, both actual and imagined developments in ISR resulting from AI integration increase the perceived threat of detection and preemptive attack on mobile missile launchers and other clandestine military technology. Should a competing nuclear power come to believe that an adversary possesses perfect information regarding the locations of nuclear weapons systems, the possibility that adversaries deploy their nuclear stockpiles rather than risk having them dismantled increases considerably. Such instability is prone to lead to expansion of nuclear arsenals, increased escalation on other fronts, and further risk of nuclear conflict.

#3 Increased Vulnerability of Nuclear Arsenals and Command Systems to Cyber Attacks

Advancements in artificial intelligence have led to rapid expansion in the capacity for malevolent actors to launch cyberattacks and exploit cyber-vulnerabilities.⁶ This includes significantly enhanced capabilities to exploit technical gaps in nuclear security infrastructure (e.g. zero-day vulnerabilities) and to manipulate high-value persons in positions of nuclear command and control (e.g. through deception or blackmail via phishing and spearphishing attacks). NATO allies have pointed out the threat of AI systems being used to attack critical infrastructure, and nuclear arsenals and command and control centers.⁷ In addition, if states move toward integrating AI into NC3 systems, such systems would be even more vulnerable to cyberattacks and data poisoning, a practice that entails manipulating the datasets AI systems are trained on to modify their behavior and exploit weaknesses. As data centers and systems are often networked, a cyber-failure could rapidly spread throughout the system, and damage other military command and control systems.

#4 Nuclear Escalation and Misperception due to AI-Generated Disinformation

Advanced AI systems have already displayed the capacity to generate vast amounts of compelling disinformation. This disinformation is generated in text using large language models, and via the synthetic construction of fake audiovisual content such as pictures and videos, also known as deep-fakes. Such disinformation is likely to have an outsized negative impact on military confrontation, and in particular on nuclear risk. For instance, if an artificially-engineered piece of audiovisual material is incredibly compelling and signals intended nuclear action, the immediacy of advanced missile technology (see #1B) would not provide sufficient time for vetting the authenticity of the information and may push decision-makers to default to a nuclear response.

Policy Recommendations

In light of the significant risks identified in the previous section, considerable attention from policymakers is necessary to ensure that the safety and security of the American people are not jeopardized. The following policy recommendations represent critical, targeted first steps to mitigating these risks:

Limit use of AI Systems in NC3 and Establish Criteria for ‘Meaningful Human Control’: As recommended by a growing number of experts, the US should prohibit or place extremely stringent constraints the use of AI systems in the highest-risk domains of military decision-making. As discussed, mere human involvement at the tail-end of nuclear decision-making is unlikely to be effective in preventing escalation of nuclear risk from integration of AI systems. Minimizing the use of advanced AI systems where safer alternatives are available, requiring meaningful human control at each step in the decision-making process, and ensuring human understanding of decisionmaking criteria of any systems deployed in NC3, would reduce risks of accidental use and loss of human control, and would also provide crucial signals to geopolitical adversaries that would minimize undue escalation risk.
Require Meaningful Hunan Control for All Potentially Lethal Conventional Weapon Use: Escalation to nuclear conflict does not occur solely within the nuclear domain, but rather emerges from broader geopolitical tensions and military maneuvers. Though this brief focuses specifically on the risks at the intersection of AI and NC3, incorporating AI into any military decision-making with major, irreversible consequences increases the risk of artificial escalation and loss of control that could eventually evolve into nuclear conflict. In order to reduce the risk of artificial escalation that could trigger nuclear conflict, the US should require by law that any potentially lethal military decision is subject to meaningful human control, regardless of whether it involves nuclear or conventional weapons systems. While the Department of Defense (DoD) Directive 3000.09 on Autonomy in Weapons Systems presently requires “appropriate levels of human judgment” in the use of force, this could be interpreted to allow for low levels of human judgment in some military operations, and is subject to change depending on DoD leadership. To ensure sound military decision-making that mitigates artificial escalation risk, “meaningful human control” should be codified in statute for any use of potentially-lethal force.
Improve Status Quo Stability by Reducing Nuclear Ambiguities: The US should formally renounce first strikes – i.e., categorically state that it will not initiate a nuclear conflict – which would help assuage tensions, reduce the risk of escalation due to ambiguities or misunderstanding, and facilitate identification of seemingly inconsistent actions or intelligence that may not be authentic. Finally, the US should improve and expand its military crisis communications network, or ‘hotlines’, with adversary states, to allow for rapid leadership correspondence in times of crisis.
Lead International Engagement and Standard-Setting: The US must adopt best practices for integration of AI into military decision-making, up to and potentially including recommending against such integration altogether at critical decision points, to exercise policy leadership on the international stage. In addition, the US should help strengthen the Nuclear Non-Proliferation Treaty and reinforce the norms underpinning the Treaty on the Prohibition of Nuclear Weapons in light of risks posed by AI.
Adopt Stringent Procurement and Contracting Standards for Integration of AI into Military Functions: Because NC3 is not completely independent of broader military decision-making and the compromise or malfunction of other systems can feed into nuclear escalation, it is vital that stringent standards be established for procuring AI technology for military purposes. This should include rigorous auditing, red-teaming, and stress-testing of systems intended for military use prior to procurement.
Fund Technical Research on AI Risk Management and NC3: The US should establish a risk management framework for the use of AI in NC3. Research in this regard can take place alongside extensive investigation of robust cybersecurity protocols and measures to identify disinformation. It should also include research into socio-technical mechanisms for mitigating artificial escalation risk (e.g. how to minimize machine bias, how to ensure that military decision-making happens at human speeds) as well as mechanisms for verifying the authenticity of intelligence and other information that could spur disinformation-based escalation. This would encourage the development of AI decision-support systems that are transparent and explainable, and subject to robust testing, evaluation, validation and verification (TEVV) protocols for specifically developed for AI in NC3. Such research could also reveal innovations in NC3 that do not rely on AI.

Finally, it is vital to set up an architecture for scrutiny and regulation of powerful AI systems more generally, including those developed and released by the private sector for civilian use. The nuclear risks posed by AI systems, such as those emerging from AI-enhanced disinformation and cyberwarfare, cannot be mitigated through policies at the intersection of the AI-nuclear frontier alone. The US must establish an auditing and licensing regime for advanced AI systems deployed in civilian domains that includes evaluation of risk for producing and proliferating widespread misinformation that could escalate geopolitical tensions, and risk of use for cyberattacks that could compromise military command control and decision support systems.

^{↩ 1} Horowitz, M. and Scharre, P. (December, 2019). A Stable Nuclear Future? The Impact of Autonomous Systems and Artificial Intelligence; Dr. James Johnson on How AI is Transforming Nuclear Deterrence. Nuclear Threat Initiative.

^{↩ 2} Reihner, P. and Wehesner, A. (Noveber, 2019). The real value of AI in Nuclear Command and Control. War on the Rocks.

^{↩ 3} Rautenbach, P. (February, 2023). Keeping humans in the loop is not enough to make AI safe for nuclear weapons. Bulletin of Atomic Scientists.

^{↩ 4} Baraniuk, S. (October, 2021). Why we place too much trust in machines. BBC News.

^{↩ 5} The short film “Artificial Escalation,” released by FLI in July 2023, provides a dramatized account of how this type of escalation can occur. This policy primer delves into mitigation strategies for the risks portrayed in the film.

^{↩ 6} These concerns are discussed in greater detail in “Cybersecurity and Artificial Intelligence: Problem Analysis and US Policy Recommendations“.

^{↩ 7} Vasquez, C. (May, 2023), Top US cyber official warns AI may be the ‘most powerful weapon of our time’. Cyberscoop; Artificial Intelligence in Digital Warfare: Introducing the Concept of the Cyberteammate. Cyber Defense Review. US Army.

FLI Governance Scorecard and Safety Standards Policy (SSP)

Taylor Jones — Mon, 30 Oct 2023 13:18:21 +0000

Introduction

AI remains the only powerful technology lacking meaningful binding safety standards. This is not for lack of risks. The rapid development and deployment of ever-more powerful systems is now absorbing more investment than that of any other models. Along with great benefits and promise, we are already witnessing widespread harms such as mass disinformation, deep- fakes and bias – all on track to worsen at the currently unchecked, unregulated and frantic pace of development. As AI systems get more sophisticated, they could further destabilize labor markets and political institutions, and continue to concentrate enormous power in the hands of a small number of unelected corporations. They could threaten national security by facilitating the inexpensive development of chemical, biological, and cyber weapons by non-state groups. And they could pursue goals, either human- or self-assigned, in ways that place negligible value on human rights, human safety, or, in the most harrowing scenarios, human existence.

Despite acknowledging these risks, AI companies have been unwilling or unable to slow down. There is an urgent need for lawmakers to step in to protect people, safeguard innovation, and help ensure that AI is developed and deployed for the benefit of everyone. This is common practice with other technologies. Requiring tech companies to demonstrate compliance with safety standards enforced by e.g. the FDA, FAA or NRC keeps food, drugs, airplanes and nuclear reactors safe, and ensures sustainable innovation. Society can enjoy these technologies’ benefits while avoiding their harms. Why wouldn’t we want the same with AI?

With this in mind, the Future of Life Institute (FLI) has undertaken a comparison of AI governance proposals, and put forward a safety framework which looks to combine effective regulatory measures with specific safety standards.

AI Governance Scorecard

Recent months have seen a wide range of AI governance proposals. FLI has analyzed the different proposals side-by-side, evaluating them in terms of the different measures required. The results can be found below. The comparison demonstrates key differences between proposals, but, just as importantly, the consensus around necessary safety requirements. The scorecard focuses particularly on concrete and enforceable requirements, because strong competitive pressures suggest that voluntary guidelines will be insufficient.

The policies fall into two main categories: those with binding safety standards (akin to the situation in e.g. the food, biotech, aviation, automotive and nuclear industries) and those without (focusing on industry self-regulations or voluntary guidelines). For example, Anthropic’s Responsible Scaling Policy (RSP) and FLI’s Safety Standards Policy (SSP) are directly comparable in that they both build on four AI Safety Levels – but where FLI advocates for an immediate pause on AI not currently meeting the safety standards below, Anthropic’s RSP allows development to continue as long as companies consider it safe. The FLI SSP is seen to check many of the same boxes as various competing proposals that insist on binding standards, and can thus be viewed as a more detailed and specific variant alongside Anthropic’s RSP.

Table 1: A summary of the AI governance playing field going in to the November 1-2 UK AI Summit. View as a PDF.

FLI Safety Standards Policy (SSP)

Taking this evaluation and our own previous policy recommendations into account, FLI has outlined an AI safety framework that incorporates the necessary standards, oversight and enforcement to mitigate risks, prevent harms, and safeguard innovation. It seeks to combine the “hard-law” regulatory measures necessary to ensure compliance – and therefore safety – with the technical criteria necessary for practical, real-world implementation.

The framework contains specific technical criteria to distinguish different safety levels. Each of these calls for a specific set of hard requirements before training and deploying such systems, enforced by national or international governing bodies. While these are being enacted, FLI advocates for an immediate pause on all AI systems that do not meet the outlined safety standards.

Crucially, this framework differs from those put forward by AI companies (such as Anthropic’s ‘Responsible Scaling Policy’ proposal) as well as those organized by other bodies such as the Partnership on AI and the UK Task Force, by calling for legally binding requirements – as opposed to relying on corporate self-regulation or voluntary commitments.

The framework is by no means exhaustive, and will require more specification. After all, the project of AI governance is complex and perennial. Nonetheless, implementing this framework, which largely reflects a broader consensus among AI policy experts, will serve as a strong foundation.

Table 2: FLI’s Proposed Policy Framework. View as a PDF.

Clarifications

Triggers: A given ASL-classification is triggered if either the hardware trigger or the capabilities trigger applies.

Registration: This includes both training plans (data, model and compute specifications) and subsequent incident reporting. National authorities decide what information to share.

Safety audits: This includes both cybersecurity (preventing unauthorized model access) and model safety, using whitebox and blackbox evaluations (with/without access to system internals).

Responsibility: Safety approvals are broadly modeled on the FDA approach, where the onus is on AI labs to demonstrate to government-appointed experts that they meet the safety requirements.

IAIA international coordination: Once key players have national AI regulatory bodies, they should aim to coordinate and harmonize regulation via an international regulatory body, which could be modeled on the IAEA – above this is referred to as the IAIA (“International AI Agency”) without making assumptions about its actual name. In the interim before the IAIA is constituted, ASL-4 systems require UN Security Council approval.

Liability: Developers of systems above ASL-1 are liable for harm to which their models or derivatives contribute, either directly or indirectly (via e.g. API use, open-sourcing, weight leaks or weight hacks).

Kill-switches: Systems above ASL-3 need to include non-removable kill-switches that allow appropriate authorities to safely terminate them and any copies.

Risk quantification: Quantitative risk bounds are broadly modeled on the practice in e.g. aircraft safety, nuclear safety and medicine safety, with quantitative analysis producing probabilities for various harms occurring. A security mindset is adopted, whereby the probability of harm factors in the possibility of adversarial attacks.

Compute triggers: These can be updated by the IAIA, e.g. lowered in response to algorithmic improvements.

Why regulate now?

Until recently, most AI experts expected truly transformative AI impact to be at least decades away, and viewed associated risks as “long-term”. However, recent AI breakthroughs have dramatically shortened timelines, making it necessary to consider these risks now. The plot below (courtesy of the Metaculus prediction site) shows that the number of years remaining until (their definition of) Artificial General Intelligence (AGI) is reached has plummeted from twenty years to three in the last eighteen months, and many leading experts concur.

Image: ‘When will the first weakly general AI system be devised, tested, and publicly announced?‘ at Metaculus.com

For example, Anthropic CEO Dario Amodei predicted AGI in 2-3 years, with 10-25% chance of an ultimately catastrophic outcome. AGI risks range from exacerbating all the aforementioned immediate threats, to major human disempowerment and even extinction – an extreme outcome warned about by industry leaders (e.g. the CEOs of OpenAI, Google DeepMind & Anthropic), academic AI pioneers (e.g. Geoffrey Hinton & Yoshua Bengio) and leading policymakers (e.g. European Commission President Ursula von der Leyen and UK Prime Minister Rishi Sunak).

Reducing risks while reaping rewards

Returning to our comparison of AI governance proposals, our analysis revealed a clear split between those that do, and those that don’t, consider AGI-related risk. To see this more clearly, it is convenient to split AI development crudely into two categories: commercial AI and AGI pursuit. By commercial AI, we mean all uses of AI that are currently commercially valuable (e.g. improved medical diagnostics, self-driving cars, industrial robots, art generation and productivity-boosting large language models), be they for-profit or open-source. By AGI pursuit, we mean the quest to build AGI and ultimately superintelligence that could render humans economically obsolete. Although building such systems is the stated goal of OpenAI, Google DeepMind, and Anthropic, the CEOs of all three companies have acknowledged the grave associated risks and the need to proceed with caution.

The AI benefits that most people are excited about come from commercial AI, and don’t require AGI pursuit. AGI pursuit is covered by ASL-4 in the FLI SSP, and motivates the compute limits in many proposals: the common theme is for society to enjoy the benefits of commercial AI without recklessly rushing to build more and more powerful systems in a manner that carries significant risk for little immediate gain. In other words, we can have our cake and eat it too. We can have a long and amazing future with this remarkable technology. So let’s not pause AI. Instead, let’s stop training ever-larger models until they meet reasonable safety standards.

2022 Annual Report

Taylor Jones — Fri, 13 Oct 2023 15:05:38 +0000

The Future of Life Institute's 2022 annual report, detailing our mission, areas of work, key projects, funding, and staff.

Cybersecurity and AI: Problem Analysis and US Policy Recommendations

Taylor Jones — Tue, 10 Oct 2023 20:30:13 +0000

Domain Definition

Cybersecurity refers to the wide array of practices concerning the attack and protection of computer systems and networks. This includes protection from attacks by malicious actors that may result in unauthorized information disclosure, theft, or damage to hardware, software, or data, as well as protection from the disruption or misdirection of services that rely on these systems. The National Cybersecurity Strategy Implementation Plan (NSCIP) published by the White House in July 2023 recognizes cybersecurity as critical to American national security interests, economic innovation, and digital empowerment.

Problem Definition

Numerous reports have pointed to the ways that artificial intelligence (AI) systems can make it easier for malevolent actors to develop more virulent and disruptive malware.¹ ² AI systems can also help adversaries automate attacks on cyberspaces, increasing the efficiency, creativity and impact of cyberattacks via novel zero-day exploits (i.e. previously unidentified vulnerabilities), targeting critical infrastructure and also enhancing techniques such as phishing and ransomware. As powerful AI systems are increasingly empowered to develop the set of tasks and subtasks to accomplish their objectives, autonomously-initiated hacking is also expected to emerge in the near-term.

The threats posed to cybersecurity in convergence with artificial intelligence can be broadly divided into four categories:

#1. AI-Enabled/Enhanced Cyberattacks on Critical Infrastructure and Resources

An increasing proportion of US critical infrastructure, including those pieces relevant to health (hospital systems), utilities (including heating, electrical supply and water supply), telecommunications, finance, and defense are now ‘on the grid’, leaving them vulnerable to potential cyberattacks by malicious actors. Such an attack could, for instance, shut off the power supply of entire cities, access high-value confidential financial or security information, or disable telecommunications networks. Several AI systems have already demonstrated some success in exploiting such vulnerabilities. Crucially, the barrier to entry, i.e. the level of skill necessary, for conducting such an attack is considerably lower with AI than without it, increasing threats from non-state actors and the number of possible attempts that may occur. In addition, patching these vulnerabilities once they have been exploited takes time, which means that painful and lasting damage may be inflicted before the problem is remedied.

#2. AI-Enabled Cyber-Manipulation of High-Value Persons

Phishing refers to the fraudulent practice of sending communication (e.g., emails, caller-ID spoofed and deep-fake voice phone calls) purporting to be from reputable sources, to extract information. Advanced AI systems, in particular large language models, have demonstrated considerable effectiveness in powering phishing attacks, both by enabling greater efficiency and volume in launching these attacks, and by tailoring them to hyper-target and more effectively deceive individuals. As these abilities scale, they could be used to launch spearfishing attacks on individuals in leadership positions within organizations critical to national-security interests. The attacker could then manipulate that individual into revealing high-value information, compromising access protections (e.g. passwords) for sensitive information or critical systems, or taking decisions detrimental to national-security interests. Beyond deception, this manipulation could include blackmail techniques to compel harmful actions. Generative AI systems could also facilitate spearfishing attacks targeted at leaders of geopolitical adversaries in order to trick them into destructive ‘retaliatory’ action.

#3. Cyber-vulnerabilities in Labs Developing Advanced AI Systems

The companies developing the most advanced AI systems in the world are primarily based within the United States and the United Kingdom. These AI systems are very likely to be targeted by malicious state and non-state actors to access vital design information (e.g., the model weights underpinning the most advanced large language models). Strategic competitors and adversaries may steal these technologies without taking the considerable effort to innovate and develop them, damaging the competitiveness of the U.S and exacerbating risks from malicious use. These actors could also remove the safeguards from these powerful models which normally protect against access to dangerous information such as how to develop WMDs. In a straw poll, a majority of top cybersecurity experts expressed concerns that the top AI labs are ill-equipped to protect these critical technologies from cyber-attacks.

#4. Integration of Opaque and Unreliable AI-Enabled Cybersecurity Systems

There has been growing discussion around using AI systems to enhance cybersecurity and cyber-defense. This comes with its own set of dangers, especially with opaque AI systems whose behavior is extremely difficult to predict and explain. Data poisoning – cases where attackers manipulate the data being used to train cyber-AI systems – could lead to systems yielding false positives or failing to detect intrusions. In addition, the model weights of the systems themselves can be stolen using querying techniques designed to find loopholes in the model. These systems could also counter-attack beyond their operators’ intentions, targeting allied systems or risking escalation with adversaries.

Policy Recommendations

Minimum Cybersecurity Requirements for Advanced AI Developers: Only a handful of AI developers, primarily based in the United States, are presently developing the world’s most advanced AI systems, with significant implications for American economic stability and national security. In order to safeguard these AI systems from malicious state and non-state actors, minimum cybersecurity requirements should be adopted for those developing and maintaining them, as is the case with high-risk biosafety labs (BSLs) and national nuclear laboratories (NNLs). These standards should include minimum criteria for cybersecurity personnel numbers, red-team tests, and external evaluations.
Explicitly Focus on AI-Enabled Cyberattacks in National Cyber-Strategies: Artificial intelligence goes completely unmentioned in the National Cybersecurity Strategy Implementation Plan published by the White House in July 2023, despite recognition of cyber risks of AI in the National Cybersecurity Strategy itself.³ AI risks need to be integrated explicitly into a broader cybersecurity posture, including in the DOD Cyber Strategy, the National Cyber Incident Response Plan (NCIRP), the National Cybersecurity Investigative Joint Task Force (NCIJTF) and other relevant plans.
Establish Minimum Standards for Integration of AI into Cybersecurity Systems and Critical Infrastructure: Integrating unpredictable and vulnerable AI systems into critical cybersecurity systems may create cyber-vulnerabilities of its own. Minimum standards regarding transparency, predictability and robustness of these systems should be set up before they are used for cybersecurity functions in critical industries. Additionally, building on guidance issued in accordance with EO 13636 on Improving Critical Infrastructure Cybersecurity⁴, EO 13800 on Strengthening the Cybersecurity of Federal Networks and Critical Infrastructure⁵, and the Framework for Improving Critical Infrastructure Cybersecurity published by NIST⁶, AI-concsious standards for cybersecurity in critical infrastructure should be developed and enforced. Such binding standards should account in particular for risks from AI-enabled cyber-attacks, and should be developed in coordination with CISA, SRMA and SLTT offices.

More general oversight and governance infrastructure for advanced AI systems is also essential to protect against cyber-risks from AI, among many other risks. We further recommend these broader regulatory approaches to track, evaluate, and incentivize the responsible design of advanced AI systems:

Require Advanced AI Developers to Register Large Training Runs and to “Know Their Customers”: The Federal Government lacks a mechanism for tracking the development and proliferation of advanced AI systems that could exacerbate cyber-risk. In order to adequately mitigate cybersecurity risks, it is essential to know what systems are being developed and who has access to them. Requiring registration for the acquisition of large amounts of computational resources for training advanced AI systems, and for carrying out the training runs themselves, would help with evaluating possible risks and taking appropriate precautions. “Know Your Customer” requirements similar to those imposed in the financial services industry would reduce the risk of systems that can facilitate cyber-attacks falling into the hands of malicious actors.
Establish a Robust Pre-deployment Auditing and Licensure Regime for Advanced AI Systems: Advanced AI systems that can pose risks to cybersecurity, or may be integrated into cybersecurity or other critical functions, are not presently required to undergo independent assessment for safety, security, and reliability before being deployed. Requiring licensure before advanced AI systems are deployed, contingent on independent audits for compliance with minimum standards for safety, security, and reliability, would identify and mitigate risks before the systems are released and become more difficult to contain. Audits should include red-teaming to identify cyber-vulnerabilities and ensure that systems cannot be readily used or modified to threaten cybersecurity.
Clarify Liability for Developers of AI Systems Used in Cyber-attacks: It is not clear under existing law whether the developers of AI systems used to, e.g., damage or unlawfully access critical infrastructure would be held liable for resulting harms. Absolving developers of liability in these circumstances creates little incentive for profit-driven developers to expend financial resources on precautionary design principles and robust assessment. Because these systems are opaque and can possess unanticipated, emergent capabilities, there is inherent risk in developing advanced AI systems and systems expected to be used in critical contexts. Implementing strict liability when these systems facilitate or cause harm would better incentivize developers to take appropriate precautions against cybersecurity vulnerabilities, critical failure, and the risk of use in cyber-attacks.

^{↩ 1} Bécue, A., Praça, I., & Gama, J. (2021). Artificial intelligence, cyber-threats and Industry 4.0: Challenges and opportunities. Artificial Intelligence Review, 54(5), 3849-3886.

^{↩ 2} Menn, J. (May, 2023). Cybersecurity faces a challenge from artificial intelligence’s rise. Washington Post.

^{↩ 3} “Too often, we are layering new functionality and technology onto already intricate and brittle systems at the expense of security and resilience. The widespread introduction of artificial intelligence systems—which can act in ways unexpected to even their own creators—is heightening the complexity and risk associated with many of our most important technological systems.” National Cybersecurity Strategy, March 2023, p.2.

^{↩ 4} Office of the Press Secretary. (February, 2013). Executive Order — Improving Critical Infrastructure Cybersecurity. The White House.

^{↩ 5} Executive Office of the President. (May, 2017). Strengthening the Cybersecurity of Federal Networks and Critical Infrastructure. National Archives.

^{↩ 6} National Institute of Standards and Technology. (2018). Framework for Improving Critical Infrastructure Cybersecurity

FLI recommendations for the UK Global AI Safety Summit

Taylor Jones — Sun, 10 Sep 2023 20:19:49 +0000

“The time for saying that this is just pure research has long since passed […] It’s in no country’s interest for any country to develop and release AI systems we cannot control. Insisting on sensible precautions is not anti-industry.

Chernobyl destroyed lives, but it also decimated the global nuclear industry. I’m an AI researcher. I do not want my field of research destroyed. Humanity has much to gain from AI, but also everything to lose.”

– Professor Stuart Russell
Founder of the Center for Human-Compatible AI at the University of California, Berkeley

Contact: Mark Brakel, Director of Policy, policy@futureoflife.org

Introduction

Prime Minister Sunak,
Secretary of State Donelan,

The Future of Life Institute (FLI) is an independent non-profit organisation that works on reducing global catastrophic and existential risks from powerful technologies. Back in 2017, FLI organised a conference in Asilomar, California to formulate one of the earliest artificial intelligence (AI) governance instruments: the “Asilomar AI principles.” The organisation has since become one of the leading voices on AI policy in Washington D.C. and Brussels, and is now the civil society champion for AI recommendations in the United Nations Secretary General’s Digital Cooperation Roadmap.

In March, FLI – joined by over 30,000 leading AI researchers, professors, CEOs, engineers, and others – called for a pause of at least six months on the largest and riskiest AI experiments, to reduce the likelihood of catastrophic accidents. The letter sparked United States Senate hearings, a formal reply from the European Parliament, and a call from UNESCO to implement a global ethical framework for AI.

Despite this shift in the public conversation, we remain locked in a race that has only accelerated. No company has developed the shared safety protocols that we believe are necessary. In our letter, we also wrote: “if such a pause cannot be enacted quickly, governments should step in“. The need for public sector involvement has never been clearer. As a result, we would like to thank you for your personal leadership in convening the world’s first AI safety summit.

In our view, the Summit should achieve three things:

Establish a common understanding of the severity and urgency of AI risks;
Make the global nature of the AI challenge explicit, recognising that all of humanity has a stake in this issue and that some solutions require a unified global response, and;
Embrace the need for urgent government intervention, including hard law where appropriate.

With this document, we offer a draft outcome declaration, a number of recommendations to participating governments, and a roadmap for post-summit work. We imagine that summit preparations are well under way, but hope that this document can provide further inspiration as to what themes should be covered during the preparatory meetings and at the summit itself. Ultimately, we hope that the summit can kickstart the development of a new international architecture for AI regulation.

We wish you good luck with the preparations for the summit and stand ready to offer our expertise in support of effective global AI governance.

Sincerely,

Professor Anthony Aguirre
Executive Director

Professor Max Tegmark
President

Proposed Declaration on AI Safety

Increasingly powerful AI systems pose risks, through accidents, misuse, or structural problems, with potentially catastrophic consequences. The mitigation of these emerging risks must become a global priority.
The robust mitigation of AI risks demands leadership from the public sector. Advanced AI systems should, like other potentially dangerous technologies, be carefully regulated to ensure compliance with adequate safety measures.
Neither systems nor the risks they pose can be contained within the borders of one nation state. Adequate governance of advanced AI necessitates ongoing and intensive global coordination.

The participating nations in the world’s first global AI safety summit agree to:

Reconvene in six months, and every six months thereafter, to accelerate the creation of a robust AI governance regime;
Increase public funding for AI safety research to improve understanding of the risks and reduce the probability of accidents;
Develop national AI safety strategies consisting of the following measures:
1. Standards for advanced AI, plus associated benchmarks and thresholds for dangerous capabilities,
2. Mandatory pre-deployment audits for potentially dangerous AI systems by independent third parties,
3. Monitoring of entities with large-scale AI compute concentrations,
4. Safety protocols to prevent systems with dangerous capabilities from being developed, deployed or stolen,
5. Restrictions on open source AI based on capability thresholds to prevent the proliferation of powerful models amongst malicious actors,
6. Immediate enhancement of cybersecurity standards at leading AI companies,
7. Adaptation of national liability law to AI-specific challenges;
Establish a post-summit working group with a mandate to develop a blueprint for a new global agency that can coordinate the governance of advanced AI, advise on safety standards, and ensure global adherence;
Encourage leading companies to share information with the UK Foundation Model task force and welcome the UK’s intention to put this entity at the disposal of the international community.

Recommendations in advance of the Summit

Recommendations for all participating governments:

Do not let the perceived complexity of AI technology stand in the way of action. Many useful risk-mitigation measures are technology-neutral.
Critically assess whether existing national AI strategies are sufficiently attuned to AI safety risk.

Recommendations for the UK hosts:

Involve all major AI powers in the summit, including Brazil, China, the EU, India, Japan and the US. Instruct UK embassies in key capitals to convene informational sessions about AI safety.

Ensure the summit takes a truly global perspective and avoid overly relying on the transatlantic relationship. Look beyond the US voluntary commitments for inspiration.
Carefully balance private sector perspectives with those of independent experts from academia and civil society.
Ask any private sector attendees to submit their safety plans ahead of time and make these documents available to other governments for scrutiny.
Exclude autonomous weapons systems from the agenda. A treaty process is already emerging at other fora and the inclusion of too many issues could undermine progress on civilian AI safety.
Amplify the UK’s global leadership by bringing forward domestic AI legislation, recognising that societal-scale risks of AI cannot be managed by sector or through voluntary guidelines.

Recommendations for the People’s Republic of China:

Inform other governments about the recently enacted Interim Measures for the Management of Generative Artificial Intelligence Services.
Engage in an open dialogue with the US, despite ongoing economic and strategic competition. Global threats from advanced AI, much like climate change, urgently demand cooperation even if this is not possible on most bilateral issues.

Recommendations for the European Union and its Member States:

Inform other governments about the proposed EU AI Act and any key insights that have emerged during the drafting process, with a particular focus on the regime for more general AI systems.
Where the AI Act falls short in mitigating risks identified at the summit, keep an open mind to adapting the overall EU regulatory framework.
Inspire other governments by sharing information about the structure of the Spanish Agency for the Supervision of Artificial Intelligence (AESIA), the first specialized regulatory entity for AI in the EU.

Recommendations for the United States:

Closely monitor compliance of key AI corporations with the recent voluntary commitments and apply appropriate pressure to ensure compliance.
Do not leave the global regulatory conversation to Brazil, China and Europe; fast-track binding legislation.
Engage in an open dialogue with China, despite ongoing economic and strategic competition. Global threats from advanced AI, much like climate change, urgently demand cooperation even if this is not possible on most bilateral issues.

Recommendations for the summit programme

The UK government has set out five strong ambitions for the AI Safety Summit. Given how unfamiliar many government officials are with AI Safety, we would recommend that a final programme ensures all participants develop a shared understanding of the risks that we face. To this end, we would suggest involving independent experts to clearly articulate what risks the international community needs to address.

Existing large-scale harms

As the UK government frames the conversation, it may want to consider highlighting recent examples of large-scale harms caused by AI. The Australian Robodebt scheme and the Dutch childcare benefit scandal, for example, have shown how simple algorithms can already disrupt societies today.

Proposed speakers: Minister Alexandra van Huffelen (The Netherlands) and Royal
Commissioner Catherine Holmes (Australia)

Catastrophic risks from accidents

A survey of 738 leading AI scientists found, in aggregate, that researchers believe that there is a 50% chance that we will develop systems surpassing human abilities in all domains before 2060. Currently, no robust mechanism exists to ensure that humans will stay in control of these incredibly powerful systems. Neither do we understand how to accurately align the objectives they pursue with our own. This session would lay out the risks from out-of-control AI systems.

Proposed speaker: Stuart Russell (University of California, Berkeley)

Catastrophic risks from misuse and proliferation

Through cybertheft or voluntary open-sourcing, very powerful AI systems can end up in the hands of malicious actors and be used to cause significant harm to the public. Once the compute-intensive training phase has been completed, consumer hardware can be sufficient to fine-tune AI models for destructive behaviour (e.g. automated cyberattacks that disable critical infrastructure or the creation of pathogens that cause catastrophic pandemics).

Proposed speaker: Professor Yoshua Bengio (University of Montreal)

Post-summit roadmap

Building on initial agreements at fora like the G7, the Bletchley Summit should be the start of a process, rather than a one-off event. Ahead of a successor Summit, for which FLI would suggest May 2024, we would suggest the following roadmap.

For the post-summit working group:

The proposed working group would have a mandate to develop the blueprint for a new global agency that can coordinate the governance of advanced AI, advise on safety standards, and ensure global adherence.

Functions of the agency (or associated entities) would need to include i) risk identification, ii) promoting agreement on governance standards, such as thresholds that risky capabilities ought not to exceed, and iii) assistance with implementation and enforcement.

No perfect template exists for dealing with the challenges that AI will bring. In a recent working paper, Trager et al. look at the features of relevant analogous institutions, show which relevant functions they fulfil, and propose a design for an International AI Organisation (IAIO):

Given the exponential growth in AI capabilities and the corresponding urgency of mitigating risk, the blueprint should be ready for discussion at the next summit. As with other international organisations, FLI recommends that the initial (UK) hosts act as a temporary secretariat in developing the agency until such a time when the agency can itself support national governments.

At national level:

Following the summit, governments should revise their national AI strategies. Whereas these strategies¹ previously focused almost exclusively on economic competitiveness, recalibration is required to account for AI safety risks.

Firstly, governments need to establish safety standards for the responsible design, development, and deployment of powerful AI systems. These standards should regularly be updated as technology progresses, and include:

Comprehensive pre-deployment risk assessments informed by internal and independent third party model audits. These audits should test for dangerous capabilities, controllability, and ethical alignment.
Standardised protocols for permissible deployment options for AI systems. These should range from fully open sourcing a model to not deploying it at all. If a system fails to pass an audit successfully, deployment should be prohibited.
Post-deployment monitoring requirements that can trigger 1) repeated risk assessments if post-deployment enhancement techniques significantly alter system capabilities and 2) immediate termination of model deployment if unacceptably dangerous behaviour is detected.
Categories of AI capabilities, such as automated cyberattacks or fraud, that should be restricted to prevent large harms to public safety.
Strong measures to prevent and track model leaks. These should include robust cybersecurity standards as well as safeguards against threats from outside the relevant companies.

Alongside standards, robust enforcement mechanisms should be enshrined in national legislation to ensure leading AI corporations comply with appropriate safety standards. To enable adequate enforcement, governments should create national AI agencies with the authority to initiate enforcement action. National authorities also have a role in mandating private, third-party actors to audit the most capable AI systems and to put arrangements in place that minimise conflicts of interest. Moreover, the adaptation of national liability law to AI can help dissuade corporate leaders from taking excessive risks.

Governments should also improve their institutional understanding of the key risks. On the one hand, and especially in countries with leading AI corporations, mandatory information-sharing regimes should be put in place. Continual information-sharing will provide governments with insight into development processes, compute usage, and model capabilities and grant governments early access to models for testing purposes. Furthermore, a global AI incident database should be created to monitor recorded harms.

On the other hand, all governments should expand academic research on AI safety. Additional funding needs both to increase the number of scientists working on the problem and to expand the computational resources available to safety researchers. This will empower publicly-funded academics to conduct the type of safety research (on large scale models) that has recently become the exclusive preserve of the private sector. The establishment of an international research institution for AI safety similar to CERN should be seriously considered.

Finally, governments should establish hardware governance regimes. Giant data centers with several thousand cutting-edge AI chips are needed to develop the most capable systems. This physical infrastructure marks the most amenable bottleneck for government intervention. In a first step, large domestic AI compute concentrations and their highly centralised global supply chains need to be mapped. Additionally, reporting requirements for large training runs should be instantiated for monitoring purposes. To substantially reduce the risk of catastrophic accidents, licensing regimes for large-scale training runs must be developed to ensure requesting entities can demonstrate compliance with the required safety precautions.

Recommended Experts

Tackling these new challenges will require governments to build considerable expertise. Below is a list of suggested experts to involve in preparatory meetings for the summit at expert level, and in eventual post-summit working groups.

International institutions for the governance of advanced AI

Professor Robert Trager (University of California, Los Angeles),
Professor Duncan Snidal (University of Oxford),
Dr. Allan Dafoe (Google DeepMind),
Mustafa Suleyman (Inflection AI)

Auditing regimes for high-risk AI systems

Professor Ellen P. Goodman (Rutgers Law School),
Dr. Paul Christiano (Alignment Research Center),
Markus Anderljung (Center for the Governance of AI),
Elizabeth Barnes (ARC Evals)

Hardware governance

Professor Anthony Aguirre (University of California, Santa Cruz),
Dr. Jess Whittlestone (Centre for Long-Term Resilience),
Lennart Heim (Center for the Governance of AI),
Dr. Shahar Avin (Centre for the Study of Existential Risk, University of Cambridge)

^{↩ 1} See the OECD AI Policy Observatory for an overview.

FLI AI Act Trilogues

Taylor Jones — Tue, 04 Jul 2023 20:12:49 +0000

Policymaking In The Pause

Taylor Jones — Wed, 12 Apr 2023 20:30:13 +0000

“The time for saying that this is just pure research has long since passed. […] It’s in no country’s interest for any country to develop and release AI systems we cannot control. Insisting on sensible precautions is not anti-industry. Chernobyl destroyed lives, but it also decimated the global nuclear industry. I’m an AI researcher. I do not want my field of research destroyed. Humanity has much to gain from AI, but also everything to lose.”
Stuart Russell, Smith-Zadeh Chair in Engineering and Professor of Computer Science at the University of California, Berkeley, Founder of the Center for Human Compatible Artificial Intelligence (CHAI).

“Let’s slow down. Let’s make sure that we develop better guardrails, let’s make sure that we discuss these questions internationally just like we’ve done for nuclear power and nuclear weapons. Let’s make sure we better understand these very large systems, that we improve on their robustness and the process by which we can audit them and verify that they are safe for the public.”
Yoshua Bengio, Scientific Director of the Montreal Institute for Learning Algorithms (MILA), Professor of Computer Science and Operations Research at the Université de Montréal, 2018 ACM A.M. Turing Award Winner.

“We have a perfect storm of corporate irresponsibility, widespread adoption, lack of regulation and a huge number of unknowns. shows how many people are deeply worried about what is going on. I think it is a really important moment in the history of AI – and maybe humanity,”
Gary Marcus, Professor Emeritus of Psychology and Neural Science at New York University, Founder of Geometric Intelligence

“It feels like we are moving too quickly. I think it is worth getting a little bit of experience with how they can be used and misused before racing to build the next one. This shouldn’t be a race to build the next model and get it out before others.”
Peter Stone, Professor at the University of Texas at Austin, Chair of the One Hundred Year Study on AI.

“We don’t know what these systems are trained on or how they are being built. All of this happens behind closed doors at commercial companies. This is worrying.”
Catelijne Muller, President of ALLAI, Member of the EU High Level Expert Group on AI

“Those making these have themselves said they could be an existential threat to society and even humanity, with no plan to totally mitigate these risks. It is time to put commercial priorities to the side and take a pause for the good of everyone to assess rather than race to an uncertain future”
Emad Mostaque, Founder and CEO of Stability AI

Introduction

Prominent AI researchers have identified a range of dangers that may arise from the present and future generations of advanced AI systems if they are left unchecked. AI systems are already capable of creating misinformation and authentic-looking fakes that degrade the shared factual foundations of society and inflame political tensions.¹ AI systems already show a tendency toward amplifying entrenched discrimination and biases, further marginalizing disadvantaged communities and diverse viewpoints.² The current, frantic rate of development will worsen these problems significantly.

As these types of systems become more sophisticated, they could destabilize labor markets and political institutions, and lead to the concentration of enormous power in the hands of a small number of unelected corporations. Advanced AI systems could also threaten national security, e.g., by facilitating the inexpensive development of chemical, biological, and cyber weapons by non-state groups. The systems could themselves pursue goals, either human- or self-assigned, in ways that place negligible value on human rights, human safety, or, in the most harrowing predictions, human existence.³

In an effort to stave off these outcomes, the Future of Life Institute (FLI), joined by over 15,000 leading AI researchers, professors, CEOs, engineers, students, and others on the frontline of AI progress, called for a pause of at least six months on the riskiest and most resource-intensive AI experiments – those experiments seeking to further scale up the size and general capabilities of the most powerful systems developed to date.⁴

The proposed pause provides time to better understand these systems, to reflect on their ethical, social, and safety implications, and to ensure that AI is developed and used in a responsible manner. The unchecked competitive dynamics in the AI industry incentivize aggressive development at the expense of caution⁵. In contrast to the breakneck pace of development, however, the levers of governance are generally slow and deliberate. A pause on the production of even more powerful AI systems would thus provide an important opportunity for the instruments of governance to catch up with the rapid evolution of the field.

We have called on AI labs to institute a development pause until they have protocols in place to ensure that their systems are safe beyond a reasonable doubt. Regardless of whether the labs will head our call, this policy brief provides policymakers with concrete recommendations for how governments can manage AI risks.

The recommendations are by no means exhaustive: the project of AI governance is a long game and will extend far beyond any pause. Nonetheless, implementing these recommendations, which largely reflect a broader consensus among AI policy experts, will establish a strong governance foundation for AI.

Policy recommendations

Mandate robust third-party auditing and certification.
Regulate access to computational power.
Establish capable AI agencies at the national level.
Establish liability for AI-caused harms.
Introduce measures to prevent and track AI model leaks.
Expand technical AI safety research funding.
Develop standards for identifying and managing AI-generated content and recommendations.

To coordinate, collaborate, or inquire regarding the recommendations herein, please contact us at policy@futureoflife.org.

1. Mandate robust third-party auditing and certification for specific AI systems

For some types of AI systems, the potential to impact the physical, mental, and financial wellbeing of individuals, communities, and society is obvious. For example, a credit scoring system could discriminate against certain ethnic groups. For other systems – in particular general-purpose AI systems⁶ – the applications and potential risks are often not immediately evident. General-purpose AI systems trained on massive datasets also have unexpected (and often unknown) emergent capabilities.⁷

In Europe, the draft AI Act already requires that, prior to deployment and upon any substantial modification, ‘high-risk’ AI systems undergo ‘conformity assessments’ in order to certify compliance with specified harmonized standards or other common specifications.⁸ In some cases, the Act requires such assessments to be carried out by independent third-parties to avoid conflicts of interest.

In contrast, the United States has thus far established only a general, voluntary framework for AI risk assessment.⁹ The National Institute of Standards and Technology (NIST) is developing so-called ‘profiles’ that will provide specific risk assessment and mitigation guidance for certain types of AI systems, but these profiles still allow organizations simply to ‘accept’ the risks that they create for society instead of addressing them. In other words, the United States does not require any third-party risk assessment or risk mitigation measures before a powerful AI system can be deployed at scale.

To ensure proper vetting of powerful AI systems before deployment, we recommend a robust independent auditing regime for models that are general-purpose, trained on large amounts of compute, or intended for use in circumstances likely to impact the rights or the wellbeing of individuals, communities, or society. This mandatory third-party auditing and certification scheme could be derived from the EU’s proposed ‘conformity assessments’ and should be adopted by jurisdictions worldwide¹⁰.

In particular, we recommend third-party auditing of such systems across a range of benchmarks for the assessment of risks¹¹, including possible weaponization¹² and unethical behaviors¹³ and mandatory certification by accredited third-party auditors before these high-risk systems can be deployed. Certification should only be granted if the developer of the system can demonstrate that appropriate measures have been taken to mitigate risk, and that any residual risks deemed tolerable are disclosed and are subject to established protocols for minimizing harm.

2. Regulate organizations’ access to computational power

At present, the most advanced AI systems are developed through training that requires an enormous amount of computational power – ‘compute’ for short. The amount of compute used to train a general-purpose system largely correlates with its capabilities, as well as the magnitude of its risks.

Today’s most advanced models, like OpenAI’s GPT-4 or Google’s PaLM, can only be trained with thousands of specialized chips running over a period of months. While chip innovation and better algorithms will reduce the resources required in the future, training the most powerful AI systems will likely remain prohibitively expensive to all but the best-resourced players.

Figure 1. OpenAI is estimated to have used approximately 700% more compute to train GPT-4 than the next closest model (Minerva, DeepMind), and 7,000% more compute than to train GPT-3 (Davinci). Depicted above is an estimate of compute used to train GPT-4 calculated by Ben Cottier at Epoch, as official training compute details for GPT-4 have not been released. Data from: Sevilla et al., ‘Parameter, Compute and Data Trends in Machine Learning’, 2021 .

In practical terms, compute is more easily monitored and governed than other AI inputs, such as talent, data, or algorithms. It can be measured relatively easily and the supply chain for advanced AI systems is highly centralized, which means governments can leverage such measures in order to limit the harms of large-scale models.¹⁴

To prevent reckless training of the highest risk models, we recommend that governments make access to large amounts of specialized computational power for AI conditional upon the completion of a comprehensive risk assessment. The risk assessment should include a detailed plan for minimizing risks to individuals, communities, and society, consider downstream risks in the value chain, and ensure that the AI labs conduct diligent know-your-customer checks.

Successful implementation of this recommendation will require governments to monitor the use of compute at data centers within their respective jurisdictions.^14.5 The supply chains for AI chips and other key components for high-performance computing will also need to be regulated such that chip firmware can alert regulators to unauthorized large training runs of advanced AI systems.¹⁵

Through passage of the CHIPS and Science Act of 2022, the United States has instituted licensing requirements^15.5 for export of many of these components in an effort to monitor and control their global distribution. However, licensing is only required when exporting to certain destinations, limiting the capacity to monitor aggregation of equipment for unauthorized large training runs within the United States and outside the scope export restrictions. Companies within the specified destinations have also successfully skirted monitoring by training AI systems using compute leased from cloud providers.¹⁶ We recommend expansion of know-your-customer requirements to all high-volume suppliers for high-performance computing components, as well as providers that permit access to large amounts of cloud compute.

3. Establish capable AI agencies at national level

AI is developing at a breakneck pace and governments need to catch up. The establishment of AI regulatory agencies helps to consolidate expertise and reduces the risk of a patchwork approach.

The UK has already established an Office for Artificial Intelligence and the EU is currently legislating for an AI Board. Similarly, in the US, Representative Ted Lieu has announced legislation to create a non-partisan AI Commission with the aim of establishing a regulatory agency. These efforts need to be sped up and taken up around the world.

We recommend that national AI agencies be established in line with a blueprint¹⁷ developed by Anton Korinek at Brookings. Korinek proposes that an AI agency have the power to:

Monitor public developments in AI progress and define a threshold for which types of advanced AI systems fall under the regulatory oversight of the agency (e.g. systems above a certain level of compute or that affect a particularly large group of people).
Mandate impact assessments of AI systems on various stakeholders, define reporting requirements for advanced AI companies and audit the impact on people’s rights, wellbeing, and society at large. For example, in systems used for biomedical research, auditors would be asked to evaluate the potential for these systems to create new pathogens.
Establish enforcement authority to act upon risks identified in impact assessments and to prevent abuse of AI systems.
Publish generalized lessons from the impact assessments such that consumers, workers and other AI developers know what problems to look out for. This transparency will also allow academics to study trends and propose solutions to common problems.

Beyond this blueprint, we also recommend that national agencies around the world mandate record-keeping of AI safety incidents, such as when a facial recognition system causes the arrest of an innocent person. Examples include the non-profit AI Incident Database and the forthcoming EU AI Database created under the European AI Act.¹⁸

4. Establish liability for AI-caused harm

AI systems present a unique challenge in assigning liability. In contrast to typical commercial products or traditional software, AI systems can perform in ways that are not well understood by their developers, can learn and adapt after they are sold and are likely to be applied in unforeseen contexts. The ability for AI systems to interact with and learn from other AI systems is expected to expedite the emergence of unanticipated behaviors and capabilities, especially as the AI ecosystem becomes more expansive and interconnected.

Several plug-ins have already been developed that allow AI systems like ChatGPT to perform tasks through other online services (e.g. ordering food delivery, booking travel, making reservations), broadening the range of potential real-world harms that can result from their use and further complicating the assignment of liability.¹⁹ OpenAI’s GPT-4 system card references an instance of the system explicitly deceiving a human into bypassing a CAPTCHA bot-detection system using TaskRabbit, a service for soliciting freelance labor.²⁰

When such systems make consequential decisions or perform tasks that cause harm, assigning responsibility for that harm is a complex legal challenge. Is the harmful decision the fault of the AI developer, deployer, owner, end-user, or the AI system itself?

Key among measures to better incentivize responsible AI development is a coherent liability framework that allows those who develop and deploy these systems to be held responsible for resulting harms. Such a proposal should impose a financial cost for failing to exercise necessary diligence in identifying and mitigating risks, shifting profit incentives away from reckless empowerment of poorly-understood systems toward emphasizing the safety and wellbeing of individuals, communities, and society as a whole.

To provide the necessary financial incentives for profit-driven AI developers to exercise abundant caution, we recommend the urgent adoption of a framework for liability for AI-derived harms. At a minimum, this framework should hold developers of general-purpose AI systems and AI systems likely to be deployed for critical functions²¹ strictly liable for resulting harms to individuals, property, communities, and society. It should also allow for joint and several liability for developers and downstream deployers when deployment of an AI system that was explicitly or implicitly authorized by the developer results in harm.

5. Introduce measures to prevent and track AI model leaks

Commercial actors may not have sufficient incentives to protect their models, and their cyberdefense measures can often be insufficient. In early March 2023, Meta demonstrated that this is not a theoretical concern, when their model known as LLaMa was leaked to the internet.²² As of the date of this publication, Meta has been unable to determine who leaked the model. This lab leak allowed anyone to copy the model and represented the first time that a major tech firm’s restricted-access large language model was released to the public.

Watermarking of AI models provides effective protection against stealing, illegitimate redistribution and unauthorized application, because this practice enables legal action against identifiable leakers. Many digital media are already protected by watermarking – for example through the embedding of company logos in images or videos. A similar process²³ can be applied to advanced AI models, either by inserting information directly into the model parameters or by training it on specific trigger data.

We recommend that governments mandate watermarking for AI models, which will make it easier for AI developers to take action against illegitimate distribution.

6. Expand technical AI safety research funding

The private sector under-invests in research that ensures that AI systems are safe and secure. Despite nearly USD 100 billion of private investment in AI in 2022 alone, it is estimated that only about 100 full-time researchers worldwide are specifically working to ensure AI is safe and properly aligned with human values and intentions.²⁴

In recent months, companies developing the most powerful AI systems have either downsized or entirely abolished their respective ‘responsible AI’ teams.²⁵ While this partly reflects a broader trend of mass layoffs across the technology sector, it nonetheless reveals the relative de-prioritization of safety and ethics considerations in the race to put new systems on the market.

Governments have also invested in AI safety and ethics research, but these investments have primarily focused on narrow applications rather than on the impact of more general AI systems like those that have recently been released by the private sector. The US National Science Foundation (NSF), for example, has established ‘AI Research Institutes’ across a broad range of disciplines. However, none of these institutes are specifically working on the large-scale, societal, or aggregate risks presented by powerful AI systems.

To ensure that our capacity to control AI systems keeps pace with the growing risk that they pose, we recommend a significant increase in public funding for technical AI safety research in the following research domains:

Alignment: development of technical mechanisms for ensuring AI systems learn and perform in accordance with intended expectations, intentions, and values.
Robustness and assurance: design features to ensure that AI systems responsible for critical functions²⁶ can perform reliably in unexpected circumstances, and that their performance can be evaluated by their operators.
Explainability and interpretability: develop mechanisms for opaque models to report the internal logic used to produce output or make decisions in understandable ways. More explainable and interpretable AI systems facilitate better evaluations of whether output can be trusted.

In the past few months, experts such as the former Special Advisor to the UK Prime Minister on Science and Technology James W. Phillips²⁷ and a Congressionally-established US taskforce have called for the creation of national AI labs as ‘a shared research infrastructure that would provide AI researchers and students with significantly expanded access to computational resources, high-quality data, educational tools, and user support.’²⁸ Should governments move forward with this concept, we propose that at least 25% of resources made available through these labs be explicitly allocated to technical AI safety projects.

7. Develop standards for identifying and managing AI-generated content and recommendations

The need to distinguish real from synthetic media and factual content from ‘hallucinations’ is essential for maintaining the shared factual foundations underpinning social cohesion. Advances in generative AI have made it more difficult to distinguish between AI-generated media and real images, audio, and video recordings. Already we have seen AI-generated voice technology used in financial scams.²⁹

Creators of the most powerful AI systems have acknowledged that these systems can produce convincing textual responses that rely on completely fabricated or out-of-context information.³⁰ For society to absorb these new technologies, we will need effective tools that allow the public to evaluate the authenticity and veracity of the content they consume.

We recommend increased funding for research into techniques, and development of standards, for digital content provenance. This research, and its associated standards, should ensure that a reasonable person can determine whether content published online is of synthetic or natural origin, and whether the content has been digitally modified, in a manner that protects the privacy and expressive rights of its creator.

We also recommend the expansion of ‘bot-or-not’ laws that require disclosure when a person is interacting with a chatbot. These laws help prevent users from being deceived or manipulated by AI systems impersonating humans, and facilitate contextualizing the source of the information. The draft EU AI Act requires that AI systems be designed such that users are informed they are interacting with an AI system,³¹ and the US State of California enacted a similar bot disclosure law in 2019.³² Almost all of the world’s nations, through the adoption of a UNESCO agreement on the ethics of AI, have recognized³³ ‘the right of users to easily identify whether they are interacting with a living being, or with an AI system imitating human or animal characteristics.’ We recommend that all governments convert this agreement into hard law to avoid fraudulent representations of natural personhood by AI from outside regulated jurisdictions.

Even if a user knows they are interacting with an AI system, they may not know when that system is prioritizing the interests of the developer or deployer over the user. These systems may appear to be acting in the user’s interest, but could be designed or employed to serve other functions. For instance, the developer of a general-purpose AI system could be financially incentivized to design the system such that when asked about a product, it preferentially recommends a certain brand, when asked to book a flight, it subtly prefers a certain airline, when asked for news, it provides only media advocating specific viewpoints, and when asked for medical advice, it prioritizes diagnoses that are treated with more profitable pharmaceutical drugs. These preferences could in many cases come at the expense of the end user’s mental, physical, or financial well-being.

Many jurisdictions require that sponsored content be clearly labeled, but because the provenance of output from complex general-purpose AI systems is remarkably opaque, these laws may not apply. We therefore recommend, at a minimum, that conflict-of-interest trade-offs should be clearly communicated to end users along with any affected output; ideally, laws and industry standards should be implemented that require AI systems to be designed and deployed with a duty to prioritize the best interests of the end user.

Finally, we recommend the establishment of laws and industry standards clarifying and the fulfillment of ‘duty of loyalty’ and ‘duty of care’ when AI is used in the place of or in assistance to a human fiduciary. In some circumstances – for instance, financial advice and legal counsel – human actors are legally obligated to act in the best interest of their clients and to exercise due care to minimize harmful outcomes. AI systems are increasingly being deployed to advise on these types of decisions or to make them (e.g. trading stocks) independent of human input. Laws and standards towards this end should require that if an AI system is to contribute to the decision-making of a fiduciary, the fiduciary must be able to demonstrate beyond a reasonable doubt that the AI system will observe duties of loyalty and care comparable to their human counterparts. Otherwise, any breach of these fiduciary responsibilities should be attributed to the human fiduciary employing the AI system.

Conclusion

The new generation of advanced AI systems is unique in that it presents significant, well-documented risks, but can also manifest high-risk capabilities and biases that are not immediately apparent. In other words, these systems may perform in ways that their developers had not anticipated or malfunction when placed in a different context. Without appropriate safeguards, these risks are likely to result in substantial harm, in both the near- and longer-term, to individuals, communities, and society.

Historically, governments have taken critical action to mitigate risks when confronted with emerging technology that, if mismanaged, could cause significant harm. Nations around the world have employed both hard regulation and international consensus to ban the use and development of biological weapons, pause human genetic engineering, and establish robust government oversight for introducing new drugs to the market. All of these efforts required swift action to slow the pace of development, at least temporarily, and to create institutions that could realize effective governance appropriate to the technology. Humankind is much safer as a result.

We believe that approaches to advancement in AI R&D that preserve safety and benefit society are possible, but require decisive, immediate action by policymakers, lest the pace of technological evolution exceed the pace of cautious oversight. A pause in development at the frontiers of AI is necessary to mobilize the instruments of public policy toward common-sense risk mitigation. We acknowledge that the recommendations in this brief may not be fully achievable within a six month window, but such a pause would hold the moving target still and allow policymakers time to implement the foundations of good AI governance.

The path forward will require coordinated efforts by civil society, governments, academia, industry, and the public. If this can be achieved, we envision a flourishing future where responsibly developed AI can be utilized for the good of all humanity.

^{↩ 1} See, e.g., Steve Rathje, Jay J. Van Bavel, & Sander van der Linden, ‘Out-group animosity drives engagement on social media,’ Proceedings of the National Academy of Sciences, 118 (26) e2024292118, Jun. 23, 2021, and Tiffany Hsu & Stuart A. Thompson, ‘Disinformation Researchers Raise Alarms About A.I. Chatbots,’ The New York Times, Feb. 8, 2023

^{↩ 2} See, e.g., Abid, A., Farooqi, M. and Zou, J. (2021a), ‘Large language models associate Muslims with violence’, Nature Machine Intelligence, Vol. 3, pp. 461–463.

^{↩ 3} In a 2022 survey of over 700 leading AI experts, nearly half of respondents gave at least a 10% chance of the long-run effect of advanced AI on humanity being ‘extremely bad,’ at the level of ‘causing human extinction or similarly permanent and severe disempowerment of the human species.’

^{↩ 4} Future of Life Institute, ‘Pause Giant AI Experiments: An Open Letter,’ Mar. 22, 2023.

^{↩ 5} Recent news about AI labs cutting ethics teams suggests that companies are failing to prioritize the necessary safeguards.

^{↩ 6} The Future of Life Institute has previously defined “general-purpose AI system” to mean ‘an AI system that can accomplish or be adapted to accomplish a range of distinct tasks, including some for which it was not intentionally and specifically trained.’

^{↩ 7} Samuel R. Bowman, ’Eight Things to Know about Large Language Models,’ ArXiv Preprint, Apr. 2, 2023.

^{↩ 8} Proposed EU Artificial Intelligence Act, Article 43.1b.

^{↩ 9} National Institute of Standards and Technology, ‘Artificial Intelligence Risk Management Framework (AI RMF 1.0),’ U.S. Department of Commerce, Jan. 2023.

^{↩ 10} International standards bodies such as IEC, ISO and ITU can also help in developing standards that address risks from advanced AI systems, as they have highlighted in response to FLI’s call for a pause.

^{↩ 11} See, e.g., the Holistic Evaluation of Language Models approach by the Center for Research on Foundation Models: Rishi Bommassani, Percy Liang, & Tony Lee, ‘Language Models are Changing AI: The Need for Holistic Evaluation‘.

^{↩ 12} OpenAI described weaponization risks of GPT-4 on p.12 of the “GPT-4 System Card.”

^{↩ 13} See, e.g., the following benchmark for assessing adverse behaviors including power-seeking, disutility, and ethical violations: Alexander Pan, et al., “Do the Rewards Justify the Means? Measuring Trade-offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark,” ArXiv Preprint, Apr. 6, 2023.

^{↩ 14} Jess Whittlestone et al., ‘Future of compute review – submission of evidence’, Aug. 8, 2022.

^{↩ 14.5} Please see fn. 14 for a detailed proposal for government compute monitoring as drafted by the Centre for Long-Term Resilience and several staff members of AI lab Anthropic.

^{↩ 15} Yonadav Shavit at Harvard University has proposed a detailed system for how governments can place limits on how and when AI systems get trained.

^{↩ 15.5} Bureau of Industry and Security, Department of Commerce, ‘Implementation of Additional Export Controls: Certain Advanced Computing and Semiconductor Manufacturing Items; Supercomputer and Semiconductor End Use; Entity List Modification‘, Federal Register, Oct. 14, 2022.

^{↩ 16} Eleanor Olcott, Qianer Liu, & Demetri Sevastopulo, ‘Chinese AI groups use cloud services to evade US chip export control,’ Financial Times, Mar. 9, 2023.

^{↩ 17} Anton Korinek, ‘Why we need a new agency to regulate advanced artificial intelligence: Lessons on AI control from the Facebook Files,’ Brookings, Dec. 8 2021.

^{↩ 18} Proposed EU Artificial Intelligence Act, Article 60.

^{↩ 19} Will Knight & Khari Johnson, ‘Now That ChatGPT is Plugged In, Things Could Get Weird,’ Wired, Mar. 28, 2023.

^{↩ 20} OpenAI, ‘GPT-4 System Card,’ Mar. 23, 2023, p.15.

^{↩ 21} I.e., functions that could materially affect the wellbeing or rights of individuals, communities, or society.

^{↩ 22} Joseph Cox, “Facebook’s Powerful Large Language Model Leaks Online,” VICE, Mar. 7, 2023.

^{↩ 23} For a systematic overview of how watermarking can be applied to AI models, see: Franziska Boenisch, ‘A Systematic Review on Model Watermarking of Neural Networks,’ Front. Big Data, Sec. Cybersecurity & Privacy, Vol. 4, Nov. 29, 2021.

^{↩ 24} This figure, drawn from , ‘The AI Arms Race is Changing Everything,’ (Andrew R. Chow & Billy Perrigo, TIME, Feb. 16, 2023 ), likely represents a lower bound for the estimated number of AI safety researchers. This resource posits a significantly higher number of workers in the AI safety space, but includes in its estimate all workers affiliated with organizations that engage in AI safety-related activities. Even if a worker has no involvement with an organization’s AI safety work or research efforts in general, they may still be included in the latter estimate.

^{↩ 25} Christine Criddle & Madhumita Murgia, ‘Big tech companies cut AI ethics staff, raising safety concerns,’ Financial Times, Mar. 29, 2023.

^{↩ 26} See fn. 21, supra.

^{↩ 27} Original call for a UK government AI lab is set out in this article.

^{↩ 28} For the taskforce’s detailed recommendations, see: ‘Strengthening and Democratizing the U.S. Artificial Intelligence Innovation Ecosystem: An Implementation Plan for a National Artificial Intelligence Research Resource,’ National Artificial Intelligence Research Resource Task Force Final Report, Jan. 2023.

^{↩ 29} Pranshu Verma, ‘They thought loved ones were calling for help. It was an AI scam.‘ The Washington Post, Mar. 5, 2023.

^{↩ 30} Tiffany Hsu & Stuart A. Thompson, ‘Disinformation Researchers Raise Alarms About A.I. Chatbots,’ The New York Times, Feb. 8, 2023 .

^{↩ 31} Proposed EU Artificial Intelligence Act, Article 52.

^{↩ 32} SB 1001 (Hertzberg, Ch. 892, Stats. 2018).

^{↩ 33} Recommendation 125, ‘Outcome document: first draft of the Recommendation on the Ethics of Artificial Intelligence,’ UNESCO, Sep. 7, 2020, p. 21.