The EU Artificial Intelligence Act (AI Act) introduces specific rules for large, general AI models that can serve many purposes. These models are often called foundation models in the AI community – think of powerful AI systems like GPT language models or image generators. In the AI Act, they’re officially termed “general-purpose AI models” (GPAI models). This article explains what that means, how the Act regulates such models (including key provisions like Article 3’s definitions, Article 52a’s framework, and Title III, Chapter V), and what obligations providers of these models must follow. We’ll break down the legal terms into plain language and give examples for clarity.
General-Purpose AI Models vs. Foundation Models (Definitions)
General-Purpose AI (GPAI) Model: The AI Act defines a GPAI model as “an AI model… that displays significant generality and is capable [of] competently performing a wide range of distinct tasks… and that can be integrated into a variety of downstream systems or applications.” In essence, these are versatile AI models trained on very large data sets (often via self-supervision) to handle many different tasks. Crucially, the Act excludes models “used before release on the market for research, development and prototyping” – only models actually made available (placed on the market) are covered.
Foundation Model: “Foundation model” is a popular term in tech for these broad AI models. In practice, it refers to the same concept as a GPAI model in the Act. The Act uses the term GPAI model to cover what are commonly known as foundation models. For example, a large language model like OpenAI’s GPT-4 or an image-generating model like DALL·E 2 would be considered foundation models – and thus GPAI models under the Act. These models are “foundational” because they serve as base models that can be adapted or fine-tuned for a wide array of applications (chatbots, writing assistants, image creation tools, etc.). As IBM’s explainer puts it, a foundation model is a GPAI; a chatbot or other AI application built on that model would be considered an AI system under the Act. In summary, foundation models = GPAI models in the AI Act’s language, referring to large-scale, general-purpose AI trained for broad versatility.
Illustrative Examples: Some well-known GPAI/foundation models include large language models like GPT-3/GPT-4, Google’s PaLM, and open-source models like Meta’s LLaMA. Image generators such as OpenAI’s DALL·E 2 or Stable Diffusion are also foundation models, since they can create novel images for countless purposes. DeepMind’s AlphaStar (games), OpenAI’s Codex (code generation), and GPT-3 are cited among influential general-purpose AI systems. These examples help show the range: from text to images to multimodal capabilities, foundation models are those AI systems with broad skills that can be plugged into many downstream uses.
How the AI Act Treats General-Purpose & Foundation Models
Special Category in the AI Act: The EU AI Act was designed around a risk-based approach mainly targeting specific AI systems (applications like medical devices, hiring tools, etc.). General-purpose AI models didn’t fit neatly into that scheme because they aren’t tied to one use – they can be used in many applications, some risky, some not. To address this, lawmakers added special provisions for GPAI models (foundation models). Title III, Chapter V of the AI Act is dedicated to rules for general-purpose AI models. (In earlier drafts this was introduced via an Article 52a, which evolved into the final Chapter V section in the Act.)
Definitions in Law: In Article 3 of the Act, the GPAI model definition (discussed above) was added to legally pinpoint what counts as a general-purpose model. The Act also introduces concepts like “high-impact capabilities” and “systemic risk” specific to these models. In simple terms, systemic risk refers to the scenario where an extremely advanced, widely used general-purpose model could have large-scale impacts on society or the economy – for example, if a model is so powerful and pervasive that its failures or misuse could cause widespread harm. The Act worries that such top-tier models (sometimes dubbed “high-impact” foundation models) might enable serious problems like sophisticated disinformation, cyberattacks, or major bias and safety issues.
Two-Tier Approach – “Generic” vs “Systemic” Models: The AI Act distinguishes between two categories of GPAI models:
- Regular General-Purpose AI Models: These are foundation models that do not meet the threshold of “systemic risk.” They are subject to a set of baseline obligations (described in the next section). This will cover the majority of foundation models made available in the market.
- GPAI Models with “Systemic Risk”: These are the high-impact foundation models that are considered so capable and widely utilized that they present heightened risks at the EU level. Article 51 of the Act lays out that the European Commission can designate a general-purpose AI model as having systemic risk if it meets certain criteria. One benchmark given is if the model required an exceptionally large amount of computational power to train (for instance, a rough yardstick is training compute above 10^25 FLOPs), suggesting it’s among the most advanced of its kind. Such designation isn’t automatic just by size – the Commission, advised by an AI Office and experts, will evaluate a model’s impact and capabilities to decide if it’s “systemic.” In essence, this category is meant to capture the very powerful foundation models (think of cutting-edge models that rival the most advanced known AI).
Regulatory Implications: Why does this distinction matter? Because the Act imposes additional stringent obligations on providers of systemic-risk GPAI models. If a foundation model is classified as “systemic,” the provider must meet all the baseline duties plus extra requirements similar to those for high-risk AI systems. (These could include conducting thorough risk assessments, reducing identified risks, reporting serious incidents or misuse to authorities, and ensuring robust safety and cybersecurity measures.) In contrast, a “regular” GPAI model (without systemic risk designation) has lighter obligations focused more on transparency and documentation rather than active risk mitigation. The idea is to have a proportionate approach: all foundation models need some oversight, but the most powerful ones need greater oversight.
To summarize, the AI Act carves out foundation models (GPAI models) as a separate regulated group. All such models must follow certain rules (transparency, documentation, etc.), and if a model is deemed high-impact/systemic, its provider will face additional controls (similar to how “high-risk” AI applications are regulated). This two-tier model was introduced to ensure innovation with general-purpose AI isn’t stifled across the board, while still reining in the potential outsized harms of the very powerful AI models.
Obligations for Providers of General-Purpose AI Models (Foundation Models)
Providers who develop and make available general-purpose AI models in the EU must comply with specific obligations under the AI Act. In plain terms, these rules aim to ensure foundation models are documented, transparent, and used responsibly. If you provide a foundation model (GPAI model) in the EU, here are the key requirements (for non-“high-impact” models):
- Maintain Technical Documentation: You must create and keep updated detailed technical documentation about the model. This includes info on how the model was developed, how it was trained (training data, process), how it was tested, and how it performs (evaluation results). Essentially, regulators want a record of the model’s design and behavior. This documentation should be ready to share with the new EU “AI Office” or national authorities upon request. (The AI Act even specifies annexes listing what info to include in these docs, like descriptions of the model architecture, training approach, dataset details, performance benchmarks, and even information on compute resources and energy usage.)
- Provide Information to Downstream Users (Developers): If you’re the provider of a GPAI model, you also need to supply documentation and usage guidelines to any company or person who integrates your model into their own AI systems. In practice, this means sharing sufficient information so that those building on your model understand its capabilities and limitations and can comply with the AI Act in their use-case. For example, if a company uses your foundation model to build an AI hiring tool, you should have provided them info about what your model can and cannot reliably do, known risk areas (like biases), etc., so they can deploy it responsibly. This transfer of information should include at minimum the elements listed in Annex XII of the Act (e.g. intended functions, performance metrics, limitations, and guidelines for safe use of the model). Importantly, you can redact or protect truly sensitive intellectual property in this documentation, but only insofar as it doesn’t prevent downstream users from understanding and safely using the model.
- Ensure Copyright and Data Rights Compliance: Providers of foundation models must have a policy and technical measures to comply with EU copyright law. In simple terms, if your model was trained on copyrighted works (text, images, etc.), you need to respect any rights reservations by content owners. The Act specifically references Article 4(3) of the EU Copyright Directive (2019/790), which allows content creators to opt-out of text and data mining. So, providers should implement “state-of-the-art” solutions to detect if data had a “do not train” flag and ensure the model doesn’t use such data without permission. This obligation addresses a big concern with foundation models – that they are often trained on vast internet data, potentially ingesting copyrighted material. Under the AI Act, model developers need to take active steps to avoid IP infringement, for example by filtering training data or honoring opt-out lists. This is a new kind of requirement, effectively marrying AI development with compliance to copyright rules.
- Publish a Summary of Training Data: To increase transparency, providers must publish a high-level summary of the data used to train the model. The Act calls for a “sufficiently detailed summary about the content” the model was trained on, using a template that will be provided by the EU’s AI Office. This doesn’t mean you have to list every data point used, but you need to give an overview of the types of data, sources, and characteristics of the training dataset. For instance, a model like GPT-4’s summary might state it was trained on large swaths of internet text from books, websites, and articles up to a certain date, including multiple languages, plus code from public repositories, etc. The goal is to let users and regulators know at a general level what kind of material shaped the AI’s knowledge. This helps identify potential biases or gaps and improves trust through some openness about data provenance.
- Cooperate with Authorities: Under the Act, model providers must be ready to cooperate with EU regulators and national supervisory authorities upon request. This could involve answering questions, providing additional information, or addressing issues identified about the model. Essentially, if the AI Office or a national authority is investigating something related to your foundation model, you have a duty to assist and share relevant information. This cooperation clause is about oversight – ensuring that authorities can obtain information needed to enforce the law or assess risks.
- Optional Compliance via Codes of Practice or Standards: The Act acknowledges that formal standards for these emerging technologies may take time to develop. Therefore, providers can rely on approved “codes of practice” (industry self-regulation guidelines) to help meet the above obligations. If a code of conduct is endorsed under Article 56, following it can serve as evidence that you’re in compliance, at least until official harmonised standards are published. Once formal standards exist, complying with those gives a “presumption of conformity” with the law. In practice, this means the AI industry and regulators might collaboratively develop best-practice guidelines for documentation, transparency, etc., and model providers who adopt them will have an easier time demonstrating compliance. If you choose not to follow an established code or standard, you’ll need to prove in other ways that you meet the requirements.
- Open-Source Model Exception: Notably, the Act carves out an exception for certain open-source AI models. If you openly release a model under a free/open-source license and you make the model’s full details public (including the model’s architecture and weights), then the documentation obligations (the technical file for regulators and info for downstream users) do not apply. The rationale is that if a model’s inner workings and parameters are fully open to all, there’s less need to force the provider to produce separate technical documentation – the transparency is achieved through openness. However, this exception disappears if the model is found to have systemic risks. In other words, an open-source foundation model that becomes extremely advanced and widely adopted could still be regulated as high-risk. For the vast majority of open models (e.g. a moderately sized language model released on GitHub), the developers won’t have to submit technical documentation to authorities or to every downstream user. They still should publish the training data summary and follow copyright rules though – the Act doesn’t waive those. This provision was included to support open science and not unintentionally stifle open-source AI development, while still holding the biggest, most impactful models accountable.
The above obligations focus on transparency, documentation, and responsible practices rather than strict risk controls. They apply to all GPAI model providers except those models designated with systemic risk (which have further requirements). In effect, if you’re providing a foundation model in the EU, you need to be much more open about your model’s development and ensure it’s accompanied by info for safe use. These duties are lighter than those for “high-risk AI systems” (like medical AI devices), but they are unprecedented for general AI models – marking the first time broad AI models are directly regulated.
(Note: If a foundation model is classified as high-impact/systemic, the provider must do more on top of the above. Additional duties for those “GPAI with systemic risk” include performing standardized risk evaluations, mitigating foreseeable risks, and reporting serious incidents or misuse to the authorities, as well as ensuring strong cybersecurity for the model and its infrastructure. Those enhanced obligations are analogous to the governance and risk management steps required for high-risk AI applications. However, for this article we focused on the baseline obligations that apply to all general-purpose AI models, excluding the extra “systemic risk” provisions.)
Conclusion: Clarity and Accountability for Foundation Model Providers
The EU AI Act’s provisions on general-purpose AI models (foundation models) represent a new approach to AI governance. Rather than regulating only end-use applications, the Act reaches upstream to the providers of large AI models that underpin many services. It clearly defines what counts as a general-purpose (foundation) model and sets uniform rules across the EU for those who develop them. Providers of models like GPT, image generators, or other versatile AI systems will need to be more transparent about how these models work, what data they were trained on, and how to use them responsibly. By requiring documentation, sharing of information, and public disclosure of training data summaries, the Act aims to make foundation model development more accountable.
From a practical standpoint, these rules should help downstream companies and developers better understand the AI models they are integrating. For example, a startup building a legal document analyzer on top of a large language model will have access to the model’s provided documentation and know its limits, helping them comply with law and avoid misuse. The public summary of training data and the mandate to respect copyright address concerns about dataset transparency and intellectual property rights in AI.
For general-purpose AI providers, compliance will require some work – documenting models thoroughly, engaging with the new AI Office, and possibly adjusting data practices – but it also offers a clear framework to operate legally in the EU market. Many large AI developers may start aligning with these obligations even before the law fully applies (which is expected in 2025 for GPAI provisions). The Act is poised to influence global standards, so meeting its requirements could become a badge of trust and safety for AI model providers.
In sum, the EU AI Act treats foundation models as a class that needs transparency and oversight, without equating them all to “high-risk” use cases by default. The distinction between a general-purpose AI model and other AI systems is now codified in law: if you build the foundational model itself, you have direct obligations under the Act, and if you build an application using that model, you have your own (potentially different) obligations depending on the use-case. This layered approach intends to foster innovation in AI (by not blanket-regulating every use of a foundation model) while ensuring the model creators take responsibility for the general capabilities they unleash. It’s a novel regulatory balance, and providers of GPAI models like GPT or DALL·E will need to navigate these rules carefully to remain compliant and continue driving AI progress in a safe, trustworthy manner.