The US Copyright Office is taking public comment on potential new rules around generative AI’s use of copyrighted materials, and the biggest AI companies in the world had plenty to say. Below are the arguments from Meta, Google, Microsoft, Adobe, Hugging Face, StabilityAI, and Anthropic, as well as a response from Apple that focused on copyrighting AI-written code.
The Current Debate
There are some differences in their approaches, but the overall message for most is the same: They don’t think they should have to pay to train AI models on copyrighted work.
How AI is rewriting the internet
The Copyright Office opened the comment period on August 30th, with an October 18th due date for written comments regarding changes it was considering around the use of copyrighted data for AI model training, whether AI-generated material can be copyrighted without human involvement, and AI copyright liability. There’s been no shortage of copyright lawsuits in the last year, with artists, authors, developers, and companies alike alleging violations in different cases.
Meta: "Copyright holders wouldn’t get much money anyway"
"Imposing a first-of-its-kind licensing regime now, well after the fact, will cause chaos as developers seek to identify millions and millions of rights-holders, for very little benefit, given that any fair royalty due would be incredibly small in light of the insignificance of any one work among an AI training set."
Google: "AI training is just like reading a book"
"If training could be accomplished without the creation of copies, there would be no copyright questions here. Indeed, the act of 'knowledge harvesting,' to use the Court’s metaphor from Harper & Row, like the act of reading a book and learning the facts and ideas within it, would not only be non-infringing, it would further the very purpose of copyright law. The mere fact that, as a technological matter, copies need to be made to extract those ideas and facts from copyrighted works should not alter that result."
Microsoft: "Changing copyright law could hurt small AI developers"
"Any requirement to obtain consent for accessible works to be used for training would chill AI innovation. It is not feasible to achieve the scale of data necessary to develop responsible AI models even when the identity of a work and its owner is known. Such licensing schemes will also impede innovation from start-ups and entrants who don’t have the resources to obtain licenses, leaving AI development to a small set of companies with the resources to run large-scale licensing programs or to developers in countries that have decided that use of copyrighted works to train AI models is not infringement."
Adobe: "It’s fair use, like when Accolade copied Sega’s code"
"In Sega v. Accolade, the Ninth Circuit held that intermediate copying of Sega’s software was fair use. The defendant made copies while reverse engineering to discover the functional requirements—unprotected information—for making games compatible with Sega’s gaming console. Such intermediate copying also benefited the public: it led to an increase in the number of independently designed video games (which contain a mix of functional and creative aspects) available for Sega’s console. This growth in creative expression was precisely what the Copyright Act was intended to promote."
Anthropic: "Copying is just an intermediate step"
"For Claude, as discussed above, the training process makes copies of information for the purposes of performing a statistical analysis of the data. The copying is merely an intermediate step, extracting unprotectable elements about the entire corpus of works, in order to create new outputs. In this way, the use of the original copyrighted work is non-expressive; that is, it is not re-using the copyrighted expression to communicate it to users."
Andreessen Horowitz: "Investors have spent ‘billions and billions’"
"There has been an enormous amount of investment—billions and billions of dollars—in the development of AI technologies, premised on an understanding that, under current copyright law, any copying necessary to extract statistical facts is permitted. A change in this regime will significantly disrupt settled expectations in this area. Those expectations have been a critical factor in the enormous investment of private capital into U.S.-based AI companies which, in turn, has made the U.S. a global leader in AI. Undermining those expectations will jeopardize future investment, along with U.S. economic competitiveness and national security."
Hugging Face: "Training on copyrighted material is fair use"
"The use of a given work in training is of a broadly beneficial purpose: the creation of a distinctive and productive AI model. Rather than replacing the specific communicative expression of the initial work, the model is capable of creating a wide variety of different sort of outputs wholly unrelated to that underlying, copyrightable expression. For those and other reasons, generative AI models are generally fair use when they train on large numbers of copyrighted works. We use 'generally' deliberately, however, as one can imagine patterns of facts that would raise tougher calls."
StabilityAI: "Other countries call AI model training fair use"
"A range of jurisdictions including Singapore, Japan, the European Union, the Republic of Korea, Taiwan, Malaysia, and Israel have reformed their copyright laws to create safe harbors for AI training that achieve similar effects of fair use. In the United Kingdom, the Government Chief Scientific Advisor has recommended that 'if the government’s aim is to promote an innovative AI industry in the UK, it should enable mining of available data, text, and images (the input) and utilize existing protections of copyright and IP law on the output of AI."
Apple: "Let us copyright our AI-made code"
"In circumstances where a human developer controls the expressive elements of output and the decisions to modify, add to, enhance, or even reject suggested code, the final code that results from the developer’s interactions with the tools will have sufficient human authorship to be copyrightable."