The increasing ubiquity of artificial intelligence (AI) has introduced uncertainties around the copyright protection of software developed through both open- and closed-source models. For small business software developers, like ACT | The App Association members, who both deploy and use AI, these uncertainties disproportionately harm the ability to create competitive products across markets. The first blog in our “Copyright and AI Basics” series is designed to provide a foundational understanding of the copyright implications of using and deploying public-facing generative AI (GAI).

Background

Software developers have been using AI systems for years to increase efficiency in developing new technologies by reducing waste (i.e., cost and time), streamlining repeatable tasks, and optimizing solutions. However, the deployment of advanced and public-facing GAI has raised new and unique questions with respect to copyright infringement and authorship, and the potential of weakening copyright protections.

GAI models utilize training data to learn and reproduce content based on user prompts, expending fewer resources to create desirable outcomes. This technology has enabled startup and small business developers to compete with larger, well-resourced competitors. What remains uncertain is how GAI models train themselves and if such a process would lead to infringement liability for the GAI provider and, separately, the platform user.

Copyright Liability for GAI Models

While U.S. courts have not yet determined the extent of copyright liability for GAI providers or users, they have identified and continue to explore potential inflection points for copyright infringement through GAI technology. In a traditional copyright context, infringement occurs when a work is copied, either directly or through access to a copyright-protected work, without permission from the copyright holder.

Software developers also benefit from a unique copyright protection model: open-source licensing. An open-source license allows for anyone to use, learn, or modify the underlying source code if they adhere to conditions outlined by the associated license. Violating an open-source license constitutes infringement of the underlying copyrighted work. A common example of an open-source license is the GNU General Public License. Stay tuned for a deep dive into the unique concerns around the use of GAI and open-source licenses in a future blog in this series.

The Fair Use Defense 

The largest hurdle the courts face in defining liability for copyright protection in relation to GAI is “fair use.” Fair use is an exception to copyright infringement that permits the limited use of another’s copyrighted work without permission, safeguarding the public’s First Amendment right to freedom of expression. Courts determine fair use on a case-by-case basis using four factors: the character of the use, the nature of the underlying copyrighted work, the amount of the underlying work used compared to the product as a whole, and whether the new use could cause harm to the potential markets of the underlying work. In the context of software, fair use is evaluated based on whether the copied component is a functional or expressive component of the code.

Since software is a unique hybrid between expressive and inventive, the United States Supreme Court has ruled that using functional elements of software must be fair use to protect software innovation. In Google LLC v. Oracle America, Inc., the Court found that Google’s limited copying of Oracle’s Java application programming interface (API) packages to develop an independent program for the Android mobile operating system was fair use under U.S. Copyright Law. The Court found Google’s use of the Java API was transformative because Google only copied 0.4 percent of the API needed for their programmers to create a different environment but with a familiar language.

A fair-use examination of expressive components of software follows a traditional analysis. Recently, the United States Supreme Court further clarified this examination in Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith, where the Andy Warhol Foundation argued that applying aesthetic variations to an original photo of Prince taken by another individual was transformative enough to be considered a new and copyrightable work. The Court did not agree, stating that the variations were too similar and that the original photo was infringed. So, what does this have to do with software? Applied to software, if the nature and purpose of the non-functional elements of a computer program are slightly transformed from another established program, a court would likely rule this is not fair use and the transformed work infringes on the original program.

Types of AI-Assisted Infringement

For purposes of AI, a fair use defense is tested against two types of potential copyright infringement: training infringement and output infringement.

Training Infringement

Training infringement can occur when an AI system processes potentially copyright-registered works during its training phase. AI systems that train on larger data sets may have a lesser chance of directly copying underlying works compared to those using small data sets.

While not the controlling precedent outside six districts within Connecticut, New York, and Vermont, the U.S. District Court for the Second Circuit ruled that, in some contexts, machine learning constitutes fair use. For example, in Authors Guild v. Google, Inc., the court deemed the use of the copyrighted work transformative because making digital copies of a work to provide information to the public about the work functions to increase public knowledge without providing a substantial substitute for the original works.

In another case, ML Genius Holdings LLC v. Google LLC, the Second Circuit affirmed that the U.S. Copyright Act preempted a website owner’s state law breach of contract and unfair competition claims. Genius alleged that Google infringed by copying and displaying lyrics from Genius’ website without permission. Genius did not claim the actual lyrics because it does not hold copyright in the lyrics. Genius did claim that Google was unauthorized in displaying of the lyrics. The Second Circuit found that transcriptions, as opposed to underlying lyrics, are not copyrightable. While courts like the Second Circuit have contemplated the tension between advanced AI and IP infringement, there is no consensus yet among different U.S. courts.

Output Infringement

Output infringement can occur when an AI system, after being trained and prompted by an end user, produces a work that includes, in part or in whole, copyrighted material. In this case, fair use may depend on the extent to which the trained-on work is redistributed to the end user. While there is unresolved precedent regarding the extent of copyright liability for AI providers or end users, potentially integrating infringing outputs in your work could subject you to liability and render your work uncopyrightable.

This issue can arise with popular programs like GitHub Copilot or ChatGPT. For example, GitHub Copilot is a GAI model trained in all coding languages, downloading and copying publicly accessible data to produce user input. Issues arise because Copilot could replicate existing lines of protected code, potentially introducing security vulnerabilities and subjecting platform users to unintended liability for copyright infringement. While there is not enough precedent defining the contours of output infringement, the App Association continues to be a significant player in molding this landscape for our member companies.

Authorship of AI-Assisted Works

Historically, the courts have interpreted the author of a copyrightable work to be human. For example, in Naruto v. David Slater, the U.S. District Court for the Ninth Circuit held that the U.S. Copyright Act does not expressly allow animals to be the author of a copyrightable work. This was the case of the infamous “monkey selfie,” where an individual left camera equipment unattended, and a monkey by the name of Naruto used it to take a selfie. The individual tried to assert copyright ownership over the selfie that was going viral and being reproduced. The Ninth Circuit ultimately decided that no one could receive authorship credit for the photo. Similarly, if a work is entirely produced by AI, the U.S. Copyright Office will deem the work uncopyrightable. In Thaler v. Perlmutter, the U.S. Supreme Court affirmed the U.S. Copyright Office’s decision to reject the registration of a piece of work entirely created by GAI. While this case is currently on appeal, the U.S. Copyright Office is confident that the decision is likely to be upheld. This case suggests in order for work to be copyrightable it must have human authorship. The U.S. Copyright Office has released AI-assisted guidelines reflecting the Thaler case, requiring that a person seeking copyright protection for an AI-assisted work must provide evidence that the AI had a minimal role in the development of the work. Otherwise, the U.S. Copyright Office states that the work may not be able to receive copyright protection at all.

While the App Association is encouraged by the clarity this guidance provides, we have offered the agency detailed comments indicating that the guidelines are misguided and inflexible in the age of advanced AI. We believe the U.S. Copyright Office should consider the amount and level of “human authorship” rather than the amount of content generated by GAI when making registration determinations.

Conclusion

Our members and software developers at large play a crucial role in shaping an innovation landscape where strong copyright protections align with the advancement of AI. The App Association will continue to elevate our membership perspective through regulatory comments and agency meetings. As we amplify the voice of the small software developer community, we encourage our community to continue sharing their stories – email Brad Simonich here to share your perspective. In the meantime, stay tuned for the second blog in this series that will discuss transparency requirements Congress should consider to better meet the needs of small businesses.