SRHE Blog

The Society for Research into Higher Education

Fair use or copyright infringement? What academic researchers need to know about ChatGPT prompts

4 Comments

by Anita Toh

As scholarly research into and using generative AI tools like ChatGPT becomes more prevalent, it is crucial for researchers to understand the intersections of copyright, fair use, and use of generative AI in research. While there is much discussion about the copyrightability of generative AI outputs and the legality of generative AI companies’ use of copyrighted material as training data (Lucchi, 2023), there has been relatively little discussion about copyright in relation to user prompts. In this post, I share an interesting discovery about the use of copyrighted material in ChatGPT prompts.

Imagine a situation where a researcher wishes to conduct a content analysis on specific YouTube videos for academic research. Does the researcher need to obtain permission from YouTube or the content creators to use these videos?

As per YouTube’s guidelines, researchers do not require explicit copyright permission if they are using the content for “commentary, criticism, research, teaching, or news reporting,”as these activities fall under the umbrella of fair use (Fair Use on YouTube – YouTube Help, 2023).

What about this scenario? A researcher wants to compare the types of questions posed by investors on the reality television series, Shark Tank, with questions generated by ChatGPT as it roleplays an angel investor. The researcher plans to prompt ChatGPT with a summary of each Shark Tank pitch and ask ChatGPT to roleplay as an angel investor and ask questions. In this case, would the researcher need to obtain permission from Shark Tank or its production company, Sony Pictures Television?

In my exploration, I discovered that it is indeed crucial to obtain permission from Sony Pictures Television. ChatGPT’s terms of service emphasise that users should “refrain from using the service in a manner that infringes upon third-party rights. This explicitly means the input should be devoid of copyrighted content unless sanctioned by the respective author or rights holder” (Fiten & Jacobs, 2023).

I therefore initiated communication with Sony Pictures Television, seeking approval to incorporate Shark Tank videos in my research. However, my request was declined by Sony Pictures Television in California, citing “business and legal reasons”. Undeterred, I approached Sony Pictures Singapore, only to receive a reaffirmation that Sony cannot endorse my proposed use of their copyrighted content “at the present moment”. They emphasised that any use of their copyrighted content must strictly align with the Fair Use doctrine.

This evokes the question: Why doesn’t the proposed research align with fair use? My initial understanding is that the fair use doctrine allows re-users to use copyrighted material without permission from the right holders for news reporting, criticism, review, educational and research purposes (Copyright Act 2021 Factsheet, 2022).

In the absence of further responses from Sony Pictures Television, I searched the web for answers.

Two findings emerged which could shed light on Sony’s reservations:

  • ChatGPT’s terms highlight that “user inputs, besides generating corresponding outputs, also serve to augment the service by refining the AI model” (Fiten & Jacobs, 2023; OpenAI Terms of Use, 2023).
  • OpenAI is currently facing legal action from various authors and artists alleging copyright infringement (Milmo, 2023). They contend that OpenAI had utilized their copyrighted content to train ChatGPT without their consent. Adding to this, the New York Times is also contemplating legal action against OpenAI for the same reason (Allyn, 2023).

These revelations point to a potential rationale behind Sony Pictures Television’s reluctance: while use of their copyrighted content for academic research might be considered fair use, introducing this content into ChatGPT could infringe upon the non-commercial stipulations (What Is Fair Use?, 2016) inherent in the fair use doctrine.

In conclusion, the landscape of copyright laws and fair use in relation to generative AI tools is still evolving. While previously researchers could rely on the fair use doctrine for the use of copyrighted material in their research work, the availability of generative AI tools now introduces an additional layer of complexity. This is particularly pertinent when the AI itself might store or use data to refine its own algorithms, which could potentially be considered a violation of the non-commercial use clause in the fair use doctrine. Sony Pictures Television’s reluctance to grant permission for the use of their copyrighted content in association with ChatGPT reflects the caution that content creators and rights holders are exercising in this new frontier. For researchers, this highlights the importance of understanding the terms of use of both the AI tool and the copyrighted material prior to beginning a research project.

Anita Toh is a lecturer at the Centre for English Language Communication (CELC) at the National University of Singapore (NUS). She teaches academic and professional communication skills to undergraduate computing and engineering students.

Author: SRHE News Blog

An international learned society, concerned with supporting research and researchers into Higher Education

4 thoughts on “Fair use or copyright infringement? What academic researchers need to know about ChatGPT prompts

  1. Thank you this is interesting, I am wondering how this changes now with AI accessing the internet, because at hat point the material would be available to AI anyway so you are only pointing it to data it has access to.

  2. If GEN AIs are refined with user inputs, can it’s application be limited in anyway?

  3. Pingback: Ask a Information Ethicist: What Information Is OK to Use to Immediate ChatGPT? - BitWolf

  4. You could consider checking the Anthropic terms. They say that they will not train on data in prompts or output. This indicates that Fair use could apply. On the other hand the demand that you have rights to data used in prompts so that could be an issue.

Leave a Reply

Discover more from SRHE Blog

Subscribe now to keep reading and get access to the full archive.

Continue reading