While generative AI tools can help users with such tasks as brainstorming for new ideas, organizing existing information, mapping out scholarly discussions, or summarizing sources, they are also notorious for not relying fully on factual information or rigorous research strategies. In fact, they are known for producing "hallucinations," an AI science term used to describe false information created by the AI system to defend its statements. Oftentimes, these "hallucinations" can be presented in a very confident manner and consist of partially or fully fabricated citations or facts.
Certain AI tools have even been used to intentionally produce false images or audiovisual recordings to spread misinformation and mislead the audience. Referred to as "deep fakes," these materials can be utilized to subvert democratic processes and are thus particularly dangerous.
Additionally, the information presented by generative AI tools may lack currency as some of the systems do not necessarily have access to the latest information. Rather, they may have been trained on past datasets, thus generating dated representations of current events and the related information landscape.
Another potentially significant limitation of AI is the bias that can be embedded in the products it generates. Fed immense amounts of data and text available on the internet, these large language model systems are trained to simply predict the most likely sequence of words in response to a given prompt, and will therefore reflect and perpetuate the biases inherent in the inputted internet information. An additional source of bias lies in the fact that some generative AI tools utilize reinforcement learning with human feedback (RLHF), with the caveat that the human testers used to provide this feedback are themselves non-neutral. Accordingly, generative AI like ChatGPT is documented to have provided output that is socio-politically biased, occasionally even containing sexist, racist, or otherwise offensive information.
Generative AI tools have introduced new challenges in academic integrity, particularly related to plagiarism.
Plagiarism is typically defined as presenting someone else's work or ideas as one's own. While a generative AI tool might not qualify as a "someone," using text generated from an AI tool without citing is still considered plagiarism, according to Georgetown University Honor Council, because the work is still not the researcher's own. Individual policies for using and crediting GAI tools might vary from class to class, so looking at the syllabus and having a clear understanding from the professor is important.
A note about plagiarism detection tools:
A number of AI detection tools are currently available to publishers and institutions, but there are concerns about low rates of accuracy and false accusations. Because generative AI tools do not generate large amounts of text word-for-word from existing works, it can be difficult for automated tools to detect plagiarism. Georgetown does not currently use the AI detection feature of its plagiarism detection tool, Turnitin.
Another area of academic integrity affected by GAI tools is that of false citations.
Providing false citations in research, whether intentional or unintentional, violates the Honor Council's Standards of Conduct. GAI tools such as ChatGPT have been known to generate false citations, and even if the citations represent actual papers, the cited content in ChatGPT might still be inaccurate.
There are currently also multiple privacy concerns associated with the use of generative AI tools. The most prominent issues revolve around the possibility of a breach of personal/sensitive data and re-identification. More specifically, most AI-powered language models, including ChatGPT, require for users to input large amounts of data to be trained and generate new information products effectively. This translates into personal or sensitive user-submitted data becoming an integral part of the collection of material used to further train the AI without the explicit consent of the user. Moreover, certain generative AI policies even permit AI developers to profit off of this personal/sensitive information by selling it to third parties. Even in cases when clear identifying personal information is not entered by AI user, the utilization of the system carries a risk of re-identification as the submitted dataset may contain patterns allowing for the generated information to be linked back to the individual or entity.
Given these issues, extensive downloading of Library materials to build AI training corpora is prohibited. Additionally, some Library content providers prohibit any amount of their content being used with AI tools (please see Library's Policy on Responsible Use of Electronic Resources).
This work is licensed under a Creative Commons Attribution NonCommercial 4.0 International License. | Details of our policy