Generative artificial intelligence (GenAI) and large language models (LLMs) have an inherent issue of plagiarism. This issue isn’t limited to those seeking to gain an undeserved benefit but integral to GenAI from its inception.
It’s becoming widespread knowledge that the use of chatbots and virtual assistants by students is on the rise, especially as the academic year ends and coursework is assessed. GenAI—AI that can generate data like text and images— is often utilised to polish or refurbish finished essays.
The second method involves checking the facts and rephrasing content generated by AI. Both tactics eliminate the so-called ‘hallucination’ problem—where a virtual assistant conjures convincing but fabricated answers and citations. The question remains, is this the modern equivalent of borrowing essays from seniors? The answer is, it’s still plagiarism. The long-term repercussions of depending on AI, however, are wider in scope.
This issue of plagiarism is also evident in Leaving Cert coursework. There is considerable unease regarding the reform of the senior cycle which prescribes that all subjects include additional assessment elements, which are not traditional written exams. Out of the 41 subjects available at Leaving Cert level prior to the completion of the senior cycle reform, only 12 were without an additional assessment component. Several of these, like oral exams, are immune to AI usage. But the vulnerability of others, such as research projects and essays, cannot be ignored.
Misuse of GenAI has also reached the field of scientific research. Irony notwithstanding, a single study scrutinised the scientific peer review in AI conferences following the emergence of ChatGPT. The findings indicated that anywhere between 6.5% and 16.9% of the text submitted as peer reviews for these conferences may have been significantly altered by LLMs.
Increased usage of adjectives like ‘commendable’, ‘meticulous’, and ‘intricate’ was one of the signs. Presently, AI reduces languages to monotony; however, this is expected to soon evolve.
Scarlett Johansson, the renowned actress, recently expressed indignant frustration when OpenAI’s latest voice assistant for ChatGPT4o seemingly mirrored her signature voice so perfectly, it misled her nearest and dearest. Although Sam Altman, the lead executive of OpenAI who was embroiled in controversy after being dismissed and reappointed in one week due to allegations of dishonesty towards his board, twice requested her permission, her response remained negative on both occasions.
OpenAI has denied any association of Johansson’s voice with ‘Sky’, a sultry and coquettish voice that serves as one of the five alternatives for communication with ChatGPT4o. Despite this, the voice has been recalled. In a previous venture, Johansson lent her voice to the character of Samantha in the 2013 film ‘Her’, a production by Spike Jonze. The narrative features a solitary introvert, Theodore, characterised by Joaquin Phoenix, who finds himself smitten by a virtually all-knowing AI assistant. The twist being, the omnipresent Samantha simultaneously conducts dialogues with 8,316 individuals and has romantic feelings for 641 of them, all while engaged in conversation with Theodore. This film has notably earned a place in Altman’s list of favourites.
The tactics employed by GenAI organisations to instruct their LLMs are otherworldly. They require colossal quantities of data that are hard to comprehend. While Johansson enjoys the privileges of wealth and popularity, there are less fortunate voice actors who claim their voices have been replicated and used by GenAI firms. According to The New York Times, Paul Skye Lehrman and Linnea Sage, a couple, have experienced most of their voice acting contracts from Fiverr, an affordable freelance platform. The pair are accusing Lovo, a start-up based in California, of unlawfully duplicating their voices, compromising their source of income.
Interestingly, The New York Times has thus far invested $1 million in litigation against OpenAI and Microsoft. The publication contends that the companies contravened fair use by not merely utilising New York Times articles to train their LLMs, but also by replicating articles almost identically while generating chatbot responses.
GenAI corporations train their Large Language Models (LLMs) through methods that border the surreal. The process requires a colossal expanse of data. A research unit guided by tech journalist at The New York Times, Cade Metz, uncovered that OpenAI had already utilized every reliable source of English digital content by 2021 – digital reads, Wikipedia, Reddit, and countless websites. Consequently, OpenAI devised a technique to transcribe video and audio content, inclusive of YouTube.
[The possibility of AI tutors developing concern for students barely scrapes the surface of our potential concerns.]
Interestingly enough, Google – owner of YouTube – did not protest against this conspicuous violation of YouTube’s fair use policies. Metz claims this lack of action from Google is due to them indulging in the same affair. Creators such as visual artists, authors, computer coders and musicians form part of the groups seeking legal recourse against GenAI firms for exploiting their material without consent, thereby jeopardising their income sources.
Ireland’s economy relies heavily on influential tech corporations. GenAI represents the newest field for profiteering but seems to rely extensively on plagiarism, making students discreetly leaning on ChatGPT to enhance their essays seem like mere beginners.
Despite being an unbelievable and ground-breaking tech advancement, GenAI is constructed largely on undervalued, pilfered labour. As it stands, AI predominantly exploits the creativity and livelihoods of those who are vulnerable and marginalised. But deceitful practices always carry repercussive consequences. Could our complacent acceptance of AI’s deceptive practices speed up the obsolescence of not only the vulnerable jobs, but also of numerous careers once monopolised by humans?