Re: Some advice please re. ChatGPT

Sent by magic

On Dec 28, 2022, at 5:05 PM, Steve DiPaola <sdipaola@sfu.ca> wrote:

This is a very complex space, and something like email is not a way to take on all the issues.

But here are some overall bullet pointed thoughts ( sometime academic - AI is very misunderstood and the only way to talk about the ethical issues is also to explain what an AI program is doing. )

I have slides, paper and data on most of what I am saying quickly here

Some overall thoughts:

- ChatGPT is nothing new (but the press and now everyone is talking/playing with it) is just a specific new fork of OpenAI's GPT3 ( kind of GPT3.5 with training specifically in the conversation space).

- It is a NLP system ( Natural Language Processing) using Deep Learning Transformer AI. And it gets trained on a massive dataset (10x more than others before it) of 100s of 1000s of books, webcrawls, ...

- Therefore it will never be better than the dataset it is trained on and currently those datasets because of their enormous size (and how hard and expensive it is to clean/verify/correct) is not very accurate. (just data dumps from books and web crawls)

- The dataset is one current issue - with bias, mistakes and so on. It is based on humans as we have mistakes ( in books and info), biases, and much else. The AI systems use these as patterns. Researchers are working on this.

- There is no all knowing Ai here, it is simply a multi Dl Transformer system that is able to complete a sentence and now a sophisticated query by using the history of human written discourse (that dataset of books , webcrawls).

- The plagiarism problem and outcry is even more substantial on the visual systems side which combine 1) a GPT NLP into a 2) diffusion visual generation - that allows a user to type in a sentence (prompt) and get an image out ( soon video, 3d, vr).

- Historically these NLP transformer systems have this problem called 'hallucinating' (different from the classical meaning) where they get things wrong, we have noticed ~5-10% of the time. Including fully contradicting themselves. ( even though they do perfectly otherwise). Our art piece with "Socrates", El Turco tries to show that in the 17 dialogues on life, love, ...

- This getting things wrong can happen in a specific way that you can write tools to detect it. Researchers are working on this. However, different from generated visuals ( like deepfakes) detecting these issues in words it is hard as prose can simply be edited and fixed and changed by a user even though most came directly from the AI systems.

- This is the biggest issue with generated essays - they can be edited and in doing so they do become the person's ideas - how much moves it to semi plagiarism to none at all - similar to a student that starts pulling from Wikipedia but keeps using more sources and their ideas /edits .

- You can work with these systems ( we do) in rather sophisticated ways, questioning the research output, re-asking good questions - if a student is doing that and coming up with something strong - from many hours of working with this tool. Is that wrong for every situation?

- These NLP GPT type systems are good at many things from writing prose to writing programs -- so they have become ( and more so soon) a way to use natural language input ( requests or prompts) to do almost anything. THEREFORE they will eventually be a new type of wikipedia / knowledge search. A front end to research and even code writing and with code writing - implementation of many types of things ( again including search).

- This is the issue where - you can use these systems ( now or soon) to help you with researching a topic like you would with google search,and that is not wrong per se, what is of course wording is the standard - if you did not write it or or substantial take the knowledge and put it in your voice/thoughts - than it should be cited or it is plagiarism.

- So where will that be with these systems - it is hard to know - as the research part seems fine. You can even craft a very good prompt ( query) in a way that you are adding to the output better than others might.

- I have been using our system to bring Van Gogh to life where we feel at some level of narrative for a person we only know so much about - that it is now strong as his representation say to visitors. It is a new type of craft to constantly improve and teach it. Whos to say that doing so for an essay is not also a skill and has merit.

- I also write poetry (that gets in collections) with an iterative technique using our own coded version of these systems. So I don't just say "yo AI write a poem in the style of ____________". I go back and forth with it as you would with a good collaborator. I keep the good and edit the bad - and resend it back. We have a whole paper ( me and my PhD) on this new type of disruptive, creative iterative process. So back to the essay it might be hard to draw the line. Surely inputiing in in the direct assignment instructions and just grabbing out the result is pure crap/plagiarism. But as we all use these tools more significantly there is a grey zone.

There is an effort to computational or digital watermark anything that is generated from AI systems. And this is a very important effort that I give talks on as it goes well beyond essay systems. I see this effort as a way bigger ethical issue than self driving cars. As in a decade or two we will dumb down all this as it trains on its own bad data ( words/ images/ ideas) always moving away from minority/variation toward average data.

Again you will see articles about water marking AI - and it is important - it is just not obvious ( to me) how you can do this for words that can be moved and edited at will.

For the bigger issues here and SFUs place in educating the world on it, we have groups. I gave a talk recently at the SFU's Digital Democracy Institute - they are a great institution to work through.

Hope this helps - just a quick on the fly update on this during the holidays. Back to that paper writing.

- Steve DiPaola, PhD - -
- Prof: Sch of Interactive Arts & Technology (SIAT);

- Past Director: Cognitive Science Program;

- - Simon Fraser University - - -

research site: ivizlab.sfu.ca

art work site: www.dipaola.org/art/

our book on: AI and Cognitive Virtual Characters
At Simon Fraser University, we live and work on the unceded traditional territories of the Coast Salish peoples of the xʷməθkwəy̓əm (Musqueam), Skwxwú7mesh (Squamish), and Səl̓ílwətaɬ (Tsleil-Waututh) and in SFU Surrey, Katzie, Kwantlen, Kwikwetlem (kʷikʷəƛ̓əm), Qayqayt, Musqueam (xʷməθkʷəy̓əm), Tsawassen, and numerous Stó:lō Nations.

On Wed, Dec 28, 2022 at 2:39 PM Ronda Arab <ronda_arab@sfu.ca> wrote:

I don’t think so. If a student can get a passing grade on an essay about “Hamlet” without having read the play, they have learned nothing that they are supposed to learn. The point isn’t what a student can claim to know about the play; it is the process by which they used their own brain, not an AI tool, to learn what is now written on the essay that they’ve handed in. I’m not a mathematician and I don’t know what all a calculator can do these days, but even with a calculator in my (decades ago) first year calculus class, I still had to learn and understand the formulas to get good grades.

Ronda

Sent by magic

On Dec 28, 2022, at 3:30 PM, Behraad Bahreyni <bba19@sfu.ca> wrote:

Is using AI tools to generate a unique text and using it as part of your work plagiarism?

I believe our teaching will have to evolve. As of now, the only big of creativity in responses from GPT thing is in rewording of the sentences and the system does not produce new knowledge. Our assessment methods maybe have to prioritize creative responses rather than rehashing of Google/Wikipedia results. I think of this new technological tool as when we allowed calculators in the exam rooms or take-home exams in the age of internet.

Cheers

Sent from a mobile device

On Dec 28, 2022, at 2:07 PM, Ronda Arab <ronda_arab@sfu.ca> wrote:

Hi Steve, Nicky and others,

Nicky, your advice is helpful, and I will use it. I wonder if Steve has advice on how to deal with plagiarism using these AI tools?

Ronda

Sent by magic

On Dec 28, 2022, at 2:27 PM, Nicky Didicher <didicher@sfu.ca> wrote:

Thanks, Sam, for mentioning this, and I'm interested in hearing what the AIAs have been discussing, James.

I hadn't heard of ChatGPT before, but I just spent half an hour getting it to generate English papers for me. They're all about four paragraphs long and C- or D quality, but the program was also able to generate lists of appropriate quotations to use as evidence (not always the best quotations) for both a work in public domain (a Jane Austen novel) and a work currently in copyright that shouldn't be available full-text online legally (a Rebecca Stead novel). The bot's plot summaries and critical assessments were paraphrases of existing ones, but not close enough to be detected as plagiarism, and it can create new versions of those paraphrases instantly. This means that we can't tell classes "I've already put this essay topic through ChatGPT, so I'll recognize if you use it."

Maybe we could suspect that a C paper's thesis and evidence was generated by ChatGPT, but I don't think we could prove it. Its idea of writing a conclusion is to repeat and rephrase its introduction and the writing style is completely bland and repetitive, but those are true of D or C- English papers in general. If we're lucky, the generated essay will have a big error in it (one of the Rebecca Stead ones I asked for identified the wrong character as Black and the other used evidence that didn't really make sense for the topic), and we can ask the student questions to show whether they actually read the material they were supposed to.

I think this situation may end up being like one in which a student asks or pays someone else to write a paper for them, but the paper *isn't* obviously way above the student's writing or thinking level as demonstrated in other assignments. We can interview the student and ask them questions about their work and hope they can't explain it, but we won't be able to prove they didn't write it themselves.

Perhaps as teachers the best we can do is make fun of ChatGPT in class when we're talking about academic dishonesty and say how we've tried it out on our essay topics and never gotten anything back worth more than a C-.

Nicky

From: James Fleming <james_fleming@sfu.ca>
Sent: December 28, 2022 12:57:38 PM
To: Sam Black; academic-discussion@sfu.ca
Subject: Re: Some advice please re. ChatGPT

Coincidentally those of us currently serving as departmental Academic Integrity Advisors are having a chat about this issue on our own list--not with regard to policy, but pedagogy and evaluation. Would there be interest in broadening the discussion? JDF

James Dougal Fleming

Professor, Department of English

Simon Fraser University

Burnaby/Vancouver,

British Columbia,

Canada.

The truth is an offence, but not a sin.

-- Bob Marley

From: Sam Black <samuel_black@sfu.ca>
Sent: December 28, 2022 12:24 PM
To: academic-discussion@sfu.ca
Subject: Some advice please re. ChatGPT

Hi All,

Does anyone know if some policy guidelines have been issued by SFU re. ChatGPT and academic dishonesty. Specifically, what would SFU accept as dispositive evidence that an essay had been generated using ChatGPT or similar AI software? Obviously, it will be impossible to introduce as evidence materials that have been cut and pasted without acknowledgement.

In this vein, I recently had a chat with an engineering student (but not an SFU student!) who received an A+ on an assignment using ChatGPT.

The software could not generate an A+ paper in Philosophy (of course not!). For the moment, I'm mostly concerned with suspicious C+ papers.

Thanks in advance,

Sam

Sam Black

Assoc. Prof. Philosophy, SFU

I respectfully acknowledge that SFU is on the unceded ancestral and traditional territories of the səl̓ilw̓ətaʔɬ (Tsleil-Waututh), Sḵwx̱wú7mesh Úxwumixw (Squamish), xʷməθkʷəy̓əm (Musqueam) and kʷikʷəƛ̓əm (Kwikwetlem) Nations.