1 of 57

Rethinking Writing for Assessment in the Era of Artificial Intelligence

UC San Diego "Threats & Opportunities" Virtual Symposium

Anna Mills, April 18, 2023

Licensed CC BY NC 4.0

2 of 57

Welcome!

Housekeeping

I like to approach this in a spirit of inquiry. It’s a complex topic, and I don’t claim to have the answers.�

Feel free to ask questions in the Q&A as we go, and I’ll try to get to them at the end of each section.

Slides: https://bit.ly/rethinkwritingUC��Resources: https://bit.ly/AITextEdu

3 of 57

Agenda

  • How well does ChatGPT mimic academic writing?
  • Can we out-prompt it?
  • Can we detect it?
  • How can we assign writing so as to deter misuse?
  • How might we want to integrate it into our pedagogy?

Presentation by Anna Mills, licensed CC BY NC 4.0.

4 of 57

Background

  • ChatGPT is an AI text generator or large language model created by the company OpenAI and funded by Microsoft, released November 30.
  • ChatGPT uses GPT 4 and/or 3.5, updated versions of software available since 2020.
  • OpenAI is considered to have the most capable publicly available language model. But competitors include Anthropic’s Claude and Google’s Bard.
  • Open-source models also exist, and other companies like Meta have �built models that are not yet public.

Presentation by Anna Mills, licensed CC BY NC 4.0.

5 of 57

ChatGPT’s release means that anyone can ask for a particular kind of text and get something more or less passable free or cheap

6 of 57

Let’s take as a starting point the idea that ChatGPT can produce�passable academic prose in response to many writing prompts

Grammatically correct

On topic

Academic style

Sounds plausible

Presentation by Anna Mills, licensed CC BY NC 4.0.

7 of 57

It can take in and incorporate background information, sources, quotations, and lists of ideas.

  • You feed it the information it needs (up to around 2,000 words).
  • Give it anything you have that will help it produce the piece of writing.
  • Describe the style, length, and any other requirements.

Presentation by Anna Mills, licensed CC BY NC 4.0.

8 of 57

It can produce multiple original word combinations to respond to one prompt

  • Its outputs are not usually copies of human-written text.
  • If you don’t like one, you can request a different one.
  • Two users may put in the same prompt and get different results.

Presentation by Anna Mills, licensed CC BY NC 4.0.

9 of 57

Can teachers outprompt ChatGPT?�Are there kinds of academic writing it can’t generate?

I did some testing for OpenAI before GPT-4 came out, and I find that it can produce something passable for most prompts with a little help.

Feed it a reading, some real sources, snippets on current events, bits of personal experience, a transcript or description of an image—whatever the prompt demands.

The system will generate a personal essay, a piece about something timely and hyper-local, a simulated reflection on the writing process, a close analysis of a text.

Presentation by Anna Mills, licensed CC BY NC 4.0.

10 of 57

But don’t take my word for it. We should probably all be trying out our prompts.

  • Try yours on ChatGPT running GPT-4
  • Poe.com will let you compare results from different language models
  • Don’t just ask it once—keep prompting it to refine the output.
  • If you find a kind of prompt it can’t do, please let me know!
  • And remember that language models have not reached their endpoint, so we can’t count on current limitations.

Presentation by Anna Mills, licensed CC BY NC 4.0.

11 of 57

Let’s briefly try ChatGPT (with GPT-4) together

  • Is there a simple essay prompt you’d like me to try?
  • Please share in the chat.
  • Here’s a sample generation related to Chinese savings rates. Here I prompted it to get it to look more like an essay and add sources.

12 of 57

If we can’t out-prompt ChatGPT, can we detect its use?

13 of 57

First, a caution: ChatGPT doesn’t “know” if it wrote something!

  • Don’t paste student work into ChatGPT and ask, “Did you write this?”
  • First, it’s not legal to share student work without permission.
  • ChatGPT may well give an answer. The answer might be right, but it might be wrong.
  • ChatGPT has not been tested for this purpose. (OpenAI has a different AI text classifier system.)
  • ChatGPT has no access to a database of its own outputs to check against.

Presentation by Anna Mills, licensed CC BY NC 4.0.

14 of 57

AI text detection software exists. It is not reliable.

Presentation by Anna Mills, licensed CC BY NC 4.0.

15 of 57

How does AI detection compare to (traditional) plagiarism detection?

Differences

  • No access to a database of AI text for direct comparison (unless companies create and share such)
  • Detection is based not on direct similarity but on probabilistic guesses based on features of the text
  • ChatGPT and other AI tools can tweak the text to evade detection

Similarities

    • Similar questions about whether it makes teachers focus too much on policing and an adversarial relationship
    • Similar options for using it punitively versus more collaboratively (i.e. second chances without penalty)
    • Similar concerns about the use of student data

Presentation by Anna Mills, licensed CC BY NC 4.0.

16 of 57

Turnitin claims a less than 1/100 false positive rate,but is that accurate?

  • Our AI writing preview has been trained on academic writing with high efficacy rates and can identify 97% of AI writing”
  • But the company has shared no data at all, let alone external peer reviewed studies
  • They didn’t test their system on the most sophisticated, recent AI software, ChatGPT running GPT-4.

Presentation by Anna Mills, licensed CC BY NC 4.0.

17 of 57

The Washington Post found an example of a false positive

18 of 57

GPTZero labeled the Bill of Rights “likely AI”

On April 17, 2023, I retested a popular Reddit experiment using GPTZero.me.

19 of 57

Caution: false positives = false accusations?

  • All these tools currently may get it wrong: sometimes they label human text as “likely AI” and sometimes they label AI text as likely human.

  • This is not likely to change. I have not heard any experts optimistic about these tools ever being able to eliminate false positives, though the rates could well improve.

Presentation by Anna Mills, licensed CC BY NC 4.0.

20 of 57

How will students feel when we tell them their writing may be wrongly flagged as likely AI?

To help students prepare for the possibility of false positives, Turnitin advises “Establish your voice: Make sure that your writing style and voice are evident to your instructor.”�

  • But is it fair to ask students to write in such a way that proves their humanity?
  • How will the student know what this means, how to do it, and how to make sure that the instructor will perceive it as their voice?
  • As AI gets “voicier,” this may not work.

Presentation by Anna Mills, licensed CC BY NC 4.0.

21 of 57

Caution: privacy violations

  • If you are considering pasting student work into a detector, consider whether you have the legal right.
  • Have you looked at the privacy policy attached to the tool?
  • How might the student data be used?
  • Has your institution’s IT department vetted the software and approved it?

Presentation by Anna Mills, licensed CC BY NC 4.0.

22 of 57

Caution: Is detection a good use of money?

  • Turnitin’s detector is free for now,

BUT

  • “Beginning January 1, 2024, only customers licensing Originality or TFS with Originality will have access to the full AI writing detection experience”--Turnitin

Presentation by Anna Mills, licensed CC BY NC 4.0.

23 of 57

Turnitin will charge extra for paraphrase detection, if they release it

From Turnitin’s FAQ:

  • “Turnitin has been working on building paraphrase detection capabilities – ability to detect when students have paraphrased content either with the help of paraphrasing tools or re-written it themselves…We have plans for a beta release in 2023…for an additional cost.”

Presentation by Anna Mills, licensed CC BY NC 4.0.

24 of 57

And another thing: it’s simple for students to get around Turnitin’s AI detection with Quillbot.�

From the Washington Post on Turnitin’s AI detector: “It couldn’t spot the ChatGPT in papers we ran through Quillbot, a paraphrasing program that remixes sentences.”

Presentation by Anna Mills, licensed CC BY NC 4.0.

25 of 57

Free software explicitly designed to get around AI detectors

26 of 57

Controversy!

Presentation by Anna Mills, licensed CC BY NC 4.0.

27 of 57

Even according to Turnitin, detectors will often be inconclusive

  • “Rely on relationships with the student: This kind of judgment should never be made without a respectful dialogue with the student.”
  • “Compare the writing in question to the diagnostic sample.”
  • “After review, if the evidence isn’t clear, give the student the benefit of the doubt. All the right conversations have taken place, with all the right questions asked, and there’s still uncertainty, the student cannot be penalized based on that.” –Turnitin

My note: this is not just inconclusive; it is time intensive and raises equity concerns about how the judgments will be made and which students will be affected.

Presentation by Anna Mills, licensed CC BY NC 4.0.

28 of 57

So if we can’t out-prompt AI and we can’t rely on AI text detection, can we rely on in-class writing?

  • Maybe some of the time.�
  • But there’s not enough time in class to develop the ideas fully in a longer piece. �
  • In-class writing or handwritten work can create a lot of anxiety and barriers for many students with disabilities.

Presentation by Anna Mills, licensed CC BY NC 4.0.

29 of 57

Then how can we rely on writing as a form of assessment?�

My working hypothesis: we can choose a combination of approaches to

  • deter misuse of AI and
  • thus help prevent learning loss.

Presentation by Anna Mills, licensed CC BY NC 4.0.

30 of 57

Let’s go back to why we assign writing. Not for the product. For the thinking process.

“A fundamental tenet of Writing Across the Curriculum is that writing is a mode of learning. Students develop understanding and insights through the act of writing. Rather than writing simply being a matter of presenting existing information or furnishing products for the purpose of testing or grading, writing is a fundamental means to create deep learning and foster cognitive development.”

-- The Association for Writing Across the Curriculum

Presentation by Anna Mills, licensed CC BY NC 4.0.

31 of 57

How does the writing process contribute to student learning in your field?�

What might we lose if we give up on assigning writing in non-writing courses?

Presentation by Anna Mills, licensed CC BY NC 4.0.

32 of 57

Are these almost one and the same?

  • The best strategies for deterring misuse of AI
  • The best ways to assign writing anyway
  • Share how we personally use the writing process to help our thinking process in our discipline.
  • Create assignments that have intrinsic meaning and are likely to be motivating.
  • Allow room for creativity, original thinking, and choice about the focus of the assignment.

Presentation by Anna Mills, licensed CC BY NC 4.0.

33 of 57

First and foremost, emphasize purpose and engagement

  • Emphasize how writing helps us think and form our own ideas and voice.
  • If students see meaning in the writing assignment and understand what they will get out of wrestling with it, they are more likely not to resort to a text generator.

Presentation by Anna Mills, licensed CC BY NC 4.0.

34 of 57

Build relationships and community so that writing and reading happen in relationship�

  • Hold conferences with students
  • Offer video feedback
  • Ask students to record audio or video notes about their writing.
  • Invite peer responses on what is interesting in each student’s writing

Presentation by Anna Mills, licensed CC BY NC 4.0.

35 of 57

Teach the writing process

  • Collaborative annotation of the foundational texts for the writing assignment
  • Prewriting/drafting
  • Reflections on their own thinking and writing process
  • Peer review
  • Revision
  • Ask to students to share and comment on their version history/track changes

Presentation by Anna Mills, licensed CC BY NC 4.0.

36 of 57

Discuss ethics and transparency with students

  • Have an open discussion at the beginning of the course–when do we need to know how AI was used in writing? Come up with examples of when it matters, when it doesn’t.
  • Seek student input or collaboration on AI policy formation
  • What do students think will deter misuse of AI?
  • What values do students see as important here?

Presentation by Anna Mills, licensed CC BY NC 4.0.

37 of 57

Some students want teachers and students to use detectors. They don’t want to be at a competitive disadvantage.

  • Boston University students in a data science class developed a policy that allows different options for AI use but involves detectors extensively.
  • “Students shall…Employ AI detection tools and originality checks prior to submission, ensuring that their submitted work is not mistakenly flagged.”
  • “Instructors shall…Employ AI detection tools to evaluate the degree to which AI tools have likely been employed.”

Presentation by Anna Mills, licensed CC BY NC 4.0.

38 of 57

Detection as deterrent: �Point out that what’s generated by AI might be labeled as AI, sooner or later

  • No one should assume that AI text is undetectable.
  • As the software evolves, what’s not detectable now might become retroactively detectable.
  • Might OpenAI and/or Quillbot at some point allow anyone to check against a database of outputs to certify whether a particular text sequence has ever been generated by their systems?
  • Remind students that their next teacher or school or workplace might use detection software.

Presentation by Anna Mills, licensed CC BY NC 4.0.

39 of 57

Honor system approaches

  • Example: Ask students to affirm that they have labeled any AI text as such.
  • Similar to affirming that the writing they have submitted is their own and they have not plagiarized.

Presentation by Anna Mills, licensed CC BY NC 4.0.

40 of 57

Rating activity

Are these strategies doable?

Which should we prioritize? �

  • Go to https://www.menti.com/aliudexi4vsm (in the chat)�
  • Or, go to Menti.com and enter 2395 3860 to rate the options

41 of 57

Presentation by Anna Mills, licensed CC BY NC 4.0.

42 of 57

Presentation by Anna Mills, licensed CC BY NC 4.0.

43 of 57

Should we incorporate text generator use into our teaching?

  • Ethics: Can we justify using a particular tool given concerns about labor, environmental impacts, data rights, bias, and others?
  • Privacy: Are we requiring students to submit their data to platforms that will not keep it private?
  • Opportunity costs: Are the ways we might use it better than existing and alternative teaching practices? Even if we ask students to critique or improve its outputs, is spending time reading and critiquing them really better than reading and critiquing human-written texts?
  • Preparation: Are we ready to explain the technology, show students its pitfalls, and make sure they can identify problems on their own?

Presentation by Anna Mills, licensed CC BY NC 4.0.

44 of 57

Considerations if we do try incorporating text generators in our pedagogy:�Effect on quality of thought

  • What kind of thinking processes are we hoping for in this assignment?
  • How might use of text generator preempt thinking? It’s not always the best starting point. When we want to encourage students to develop their inklings, questions, suspicions, interests, maybe we shouldn’t use it too much or too early.
  • How could it help students extend their thinking? Do we want to encourage them to ask it for feedback?

Presentation by Anna Mills, licensed CC BY NC 4.0.

45 of 57

Considerations if we do try incorporating text generators:��Privacy

  • Consider offering students a pre-generated ChatGPT session to critique or another alternative to making an OpenAI account if they have concerns about their data privacy.
  • Warn students and make sure your assignment doesn’t invite writing that someone might not want to be public.

Presentation by Anna Mills, licensed CC BY NC 4.0.

46 of 57

Do we need to teach students prompt engineering?

  • I’m skeptical that the skills needed to work with language models in our disciplines are different from rhetorical skills and expertise in our disciplines
  • My take: Let’s stick to our principal learning goals. These will probably make for good prompting.
  • Won’t any technical details of prompting likely have changed by the time our students graduate?

Presentation by Anna Mills, licensed CC BY NC 4.0.

47 of 57

Many instructors are exploring incorporating use of ChatGPT into pedagogy

Wharton business school professor Ethan Mollick has been a vocal and popular proponent of teaching students to use AI.

Presentation by Anna Mills, licensed CC BY NC 4.0.

48 of 57

In his substack One Useful Thing, Mollick shares strategies such as

  • Requiring students to use multiple variations on a prompt to and reflect on the variations in quality of ChatGPT outputs about course material
  • Teaching students to use more specific directions in their prompts, such as describing the style desired, length, tone, etc.
  • Teaching students to prompt for a short section of text at a time and give ChatGPT feedback on how to improve
  • Asking students to prompt ChatGPT for many ideas and then pick promising ones to develop.

Presentation by Anna Mills, licensed CC BY NC 4.0.

49 of 57

Critical AI literacy? Yes, please!

Whether or not we teach with text generators, we can teach about them. We can start by introducing the concept of statistically generated text and dispelling any notion that AI is sentient, authoritative, or neutral. Teach students to watch for problems in AI outputs.

Presentation by Anna Mills, licensed CC BY NC 4.0.

50 of 57

We need course materials

  • About AI in general and its risks and ethical considerations
  • About text generators/large language models and their uses, risks, and ethical considerations

Materials for discussion and adaptation

  • Handouts
  • Slides
  • Assignments
  • Lesson plans

Presentation by Anna Mills, licensed CC BY NC 4.0.

51 of 57

Critical AI Literacy and Critical Assessment

A Canvas module

My idea: Students watch video and annotate orientations to ChatGPT, then read a NYT article and a sample ChatGPT critical assessment alongside a sample human-written assessment. They reflect on what ChatGPT misses and what they can learn about language models from the contrast..

Context: Complements the open text How Arguments Work.

What I am aiming to achieve: Understanding of language model as statistical text predictors, not thinkers. Familiarity with common deficiencies in their outputs. Increased skill and confidence with critical assessment.

Link to more information: View the activities on Canvas or Canvas Commons

References:Gary Marcus’s Scientific American article “AI Platforms like ChatGPT Are Easy to Use but Also Potentially Dangerous,” Leon Furze’s Teaching AI Ethics and others.

52 of 57

A Canvas Commons module

53 of 57

Ask students to reflect on the differences between ChatGPT’s critical assessment and the human-written assessment

Presentation by Anna Mills, licensed CC BY NC 4.0.

What did ChatGPT miss? What did its output get right?

How do those observations match what we learned about how language models work?

How might the sample essay have turned out if the student had started with the ChatGPT output and revised from there?

What lessons do you draw from this comparison?

54 of 57

Further Resources from the Writing Across the Curriculum Clearinghouse

  • https://bit.ly/AIwritingEDU

  • AI Text Generators and Teaching Writing: Starting Points for Inquiry: news, analysis, and educators’ approaches to the subject.�
  • Established with the support of Lee Nickoson and Mike Palmquist, curated by Anna Mills

Presentation by Anna Mills, licensed CC BY NC 4.0.

55 of 57

Presentation by Anna Mills, licensed CC BY NC 4.0.

56 of 57

Presentation by Anna Mills, licensed CC BY NC 4.0.

About AI Text Generators/Large Language Models

Implications for Higher Ed Writing Assignments

Audio and Video 

Sample Policy Statements about Text Generators

Student Perspectives and Marketing to Students

Course Materials on AI Text Generators 

Assignments That Incorporate Text Generators

Peer-Reviewed Papers

Short Pieces on the General Topic of AI

Books on the General Topic of AI

Using Language Models, Including ChatGPT

Detecting AI-Generated Text

Using Text Generators for Help Preparing Courses and Assessing Students

Calls for Papers and Proposals

57 of 57

Questions? Comments?

Let the discussions continue as we sort this out together.��Anna Mills�armills@marin.edu, @EnglishOER

Slides: https://bit.ly/rethinkwritingUC�This presentation is shared under a CC BY NC 4.0 license.