1 of 43

AI for Research Assistance: Skeptical Approaches

Anna Mills, English Instructor, College of Marin

From a keynote address to GICOIL

April 19, 2024

2 of 43

Could AI be the best research assistant ever?

“I have basically found that it is the best research assistant I’ve ever had. So now if I’m looking up something for a column or preparing for a podcast interview, I do consult with generative AI almost every day for ideas and brainstorming.

And just things like research — make me a timeline of all the major cyber attacks in the last 10 years, or something like that. And of course I will fact-check that research before I use it in a piece, just like I would with any research assistant.”

-Kevin Roose on the Hard Fork podcast for The New York Times Opinion, April 5, 2024

3 of 43

What skills do you need to work with generative AI?

You ask it for what you want.
Then you question what it gives you. You revise, reject, add, start over, tweak.

To do this, you need

critical thinking, reading, and writing skills.
subject-matter expertise.
knowledge of what kinds of weaknesses to look out for in AI. Let’s call that critical AI literacy.

4 of 43

Let’s cultivate skepticism as we explore AI research tools

5 of 43

We’ve probably heard that AI makes things up. But what about when it works with real, current sources?

Remember the warning that “AI makes up sources”? AI systems still make them up, but not as often as a year ago.
Many AI systems now combine text generation with Internet or database search. This includes ChatGPT Plus, Google’s Gemini, Microsoft Copilot, and Perplexity.

6 of 43

Perplexity combines search and chat

Try it! You can go back to your earlier experiment with Perplexity and retry it with the default “All” focus.

It searches the Internet and (seems to) base its answers on the top sources it finds.

7 of 43

AI + real sources = good and bad news

It sounds like a dream! At long last, could we just explain what we are looking for and be in dialogue with the papers themselves?�
But it doesn’t work reliably. These AI research assistants mimic academic source citation practices but aren’t always summarizing correctly and may cite a source that is not really the origin of its output.

8 of 43

How does this work for academic research? Let’s look at Elicit: “The AI Research Assistant”

With Elicit, you write “your search query in natural language… There's no need to try to think of every possible keyword or synonym to find relevant papers. Just ask Elicit a question like you would ask an expert in the field.”
“Elicit's Find papers step can handle filter criteria directly in your query. If you try a query like ‘papers about the benefits of taking l-theanine published after 2020’ Elicit will automatically filter to papers published after 2020.”�--Elicit Help

9 of 43

Elicit answers your research question in its own words

Instead of searching on “teacher shortages students effects”

And also “educator shortages” and “instructor shortages,” with “impacts”

the student can just ask their question in one way.

10 of 43

Elicit lists papers and summarizes their elements

11 of 43

My students enjoyed Elicit’s intuitive interface and immediate response to their questions.

They found it to easier to navigate than the academic databases they had just learned to search.
I want to keep sharing it as a way to open research to students who find it intimidating and who shut down due to cognitive overload when they have to tweak search terms, filters, and databases.
But I can see the potential for harm as well as good.

12 of 43

In one test, Elicit’s synthesis addressed a different question, not what I asked.

Question: Do language models trained partly on AI-generated text perform worse than ones trained only on human text?

Its answer was about detection of AI text and comparison of the quality of human writing and AI text, not about how training data affects performance.

Can we help students practice catching this kind of misinterpretation?

13 of 43

Elicit’s one-sentence summaries of papers sometimes miss key points.

Elicit’s summary of Student Perceptions of AI-Powered Writing Tools: Towards Individualized Teaching Strategies by Michael Burkhard: “AI-powered writing tools can be used by students for text translation, to improve spelling or for rewriting and summarizing texts.”��But the real abstract includes this other central point: “[S]tudents may need guidance from the teacher in interacting with those tools, to prevent the risk of misapplication. …Depending on the different student types, individualized teaching strategies might be helpful to promote or urge caution in the use of these tools.”

Elicit’s “main findings” column describes this better, but the user has to specifically choose that option.

14 of 43

A variety of AI research apps offer similar functionality to Elicit.

15 of 43

Consensus.AI attempts to assess the level of agreement among scholars

You ask it a research question
It offers a brief AI-generated synthesis answer based on ten papers.
For yes or no questions, it estimates what percent of the research leans “yes,” “possibly,” and “no” on the question.

16 of 43

The “Consensus Meter”

17 of 43

The “Consensus Meter” looks authoritative and quantitative with its percent ratings, but it comes with warnings and would be hard to double check.

“This feature is powered by generative AI and may generate incorrect information. Please read the papers and consult other sources.”
“Important Reminder: A result appearing at the top does NOT mean it is true. Interpreting research is complicated.”

18 of 43

Undermind

“Undermind highlights the precise papers you should focus on and gives a clear explanation for each decision.”
Analyzes the full text of research articles.

19 of 43

Keenious

If you have one research paper, Keenious helps you find related ones

Upload a paper
Keenious suggests topics based on the paper
Keenious suggests related research and allows you to filter by topic.

20 of 43

Scite.ai

Similar to Elicit, but with special focus on citation
“Read what research articles say about each other”
“Smart Citations allow users to see how a publication has been cited by providing the context of the citation and a classification describing whether it provides supporting or contrasting evidence for the cited claim”
How accurate are these assessments?

21 of 43

SciSpace

Similar to Elicit but adds in “AI Chat for Scientific PDFs”
“Copilot” mode takes a chat-with-the-research approach:it shows a scholarly paper on one side and a chat pane on the other with suggested prompts.
It will do an automated “literature review” on the basis of a user-entered question. What’s the quality?

22 of 43

ResearchRabbit.AI, “Spotify for papers”

Research Rabbit helps scholars organize papers and discover connections between them.
Its maps of networks of connections between authors and between papers could give students a visual representation of research as conversation. Connected Papers also does this.

23 of 43

AI features common in these apps are now appearing within academic databases themselves. AI functionality is going mainstream in the research process.

24 of 43

So how do we guide students to use AI for research wisely?

25 of 43

Let’s be skeptical of the efficiencies promised by AI apps. Much of the thinking happens in that inefficient reading time.

SciSpace’s tag line is “Do hours worth of reading in minutes”

Elicit.org’s tag lines are

“Analyze research papers at superhuman speed.”
“Automate time-consuming research tasks like summarizing papers, extracting data, and synthesizing your findings.”

26 of 43

There are losses with such seeming efficiency. Emily Bender and Chirag Shah have raised concerns about these search-LLM combinations.

“In “Situating Search,” Bender and Shah arguethat “removing or reducing interactions in an effort to retrieve presumably more relevant information can be detrimental to many fundamental aspects of search, including information verification, information literacy, and serendipity.”�

Proceedings of the 2022 Conference on Human Information Interaction and Retrieval, March 2022

27 of 43

Even the systems that search the Internet and databases will misrepresent and make things up

They may create an illusion of easy and efficient access and understanding of the field.
Maybe the student isn’t really getting the sources they need or understanding them accurately.
Even if the student reads the original text, are their reading skills adequate to help them discern whether the summary is accurate?

28 of 43

Example of an error in summary: After listening to Ezra Klein’s podcast, I asked Perplexity, “What does Ezra Klein think AI will do to the Internet?”

Perplexity.AI:

But no! His guest Nilay Patel said that, as the footnoted source indicates!

29 of 43

Let’s make sure students practice checking how AI handles information

Invite students to try out one of these system that purports to cite its sources and/or aid with research. Ask them to find something the AI missed.

Is each source really talking about what the AI summary says it’s talking about?
Did the AI summary miss anything central?
Is the summary or synthesis accurate?

30 of 43

One lesson: “Fact-Checking Auto-Generated AI Hype”

I asked students to fact-check a list of claims and sources generated by ChatGPT. They commented in the margins of a chat session transcript, speaking back to and correcting ChatGPT’s handling of sources.

See this description of the assignment with materials and samples, published in TextGenEd: Teaching with Text Generation Technologies from the Writing Across the Curriculum Clearinghouse.

31 of 43

ChatGPT misinformation from a chat session on surprising AI facts

“AI Can Decode Ancient Scripts:

Fact: Researchers used AI to decipher the ancient script known as Linear B.
Source: Jucha, M. A., Arjovsky, M., & Bengio, Y. (2017). Decipherment of the Linear B script with deep learning. arXiv preprint arXiv:1708.08731.
Credibility: arXiv is a repository for electronic preprints, and while not peer-reviewed, many significant findings are first reported here. Yann Bengio, one of the authors, is a Turing Award winner.”

There’s no such paper and no such author!

32 of 43

What happened? Yann LeCun + Yoshua Bengio = Yann Bengio?

Yann LeCun and Yoshua Bengio are computer scientists considered “godfathers” of AI who have collaborated. ChatGPT combined their names.

33 of 43

ChatGPT generated the claim, “AI creates original art and music.” I annotated its supposed source and shared this with students.

34 of 43

Students also practiced assessing ChatGPT’s explanations for why sources were credible

ChatGPT output cited the Facebook AI blog: “While a company blog might not be a traditional academic source, it's a primary source in this case because it's directly from the team that conducted the research.”

The students pushed back on the idea that a company blog is credible just because it contains internal company information.

35 of 43

What will we do if we ask students to use AI and the students don’t want to?

36 of 43

If you incorporate a language model, give students a comparable alternative in case they have privacy or data rights concerns

Consider offering students a pre-generated ChatGPT session to critique or another alternative if they have concerns about their data privacy.
Perplexity.ai with the “Writing” focus option doesn’t require an account.
See Blueprint for an AI Bill of Rights for Education by Kathryn Conrad.

37 of 43

Further resources for ideas on teaching with and about AI

Collections of ideas and tested pedagogical practices.

38 of 43

The AI Pedagogy Project from Harvard's metaLAB

39 of 43

TextGenEd: Teaching with Text Generation Technologies

Edited by Annette Vee, Tim Laquintano & Carly Schnitzler

And published by the Writing Across the Curriculum Clearinghouse

40 of 43

Browse, comment, and share your own informal reflections on the Exploring AI Pedagogy site from the MLA/CCCC Task Force on Writing and AI

41 of 43

One more reason why we need to teach discerning, skeptical approaches to AI: We and our students can help shape the future of the information landscape and mitigate harms.

From the Ezra Klein Show interview with Nilay Patel for New York Times Opinion, April 5, 2024 Patel is editor of The Verge.

EZRA KLEIN: What is A.I. doing to the internet right now?

NILAY PATEL: It is flooding our distribution channels with a cannon-blast of — at best — C+ content that I think is breaking those distribution channels…. I think right now it’s higher than people think, the amount of A.I. generated noise, and it is about to go to infinity.

EZRA KLEIN: What happens when this flood of A.I. content gets better? What happens when it doesn’t feel like garbage anymore? What happens when we don’t know if there’s a person on the other end of what we’re seeing or reading or hearing?

42 of 43

With AI, knowing what’s true and where information comes from will keep being important, and will get more complicated. How will we as a society shape this?

The bottom line: let’s get to know AI. Our voices are needed!

Be curious, be bold.

If we work in education, we likely have critical thinking and communication skills that will help us use AI.

Our students need our guidance, and our voices are needed in the larger policy conversations around AI in society.

43 of 43

Questions or comments?�Thank you, and feel free to get in touch!

AnnaRMills.com

Twitter/X: @EnglishOER

LinkedIn: anna-mills-oer

Slides open for commenting: https://bit.ly/skepticalAIresearch

�This presentation is shared under a CC BY NC 4.0 license.