Skeptical Approaches to AI Research Tools
Welcome! Please share
Photo by Kim Cruz
Welcome! What to expect
Could AI be the best research assistant ever?
“I have basically found that it is the best research assistant I’ve ever had. So now if I’m looking up something for a column or preparing for a podcast interview, I do consult with generative AI almost every day for ideas and brainstorming.
And just things like research — make me a timeline of all the major cyber attacks in the last 10 years, or something like that. And of course I will fact-check that research before I use it in a piece, just like I would with any research assistant.”
-Kevin Roose on the Hard Fork podcast for The New York Times Opinion, April 5, 2024
Share in the chat: On a scale of 1 to 4, how much have you used AI for research, specifically for help finding sources and working with them?
1: I haven’t used it
2: I’ve used it a little
3: I’ve used it quite a few times
4: I use it consistently in my search/research process
When AI combines with search and works with real sources
Remember the warning that “AI makes up sources”? AI systems still make them up, but not as often as a year ago.
AI + real sources = good and bad news
*
Google AI overviews have included some striking misinformation, as in this test by Casey Newton shared on Threads.
*
Will we examine these AI results for bias and consider where they are coming from? I asked Google, “What makes life worth living?” Its answer is secular and individualistic in keeping with its dataset. The top source it cited was Psychology Today
*
I asked Perplexity “Which vegetables are not likely to grow well in a San Francisco garden where there is moderate fog and wind?”
But the AI overview sometimes misrepresents the sources. After listening to Ezra Klein’s podcast, I asked Perplexity, “What does Ezra Klein think AI will do to the Internet?”
Perplexity.AI:
But no! His guest Nilay Patel said that, as the footnoted source indicates!
Let’s test: Ask Perplexity.ai a question on a topic you know a lot about. Share in the chat:
Teaching students to approach AI research assistance skeptically
Let’s make sure students practice checking how AI handles information
Invite students to try out one of these systems that purports to cite its sources and/or aid with research. Ask them to find something the AI missed.
One lesson: “Fact-Checking Auto-Generated AI Hype”
I asked students to fact-check a list of claims and sources generated by ChatGPT. They commented in the margins of a chat session transcript, speaking back to and correcting ChatGPT’s handling of sources.
See this description of the assignment with materials and samples, published in TextGenEd: Teaching with Text Generation Technologies from the Writing Across the Curriculum Clearinghouse.
ChatGPT misinformation from a chat session on surprising AI facts
“AI Can Decode Ancient Scripts:
There’s no such paper and no such author!
What happened? Yann LeCun + Yoshua Bengio = Yann Bengio?
Yann LeCun and Yoshua Bengio are computer scientists considered “godfathers” of AI who have collaborated. ChatGPT combined their names.
ChatGPT generated the claim, “AI creates original art and music.” I annotated its supposed source and shared this with students.
Students also practiced assessing ChatGPT’s explanations for why sources were credible
ChatGPT output cited the Facebook AI blog: “While a company blog might not be a traditional academic source, it's a primary source in this case because it's directly from the team that conducted the research.”
The students pushed back on the idea that a company blog is credible just because it contains internal company information.
AI for Academic Research Apps: A look at Elicit.org
How does this work for academic research? Let’s look at Elicit: “The AI Research Assistant”
Elicit answers your research question in its own words
Instead of searching on “teacher shortages students effects”
And also “educator shortages” and “instructor shortages,” with “impacts”
the student can just ask their question in one way.
Elicit lists papers and summarizes their elements
My students enjoyed Elicit’s intuitive interface and immediate response to their questions.
In one test, Elicit’s synthesis addressed a different question, not what I asked.
Question: Do language models trained partly on AI-generated text perform worse than ones trained only on human text?
Its answer was about detection of AI text and comparison of the quality of human writing and AI text, not about how training data affects performance.
Can we help students practice catching this kind of misinterpretation?
Elicit’s one-sentence summaries of papers sometimes miss key points.
Elicit’s summary of Student Perceptions of AI-Powered Writing Tools: Towards Individualized Teaching Strategies by Michael Burkhard: “AI-powered writing tools can be used by students for text translation, to improve spelling or for rewriting and summarizing texts.”��But the real abstract includes this other central point: “[S]tudents may need guidance from the teacher in interacting with those tools, to prevent the risk of misapplication. …Depending on the different student types, individualized teaching strategies might be helpful to promote or urge caution in the use of these tools.”
Elicit’s “main findings” column describes this better, but the user has to specifically choose that option.
Elicit promises efficiency, but even where it delivers we should ask whether such efficiency means we are skipping important thinking and reading
Elicit.org’s tag lines are
Emily Bender and Chirag Shah have argued that essential aspects of research may be lost with such efficiencies.
In “Situating Search,” Bender and Shah argue that “removing or reducing interactions in an effort to retrieve presumably more relevant information can be detrimental to many fundamental aspects of search, including information verification, information literacy, and serendipity.”�
Proceedings of the 2022 Conference on Human Information Interaction and Retrieval, March 2022
Problems with student use of AI research assistance
AI for Academic Research: An array of apps with different specialties
A variety of AI research apps offer similar functionality to Elicit.
Consensus.AI attempts to assess the level of agreement among scholars
The “Consensus Meter”
The “Consensus Meter” looks authoritative and quantitative with its percent ratings, but it comes with warnings and would be hard to double check.
If you have one research paper, Keenious helps you find related ones
ResearchRabbit.AI, “Spotify for papers”
ResearchRabbit.AI, “Spotify for papers”
When I put in my own paper, ResearchRabbit showed me a network of papers on related topics. I could see that the Zawacki-Richter one was cited by the others.
MoxieLearn.ai, an all-inclusive app for supporting academic research and writing. Not cheap.
AI features common in these apps are now appearing within academic databases themselves. AI functionality is going mainstream in the research process.
Share in the chat: What's seems most useful in AI research assistance? (Let's imagine the functionality works decently.)
The outlook is for ongoing ambivalence, gray areas, and exploration
Can we use AI research assistance wisely?
Can we guide students to use it in ways that don’t detract from their learning?
Can we make sure they practice noticing where AI is wrong and where its use gets in the way of important reading and reflection?
Slides open for commenting: https://bit.ly/SkepticalAIresearch�
This presentation is shared under a CC BY NC 4.0 license.