Overview of Generative AI for R coding
A/Prof Chris Brown
University of Tasmania
c.j.brown@utas.edu.au
Claude Sonnet 3.7 ….
Overview
genAI what is it?
How LLMs work
LLM abilities
Example
Help me complete this analysis @benthic-readme.md
What do LLMs mean for science?
What do LLMs mean for science?
LLM evaluation
LLM evaluation – R example
Level of complexity, number of tasks from:
1. Read data from file(s)
2. Data wrangling (filtering, transposing, etc.)
3. Visualization
4. Machine learning or statistics applications
5. Handling more than one dataset
LLM evaluation example
Jansen et al. 2025 “Leveraging large language models for data analysis automation”
LLM evaluation
Jansen et al. 2025 “Leveraging large language models for data analysis automation”
Common responses to LLMs
What should you learn right now?
What tools should I use for coding and stats?
Github copilot in VScode
LLM jargon essentials
Token a part of a word that is the unit of prediction. e.g. GPT3.0 had ~50,000 tokens
Prompt a piece of text that is used to generate content from an LLM.
Prompt engineering the process of designing and refining prompts to get the desired output from an LLM.
Context window the number of tokens that the LLM uses to predict the next token. The LLM’s ‘memory’ size.
Temperature Low temperature means the LLM favours the highest probability token, high temperature means its more random
LLM jargon essentials
API Application Programming Interface
System message
User message
Assistant message
LLM software
AI assistant Software that manages your interactions with an LLM
Tools Give LLM abilities to do things like access data, run code
Agents Can run autonomously by writing code, returning results to themselves as a prompt, then responding and so on
Prompt engineering
A conversation looks like this:
System message
User message
Assistant response
User message
Assistant response
etc. until the context window is full
Prompting strategies
Be clear about your workflow
1. Plan statistical approach (science part)
2. Organize project and plan workflow
3. Implement plan – write the code
Best to treat each step separately
See our pre-print linked in the notes: Brown and Spillias (2025)
1. Prompting strategies for statistical advice
What the LLM won’t do that a real statistical consultant would do:
Ask you for more information before giving an answer
1. Prompting strategies for statistical advice
1. Prompting strategies for statistical advice
You are an expert in ecological statistics with the R program.
I want to statistically test the dependence of fish abundance on coral cover. I have observations of coral cover (continuous percentage) and fish abundance (count of number of fish). Observations were made at 49 different locations. Observations were made with standardized surveys, so the area surveyed at each site was the same.
Sites are spatially clustered into different regions. Provide me with several options for statistical approaches would be appropriate for answering my research question. Use chain of thought to reason about each approach before providing a final summary.
I've attached the data [data] and a reference on analysis of count data with ecology [reference].
2. Prompting strategies for planning
2. Prompting strategies for planning
Help me plan the steps to complete this analysis. This should include a series of scripts that we will need to make for each step of the analysis. It should also include a plan for how to structure the project directory. Create modular scripts for this analysis with separate files for data preparation, model fitting, diagnostics, and visualization. Save data files for intermediate steps. Use chain of thought reasoning to think carefully about each step.
2. Prompting strategies for planning
3. Prompting strategies for implementation
What does this all mean for science?
Illusion of understanding
Messeri and Crockett Nature “Artificial intelligence and illusions of understanding in scientific research” https://www.nature.com/articles/s41586-024-07146-0
Illusion of understanding
Messeri and Crockett Nature “Artificial intelligence and illusions of understanding in scientific research” https://www.nature.com/articles/s41586-024-07146-0
Illusion of understanding
Messeri and Crockett Nature “Artificial intelligence and illusions of understanding in scientific research” https://www.nature.com/articles/s41586-024-07146-0
Ethics and environmental issues
What this means for science?
What you should do
What our discipline should do
Interactive workshop
You’ll need: