Beware of Botshit:
How to Manage the Epistemic Risks of Generative Chatbots
Paper and authors
Hannigan, T., McCarthy I.P. and Spicer, A. (2024) Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots. In Business Horizons.
Ian P. McCarthy
Tim Hannigan
Andre Spicer
Our aim
Hannigan, T., McCarthy I.P. and Spicer, A. (2024) Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots. In Business Horizons.
Our approach
Hannigan, T., McCarthy I.P. and Spicer, A. (2024) Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots. In Business Horizons.
Stochastic parrots
Hannigan, T., McCarthy I.P. and Spicer, A. (2024) Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots. In Business Horizons.
Thus...
Hannigan, T., McCarthy I.P. and Spicer, A. (2024) Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots. In Business Horizons.
Chatbots unveiled:�knowing versus predicting
Hannigan, T., McCarthy I.P. and Spicer, A. (2024) Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots. In Business Horizons.
Reinforcement Learning from Human Feedback (RLHF): The ChatGPT LLM process | Description | Risk of generating LLM hallucinations |
1. Data collection | A large text data set is compiled to capture diverse topics, contexts, and linguistic styles. | If the data is biased, not current, incomplete, or inaccurate, the LLM and human users can learn and perpetuate its responses. |
2. Data preprocessing | The data is cleaned to remove irrelevant text and correct errors and then converted for uniform encoding. | Preprocessing inadvertently removes meaningful content or adds errors that alter the context or meaning of some text. |
3. Tokenization | The data is split into ‘tokens’, which can be as short as one character or as long as one word. | When language contexts are poorly understood, tokenization results in wrong or reduced meaning, interpretation errors, and false outputs. |
4. Unsupervised learning to form a baseline model | The tokenized data trains the LLM transformer to make predictions. The LLM learns from the data’s inherent structure without supervision. | The LLM learns to predict content but does not understand its meaning, leading it to generate outputs that sound plausible but are incorrect or nonsensical. |
Reinforcement Learning from Human Feedback (RLHF): The ChatGPT LLM process | Description | Risk of generating LLM hallucinations |
5. Reinforcement Learning from Human Feedback: �i) supervised fine-tuning of model (SFT) | A team of human labelers curates a small set of demonstration data. They select a set of prompts and write down expected outputs for each (i.e., desired output behavior). This is used to fine-tune the model with supervised learning. | This process is very costly, and the amount of data used is small (about 12,000 data points). Prompts are sampled from user requests (from old models). This means the SFT only covers a relatively small set of possibilities. |
6. Reinforcement Learning from Human Feedback: �ii) training a reward model (RW) | The human labelers repeatedly run these prompts against the SFT model and get multiple outputs per prompt. They rank the prompts for mimicking human preferences. This is used to train a reward model (RM). | Human labelers agree to a set of common guidelines they will follow. There is no accountability for this, which can skew the reward model. |
7. Reinforcement Learning from Human Feedback: �iii) fine-tuning SFT model through proximal policy optimization (PPO) | A reinforcement learning process is continually run using the proximal policy optimization (PPO) algorithm on both the SFT and RM. The PPO uses a “value function” to calculate the difference between expected and current outputs. | If faced with a prompt about a fact not covered by the training data (SFT and RM), the LLM will likely generate an incorrect or made-up response. |
Provisional knowledge
Hannigan, T., McCarthy I.P. and Spicer, A. (2024) Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots. In Business Horizons.
Bullshit, hallucinations and botshit
| Bullshit | Botshit |
Defined | Human-generated content that has no regard for the truth which a human then applies for communication and decision-making tasks (Frankfurt, 2009, McCarthy et al., 2020, Spicer, 2017). For example, a human produces a report using evidence that they have made up and is untrue, and the report is presented to others. | Chatbot generated-content that is not grounded in truth (i.e., hallucinations) and is then uncritically used by a human for communication and decision-making tasks. For example, a human produces a report using chatbot generated content that is untrue, and the report is presented to others. |
Types | Pseudo-profound bullshit: statements that seem deep and meaningful (Pennycook et al. 2015) Persuasive bullshit: statements that aim to impress or persuade (Littrell et al. 2021a) Evasive bullshit: statements that strategically circumvent the truth (Littrell et al. 2021a) Social bullshit: statements that tease, exaggerate, joke, or troll (McCarthy et al., 2020; Spicer, 2017) | Intrinsic botshit: the human application of a chatbot response that contradicts the chatbot’s training data (Ji et al., 2023; Sun et al., 2023) Extrinsic botshit: the human application of a chatbot response that cannot be verified as true or false by the chatbot’s training data (Ji et al., 2023; Sun et al., 2023; Maynez et al., 2020)
|
| Bullshit | Botshit |
Insights | Humans are more likely to generate and use bullshit:
Humans are more likely to believe and spread bullshit:
| Chatbots are more likely to generate hallucinations for humans to use and transform into botshit when there are:
|
A typology of chatbot work modes
Hannigan, T., McCarthy I.P. and Spicer, A. (2024) Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots. In Business Horizons.
A typology of chatbot work modes
Hannigan, T., McCarthy I.P. and Spicer, A. (2024) Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots. In Business Horizons.
Using chatbots with integrity
Hannigan, T., McCarthy I.P. and Spicer, A. (2024) Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots. In Business Horizons.
Ignorance - automated
Hannigan, T., McCarthy I.P. and Spicer, A. (2024) Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots. In Business Horizons.
Miscalibration - authenticated
Hannigan, T., McCarthy I.P. and Spicer, A. (2024) Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots. In Business Horizons.
Routinization-automated
Hannigan, T., McCarthy I.P. and Spicer, A. (2024) Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots. In Business Horizons.
Black boxing - autonomous
Hannigan, T., McCarthy I.P. and Spicer, A. (2024) Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots. In Business Horizons.
Guardrails (i.e., rules, guidelines, or limitations for chatbot use) for how the technology, organizations, and users can mitigate botshit risks and enhance the truthiness of chatbot use for work.
Technology-oriented guardrails
Organization-oriented guardrails
User-oriented guardrails
In sum
References
References