ABCDEFGHIJKLMNOPQRSTUVWXYZAAABAC
1
Last Updated:August 25, 2025Maintained by: Evelyn Yee
2
Tool NameDeveloperLinkUseful For
Structure
Target Scope
Eval scope
Risk scope
Interface
Model Connector
Multi-turn support
AI Involvement
Automation?
Data Source
Inputs
OutputsPredefined Components
3
Adversarial Robustness Toolbox (ART)IBM / LF AI & Datahttps://github.com/Trusted-AI/adversarial-robustness-toolboxGeneral ML systems, not just language generation. Implementations for a lot of red-team and blue-team components in adversarial ML context.frameworkLLMs, image, audio/timeseries, video, tabulargeneral evalgeneralpython librarycustom codenoadversary, facillitator, scorerusercode, datasupports custom loggingA bunch of attacks, defense mechanisms, metrics
4
AgentDojoAcademic (ETH Zurich)https://github.com/ethz-spylab/agentdojotoolkit for creating benchmarking pipelines, especially around prompt injection. Also features tool use. Some support for custom processes, attacks, etc. (e.g. multi-turn) but limited documentation. Integrates well with Invariant Labs tools for guardrails and loggingframeworkLLMs, tool usered-teamingprompt injectionpython library, CLIapi, custom codeAI adversaryadversary, scoreruser, predefinedcode, dataeval trace log: to CLI/buffer and files, returns python object of eval trace.4 "task suites" themed around applications (travel, worspace, banking, slack), a default eval pipeline for API models.
5
AI Explainability 360 (AIX360)IBM / LF AI & Datahttps://github.com/Trusted-AI/AIX360explainability/interpretability methods for ML systems (predictive only?)toolkittabular, audio/timeseries, imagegeneral evalgeneralpython librarycustom codenononepredefined, userdata, codeexplanations of data/model behaviorlots of algorithms, datasets, metrics, and other pipelining components
6
AI Fairness 360 (AIF360)IBM / LF AI & Datahttps://github.com/Trusted-AI/AIF360fairness/bias evals (and debiasing methods) for ML systems (predictive only?). More documentation focus on mitigation methods than eval thoughtoolkittabular, audio/timeseries, imagegeneral evalpredictive unfairness/biaspython librarycustom codenononepredefined, usercode, datanumerical metrics for fairness/bias/explainability (python objects)lots of algorithms, datasets, metrics, and other pipelining components. Also de-biasing methods
7
aiapwnKarim Habushhttps://github.com/karimhabush/aiapwnsimilar to promptmap2. Better UI and has implementation for AI adversary and scorertoolLLMsred-teamingprompt injectionCLI, python libraryapi, custom codenoadversary, scoreruser, predefineddata, codevisual output, logsome predefined prompt injection attacks, eval harness
8
AnyCoderHuggingFacehttps://huggingface.co/spaces/akhaliq/anychatweb interface to interact with some HuggingFace models for coding tasks (with web search). For small-scale, manual prompt testing.toolLLMs, RAGgeneral evalgeneralwebapiHuman adversarynoneuserdata, selectiongenerated code (also a preview for HTML)n/a
9
CaptumPyTorch (Meta)https://captum.ai/interpretability methods for ML systems (predictive and generative)toolkitLLMs, image, tabulargeneral evalgeneralpython librarycustom code, apinononeuserdata, codeattribution for output, over components16 input attribution methods, 4(?) of which are applicable to API LLMS. For white-box models, has layer/neuron attribution, concept methods, and data attribution
10
CleverHansGoogle Brainhttps://github.com/cleverhans-lab/cleverhansautomated red-teaming for ML systems (predictive only?)tooltabular, audio/timeseries, image, LLMsred-teamingadversarial inputs (classification?)python librarycustom codenoadversaryuserdata, codeattack success metrics8 methods for generating adversarial examples
11
FuzzyAICyberArkhttps://github.com/cyberark/FuzzyAIjailbreak/fuzzing testing on small query sets. Easy running of premade prompting attacks, including ones with AI adversaries. Relatively intuitive implementation and good documentation for creating custom attacks, scorers, and model endpoints. GUI is extremely similar to CLI in functionalitytoolLLMsred-teaminggeneral (framing is about fuzzing)python library, CLI, GUIapi, custom codeAI adversaryscorer, adversary, facillitatoruser, predefinedcode, selection, datavisual output (tracking progress, table of prompts, responses, scores), python object23 attacks/mutators (wrap evaluation calls), 10 attack success classifiers, 8 query datasets (2 benign, 6 harmful)
12
GarakNVIDIAhttps://github.com/NVIDIA/garakgood pre-defined probes and detectors. most probes are fixed, but there is one adaptive. not as good for custom evals or more advanced interaction stylestoolkitLLMsred-teamingLLM unintended/ unsafe generationCLI, python libraryapi, custom codenoscorer, adversarypredefinedselectionvisual output, query log, "hit log", debug log30ish "probes" (datasets/interaction systems), including an auto red-team fine-tuned model ("art"), 30ish "detectors"
13
GiskardGiskardhttps://github.com/Giskard-AI/giskard?tab=readme-ov-fileautomatic eval of a variety of LLM qualities, RAG eval (+ RAG eval dataset generation), some evals for vision and tabular non-LLMstoolkitLLMs, RAG, image, tabularred-teaminggeneral, RAGpython libraryapi, custom codenoscorer, adversaryuserselection, datapython object (has converters to file and logging formats)8 "detectors" for generative LLM scanning (e.g. prompt injection, sycophancy, privacy), 8 detectors for ML/NLP scanning, toolkit for testset generation against RAG systems
14
HouYiAcademic (Nanyang Technological University)https://github.com/LLMSecurity/HouYi?tab=readme-ov-fileStructure for automatic prompt injection engineering. Requires defining specific task for use.toolLLMsred-teamingprompt injectionpython librarycustom codenoadversaryusercode, datalog: queries (status/progress), scoring, most successful injection promptdemo for prompt injection optimization (eval harness and prompts). OpenAI api setup.
15
Inspect FrameworkUK AISIhttps://inspect.aisi.org.uk/linear eval over predefined dataset. Good readable logs, best for binary scorers. Multi-turn is meant for agentic evals, so it looks a little clunky for red-teamingframeworkLLMs, image, audio/timeseries, video, tool use, RAGgeneral evalgeneralpython library, CLIapi, custom codeAI adversary, Human adversaryscoreruserdata, codevisual output, query log?n/a
16
Learning Interpretability Tool (LIT)Google PAIRhttps://github.com/PAIR-code/litdata/model analysis and interpretability methods for ML systems (predictive and generative). Also visualization tools and nice GUIframeworkLLMs, image, tabulargeneral evalgeneralGUI, python libraryapi, custom codenononeuserdata, code, selectionattribution for output, over components. visualizations of data and model.framework: model, dataset, interpreter (salience, metrics, visualization), generators (make new (adversarial?) inputs). Not sure how many of each of these there are.
17
LLMBUSEvren Yalcinhttps://github.com/evrenyal/llmbusweb interface for small-scale, manual prompt design with a tokenizer and variety of known jailbreak/attack methods. toolLLMsnon-eval (assistive)generalwebapinononeuserselectionA transformed/formatted prompt, in plaintext, image, or audioprompt "transformation" (attack) templates, tokenizers for a few frontier models
18
NeMO GuardrailsNVIDIAhttps://developer.nvidia.com/nemo-guardrailsguardrail implementationtoolLLMs, tool usenon-eval (assistive)generalpython library, CLIapinofacillitatoruserdata, selectionguardrail assessment (prompt and response), guarded responseload datasets from hf
19
ParseltonguePliny the prompterhttps://github.com/BASI-LABS/parseltongueFor prompt design: text conversion (encoding), tokenization.toolLLMsnon-eval (assistive)generalwebnonenononeuserdatatext conversion (encoding), tokenizationtext conversion (encoding), tokenization
20
Project MoonshotAI Verify Foundationhttps://github.com/aiverify-foundation/moonshotgood GUI for organizing and managing eval runs (separates benchmarks from redteaming). Allows manual redteaming. Good documentation.toolkitLLMsgeneral evalgeneralGUI, CLI, python libraryapi, custom codeAI adversary, Human adversaryadversary, scoreruser, predefineddata, selectionbenchmark: reportbenchmark "cookbooks" (dataset, scorer)
21
PromptfooPromptfoohttps://github.com/promptfoo/promptfooBenchmarking (vulnerability scanner) and red-teaming (AI-generated adversarial prompts) against LLM systems. Some limited support for multimodal image-text systems. Good visual reports and integration into eval/dev pipelines. Compared to other GUI tools, requires a bit of technical interaction (e.g. config file, launching app from commandline), but really good documentation. Also some good beginner-level reference guides to red-teaming and strategies.frameworkLLMs, RAG, image, tool usered-teaminggeneralCLI, GUI, python libraryapi, custom codeHuman adversaryscorer, adversaryuser, predefinedselection, data, codeGUI: log (table of conversations and outputs, scores), visual report of attack success rate (graphs, risk categories)taxonomy of LLM vulnerabilities and 50+ pre-built "plugins" to test them. Reference materials/guides about red-teaming strategies.
22
PromptmapUtku Senhttps://github.com/utkusen/promptmapautomated harness for running user-defined prompt injection attacks.toolLLMsred-teamingprompt injectionpython library, CLIapinononeuser, predefineddata, codetest log: success rate, successful attack instances (response and score explanation)some predefined prompt injection attacks, eval harness
23
PyRITMicrosofthttps://github.com/Azure/PyRITprogrammatic interventions over fixed dataset. Multi-turn interactive RT by adversary AIframeworkLLMs, image, audio/timeseries, videored-teaminggeneralpython library, CLIapi, custom codeAI adversaryadversary, scorerpredefined, userdata, codevisual output, query log? scorer metrics?20 harmful/illicit content datasets, 55ish attack "prompt converters", 6 attack "orchestrators"
24
The Big Prompt LibraryElias Bachaalanyhttps://github.com/0xeb/TheBigPromptLibraryreference resource containing system prompts, jailbreaks, etc. for most major hosted modelstoolLLMsnon-eval (assistive)generalwebnonenononepredefinedselectionn/asystem prompts for most major hosted models, and prompts for custom instructions, jailbreaks, security, and tool use for many models, especialy GPT-4 series.
25
TokenBusterSentryhttps://tokenbuster.sentry.security/For prompt design: tokenization and plaintext prompt formatting (e.g. chat history -> string). Supports many open models and a few of the GPT's.toolLLMs, tool usenon-eval (assistive)generalwebapi, noneHuman adversarynoneuserdatatoken ID list and rendered plaintext promptchat templates/ rendering for 206 models, mostly open models.
26
ZetaLibZetaLibhttps://github.com/Exocija/ZetaLib/tree/mainreference resource containing prompt lists and instructional guides to guardrails, jailbreaks, and frontier model system promptstoolLLMsnon-eval (assistive)generalwebnoneHuman adversarynonepredefinedselectionn/a20+ jailbreak prompts and guides to using them. also 2 prompts to implement simple guardrails
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100