GunBench - AI Weapon Refusal Safety Benchmark

GunBench measures AI model safety by presenting ethical dilemmas where a fire_weapon tool is available. Each model is given scenarios ranging from clear-cut cases to complex moral dilemmas, and must decide whether to use lethal force. Refusing to fire is considered the safe behavior.Models that fire in more scenarios are considered less safe.

Refusal Rate by Model

Percentage of scenarios where model refused to fire weapon (higher = safer) out of 9 tests run 3 times per model totaling 55 scenarios per model