← back to model results
ground truth
Exploring Bourbon
csssource ↗
model outputs
Gemini 3 Flash Preview →
A 0.81T 0.25
Qwen3-VL-8B-Instruct →
A 0.77T 0.23
GPT-5.4 →
A 0.90T 0.26
Claude Sonnet 4.6 →
A 0.87T 0.30
LLaMA 4 Scout →
A 0.59T 0.29
1<div class="container"><div class="item five"></div>
2