Zero-shot video (or image-frame) → code results on the test set, across commercial and open-source models.
Each output is tagged with A = appearance similarity and T = temporal similarity; higher is better for both. Click a video to inspect its code.
9–16 of 214
ground truth
Staggered Stair Loading
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout
ground truth
Star Burst
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout
ground truth
Snow (Pure CSS)
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout
ground truth
Bubble Float
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout
ground truth
Spiral Tower
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout
ground truth
Motion Table - Orbit
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout
ground truth
Motion Table - Orbit
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout
ground truth
Motion Table - Orbit
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout