Zero-shot video (or image-frame) → code results on the test set, across commercial and open-source models.
Each output is tagged with A = appearance similarity and T = temporal similarity; higher is better for both. Click a video to inspect its code.
17–24 of 214
ground truth
Motion Table - Orbit
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout
ground truth
Houdini Gradient Border Button
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout
ground truth
Sequenced SplitText Animation
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout
ground truth
Rotating text
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout
ground truth
Only CSS: Colorful Jewelry
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout
ground truth
Camera following: Step8
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout
ground truth
Only CSS: Joint Animation
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout
ground truth
Only CSS: Joint Animation
model outputs
Gemini 3 Flash Preview
Qwen3-VL-8B-Instruct
GPT-5.4
Claude Sonnet 4.6
LLaMA 4 Scout