arxiv:2511.08195

UI2Code^N: A Visual Language Model for Test-Time Scalable Interactive UI-to-Code Generation

Published on Nov 11

· Submitted by

Authors:

Abstract

UI2Code$^\text{N}$, a visual language model enhanced through staged pretraining, fine-tuning, and reinforcement learning, achieves superior performance in UI-to-code generation, editing, and polishing with iterative feedback.

AI-generated summary

User interface (UI) programming is a core yet highly complex part of modern software development. Recent advances in visual language models (VLMs) highlight the potential of automatic UI coding, but current approaches face two key limitations: multimodal coding capabilities remain underdeveloped, and single-turn paradigms make little use of iterative visual feedback. We address these challenges with an interactive UI-to-code paradigm that better reflects real-world workflows and raises the upper bound of achievable performance. Under this paradigm, we present UI2Code^N, a visual language model trained through staged pretraining, fine-tuning, and reinforcement learning to achieve foundational improvements in multimodal coding. The model unifies three key capabilities: UI-to-code generation, UI editing, and UI polishing. We further explore test-time scaling for interactive generation, enabling systematic use of multi-turn feedback. Experiments on UI-to-code and UI polishing benchmarks show that UI2Code^N establishes a new state of the art among open-source models and achieves performance comparable to leading closed-source models such as Claude-4-Sonnet and GPT-5. Our code and models are available at https://github.com/zai-org/UI2Code_N.

View arXiv page View PDF Project page GitHub 24 Add to collection

Community

wenyi

Paper submitter 4 days ago

UI2Code^N is a visual language foundation model trained through staged pretraining, fine-tuning, and reinforcement learning to achieve foundational improvements in multimodal coding, which unifies three key capabilities: UI-to-code generation , UI editing , and UI polishing .

GitHub: https://github.com/zai-org/UI2Code_N
HF：https://huggingface.co/zai-org/UI2Code_N
Homepage: https://zheny2751-dotcom.github.io/ui2code-n.github.io/
Paper: https://arxiv.org/abs/2511.08195

Interactive UI-to-Code Paradigm We redefine UI-to-code generation as iterative reasoning with visual feedback, enabling flexible code generation, editing, and test-time scaling (e.g., +12% improvement with four rounds of polishing).
First open-source Unified UI2Code Model: UI2CodeN is the first open-source VLM to jointly support UI-to-code, UI editing, and UI polishing, achieving state-of-the-art results on Design2Code, Flame-React-Eval, and Web2Code, outperforming Gemini-2.5-Pro and Claude-4-Sonnet.
Full Training Recipe for Coding VLM: We are the first to release a complete three-stage training pipeline—pretraining, supervised fine-tuning, and reinforcement learning with a novel reward design —balancing data realism with code generation quality.