Papers
arxiv:2511.08195

UI2Code^N: A Visual Language Model for Test-Time Scalable Interactive UI-to-Code Generation

Published on Nov 11
· Submitted by Wenyi Hong on Nov 17
Authors:
,
,
,
,
,
,
,

Abstract

UI2Code$^\text{N}$, a visual language model enhanced through staged pretraining, fine-tuning, and reinforcement learning, achieves superior performance in UI-to-code generation, editing, and polishing with iterative feedback.

AI-generated summary

User interface (UI) programming is a core yet highly complex part of modern software development. Recent advances in visual language models (VLMs) highlight the potential of automatic UI coding, but current approaches face two key limitations: multimodal coding capabilities remain underdeveloped, and single-turn paradigms make little use of iterative visual feedback. We address these challenges with an interactive UI-to-code paradigm that better reflects real-world workflows and raises the upper bound of achievable performance. Under this paradigm, we present UI2Code^N, a visual language model trained through staged pretraining, fine-tuning, and reinforcement learning to achieve foundational improvements in multimodal coding. The model unifies three key capabilities: UI-to-code generation, UI editing, and UI polishing. We further explore test-time scaling for interactive generation, enabling systematic use of multi-turn feedback. Experiments on UI-to-code and UI polishing benchmarks show that UI2Code^N establishes a new state of the art among open-source models and achieves performance comparable to leading closed-source models such as Claude-4-Sonnet and GPT-5. Our code and models are available at https://github.com/zai-org/UI2Code_N.

Community

Paper submitter

UI2Code^N is a visual language foundation model trained through staged pretraining, fine-tuning, and reinforcement learning to achieve foundational improvements in multimodal coding, which unifies three key capabilities: UI-to-code generation , UI editing , and UI polishing .

GitHub: https://github.com/zai-org/UI2Code_N
HF:https://huggingface.co/zai-org/UI2Code_N
Homepage: https://zheny2751-dotcom.github.io/ui2code-n.github.io/
Paper: https://arxiv.org/abs/2511.08195

  • Interactive UI-to-Code Paradigm We redefine UI-to-code generation as iterative reasoning with visual feedback, enabling flexible code generation, editing, and test-time scaling (e.g., +12% improvement with four rounds of polishing).
  • First open-source Unified UI2Code Model: UI2CodeN is the first open-source VLM to jointly support UI-to-code, UI editing, and UI polishing, achieving state-of-the-art results on Design2Code, Flame-React-Eval, and Web2Code, outperforming Gemini-2.5-Pro and Claude-4-Sonnet.
  • Full Training Recipe for Coding VLM: We are the first to release a complete three-stage training pipeline—pretraining, supervised fine-tuning, and reinforcement learning with a novel reward design —balancing data realism with code generation quality.

Screenshot_2025-11-18-08-07-18-042_com.android.chrome

·

Help me to create that website

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2511.08195 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 3