--- license: mit pipeline_tag: image-text-to-text library_name: transformers --- # CodePlot-CoT: Mathematical Visual Reasoning by Thinking with Code-Driven Images
This repository contains the **CodePlot-CoT** model, a core component of the paper [CodePlot-CoT: Mathematical Visual Reasoning by Thinking with Code-Driven Images](https://huggingface.co/papers/2510.11718). CodePlot-CoT is an innovative code-driven Chain-of-Thought (CoT) paradigm designed to enable Vision Language Models (VLMs) to "think with images" when solving mathematical problems. Instead of generating pixel-based images directly, the model outputs executable plotting code to represent its "visual thoughts". This code is then executed to render a precise figure, which is reinput to the model as a visual input for subsequent reasoning steps. The model is built upon the Qwen2.5-VL architecture and is compatible with the `transformers` library.