AskAnythingInCharts-Qwen2.5-7B

A fine-tuned Qwen2.5-VL-7B model specifically optimized for chart understanding tasks using LoRA (Low-Rank Adaptation) on the ChartQA dataset.

Model Details

Base Model: Qwen/Qwen2.5-VL-7B-Instruct
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Dataset: ChartQA (chart understanding benchmark)
Accuracy: 66.0% on ChartQA validation set (+8.5% improvement over base model)

Performance Comparison

Model	ChartQA Accuracy	Improvement
Qwen 2.5 7B base	57.5%	-
AskAnythingInCharts-Qwen2.5 7B	66.0%	+8.5%

Training Configuration

Epochs: 6
Learning Rate: 4e-5
LoRA Rank: 64
LoRA Alpha: 16
Target Modules: Vision and language attention layers
Batch Size: 1 (with gradient accumulation)
Optimizer: AdamW with fused implementation
Scheduler: Cosine learning rate schedule
Hardware: GPU with 16GB+ VRAM
Framework: HuggingFace Transformers + PEFT + DeepSpeed

Key Improvements

The fine-tuned model shows significant improvements in:

✅ Concise Answers: Returns exact values without verbose explanations
✅ Label Recognition: Better at reading text labels from charts
✅ Color Identification: More accurate at identifying chart colors
✅ Statistical Calculations: Improved at medians, ratios, differences
✅ Counting: Better accuracy in counting chart elements
✅ Region Comparison: Accurate comparisons across chart regions
✅ Yes/No Questions: More reliable binary responses

Usage

Direct Usage

from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration
from peft import PeftModel
from PIL import Image

# Load base model
base_model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
    "Qwen/Qwen2.5-VL-7B-Instruct",
    torch_dtype="bfloat16",
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "prakashchhipa/Qwen2.5-VL-7B-ChartQA-LoRA")
model = model.merge_and_unload()

# Load processor
processor = AutoProcessor.from_pretrained("prakashchhipa/Qwen2.5-VL-7B-ChartQA-LoRA")

# Inference
image = Image.open("chart.png")
question = "What is the highest value in the chart?"

messages = [
    {"role": "user", "content": [
        {"type": "text", "text": question},
        {"type": "image", "image": image}
    ]}
]

text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# Process and generate response...

Try the Demo

🎨 Interactive Demo: HuggingFace Spaces
📊 GitHub Repository: AskAnythingInCharts-Qwen2.5-7B

Training Data

The model was fine-tuned on the ChartQA dataset, which contains:

Chart images from various sources
Questions about chart content
Ground truth answers
Multiple chart types (bar charts, line graphs, pie charts, etc.)

Evaluation

Test Set: ChartQA validation set (500 examples)
Metric: Exact Match with normalization and numeric tolerance
Filtering: Only genuine improvements (excluded verbose-but-correct cases)

Limitations

Primarily optimized for chart understanding tasks
May not perform as well on general vision-language tasks
Requires GPU with sufficient VRAM for inference
Performance may vary on chart types not well-represented in training data

Citation

If you use this model in your research, please cite:

@misc{askanything-charts-qwen2.5,
  title={AskAnythingInCharts-Qwen2.5-7B: Fine-tuned Qwen2.5-VL for Chart Understanding},
  author={Prakash Chandra Chhipa},
  year={2025},
  url={https://huggingface.co/prakashchhipa/Qwen2.5-VL-7B-ChartQA-LoRA}
}

License

This model is released under the MIT License. The base Qwen2.5-VL model is subject to its own license terms.

Contact

Author: Prakash Chandra Chhipa
Portfolio: prakashchhipa.github.io
GitHub: @prakashchhipa

Built with ❤️ using Qwen2.5-VL and HuggingFace Transformers

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for prakashchhipa/Qwen2.5-VL-7B-ChartQA-LoRA

Base model

Qwen/Qwen2.5-VL-7B-Instruct

Adapter

(133)

this model

prakashchhipa
/

Qwen2.5-VL-7B-ChartQA-LoRA