AskAnythingInCharts-Qwen2.5-7B

A fine-tuned Qwen2.5-VL-7B model specifically optimized for chart understanding tasks using LoRA (Low-Rank Adaptation) on the ChartQA dataset.

Model Details

  • Base Model: Qwen/Qwen2.5-VL-7B-Instruct
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Dataset: ChartQA (chart understanding benchmark)
  • Accuracy: 66.0% on ChartQA validation set (+8.5% improvement over base model)

Performance Comparison

Model ChartQA Accuracy Improvement
Qwen 2.5 7B base 57.5% -
AskAnythingInCharts-Qwen2.5 7B 66.0% +8.5%

Training Configuration

  • Epochs: 6
  • Learning Rate: 4e-5
  • LoRA Rank: 64
  • LoRA Alpha: 16
  • Target Modules: Vision and language attention layers
  • Batch Size: 1 (with gradient accumulation)
  • Optimizer: AdamW with fused implementation
  • Scheduler: Cosine learning rate schedule
  • Hardware: GPU with 16GB+ VRAM
  • Framework: HuggingFace Transformers + PEFT + DeepSpeed

Key Improvements

The fine-tuned model shows significant improvements in:

  • βœ… Concise Answers: Returns exact values without verbose explanations
  • βœ… Label Recognition: Better at reading text labels from charts
  • βœ… Color Identification: More accurate at identifying chart colors
  • βœ… Statistical Calculations: Improved at medians, ratios, differences
  • βœ… Counting: Better accuracy in counting chart elements
  • βœ… Region Comparison: Accurate comparisons across chart regions
  • βœ… Yes/No Questions: More reliable binary responses

Usage

Direct Usage

from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration
from peft import PeftModel
from PIL import Image

# Load base model
base_model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
    "Qwen/Qwen2.5-VL-7B-Instruct",
    torch_dtype="bfloat16",
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "prakashchhipa/Qwen2.5-VL-7B-ChartQA-LoRA")
model = model.merge_and_unload()

# Load processor
processor = AutoProcessor.from_pretrained("prakashchhipa/Qwen2.5-VL-7B-ChartQA-LoRA")

# Inference
image = Image.open("chart.png")
question = "What is the highest value in the chart?"

messages = [
    {"role": "user", "content": [
        {"type": "text", "text": question},
        {"type": "image", "image": image}
    ]}
]

text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# Process and generate response...

Try the Demo

Training Data

The model was fine-tuned on the ChartQA dataset, which contains:

  • Chart images from various sources
  • Questions about chart content
  • Ground truth answers
  • Multiple chart types (bar charts, line graphs, pie charts, etc.)

Evaluation

  • Test Set: ChartQA validation set (500 examples)
  • Metric: Exact Match with normalization and numeric tolerance
  • Filtering: Only genuine improvements (excluded verbose-but-correct cases)

Limitations

  • Primarily optimized for chart understanding tasks
  • May not perform as well on general vision-language tasks
  • Requires GPU with sufficient VRAM for inference
  • Performance may vary on chart types not well-represented in training data

Citation

If you use this model in your research, please cite:

@misc{askanything-charts-qwen2.5,
  title={AskAnythingInCharts-Qwen2.5-7B: Fine-tuned Qwen2.5-VL for Chart Understanding},
  author={Prakash Chandra Chhipa},
  year={2025},
  url={https://huggingface.co/prakashchhipa/Qwen2.5-VL-7B-ChartQA-LoRA}
}

License

This model is released under the MIT License. The base Qwen2.5-VL model is subject to its own license terms.

Contact


Built with ❀️ using Qwen2.5-VL and HuggingFace Transformers

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for prakashchhipa/Qwen2.5-VL-7B-ChartQA-LoRA

Adapter
(133)
this model

Space using prakashchhipa/Qwen2.5-VL-7B-ChartQA-LoRA 1