---
license: apache-2.0
library_name: transformers
pipeline_tag: image-text-to-text
base_model:
- llava-hf/llava-1.5-7b-hf
---

# ViSpec-LLaVA-1.5-7b (Benchmark Release)

This model repo is part of a **multimodal speculative decoding benchmark suite**.

## Why this repo exists

We maintain a unified benchmark codebase that includes multiple methods (Baseline, EAGLE, EAGLE2, Lookahead, MSD, ViSpec) so users can run training/evaluation more easily under one setup.

- The methods are aggregated here for **user convenience** (shared dataset format, scripts, and metrics).
- The original ideas and implementations belong to their respective authors.
- This specific Hugging Face repo hosts the **ViSpec-LLaVA-1.5-7b checkpoint** used in our benchmark runs.

## Upstream / Base Model

- Base model: `llava-hf/llava-1.5-7b-hf`
- Original ViSpec LLaVA release: `JLKang/ViSpec-llava-1.5-7b-hf`

## Citation

If you use this checkpoint and benchmark, please cite ViSpec and the original methods you compare against.

### ViSpec

```bibtex
@inproceedings{vispec,
  title={ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding},
  author={Kang, Jialiang and Shu, Han and Li, Wenshuo and Zhai, Yingjie and Chen, Xinghao},
  booktitle={Annual Conference on Neural Information Processing Systems},
  year={2025}
}
```

### EAGLE / EAGLE2 / EAGLE3

```bibtex
@inproceedings{li2024eagle,
  author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang},
  title = {{EAGLE}: Speculative Sampling Requires Rethinking Feature Uncertainty},
  booktitle = {International Conference on Machine Learning},
  year = {2024}
}

@inproceedings{li2024eagle2,
  author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang},
  title = {{EAGLE-2}: Faster Inference of Language Models with Dynamic Draft Trees},
  booktitle = {Empirical Methods in Natural Language Processing},
  year = {2024}
}

@inproceedings{li2025eagle3,
  author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang},
  title = {{EAGLE-3}: Scaling up Inference Acceleration of Large Language Models via Training-Time Test},
  booktitle = {Annual Conference on Neural Information Processing Systems},
  year = {2025}
}
```

### Other integrated baselines (links)

- Lookahead Decoding: https://lmsys.org/blog/2023-11-21-lookahead-decoding/
- MSD-LLaVA1.5-7B: https://huggingface.co/lucylyn/MSD-LLaVA1.5-7B
- Medusa: https://github.com/FasterDecoding/Medusa

## Notes

- This model card focuses on benchmark usage and attribution.
- For full benchmark code and scripts, please refer to the benchmark repository used in your experiment setup.