--- license: apache-2.0 library_name: transformers pipeline_tag: image-text-to-text base_model: - llava-hf/llava-1.5-7b-hf --- # ViSpec-LLaVA-1.5-7b (Benchmark Release) This model repo is part of a **multimodal speculative decoding benchmark suite**. ## Why this repo exists We maintain a unified benchmark codebase that includes multiple methods (Baseline, EAGLE, EAGLE2, Lookahead, MSD, ViSpec) so users can run training/evaluation more easily under one setup. - The methods are aggregated here for **user convenience** (shared dataset format, scripts, and metrics). - The original ideas and implementations belong to their respective authors. - This specific Hugging Face repo hosts the **ViSpec-LLaVA-1.5-7b checkpoint** used in our benchmark runs. ## Upstream / Base Model - Base model: `llava-hf/llava-1.5-7b-hf` - Original ViSpec LLaVA release: `JLKang/ViSpec-llava-1.5-7b-hf` ## Citation If you use this checkpoint and benchmark, please cite ViSpec and the original methods you compare against. ### ViSpec ```bibtex @inproceedings{vispec, title={ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding}, author={Kang, Jialiang and Shu, Han and Li, Wenshuo and Zhai, Yingjie and Chen, Xinghao}, booktitle={Annual Conference on Neural Information Processing Systems}, year={2025} } ``` ### EAGLE / EAGLE2 / EAGLE3 ```bibtex @inproceedings{li2024eagle, author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang}, title = {{EAGLE}: Speculative Sampling Requires Rethinking Feature Uncertainty}, booktitle = {International Conference on Machine Learning}, year = {2024} } @inproceedings{li2024eagle2, author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang}, title = {{EAGLE-2}: Faster Inference of Language Models with Dynamic Draft Trees}, booktitle = {Empirical Methods in Natural Language Processing}, year = {2024} } @inproceedings{li2025eagle3, author = {Yuhui Li and Fangyun Wei and Chao Zhang and Hongyang Zhang}, title = {{EAGLE-3}: Scaling up Inference Acceleration of Large Language Models via Training-Time Test}, booktitle = {Annual Conference on Neural Information Processing Systems}, year = {2025} } ``` ### Other integrated baselines (links) - Lookahead Decoding: https://lmsys.org/blog/2023-11-21-lookahead-decoding/ - MSD-LLaVA1.5-7B: https://huggingface.co/lucylyn/MSD-LLaVA1.5-7B - Medusa: https://github.com/FasterDecoding/Medusa ## Notes - This model card focuses on benchmark usage and attribution. - For full benchmark code and scripts, please refer to the benchmark repository used in your experiment setup.