Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

289

Base only

Active filters: VLM

NemoStation/Marlin-2B

Video-Text-to-Text • 2B • Updated 10 days ago • 14.7k • 444

numind/NuMarkdown-8B-Thinking

Image-to-Text • 8B • Updated 10 days ago • 41.5k • 472

lunahr/Marlin-2B-ungated

Video-Text-to-Text • 2B • Updated 8 days ago • 1.07k • 5

nvidia/NVIDIA-Nemotron-Parse-v1.2

Image-Text-to-Text • 0.9B • Updated 24 days ago • 129k • 37

nvidia/Eagle2-2B

Image-Text-to-Text • 2B • Updated Apr 27, 2025 • 525 • 34

nvidia/Eagle2-1B

Image-Text-to-Text • 1B • Updated Apr 27, 2025 • 2.14k • 30

nvidia/VILA-HD-8B-PS3-1.5K-SigLIP

Image-Text-to-Text • Updated Jul 30, 2025 • 58 • 4

nvidia/VILA-HD-8B-PS3-4K-SigLIP

Image-Text-to-Text • Updated Jul 30, 2025 • 61 • 2

nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1

Image-Text-to-Text • 9B • Updated Dec 4, 2025 • 1.21M • 179

xlangai/OpenCUA-7B

Image-Text-to-Text • 8B • Updated Feb 1 • 12.8k • 30

nvidia/VILA-HD-8B-PS3-1.5K-SigLIP2

Image-Text-to-Text • Updated Jul 30, 2025 • 540 • 1

nvidia/VILA-HD-8B-PS3-4K-SigLIP2

Image-Text-to-Text • Updated Jul 30, 2025 • 55 • 3

nvidia/VILA-HD-8B-PS3-1.5K-C-RADIOv2

Image-Text-to-Text • Updated Jul 30, 2025 • 57 • 1

nvidia/VILA-HD-8B-PS3-4K-C-RADIOv2

Image-Text-to-Text • Updated Jul 30, 2025 • 60 • 1

nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16

Image-Text-to-Text • 13B • Updated Dec 2, 2025 • 170k • 83

HongxinLi/GoClick-Large

Image-Text-to-Text • 0.8B • Updated 20 days ago • 165 • 1

mPLUG/ToolCUA-8B

Image-Text-to-Text • 9B • Updated 17 days ago • 112 • 4

mradermacher/ToolCUA-8B-GGUF

8B • Updated 16 days ago • 779 • 2

adnankhan-11/VisionNav-3B

4B • Updated 13 days ago • 123 • 1

mradermacher/VisionNav-3B-GGUF

3B • Updated 11 days ago • 503 • 1

rohanshad/cmr_c0.1

Updated Mar 25 • 3

Efficient-Large-Model/VILA-13b

Text Generation • 13B • Updated Mar 4, 2024 • 23 • 20

Efficient-Large-Model/VILA-7b

Text Generation • 7B • Updated Mar 4, 2024 • 585 • 27

Efficient-Large-Model/VILA-7b-4bit-awq

Text Generation • Updated Mar 4, 2024 • 14 • 2

Efficient-Large-Model/VILA-13b-4bit-awq

Text Generation • Updated Mar 4, 2024 • 13 • 2

Efficient-Large-Model/VILA-2.7b

Text Generation • 3B • Updated Mar 4, 2024 • 138 • 15

TIGER-Lab/Mantis-bakllava-7b

Image-Text-to-Text • 8B • Updated May 18, 2024 • 49 • 5

TIGER-Lab/Mantis-llava-7b

Image-Text-to-Text • 7B • Updated May 18, 2024 • 22 • 16

Efficient-Large-Model/VILA1.5-3b

Text Generation • Updated Jul 18, 2024 • 1.58k • 34

Efficient-Large-Model/VILA1.5-13b

Text Generation • Updated Jul 18, 2024 • 260 • 5