---
language:
- en
license: apache-2.0
tags:
- mlx
- vision
- multimodal
base_model: janhq/Jan-v2-VL-med
---

# Jan-v2-VL-med BF16 MLX

This is a BF16 (bfloat16) precision MLX conversion of [janhq/Jan-v2-VL-med](https://huggingface.co/janhq/Jan-v2-VL-med).

## Model Description

Jan-v2-VL is an 8-billion parameter vision-language model designed for long-horizon, multi-step tasks in real software environments. This "med" variant provides a balanced trade-off between inference speed and reasoning capability, offering strong performance for agentic automation and UI control tasks.

**Key Features:**
- Vision-language understanding for browser and desktop applications
- Screenshot grounding and tool call capabilities
- Stable multi-step execution with minimal performance drift
- Error recovery and intermediate state maintenance

## Precision

This model was converted to MLX format with bfloat16 precision (no quantization) using [MLX-VLM](https://github.com/Blaizzy/mlx-vlm) by Prince Canuma. BF16 provides near full-precision quality with reduced memory footprint.

**Conversion command:**
```bash
mlx_vlm.convert --hf-path janhq/Jan-v2-VL-med --dtype bfloat16 --mlx-path Jan-v2-VL-med-bf16-mlx
```

## Usage

### Installation

```bash
pip install mlx-vlm
```

### Python

```python
from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config

# Load the model
model_path = "mlx-community/Jan-v2-VL-med-bf16-mlx"
model, processor = load(model_path)
config = load_config(model_path)

# Prepare input
image = ["path/to/image.jpg"]
prompt = "Describe this image."

# Apply chat template
formatted_prompt = apply_chat_template(
    processor, config, prompt, num_images=len(image)
)

# Generate output
output = generate(model, processor, formatted_prompt, image, verbose=False)
print(output)
```

### Command Line

```bash
mlx_vlm.generate --model mlx-community/Jan-v2-VL-med-bf16-mlx --max-tokens 100 --prompt "Describe this image" --image path/to/image.jpg
```

## Intended Use

This model is designed for:
- Agentic automation and UI control
- Stepwise operation in browsers and desktop applications
- Screenshot grounding and tool calls
- Long-horizon multi-step task execution

## License

This model is released under the Apache 2.0 license.

## Original Model

For more information, please refer to the original model: [janhq/Jan-v2-VL-med](https://huggingface.co/janhq/Jan-v2-VL-med)

## Acknowledgments

- Original model by [Jan](https://huggingface.co/janhq)
- [MLX](https://github.com/ml-explore/mlx) framework by Apple
- MLX conversion framework by [Prince Canuma](https://github.com/Blaizzy/mlx-vlm)
- Model conversion by [Incept5](https://incept5.ai)