File size: 5,531 Bytes
b939164
486a590
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
# Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs

[[🏠 Homepage](https://open-bee.github.io/)] [[πŸ“– Arxiv Paper(coming soon)](https://www.google.com/search?q=)] [[πŸ€— Models](https://huggingface.co/collections/Open-Bee/bee-8b-68ecbf10417810d90fbd9995)] [[πŸ€— Datasets(coming soon)](https://www.google.com/search?q=)] [[πŸ’» Code(coming soon)](https://github.com/Open-Bee)]

## Introduction

We introduce **Bee-8B**, a new state-of-the-art, fully open 8B Multimodal Large Language Model (MLLM) designed to close the performance gap with proprietary models by focusing on data quality.

Bee-8B is trained on our new **Honey-Data-15M** corpus, a high-quality supervised fine-tuning (SFT) dataset of approximately 15 million samples. This dataset was meticulously created with our transparent, adaptable, and open-source data curation pipeline, **HoneyPipe**, which systematically cleans noisy data and enriches it with a novel dual-level (short and long) Chain-of-Thought (CoT) strategy.

This dataset enables Bee-8B to achieve exceptional performance, particularly in complex reasoning and factual accuracy, establishing a new standard for fully open MLLMs.

## Key Features

  - **High-Quality, Large-Scale Dataset:** We release **Honey-Data-15M**, a new 15M-sample SFT corpus. It has undergone extensive cleaning to remove widespread noise and has been enriched with dual-level CoT reasoning to enhance advanced problem-solving capabilities.
  - **Fully Open-Source Data Curation Suite:** We provide not just the data, but the entire methodology. **HoneyPipe** and its underlying framework **DataStudio** offer the community a transparent and reproducible pipeline, moving beyond static dataset releases.
  - **State-of-the-Art Open Model:** Our model, **Bee-8B**, achieves state-of-the-art performance among fully open MLLMs and is highly competitive with recent semi-open models like InternVL3.5-8B, demonstrating the power of high-quality data.

## News

  - **[2025.10.13]** πŸŽ‰ **Bee-8B is Released\!** Our model is now publicly available. You can download it from [Hugging Face](https://huggingface.co/collections/Open-Bee/bee-8b-68ecbf10417810d90fbd9995).
  - **[2025.10.13]** 🐝 We are thrilled to release the paper and code for the Open-Bee project\! More resources, including the Honey-Data-15M dataset, are coming soon.

## Quickstart

Below, we provide simple examples to show how to use Bee-8B with πŸ€— Transformers.

### Using πŸ€— Transformers to Chat

```python
import requests
from PIL import Image
from transformers import AutoModel, AutoProcessor

model_path = "Open-Bee/Bee-8B-RL"

# Load model
model = AutoModel.from_pretrained(
    model_path,
    torch_dtype="float32",
    trust_remote_code=True,
).to("cuda")

# Load processor
processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=True)

# Define conversation messages
messages = [{
    "role":
    "user",
    "content": [
        {
            "type": "image",
            "image": "https://huggingface.co/Open-Bee/Bee-8B-RL/assets/logo.png",
        },
        {
            "type": "text",
            "text": "Based on this picture, write an advertising slogan about Bee-8B (a Fully Open Multimodal Large Language Model)."
        },
    ],
}]

# Apply chat template
text = processor.apply_chat_template(messages,
                                     tokenize=False,
                                     add_generation_prompt=True,
                                     enable_thinking=True)

# Load image
image_url = "https://huggingface.co/Open-Bee/Bee-8B-RL/assets/logo.png"
image = Image.open(requests.get(image_url, stream=True).raw)

# Process inputs
inputs = processor(images=image, text=text, return_tensors="pt").to("cuda")

# Generate output
generated_ids = model.generate(**inputs, max_new_tokens=16384, temperature=0.6)
output_ids = generated_ids[0][len(inputs.input_ids[0]):]

# Decode output
output_text = processor.decode(output_ids, skip_special_tokens=True)

# Print result
print(output_text)
```

## Experimental Results

<div align="center">
<img src="assets/results.png" alt="logo"/> 
</div>

1.  **New State-of-the-Art:** Bee-8B establishes a new performance standard for fully open MLLMs, proving highly competitive with recent semi-open models across a wide array of benchmarks.
2.  **Excellence in Complex Reasoning:** Thanks to the CoT-enriched Honey-Data-15M, Bee-8B shows its most significant advancements in complex math and reasoning. It achieves top scores on challenging benchmarks like **MathVerse**, **LogicVista**, and **DynaMath**.
3.  **Superior Document and Chart Understanding:** The model demonstrates powerful capabilities in analyzing structured visual data, securing the top rank on the **CharXiv** benchmark for both descriptive and reasoning questions.
4.  **Robust Factual Accuracy:** Bee-8B excels in general VQA tasks that test factual accuracy and core visual skills, ranking first on benchmarks like **CountBench** and **RealWorldQA**.


## Acknowledgements

Bee-8B is developed based on the architectures and codebases of the following projects: [R-4B](https://huggingface.co/YannQi/R-4B), [LLaVA-OneVision](https://github.com/LLaVA-VL/LLaVA-NeXT), [SigLIP2](https://huggingface.co/google/siglip2-so400m-patch14-384), [Qwen3](https://github.com/QwenLM/Qwen3), and evaluated using [VLMEvalKit](https://github.com/open-compass/VLMEvalKit). We sincerely thank these projects for their outstanding contributions to the open-source community.