deepseek-ai
/

DeepSeek-OCR

@@ -8,7 +8,6 @@ tags:
 - ocr
 - custom_code
 license: mit
-library_name: transformers
 ---
 <div align="center">
   <img src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true" width="60%" alt="DeepSeek AI" />
@@ -41,18 +40,18 @@ library_name: transformers
   <a href="https://github.com/deepseek-ai/DeepSeek-OCR"><b>🌟 Github</b></a> |
   <a href="https://huggingface.co/deepseek-ai/DeepSeek-OCR"><b>📥 Model Download</b></a> |
   <a href="https://github.com/deepseek-ai/DeepSeek-OCR/blob/main/DeepSeek_OCR_paper.pdf"><b>📄 Paper Link</b></a> |
-  <a href="https://arxiv.org/abs/2510.18234"><b>📄 Arxiv Paper Link</b></a> |
 </p>
 <h2>
 <p align="center">
-  <a href="https://huggingface.co/papers/2510.18234">DeepSeek-OCR: Contexts Optical Compression</a>
 </p>
 </h2>
 <p align="center">
 <img src="assets/fig1.png" style="width: 1000px" align=center>
 </p>
 <p align="center">
-<a href="https://huggingface.co/papers/2510.18234">Explore the boundaries of visual-text compression.</a>
 </p>
 ## Usage
@@ -99,63 +98,6 @@ res = model.infer(tokenizer, prompt=prompt, image_file=image_file, output_path =
 ## vLLM
 Refer to [🌟GitHub](https://github.com/deepseek-ai/DeepSeek-OCR/) for guidance on model inference acceleration and PDF processing, etc.<!--  -->
-[2025/10/23] 🚀🚀🚀 DeepSeek-OCR is now officially supported in upstream [vLLM](https://docs.vllm.ai/projects/recipes/en/latest/DeepSeek/DeepSeek-OCR.html#installing-vllm).
-```shell
-uv venv
-source .venv/bin/activate
-# Until v0.11.1 release, you need to install vLLM from nightly build
-uv pip install -U vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
-```
-```python
-from vllm import LLM, SamplingParams
-from vllm.model_executor.models.deepseek_ocr import NGramPerReqLogitsProcessor
-from PIL import Image
-# Create model instance
-llm = LLM(
-    model="deepseek-ai/DeepSeek-OCR",
-    enable_prefix_caching=False,
-    mm_processor_cache_gb=0,
-    logits_processors=[NGramPerReqLogitsProcessor]
-)
-# Prepare batched input with your image file
-image_1 = Image.open("path/to/your/image_1.png").convert("RGB")
-image_2 = Image.open("path/to/your/image_2.png").convert("RGB")
-prompt = "<image>\nFree OCR."
-model_input = [
-    {
-        "prompt": prompt,
-        "multi_modal_data": {"image": image_1}
-    },
-    {
-        "prompt": prompt,
-        "multi_modal_data": {"image": image_2}
-    }
-]
-sampling_param = SamplingParams(
-            temperature=0.0,
-            max_tokens=8192,
-            # ngram logit processor args
-            extra_args=dict(
-                ngram_size=30,
-                window_size=90,
-                whitelist_token_ids={128821, 128822},  # whitelist: <td>, </td>
-            ),
-            skip_special_tokens=False,
-        )
-# Generate output
-model_outputs = llm.generate(model_input, sampling_param)
-# Print output
-for output in model_outputs:
-    print(output.outputs[0].text)
-```
 ## Visualizations
 <table>
 <tr>
@@ -177,10 +119,4 @@ We also appreciate the benchmarks: [Fox](https://github.com/ucaslcl/Fox), [Omini
 ## Citation
-```bibtex
-@article{wei2025deepseek,
-  title={DeepSeek-OCR: Contexts Optical Compression},
-  author={Wei, Haoran and Sun, Yaofeng and Li, Yukun},
-  journal={arXiv preprint arXiv:2510.18234},
-  year={2025}
-}

 - ocr
 - custom_code
 license: mit
 ---
 <div align="center">
   <img src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true" width="60%" alt="DeepSeek AI" />
   <a href="https://github.com/deepseek-ai/DeepSeek-OCR"><b>🌟 Github</b></a> |
   <a href="https://huggingface.co/deepseek-ai/DeepSeek-OCR"><b>📥 Model Download</b></a> |
   <a href="https://github.com/deepseek-ai/DeepSeek-OCR/blob/main/DeepSeek_OCR_paper.pdf"><b>📄 Paper Link</b></a> |
+  <a href=""><b>📄 Arxiv Paper Link</b></a> |
 </p>
 <h2>
 <p align="center">
+  <a href="">DeepSeek-OCR: Contexts Optical Compression</a>
 </p>
 </h2>
 <p align="center">
 <img src="assets/fig1.png" style="width: 1000px" align=center>
 </p>
 <p align="center">
+<a href="">Explore the boundaries of visual-text compression.</a>
 </p>
 ## Usage
 ## vLLM
 Refer to [🌟GitHub](https://github.com/deepseek-ai/DeepSeek-OCR/) for guidance on model inference acceleration and PDF processing, etc.<!--  -->
 ## Visualizations
 <table>
 <tr>
 ## Citation
+Coming soon!