Add handler.py to resolve “no handler.py file found” deployment error
Browse files**Title:** Add handler.py to resolve “no handler.py file found” deployment error
### Why
Deploying **allenai/olmOCR‑7B‑0725** on Hugging Face Inference Endpoints failed because the repository lacked a `handler.py`. A handler is required so the service knows how to run inference.

### What changed
* **Added `handler.py`**
  * Loads the model and processor from `allenai/olmOCR‑7B‑0725`
  * Accepts an image under the `inputs` key
  * Reads `max_new_tokens` from `data["parameters"]` and defaults to `256` if not supplied
  * Returns the OCR text in `generated_text`
- handler.py +33 -0
    	
        handler.py
    ADDED
    
    | @@ -0,0 +1,33 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            from typing import Any
         | 
| 2 | 
            +
            import torch
         | 
| 3 | 
            +
            from transformers import AutoModelForSeq2SeqLM, AutoProcessor
         | 
| 4 | 
            +
             | 
| 5 | 
            +
            class EndpointHandler:
         | 
| 6 | 
            +
                """
         | 
| 7 | 
            +
                Handler for allenai/olmOCR-7B-0725
         | 
| 8 | 
            +
             | 
| 9 | 
            +
                Input:
         | 
| 10 | 
            +
                  {
         | 
| 11 | 
            +
                    "inputs": <PIL.Image | base64 str | URL>,
         | 
| 12 | 
            +
                    "parameters": {"max_new_tokens": <int, optional>}
         | 
| 13 | 
            +
                  }
         | 
| 14 | 
            +
             | 
| 15 | 
            +
                Output: {"generated_text": <str>}
         | 
| 16 | 
            +
                """
         | 
| 17 | 
            +
             | 
| 18 | 
            +
                def __init__(self, path: str = "") -> None:
         | 
| 19 | 
            +
                    self.device = "cuda" if torch.cuda.is_available() else "cpu"
         | 
| 20 | 
            +
                    model_path = path or "allenai/olmOCR-7B-0725"
         | 
| 21 | 
            +
                    self.processor = AutoProcessor.from_pretrained(model_path)
         | 
| 22 | 
            +
                    self.model = AutoModelForSeq2SeqLM.from_pretrained(model_path).to(self.device)
         | 
| 23 | 
            +
             | 
| 24 | 
            +
                def __call__(self, data: dict) -> Any:
         | 
| 25 | 
            +
                    image = data.get("inputs")
         | 
| 26 | 
            +
                    params = data.get("parameters", {})
         | 
| 27 | 
            +
                    max_tokens = params.get("max_new_tokens", 256)
         | 
| 28 | 
            +
             | 
| 29 | 
            +
                    inputs = self.processor(images=image, return_tensors="pt").to(self.device)
         | 
| 30 | 
            +
                    ids = self.model.generate(**inputs, max_new_tokens=max_tokens)
         | 
| 31 | 
            +
                    text = self.processor.batch_decode(ids, skip_special_tokens=True)[0]
         | 
| 32 | 
            +
             | 
| 33 | 
            +
                    return {"generated_text": text}
         | 

