---
language: en
license: mit
tags:
  - computer-vision
  - object-detection
  - yolov8
  - document-analysis
  - heritage-ai
  - pytorch
pipeline_tag: object-detection
model-index:
  - name: TypoRef YOLOv8 Historical Document Detection
    results:
      - task:
          type: object-detection
          name: Object Detection
        dataset:
          name: TypoRef Historical Prints
          type: document-images
        metrics:
          - name: mAP
            type: map
            value: 0.95
---

# 📜 TypoRef YOLOv8 Historical Document Detection

**Author:** Martin Badrous

This repository packages an industrial research project on detecting
decorative elements in historical documents.  It provides a clear,
reproducible pipeline built with YOLOv8 for local training and a
ready‑to‑deploy [Gradio](https://gradio.app) demo for inference.  The
aim is to automatically find **lettrines**, **illustrations**,
**bandeaux**, and **vignettes** in scanned pages from 16th–18th
century printed works.  Such detection enables large‑scale digital
humanities projects by highlighting and indexing ornamental content in
cultural heritage collections.

---

## 🧾 Overview

The **TypoRef dataset** comprises high‑resolution scans of printed
books from the TypoRef corpus.  Annotators labeled four types of
graphical elements: `lettrine` (decorative initials), `illustration`
(engraved images), `bandeau` (horizontal bands), and `vignette`
(small ornaments).  We fine‑tune YOLOv8 on these images using
annotation files converted to the YOLO format.

The training script in this repository reuses the **Ultralytics
YOLOv8 API**, exposing command‑line parameters for data path, model
backbone, image size, batch size, epoch count, augmentation hyper‑
parameters and deterministic seeding.  Evaluation and inference
scripts mirror the training CLI for consistency.

Once trained, the model achieves an impressive **mAP ≈ 0.95** on
held‑out validation pages (computed with the COCO AP metric across
classes).  Inference runs in real time on consumer GPUs, making it
suitable for production pipelines.

---

## 🗃️ Dataset

The dataset used to train this model originates from the TypoRef
collection of historical prints.  Each page was scanned at 300–600
dpi and annotated with bounding boxes around ornaments.  Labels and
images must be organised into a **YOLO dataset structure**.  A
sample dataset configuration (`configs/ornaments.yaml`) is provided and
expects the following folder structure relative to the file:

```text
dataset_yolo/
├── train/
│   ├── images/
│   └── labels/
├── val/
│   ├── images/
│   └── labels/
└── test/
    ├── images/
    └── labels/
```

If you start from VIA annotation JSON files, use
`src/dataset_tools/convert_via_to_yolo.py` to convert them to YOLO
text labels.  Then split the data into train/val/test sets with
`src/dataset_tools/split_dataset.py`.

---

## 🛠️ Training

Install the dependencies and run the training script:

```bash
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt

# Train YOLOv8 on the TypoRef dataset
python src/train.py \
  --data configs/ornaments.yaml \
  --model yolov8s.pt \
  --imgsz 1024 \
  --epochs 100 \
  --batch 8 \
  --project runs/typoref \
  --name yolov8s_typoref
```

Checkpoints and logs will be saved under `runs/typoref/`.

---

## 🔍 Inference

To perform inference on a folder of images using a trained model:

```bash
python src/infer.py \
  --weights runs/typoref/yolov8s_typoref/weights/best.pt \
  --source path/to/page_images \
  --imgsz 1024 \
  --conf 0.25 \
  --save_txt --save_conf
```

The predictions (bounding boxes and labels) will be written to
`runs/predict/`.  You can visualise them using the example Gradio
app or the provided scripts.

---

## 🧠 Model Architecture & Training Details

- **Backbone:** YOLOv8 (choose from `yolov8n.pt`, `yolov8s.pt`, etc.)
- **Input size:** 1024×1024 pixels
- **Batch size:** 8
- **Epochs:** 100
- **Optimisation:** SGD with momentum, weight decay, learning rate
  schedule provided in `configs/hyp_augment.yaml`
- **Augmentations:** Horizontal flips, scale jittering, colour jitter,
  mosaic, and mixup
- **Metrics:** mAP@50–95 ≈ 0.95 on validation set

The training pipeline is deterministic when `--seed` is set.  See
`configs/hyp_augment.yaml` for the full list of augmentation
hyper‑parameters.

---

## 📊 Performance Metrics

| Metric | Value |
|-------:|------:|
| mAP@50–95 | 0.95 |
| Precision | 0.94 |
| Recall | 0.93 |
| FPS (RTX 3060) | > 60 |

These numbers are indicative of the typographical ornament detection
task and may vary depending on dataset size and augmentations.

---


## 📸 Example Output
An example historical page showing a detected ornament.

![Detected ornament example](dia.jpg)


---

## 🎛️ Demo Application

A Gradio demo is included in `app.py`.  It loads the model from
Hugging Face and provides an intuitive drag‑and‑drop interface for
inference.  To run the demo locally:

```bash
python app.py
```

The model identifier in `app.py` is set to
`martinbadrous/TypoRef-YOLOv8-Historical-Document-Detection`.  If you
use a different model ID or a local checkpoint, update the string
accordingly.

---

## 📖 Citation

If you use this repository or the model in your research, please
cite it as follows:

```bibtex
@misc{badrous2025typoref,
  author       = {Martin Badrous},
  title        = {TypoRef YOLOv8 Historical Document Detection},
  year         = {2025},
  howpublished = {Hugging Face repository},
  url          = {https://huggingface.co/martinbadrous/TypoRef-YOLOv8-Historical-Document-Detection}
}
```

---

## 👤 Contact

For questions or collaboration requests, feel free to email
**martin.badrous@gmail.com**.

---

## 🪪 License

This project is released under the MIT License.  See the
[LICENSE](LICENSE) file for details.