π TypoRef YOLOv8 Historical Document Detection
Author: Martin Badrous
This repository packages an industrial research project on detecting decorative elements in historical documents. It provides a clear, reproducible pipeline built with YOLOv8 for local training and a readyβtoβdeploy Gradio demo for inference. The aim is to automatically find lettrines, illustrations, bandeaux, and vignettes in scanned pages from 16thβ18th century printed works. Such detection enables largeβscale digital humanities projects by highlighting and indexing ornamental content in cultural heritage collections.
π§Ύ Overview
The TypoRef dataset comprises highβresolution scans of printed
books from the TypoRef corpus. Annotators labeled four types of
graphical elements: lettrine (decorative initials), illustration
(engraved images), bandeau (horizontal bands), and vignette
(small ornaments). We fineβtune YOLOv8 on these images using
annotation files converted to the YOLO format.
The training script in this repository reuses the Ultralytics YOLOv8 API, exposing commandβline parameters for data path, model backbone, image size, batch size, epoch count, augmentation hyperβ parameters and deterministic seeding. Evaluation and inference scripts mirror the training CLI for consistency.
Once trained, the model achieves an impressive mAP β 0.95 on heldβout validation pages (computed with the COCO AP metric across classes). Inference runs in real time on consumer GPUs, making it suitable for production pipelines.
ποΈ Dataset
The dataset used to train this model originates from the TypoRef
collection of historical prints. Each page was scanned at 300β600
dpi and annotated with bounding boxes around ornaments. Labels and
images must be organised into a YOLO dataset structure. A
sample dataset configuration (configs/ornaments.yaml) is provided and
expects the following folder structure relative to the file:
dataset_yolo/
βββ train/
β βββ images/
β βββ labels/
βββ val/
β βββ images/
β βββ labels/
βββ test/
βββ images/
βββ labels/
If you start from VIA annotation JSON files, use
src/dataset_tools/convert_via_to_yolo.py to convert them to YOLO
text labels. Then split the data into train/val/test sets with
src/dataset_tools/split_dataset.py.
π οΈ Training
Install the dependencies and run the training script:
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
# Train YOLOv8 on the TypoRef dataset
python src/train.py \
--data configs/ornaments.yaml \
--model yolov8s.pt \
--imgsz 1024 \
--epochs 100 \
--batch 8 \
--project runs/typoref \
--name yolov8s_typoref
Checkpoints and logs will be saved under runs/typoref/.
π Inference
To perform inference on a folder of images using a trained model:
python src/infer.py \
--weights runs/typoref/yolov8s_typoref/weights/best.pt \
--source path/to/page_images \
--imgsz 1024 \
--conf 0.25 \
--save_txt --save_conf
The predictions (bounding boxes and labels) will be written to
runs/predict/. You can visualise them using the example Gradio
app or the provided scripts.
π§ Model Architecture & Training Details
- Backbone: YOLOv8 (choose from
yolov8n.pt,yolov8s.pt, etc.) - Input size: 1024Γ1024 pixels
- Batch size: 8
- Epochs: 100
- Optimisation: SGD with momentum, weight decay, learning rate
schedule provided in
configs/hyp_augment.yaml - Augmentations: Horizontal flips, scale jittering, colour jitter, mosaic, and mixup
- Metrics: mAP@50β95 β 0.95 on validation set
The training pipeline is deterministic when --seed is set. See
configs/hyp_augment.yaml for the full list of augmentation
hyperβparameters.
π Performance Metrics
| Metric | Value |
|---|---|
| mAP@50β95 | 0.95 |
| Precision | 0.94 |
| Recall | 0.93 |
| FPS (RTX 3060) | > 60 |
These numbers are indicative of the typographical ornament detection task and may vary depending on dataset size and augmentations.
πΈ Example Output
An example historical page showing a detected ornament.
ποΈ Demo Application
A Gradio demo is included in app.py. It loads the model from
Hugging Face and provides an intuitive dragβandβdrop interface for
inference. To run the demo locally:
python app.py
The model identifier in app.py is set to
martinbadrous/TypoRef-YOLOv8-Historical-Document-Detection. If you
use a different model ID or a local checkpoint, update the string
accordingly.
π Citation
If you use this repository or the model in your research, please cite it as follows:
@misc{badrous2025typoref,
author = {Martin Badrous},
title = {TypoRef YOLOv8 Historical Document Detection},
year = {2025},
howpublished = {Hugging Face repository},
url = {https://huggingface.co/martinbadrous/TypoRef-YOLOv8-Historical-Document-Detection}
}
π€ Contact
For questions or collaboration requests, feel free to email [email protected].
πͺͺ License
This project is released under the MIT License. See the LICENSE file for details.
