fohra commited on
Commit
d01412d
·
1 Parent(s): b37539e
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ *.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -15,4 +15,68 @@ pipeline_tag: image-segmentation
15
  tags:
16
  - text-line-detection
17
  - text-region-detection
18
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  tags:
16
  - text-line-detection
17
  - text-region-detection
18
+ - document-analysis
19
+ - historical-documents
20
+ - handwritten-text
21
+ - rf-detr
22
+ - instance-segmentation
23
+ ---
24
+
25
+ # RF-DETR Seg-Preview: Historical Document Instance Segmentation
26
+
27
+ This model is trained to detect and segment text lines and text regions from historical handwritten documents spanning from the 16th to the 20th century.
28
+
29
+ ## Model Description
30
+
31
+ RF-DETR Seg-Preview is an instance segmentation model based on the RF-DETR architecture. It predicts:
32
+ - Bounding boxes for text elements
33
+ - Class labels (text_region or text_line)
34
+ - Instance segmentation masks
35
+
36
+ ### Classes
37
+
38
+ The model detects two classes:
39
+ - **text_region** (index: 1) - Larger regions of text content
40
+ - **text_line** (index: 2) - Individual lines of text
41
+
42
+ ## Training Data
43
+
44
+ The model was trained on historical handwritten documents with the following data distribution:
45
+ - **Training set**: 11,495 images
46
+ - **Validation set**: 2,711 images
47
+ - **Test set**: 2,340 images
48
+
49
+ ## Performance Metrics
50
+
51
+ ### Validation Set Performance
52
+
53
+ | Class | mAP@50:95 | mAP@50 | Precision | Recall |
54
+ |-------|-----------|--------|-----------|--------|
55
+ | text_region | 0.822 | 0.963 | 0.949 | 0.940 |
56
+ | text_line | 0.621 | 0.936 | 0.957 | 0.940 |
57
+ | **Overall** | **0.721** | **0.950** | **0.953** | **0.940** |
58
+
59
+ ### Test Set Performance
60
+
61
+ | Class | mAP@50:95 | mAP@50 | Precision | Recall |
62
+ |-------|-----------|--------|-----------|--------|
63
+ | text_region | 0.822 | 0.959 | 0.949 | 0.940 |
64
+ | text_line | 0.688 | 0.955 | 0.978 | 0.940 |
65
+ | **Overall** | **0.755** | **0.957** | **0.964** | **0.940** |
66
+
67
+ ## Training Metrics
68
+
69
+ ![Training Metrics](metrics_plot.png)
70
+
71
+ ## Use Cases
72
+
73
+ This model is particularly suitable for:
74
+ - Text line detection for OCR preprocessing
75
+ - Document digitization projects involving historical manuscripts
76
+ - Historical document understanding and analysis
77
+
78
+ ## Limitations
79
+
80
+ - The model is specifically trained on historical handwritten documents (16th-20th century)
81
+ - Performance may vary on modern printed documents or documents outside the training distribution
82
+ - Performance depends on image quality and document preservation state
metrics_plot.png ADDED

Git LFS Details

  • SHA256: 75d539ef8a7cc1214b84e8f5eb470ae9860283c548af481fe41167d350a84889
  • Pointer size: 131 Bytes
  • Size of remote file: 218 kB
results.json ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "class_map": {
3
+ "valid": [
4
+ {
5
+ "class": "text_region",
6
+ "map@50:95": 0.8215820411796119,
7
+ "map@50": 0.9632309591819967,
8
+ "precision": 0.9491835242505484,
9
+ "recall": 0.9400000000000001
10
+ },
11
+ {
12
+ "class": "text_line",
13
+ "map@50:95": 0.6208253598742672,
14
+ "map@50": 0.9359343947812567,
15
+ "precision": 0.9570004372001946,
16
+ "recall": 0.9400000000000001
17
+ },
18
+ {
19
+ "class": "all",
20
+ "map@50:95": 0.7212037005269396,
21
+ "map@50": 0.9495826769816268,
22
+ "precision": 0.9530919807253715,
23
+ "recall": 0.9400000000000001
24
+ }
25
+ ],
26
+ "test": [
27
+ {
28
+ "class": "text_region",
29
+ "map@50:95": 0.8223769265055245,
30
+ "map@50": 0.9588798259965857,
31
+ "precision": 0.9490022172949002,
32
+ "recall": 0.9400000000000001
33
+ },
34
+ {
35
+ "class": "text_line",
36
+ "map@50:95": 0.6879523939826719,
37
+ "map@50": 0.954879362766605,
38
+ "precision": 0.9780753334901381,
39
+ "recall": 0.9400000000000001
40
+ },
41
+ {
42
+ "class": "all",
43
+ "map@50:95": 0.7551646602440983,
44
+ "map@50": 0.9568795943815954,
45
+ "precision": 0.9635387753925191,
46
+ "recall": 0.9400000000000001
47
+ }
48
+ ]
49
+ },
50
+ "map": 0.9495826769816268,
51
+ "precision": 0.9530919807253715,
52
+ "recall": 0.9400000000000001
53
+ }
rfdetr_text_seg_model_202510.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2bc935e9e0f668254976fe92f6811a4fe9f28984cc3edcc3d6d9bfd21217d541
3
+ size 135572819