Zhaofeng Wu
commited on
Commit
·
a91e396
1
Parent(s):
dd0257a
readme
Browse files
README.md
CHANGED
|
@@ -1,3 +1,17 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
---
|
| 4 |
+
|
| 5 |
+
Pretrained models for our paper (https://arxiv.org/abs/2210.08431)
|
| 6 |
+
```bibtex
|
| 7 |
+
@inproceedings{wu-etal-2022-modeling,
|
| 8 |
+
title = "Modeling Context With Linear Attention for Scalable Document-Level Translation",
|
| 9 |
+
author = "Zhaofeng Wu and Hao Peng and Nikolaos Pappas and Noah A. Smith",
|
| 10 |
+
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2022",
|
| 11 |
+
month = dec,
|
| 12 |
+
year = "2022",
|
| 13 |
+
publisher = "Association for Computational Linguistics",
|
| 14 |
+
}
|
| 15 |
+
```
|
| 16 |
+
|
| 17 |
+
Please see the "Files and versions" tab for the models. You can find our IWSLT models and our OpenSubtitles models that are early-stopped based on BLEU and consistency scores, respectively. The `c` part in the checkpoint name refers to the number of context sentences used; it is the same as the sliding window size (the `L` in our paper) minus one.
|