Research material on research about pre-training encoders, with extensive comparison on masked language modeling paradigm vs causal langage modeling.
-
Should We Still Pretrain Encoders with Masked Language Modeling?
Paper • 2507.00994 • Published • 79 -
MLMvsCLM/610m-clm-40k-mlm50-42k
Feature Extraction • Updated • 12 -
MLMvsCLM/1b-mlm50-42k
Feature Extraction • Updated • 10 -
MLMvsCLM/610m-clm-42k-5000
Feature Extraction • Updated • 7