Tokenisation-Bias
					Collection
				
				21 items
				โข 
				Updated
					
				
Tokenisers trained on the MiniPile. The _raw_tokenisers folder contains the original tokenisers trained with a vocabulary size of 320k. Then, each folder is a transformers-compatible tokeniser of a smaller size.