Open Legal Data Collection A collection of our favorite open-source legal datasets on Hugging Face. • 2 items • Updated about 16 hours ago • 4
view article Article Australian-made LLM beats OpenAI and Google at legal retrieval By isaacus and 2 others • 9 days ago • 25
view article Article How I Built Lightning-Fast Vector Search for Legal Documents By adlumal • 12 days ago • 14
view article Article Introducing the Massive Legal Embedding Benchmark (MLEB) By isaacus and 2 others • 15 days ago • 18
Should We Still Pretrain Encoders with Masked Language Modeling? Paper • 2507.00994 • Published Jul 1 • 78
view article Article Training and Finetuning Sparse Embedding Models with Sentence Transformers v5 Jul 1 • 126
Zeroshot Classifiers Collection These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 12 items • Updated Jan 6 • 146
view article Article Multi-Label Classification Model From Scratch: Step-by-Step Tutorial By Valerii-Knowledgator • Jan 8, 2024 • 47
SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain Paper • 2407.19584 • Published Jul 28, 2024 • 66
Tajik Datasets Collection Datasets that have tajik subset or entirely tajik • 13 items • Updated Feb 20 • 4
Open Australian Legal Models Collection A collection of open source Australian legal language models • 6 items • Updated Jun 15, 2024 • 1
Open Australian Legal Data Collection A collection of open source Australian legal datasets • 3 items • Updated Jun 15, 2024 • 5