arxiv:2402.11485
Ryokan Ri
ryo0634
AI & ML interests
Multilingual NLP, Pretrained Language Models, Information Retrieval
Organizations
models 21
ryo0634/TinySwallow-1.5B-Math-DPO
Text Generation • 2B • Updated
• 3
ryo0634/TinySwallow-1.5B-Math-SFT
Text Generation • 2B • Updated
• 1
ryo0634/Swallow-7b-hf-oasst1-21k-ja-alert-dpo-100-steps-beta-2e-1
Text Generation • 7B • Updated
• 1
ryo0634/Swallow-7b-hf-oasst1-21k-ja-alert-dpo-100-steps-beta-1e-1
Text Generation • 7B • Updated
• 1
ryo0634/Swallow-7b-hf-oasst1-21k-ja-hh-rlhf-12k-ja-200-steps
Text Generation • 7B • Updated
• 1
ryo0634/Swallow-7b-hf-oasst1-21k-ja-hh-rlhf-12k-ja-safety-150-steps
Text Generation • 7B • Updated
• 2
ryo0634/Swallow-7b-hf-oasst1-21k-ja-hh-rlhf-12k-ja-100-steps
Text Generation • 7B • Updated
• 1
ryo0634/Swallow-7b-hf-oasst1-21k-ja-aio-retriever-200-steps
Text Generation • 7B • Updated
• 1
ryo0634/Swallow-7b-hf-oasst1-21k-ja-hh-rlhf-12k-ja
Text Generation • 7B • Updated
• 1
ryo0634/Swallow-7b-plus-hf-oasst1-21k-ja
Text Generation • 7B • Updated
• 2
datasets 22
ryo0634/gsm8k-ja-noisy-dpo-on-policy-4
Viewer
• Updated
• 890 • 4
ryo0634/gsm8k-ja-noisy-dpo-on-policy-3
Viewer
• Updated
• 900 • 4
ryo0634/gsm8k-ja-noisy-dpo-on-policy
Viewer
• Updated
• 706 • 3
ryo0634/gsm8k-ja-noisy-dpo-on-policy-2
Viewer
• Updated
• 1.07k • 2
ryo0634/gsm8k-ja-noisy-dpo
Viewer
• Updated
• 1k • 2
ryo0634/gsm8k-ja-noisy-sft
Viewer
• Updated
• 1k • 8
ryo0634/gsm8k-ja-filtered-dev
Viewer
• Updated
• 400 • 20
ryo0634/gsm8k-ja-filtered-sft
Viewer
• Updated
• 3k • 22
ryo0634/math-short-thought-filtered
Viewer
• Updated
• 757 • 4
ryo0634/math-thought-filtered
Viewer
• Updated
• 923 • 4