Sliding Window Attention Adaptation yuyijiong/Qwen3-SWA-adaptation Text Generation • Updated Dec 17, 2025 • 5 yuyijiong/fusang-v1-filtered Viewer • Updated Jan 7 • 15.9k • 21 Sliding Window Attention Adaptation Paper • 2512.10411 • Published Dec 11, 2025 • 21
train_with_paraphrasing [long-context models trained with "original text paraphrasing" dataset](https://github.com/yuyijiong/train_with_paraphrasing) yuyijiong/Qwen-14b-chat-yarn-32k Text Generation • 14B • Updated Jun 7, 2024 • 8 • 21 yuyijiong/Llama3-8B-Chinese-Chat-32k Text Generation • 8B • Updated Jun 19, 2024 • 4 • 3 yuyijiong/Qwen1.5-4b-chat-paraph Text Generation • 4B • Updated Jun 7, 2024 • 2 yuyijiong/Qwen2-7b-Instruct-paraph Text Generation • 8B • Updated Jun 28, 2024 • 2
Sliding Window Attention Adaptation yuyijiong/Qwen3-SWA-adaptation Text Generation • Updated Dec 17, 2025 • 5 yuyijiong/fusang-v1-filtered Viewer • Updated Jan 7 • 15.9k • 21 Sliding Window Attention Adaptation Paper • 2512.10411 • Published Dec 11, 2025 • 21
train_with_paraphrasing [long-context models trained with "original text paraphrasing" dataset](https://github.com/yuyijiong/train_with_paraphrasing) yuyijiong/Qwen-14b-chat-yarn-32k Text Generation • 14B • Updated Jun 7, 2024 • 8 • 21 yuyijiong/Llama3-8B-Chinese-Chat-32k Text Generation • 8B • Updated Jun 19, 2024 • 4 • 3 yuyijiong/Qwen1.5-4b-chat-paraph Text Generation • 4B • Updated Jun 7, 2024 • 2 yuyijiong/Qwen2-7b-Instruct-paraph Text Generation • 8B • Updated Jun 28, 2024 • 2
yuyijiong/segformer-b5-remote-sensing-quality Image Segmentation • Updated May 24, 2024 • 7 • 4
yuyijiong/longalign_distill_Qwen3-4B-Thinking-2507_roll8_filtered_multianswer Preview • Updated Sep 23, 2025 • 6