DimensionSTP commited on
Commit
97039ff
·
1 Parent(s): 5fb3257

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +120 -0
README.md ADDED
@@ -0,0 +1,120 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - ko
5
+ - en
6
+ tags:
7
+ - korean
8
+ - reasoning
9
+ - instruction-tuning
10
+ - fine-tuning
11
+ - llama
12
+ - desspseek
13
+ - distillation
14
+ - sft
15
+ ---
16
+
17
+ # 🧠 DeepSeek-R1-Distill-Llama-70B-Ko-Reasoning
18
+
19
+ > A large-scale Korean reasoning model fine-tuned from **deepseek-ai/DeepSeek-R1-Distill-Llama-70B**, designed to excel in logical and multi-hop reasoning tasks in Korean.
20
+
21
+ ---
22
+
23
+ ## 📌 Overview
24
+
25
+ **DeepSeek-R1-Distill-Llama-70B-Ko-Reasoning** is a fine-tuned version of [deepseek-ai/DeepSeek-R1-Distill-Llama-70B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B), specifically optimized for **logical reasoning in Korean**. This model is part of a broader research initiative to explore:
26
+
27
+ - The **transition from multilingual reasoning LLMs** to **Korean-specialized reasoning models**
28
+ - The enhancement of **non-reasoning Korean language models** into **reasoning-capable variants**
29
+ - The development of open-access models that rival proprietary alternatives in complex reasoning tasks
30
+
31
+ This model was fine-tuned using a large-scale Korean-English instruction dataset containing diverse multi-hop questions, symbolic logic tasks, and human-crafted reasoning steps.
32
+
33
+ ---
34
+
35
+ ## 🧑‍💻 Usage
36
+
37
+ Install Transformers >= 4.50:
38
+
39
+ ```bash
40
+ pip install -U transformers
41
+ ```
42
+
43
+ Basic example:
44
+
45
+ ```python
46
+ from transformers import AutoModelForCausalLM, AutoTokenizer
47
+
48
+ model_name = "DimensionSTP/DeepSeek-R1-Distill-Llama-70B-Ko-Reasoning"
49
+
50
+ model = AutoModelForCausalLM.from_pretrained(
51
+ model_name,
52
+ torch_dtype="auto",
53
+ device_map="auto"
54
+ )
55
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
56
+
57
+ prompt = "서울과 부산 중 어디가 더 커?"
58
+ messages = [
59
+ {"role": "user", "content": prompt}
60
+ ]
61
+ text = tokenizer.apply_chat_template(
62
+ messages,
63
+ tokenize=False,
64
+ add_generation_prompt=True
65
+ )
66
+
67
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
68
+
69
+ generated_ids = model.generate(
70
+ **model_inputs,
71
+ max_new_tokens=32768
72
+ )
73
+ generated_ids = [
74
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
75
+ ]
76
+
77
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
78
+ print(response)
79
+ ```
80
+
81
+ ---
82
+
83
+ ## 🧠 Base Model: deepseek-ai/DeepSeek-R1-Distill-Llama-70B
84
+
85
+ The base model, [deepseek-ai/DeepSeek-R1-Distill-Llama-70B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B), is a CoT LLM developed by the DeepSeek AI team, fine tuned from Llama 3.3 instruct.
86
+ For more technical details, refer to the [Deepseek R1 Technical Report](https://arxiv.org/pdf/2501.12948).
87
+
88
+ ---
89
+
90
+ ## 🧱 Model Architecture
91
+
92
+ | Property | Value |
93
+ |------------------|------------------------|
94
+ | Architecture | LlamaForCausalLM |
95
+ | Parameters | 70B |
96
+ | Context Length | 131,072 tokens |
97
+ | Tokenizer | LLamaTokenizer (BPE) |
98
+
99
+ ---
100
+
101
+ ## 📅 Release Date
102
+
103
+ **Mar 2025**
104
+ This model was released in March 2025 as part of the **Ko-Reasoning Series**, which focuses on pushing the boundaries of open-source reasoning in Korean using modern LLMs.
105
+
106
+ ---
107
+
108
+ ## 📬 Contact
109
+
110
+ For questions, collaborations, or deployment inquiries, please contact:
111
+
112
+ - 🤖 Hugging Face: [https://huggingface.co/DimensionSTP](https://huggingface.co/DimensionSTP)
113
+ - ✉️ Email: [[email protected]]
114
+
115
+ ---
116
+
117
+ ## 📦 Available Checkpoints
118
+
119
+ - ✅ `main`: Final stable version from the `last` branch
120
+ - ✅ All training artifacts available (tokenizer, config, model weights)