ArtusDev commited on
Commit
20919e0
·
verified ·
1 Parent(s): 7d8437d

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ dmind-2-performance-tweet.jpeg filter=lfs diff=lfs merge=lfs -text
38
+ dmind-2-performance.jpeg filter=lfs diff=lfs merge=lfs -text
39
+ quantization_config.json filter=lfs diff=lfs merge=lfs -text
MODEL-LICENSE ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ DMind-2-107B Model License Agreement
2
+ Version 1.0, 9 September 2025
3
+ Copyright (c) 2025 DMind
4
+
5
+ Section 1: Preamble
6
+ DMind is an open-source AGI research organization focused on next-generation digital finance. Driven by real market needs, DMind continuously releases open-source products—including large language models, benchmarks, datasets, tools, and more.
7
+
8
+ As an open, research-driven community, DMind is powered by a global collective of AI and Web3 enthusiasts, builders, and researchers. All of our work is released under permissive licenses, enabling individuals and enterprises alike to freely use, adapt, and build upon them to create new AI-native innovations.
9
+
10
+ DMind-2-107B is a crypto investment analysis large language model designed to provide real-time, professional consulting for individual and institutional investors. Built via domain-adaptive post-training, it integrates macro market trends with micro on-chain behaviors and can orchestrate complex multi-protocol on-chain tasks.The model is enhanced with advanced tool-calling capabilities for seamless integration with protocols, APIs, and trading interfaces. DMind-2-107B is derived from the GLM-4.5-Air base model and contains approximately 107B parameters.
11
+
12
+ This license is designed to enable open and responsible use of DMind-2-107B. It grants broad rights to use, modify, and redistribute the model, while also enforcing responsible-use provisions to prevent misuse.
13
+
14
+ Section 2: Definitions
15
+ 2.1 License – This document outlining the terms under which you may use, modify, and distribute the Model.
16
+
17
+ 2.2 Model – Refers to DMind-2-107B, including all learned weights, parameters (including optimizer states), checkpoints, and model architecture.
18
+
19
+ 2.3 Derivatives of the Model – Any model, tool, or work derived from or incorporating the Model, including fine-tuned or distilled variants, or models trained using synthetic data generated by the Model.
20
+
21
+ 2.4 Complementary Material – The source code, configuration files, evaluation tools, and documentation provided with the Model.
22
+
23
+ 2.5 Output – Any content generated by operating the Model, including text, code, or structured data.
24
+
25
+ 2.6 Distribution – Making the Model or its Derivatives available to third parties by any method, including downloads, APIs, or services.
26
+
27
+ 2.7 DMind – The organization and research community releasing this Model.
28
+
29
+ 2.8 You – Any individual or entity using or distributing the Model.
30
+
31
+ Section 3: Intellectual Property Rights
32
+ 3.1 Grant of Copyright License
33
+ DMind grants you a perpetual, worldwide, royalty-free, non-exclusive, irrevocable license to use, reproduce, prepare derivative works of, publicly display, publicly perform, sublicense, and distribute the Model, its Derivatives, and the Complementary Material.
34
+
35
+ 3.2 Grant of Patent License
36
+ Where applicable, DMind grants you a worldwide, royalty-free, non-exclusive, irrevocable license under any relevant patent claims owned or licensable by DMind to make, use, sell, offer for sale, import, and distribute the Model and its Derivatives. This license terminates if you initiate patent litigation related to the Model.
37
+
38
+ Section 4: Usage and Distribution
39
+ 4.1 Permitted Use
40
+ You may use the Model and its Derivatives for any lawful purpose, including commercial use, research, deployment in applications, and further fine-tuning.
41
+
42
+ 4.2 Distribution Conditions
43
+ When distributing the Model or any Derivative:
44
+
45
+ 4.2.1 Include a copy of this License;
46
+
47
+ 4.2.2 Retain all original attribution and legal notices;
48
+
49
+ 4.2.3 Clearly mark any modifications made;
50
+
51
+ 4.2.4 Pass on the use-based restrictions in Section 5 and Attachment A in a legally enforceable way to downstream recipients.
52
+
53
+ 4.3 Use-Based Restrictions
54
+ You must not use the Model or its Derivatives in any manner prohibited by Attachment A. You are responsible for ensuring your users also comply with these terms.
55
+
56
+ 4.4 Output Ownership
57
+ DMind claims no rights over the Output generated by your use of the Model. However, you are solely responsible for how such Output is used, and you must ensure it does not violate the responsible use terms of this License.
58
+
59
+ Section 5: Other Provisions
60
+ 5.1 Updates and Restrictions
61
+ DMind reserves the right to restrict your use of the Model (including runtime control if applicable) if you breach this License.
62
+
63
+ 5.2 Trademarks
64
+ This License does not grant any rights to use DMind's name, logo, or branding. You may not suggest affiliation or endorsement without written permission.
65
+
66
+ 5.3 Compliance with Data & IP Laws
67
+ The Model may have been trained on data containing protected information or content. You are responsible for complying with data protection and IP laws when using the Model or its Derivatives.
68
+
69
+ 5.4 Disclaimer of Warranty
70
+ The Model is provided “AS IS” without warranties of any kind. You assume all risk associated with your use.
71
+
72
+ 5.5 Limitation of Liability
73
+ DMind shall not be liable for any direct, indirect, incidental, or consequential damages arising from use of the Model.
74
+
75
+ 5.6 Additional Warranties and Indemnities
76
+ If you offer support or warranty to others based on the Model, you do so at your own risk. You agree to indemnify DMind from any liability resulting from your warranties.
77
+
78
+ 5.7 Severability
79
+ If any part of this License is found unenforceable, the remainder will stay in effect.
80
+
81
+ 5.8 Upstream Base Model Attribution and Compliance
82
+ DMind-2-107B is derived from GLM-4.5-Air. When using or distributing the Model or any Derivative, you must also comply with the license terms applicable to GLM-4.5-Air (the "Zai License"). Nothing in this Agreement is intended to, or shall be interpreted to, limit your obligations under the upstream license.
83
+
84
+ Attachment A: Use Restrictions
85
+ You agree not to use the Model or Derivatives of the Model:
86
+
87
+ A.1 In violation of any law or regulation;
88
+ A.2 For any military, warfare, or weapons-related purpose;
89
+ A.3 To exploit or harm minors in any way;
90
+ A.4 To generate or spread false or harmful information;
91
+ A.5 To collect or reveal personal identifying information without consent;
92
+ A.6 To defame, harass, or harm individuals or communities;
93
+ A.7 For automated decision-making that affects individuals’ legal or human rights without oversight;
94
+ A.8 In ways that discriminate based on race, gender, orientation, disability, or other protected characteristics;
95
+ A.9 To exploit vulnerable groups or create manipulative or abusive systems.
96
+
Modelfile ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ollama modelfile auto-generated by llamafactory
2
+
3
+ FROM .
4
+
5
+ TEMPLATE """[gMASK]<sop>{{ if .System }}<|system|>
6
+ {{ .System }}{{ end }}{{ range .Messages }}{{ if eq .Role "user" }}<|user|>
7
+ {{ .Content }}<|assistant|>{{ else if eq .Role "assistant" }}
8
+ {{ .Content }}{{ end }}{{ end }}"""
9
+
10
+ PARAMETER stop "<|user|>"
11
+ PARAMETER stop "<|endoftext|>"
12
+ PARAMETER stop "<|observation|>"
13
+ PARAMETER num_ctx 4096
README.md ADDED
@@ -0,0 +1,241 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - zh
6
+ base_model:
7
+ - zai-org/GLM-4.5-Air
8
+ ---
9
+
10
+ # DMind-2: Advanced Crypto Domain-Specific Large Language Models with Distribution-Preserving CoT Distillation
11
+
12
+
13
+ <div align="center">
14
+ <img src="dmind-ai-logo.png" width="60%" alt="DMind-2" />
15
+ </div>
16
+
17
+ <hr>
18
+
19
+ <div align="center">
20
+ <a href="https://dmind.ai/">
21
+ <img alt="DMind Website" src="https://img.shields.io/badge/DMind-Homepage-blue?logo=data:image/svg+xml;base64,)"/>
22
+ </a>
23
+ <a href="https://huggingface.co/DMindAI">
24
+ <img alt="Hugging Face" src="https://img.shields.io/badge/HuggingFace-DMind-ffd21f?color=ffd21f&logo=huggingface"/>
25
+ </a>
26
+ <a href="https://x.com/dmind_ai">
27
+ <img alt="X" src="https://img.shields.io/badge/X-@dmindai-1DA1F2?logo=x"/>
28
+ </a>
29
+ <!-- <a href="https://huggingface.co/spaces/DMindAI/DMind-1">
30
+ <img alt="Chat"
31
+ src="https://img.shields.io/badge/🤖%20Chat-DMind-536af5?color=536af5&logoColor=white"/>
32
+ </a> -->
33
+ <a href="https://discord.gg/xxwmPHU3">
34
+ <img alt="Discord"
35
+ src="https://img.shields.io/badge/Discord-DMind-7289da?logo=discord&logoColor=white&color=7289da"/>
36
+ </a>
37
+ <a href="https://opensource.org/licenses/MIT">
38
+ <img alt="Code License: MIT" src="https://img.shields.io/badge/Code%20License-MIT-yellow.svg"/>
39
+ </a>
40
+ <!-- <a href="MODEL-LICENSE">
41
+ <img alt="Model License: Model Agreement" src="https://img.shields.io/badge/Model%20License-Model%20Agreement-yellow.svg"/>
42
+ </a> -->
43
+
44
+ </div>
45
+
46
+
47
+ ## Model Overview
48
+
49
+ DMind-2 is a series of crypto investment analysis language models designed to provide real-time, professional crypto investment consulting services for individual investors and professional institutions. Standing on the shoulders of numerous open-source pioneers, we have successfully launched two model variants through innovative post-training techniques.
50
+
51
+ Among these, **Dmind-2-107B** demonstrates exceptional depth of understanding and analytical capabilities when addressing complex crypto ecosystem challenges, delivering comprehensive insights that span from macroeconomic trends to microscopic on-chain behaviors.
52
+
53
+ ## Model Variants(DMind-2-107B)
54
+ - **Base Model**: GLM-4.5-Air
55
+ - **Parameters**: 107B
56
+ - **Training Duration**: 1 month of refined post-training
57
+ - **Hardware Requirements**:
58
+ - **Features**: Its core advantage lies in its ability to deeply integrate macro market trends with micro on-chain activities, possessing a panoramic multi-chain data analysis capability; it can autonomously orchestrate and execute complex on-chain tasks spanning multiple protocols and dozens of steps; enhanced with advanced tool-calling capabilities for seamless integration with protocols, APIs, and trading interfaces; and it can synthesize traditional indicators with crypto-native signals such as on-chain data and social sentiment, providing investors with unprecedented deep insights and intelligent decision-making support.
59
+
60
+ ## Technical Innovations
61
+
62
+ ### 1. Domain-Adaptive Supervised Fine-Tuning (SFT)
63
+
64
+ In building DMind-2, we deeply understand the uniqueness of the crypto investment domain—it requires not only profound blockchain technical understanding but also keen financial market insights, and most importantly, the ability to perform rigorous logical reasoning among complex on-chain data and market signals. Therefore, our domain-adaptive fine-tuning strategy fully considers these requirements from the very beginning of dataset construction. We carefully curated a total of 47.6K high-quality training samples, including 27.8K crypto domain-specific data points covering comprehensive crypto investment scenarios from DeFi protocol analysis and NFT valuation models to DAO governance decisions. These data points are not simple Q&A pairs but contain complete investment logic chains, encompassing the entire reasoning process from market observation, data analysis, and risk assessment to investment recommendations.
65
+
66
+ To ensure the model maintains fundamental financial analysis capabilities while focusing on the crypto domain, we specifically incorporated 11.2K high-quality general domain data points and 8.6K pan-financial domain data points. These datasets help the model establish a solid foundation in financial theory and market analysis frameworks, enabling it to creatively apply mature methodologies from traditional finance to the emerging crypto sector. Through this multi-layered data fusion strategy, DMind-2 can act like a professional investment advisor who understands both technology and finance, providing users with comprehensive and in-depth investment analysis.
67
+
68
+ ### 2. 🔥 Core Innovation: Distribution-Preserving Chain-of-Thought Distillation (DPCD)
69
+
70
+ DMind-2's greatest technical breakthrough lies in our innovative Distribution-Preserving Chain-of-Thought Distillation method. Traditional domain fine-tuning causes catastrophic forgetting in CoT models, where the model loses reasoning coherence while gaining domain knowledge. Our DPCD method solves this through a mathematically rigorous dual-stream architecture.
71
+
72
+ #### Core Formulation
73
+
74
+ The DPCD optimization objective combines domain adaptation with reasoning preservation through the following loss function:
75
+
76
+ $$
77
+ \mathcal{L}_{\text{DPCD}} = \underbrace{\mathcal{L}_{\text{CE}}(\theta_s, \mathcal{D}_{\text{crypto}})}_{\text{Domain Learning}} + \underbrace{\lambda(t) \cdot \sum_{i=1}^{T} \alpha_i \cdot D_{\text{KL}}(P_{\theta_s}^{(i)} \| P_{\theta_t}^{(i)})}_{\text{Distribution Preservation}} + \underbrace{\beta \cdot \mathcal{L}_{\text{QS}}(\mathcal{C}_{\theta_s})}_{\text{Quality Score}}
78
+ $$
79
+
80
+ Where:
81
+
82
+ * \\(\theta_s\\) and \\(\theta_t\\) represent student (trainable) and teacher (frozen) model parameters.
83
+ * \\(P_{\theta}^{(i)}\\) denotes the probability distribution at reasoning step \\(i\\).
84
+ * \\(\lambda(t) = \lambda_0 \cdot (1 + \gamma \cdot \text{complexity}(x_t))\\) is the dynamic weight function.
85
+ * \\(\alpha_i = \exp(-\delta \cdot i/T)\\) implements exponential decay for later reasoning steps.
86
+ * \\(\mathcal{L}_{\text{QS}}\\) is the quality scoring loss ensuring reasoning coherence.
87
+
88
+
89
+
90
+ #### Dynamic Weight Adjustment Mechanism
91
+
92
+ The complexity-aware weight adjustment is formulated as:
93
+
94
+ $$
95
+ \lambda(t) = \begin{cases}
96
+ \lambda_{\text{high}} \cdot \left(1 + \tanh\left(\frac{\mathcal{H}(x_t) - \mu_{\mathcal{H}}}{\sigma_{\mathcal{H}}}\right)\right) & \text{if } \mathcal{T}(x_t) \in \{\text{DeFi Analysis, Risk Assessment}\} \\
97
+ \lambda_{\text{base}} & \text{if } \mathcal{T}(x_t) \in \{\text{Market Data, Price Query}\} \\
98
+ \lambda_{\text{base}} \cdot \left(1 + \frac{\mathcal{S}(c_t)}{|\mathcal{V}_{\text{crypto}}|}\right) & \text{otherwise}
99
+ \end{cases}
100
+ $$
101
+
102
+ Where \\(\mathcal{H}(x_t)\\) measures reasoning complexity through chain length and branching factor, \\(\mathcal{S}(c_t)\\) counts domain-specific terms, and \\(|\mathcal{V}_{\text{crypto}}|\\) is the crypto vocabulary size.
103
+
104
+ This mathematical framework ensures that DMind-2 maintains Qwen3's powerful reasoning capabilities while acquiring deep crypto domain expertise. The KL divergence constraint operates at each token generation step, preserving the original model's reasoning patterns. The quality scoring mechanism \\(\mathcal{L}_{\text{QS}}\\) filters out low-quality reasoning chains, maintaining only those paths with coherence scores above threshold \\(\tau = 0.85\\).
105
+
106
+ Through extensive experimentation, we found optimal hyperparameters: \\(\lambda_{\text{base}} = 0.3\\), \\(\lambda_{\text{high}} = 0.7\\), \\(\beta = 0.2\\), and \\(\delta = 0.1\\). This configuration achieves a 94.1% reasoning chain completeness while improving domain-specific accuracy by 23.2% over baseline fine-tuning methods.
107
+
108
+ ### 3. Reinforcement Learning from Human Feedback (RLHF) Optimization
109
+
110
+ After completing basic domain fine-tuning, we further optimize the model using the Group Relative Policy Optimization (GRPO) algorithm. GRPO offers better stability compared to traditional PPO algorithms, which is particularly important for financial domain models—we cannot tolerate dramatic performance fluctuations during optimization as this could lead to unpredictable investment advice. During the RLHF phase, we focused on addressing two key issues: professional output formatting and safety compliance.
111
+
112
+ For professional output formatting, we constructed 4.2K carefully designed professional format data points. These data samples are sourced from real investment research reports, market analysis documents, and project due diligence reports, covering all aspects of investment analysis. Through RLHF training, the model learned how to organize a professional investment analysis report: starting with an executive summary that clearly articulates investment opportunities and risks; conducting in-depth technical analysis and market evaluation in the main body; and finally providing clear investment recommendations and risk warnings. This structured output not only improves information readability but more importantly helps investors establish systematic analytical frameworks, avoiding impulsive investment decisions due to disorganized information.
113
+
114
+ Safety alignment is another aspect we particularly emphasize. The crypto investment field is full of high-risk, high-reward opportunities, and the model must accurately identify and highlight potential risks. We use proprietary risk case datasets to conduct safety training on the model, ensuring it won't output overly optimistic investment advice or overlook obvious risk signals. For example, when analyzing an emerging DeFi protocol, the model automatically checks key risk indicators such as smart contract audit status, team background, and total value locked, explicitly marking risk levels in investment recommendations. This responsible output approach not only protects users' asset security but also reflects our commitment to financial compliance.
115
+
116
+ ## Performance Metrics
117
+
118
+ <div align="center">
119
+ <img src="dmind-2-performance.jpeg" width="85%" alt="DMind-2" />
120
+ </div>
121
+
122
+
123
+ | Category | Benchmark (Metric) | DeepSeek-R1-0528 | gpt-oss-120b | Qwen3-235b-a22b | GLM-4.5-Air | **DMind-2-107B(107B)** |
124
+ | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
125
+ | **General** | | | | | | |
126
+ | | MMLU-Pro (EM) | 84.0 | 90.0 | 80.6 | 81.4 | 83.1 |
127
+ | | GPQA-Diamond (Pass@1) | 71.5 | 80.9 | 77.5 | 75.0 | 74.3 |
128
+ | | SimpleQA (Correct) | 30.1 | 6.7 | 54.3 | - | 51.5 |
129
+ | **Math** | | | | | | |
130
+ | | AIME 2024 (Pass@1) | 79.8 | 96.6 | 80.4 | 89.4 | 93.3 |
131
+ | | AIME 2025 (Pass@1) | 70.0 | 97.9 | 70.3 | - | 94.8 |
132
+ | | CNMO 2024 (Pass@1) | 78.8 | 86.9 | - | - | 84.1 |
133
+ | **Tools** | | | | | | |
134
+ | | BFCL_v3 | - | 67.8 | 70.3 | 76.4 | 74.5 |
135
+ | **Crypto** | | | | | | |
136
+ | | DMind Benchmark | 74.1 | 76.3 | 73.4 | 76.8 | 82.2 |
137
+
138
+
139
+ ## Application Scenarios
140
+
141
+ ### 🎯 Edge-Side Crypto Investment Decision Support
142
+
143
+ DMind-2 can provide real-time crypto investment analysis on users' personal devices, including DeFi yield comparisons, liquidity mining strategy optimization, and NFT valuation analysis. All calculations and analyses are completed locally, ensuring absolute privacy of investment strategies and position information. The model can analyze on-chain data, evaluate project fundamentals, identify market trends, and provide comprehensive support for investment decisions.
144
+
145
+ ### 💼 Personalized Financial Advisory Services
146
+
147
+ Based on users' risk preferences, investment objectives, and asset allocation needs, DMind-2 can provide customized investment advice. Whether for long-term value investing or short-term arbitrage opportunities, the model can provide professional analysis and recommendations. More importantly, it can explain complex crypto concepts in plain language, helping investors understand the logic behind every investment decision.
148
+
149
+ ### 📊 Comprehensive Financial Investment Computational Analysis
150
+
151
+ DMind-2 is not limited to the crypto domain but also possesses powerful pan-financial computational analysis capabilities. It can perform yield calculations, risk assessments, portfolio optimization, correlation analysis, and other professional financial computations. By integrating traditional financial theory with crypto innovative mechanisms, the model helps investors find optimal asset allocation solutions between old and new financial systems.
152
+
153
+ ### 🔍 Real-Time Market Monitoring and Alerts
154
+
155
+ Edge-deployed DMind-2 can monitor market dynamics 24/7, promptly alerting users when important market events or investment opportunities arise. Running locally ensures extremely fast response speeds, providing immediate response recommendations during severe market volatility.
156
+
157
+
158
+ ## Usage Example
159
+
160
+ ```python
161
+ from transformers import AutoModelForCausalLM, AutoTokenizer
162
+ import torch
163
+
164
+ # Load the model and tokenizer
165
+ model = AutoModelForCausalLM.from_pretrained(
166
+ "zai-org/GLM-4.5-Air",
167
+ torch_dtype=torch.float16,
168
+ device_map="auto",
169
+ trust_remote_code=True
170
+ )
171
+ tokenizer = AutoTokenizer.from_pretrained(
172
+ "zai-org/GLM-4.5-Air",
173
+ trust_remote_code=True
174
+ )
175
+
176
+ # Example dialogue
177
+ prompt = """<|im_start|>user
178
+ Please analyze the following investment opportunity:
179
+ 1. Project: Emerging Layer2 DEX Protocol
180
+ 2. TVL: $50M, growth rate 200%/month
181
+ 3. Token Economics: 70% circulating, 30% team locked for 2 years
182
+ 4. My risk tolerance: Medium
183
+ Please provide investment advice and risk analysis.
184
+ <|im_end|>
185
+ <|im_start|>assistant
186
+ """
187
+
188
+ # Generate a response
189
+ inputs = tokenizer(prompt, return_tensors="pt")
190
+ outputs = model.generate(
191
+ **inputs,
192
+ max_length=2048,
193
+ temperature=0.7,
194
+ do_sample=True,
195
+ pad_token_id=tokenizer.eos_token_id
196
+ )
197
+
198
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
199
+ print(response)
200
+ ```
201
+
202
+ ## Privacy & Security
203
+
204
+ - 🔐 **Fully Localized**: All inference computations are completed on user devices, no internet required
205
+ - 🛡️ **Data Privacy**: Investment strategies and personal information never leave local devices
206
+ - ⚡ **Real-Time Response**: No network latency, millisecond-level response speed
207
+ - 🔒 **Security Compliance**: Built-in risk warning mechanisms, compliant with financial regulations
208
+
209
+ ## Limitations & Disclaimers
210
+
211
+ 1. **Not Investment Advice**: Model outputs are for reference only; final investment decisions require users' own judgment
212
+ 2. **Market Risk**: Crypto markets are highly volatile; please carefully assess risk tolerance
213
+ 3. **Knowledge Timeliness**: Model knowledge has temporal limitations; latest market information requires additional verification
214
+ 4. **Regulatory Compliance**: Please comply with financial regulations in your jurisdiction when using
215
+
216
+ ## Acknowledgments
217
+
218
+ We thank the Qwen and zai teams for providing the excellent base model and the continuous contributions from the open-source community. DMind-2's success wouldn't be possible without the collective efforts of the entire AI and Crypto community.
219
+
220
+ ## License
221
+
222
+ This model follows the Apache 2.0 open-source license. Commercial use must comply with relevant terms.
223
+
224
+ ## Citation
225
+
226
+ ```bibtex
227
+ @misc{dmind2025,
228
+ title={DMind-2: Advanced Crypto Domain-Specific Large Language Models with Distribution-Preserving CoT Distillation},
229
+ author={DMind Team},
230
+ year={2025},
231
+ publisher={Hugging Face}
232
+ }
233
+ ```
234
+
235
+ ## Contact
236
+
237
+ - 🌐 Project Homepage: [https://dmind.ai](https://dmind.ai)
238
+ - 💬 Community Discussion: [Discord](https://discord.gg/dmind)
239
+ - 🐦 Twitter: [@DMindAI](https://twitter.com/DMindAI)
240
+
241
+ ---
chat_template.jinja ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [gMASK]<sop>
2
+ {%- if tools -%}
3
+ <|system|>
4
+ # Tools
5
+
6
+ You may call one or more functions to assist with the user query.
7
+
8
+ You are provided with function signatures within <tools></tools> XML tags:
9
+ <tools>
10
+ {% for tool in tools %}
11
+ {{ tool | tojson(ensure_ascii=False) }}
12
+ {% endfor %}
13
+ </tools>
14
+
15
+ For each function call, output the function name and arguments within the following XML format:
16
+ <tool_call>{function-name}
17
+ <arg_key>{arg-key-1}</arg_key>
18
+ <arg_value>{arg-value-1}</arg_value>
19
+ <arg_key>{arg-key-2}</arg_key>
20
+ <arg_value>{arg-value-2}</arg_value>
21
+ ...
22
+ </tool_call>{%- endif -%}
23
+ {%- macro visible_text(content) -%}
24
+ {%- if content is string -%}
25
+ {{- content }}
26
+ {%- elif content is iterable and content is not mapping -%}
27
+ {%- for item in content -%}
28
+ {%- if item is mapping and item.type == 'text' -%}
29
+ {{- item.text }}
30
+ {%- elif item is string -%}
31
+ {{- item }}
32
+ {%- endif -%}
33
+ {%- endfor -%}
34
+ {%- else -%}
35
+ {{- content }}
36
+ {%- endif -%}
37
+ {%- endmacro -%}
38
+ {%- set ns = namespace(last_user_index=-1) %}
39
+ {%- for m in messages %}
40
+ {%- if m.role == 'user' %}
41
+ {% set ns.last_user_index = loop.index0 -%}
42
+ {%- endif %}
43
+ {%- endfor %}
44
+ {% for m in messages %}
45
+ {%- if m.role == 'user' -%}<|user|>
46
+ {{ visible_text(m.content) }}
47
+ {{- '/nothink' if (enable_thinking is defined and not enable_thinking and not visible_text(m.content).endswith("/nothink")) else '' -}}
48
+ {%- elif m.role == 'assistant' -%}
49
+ <|assistant|>
50
+ {%- set reasoning_content = '' %}
51
+ {%- set content = visible_text(m.content) %}
52
+ {%- if m.reasoning_content is string %}
53
+ {%- set reasoning_content = m.reasoning_content %}
54
+ {%- else %}
55
+ {%- if '</think>' in content %}
56
+ {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
57
+ {%- set content = content.split('</think>')[-1].lstrip('\n') %}
58
+ {%- endif %}
59
+ {%- endif %}
60
+ {%- if loop.index0 > ns.last_user_index and reasoning_content -%}
61
+ {{ '\n<think>' + reasoning_content.strip() + '</think>'}}
62
+ {%- else -%}
63
+ {{ '\n<think></think>' }}
64
+ {%- endif -%}
65
+ {%- if content.strip() -%}
66
+ {{ '\n' + content.strip() }}
67
+ {%- endif -%}
68
+ {% if m.tool_calls %}
69
+ {% for tc in m.tool_calls %}
70
+ {%- if tc.function %}
71
+ {%- set tc = tc.function %}
72
+ {%- endif %}
73
+ {{ '\n<tool_call>' + tc.name }}
74
+ {% set _args = tc.arguments %}
75
+ {% for k, v in _args.items() %}
76
+ <arg_key>{{ k }}</arg_key>
77
+ <arg_value>{{ v | tojson(ensure_ascii=False) if v is not string else v }}</arg_value>
78
+ {% endfor %}
79
+ </tool_call>{% endfor %}
80
+ {% endif %}
81
+ {%- elif m.role == 'tool' -%}
82
+ {%- if m.content is string -%}
83
+ {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
84
+ {{- '<|observation|>' }}
85
+ {%- endif %}
86
+ {{- '\n<tool_response>\n' }}
87
+ {{- m.content }}
88
+ {{- '\n</tool_response>' }}
89
+ {%- else -%}
90
+ <|observation|>{% for tr in m.content %}
91
+
92
+ <tool_response>
93
+ {{ tr.output if tr.output is defined else tr }}
94
+ </tool_response>{% endfor -%}
95
+ {% endif -%}
96
+ {%- elif m.role == 'system' -%}
97
+ <|system|>
98
+ {{ visible_text(m.content) }}
99
+ {%- endif -%}
100
+ {%- endfor -%}
101
+ {%- if add_generation_prompt -%}
102
+ <|assistant|>{{- '\n<think></think>' if (enable_thinking is defined and not enable_thinking) else '' -}}
103
+ {%- endif -%}
config.json ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Glm4MoeForCausalLM"
4
+ ],
5
+ "attention_bias": true,
6
+ "attention_dropout": 0.0,
7
+ "eos_token_id": [
8
+ 151329,
9
+ 151336,
10
+ 151338
11
+ ],
12
+ "first_k_dense_replace": 1,
13
+ "head_dim": 128,
14
+ "hidden_act": "silu",
15
+ "hidden_size": 4096,
16
+ "initializer_range": 0.02,
17
+ "intermediate_size": 10944,
18
+ "max_position_embeddings": 131072,
19
+ "model_type": "glm4_moe",
20
+ "moe_intermediate_size": 1408,
21
+ "n_group": 1,
22
+ "n_routed_experts": 128,
23
+ "n_shared_experts": 1,
24
+ "norm_topk_prob": true,
25
+ "num_attention_heads": 96,
26
+ "num_experts_per_tok": 8,
27
+ "num_hidden_layers": 46,
28
+ "num_key_value_heads": 8,
29
+ "num_nextn_predict_layers": 1,
30
+ "pad_token_id": 151329,
31
+ "partial_rotary_factor": 0.5,
32
+ "rms_norm_eps": 1e-05,
33
+ "rope_scaling": null,
34
+ "rope_theta": 1000000,
35
+ "routed_scaling_factor": 1.0,
36
+ "tie_word_embeddings": false,
37
+ "topk_group": 1,
38
+ "torch_dtype": "bfloat16",
39
+ "transformers_version": "4.56.0.dev0",
40
+ "use_cache": true,
41
+ "use_qk_norm": false,
42
+ "vocab_size": 151552,
43
+ "quantization_config": {
44
+ "quant_method": "exl3",
45
+ "version": "0.0.11",
46
+ "bits": 2.91,
47
+ "head_bits": 6,
48
+ "calibration": {
49
+ "rows": 250,
50
+ "cols": 2048
51
+ },
52
+ "out_scales": "auto",
53
+ "codebook": "mcg"
54
+ }
55
+ }
dmind-2-performance.jpeg ADDED

Git LFS Details

  • SHA256: 1f4aeba3cbc5b2c26b30cb07b4276442f79da498cb5f50375b1a68bb9417cbb1
  • Pointer size: 131 Bytes
  • Size of remote file: 549 kB
dmind-ai-logo.png ADDED
generation_config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "eos_token_id": [
4
+ 151329,
5
+ 151336,
6
+ 151338
7
+ ],
8
+ "pad_token_id": 151329,
9
+ "transformers_version": "4.56.0.dev0"
10
+ }
model-00001-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:01a3ba550548a1a6015cbd6674418765aa599dc952454083695d7ec9ad3e57db
3
+ size 8046361584
model-00002-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7f7306ee5206cf6ddd269e3a44517244f16a56efd7ade06733ffaa2451da7c68
3
+ size 7831519308
model-00003-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ffdd95f57f1126b610c0e8786f8dc772754a4ce95e09bb5576b7f291384eb234
3
+ size 8148044392
model-00004-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7e14e0330e75673587c7261807a5b7994830deadf9d42a3572e2dba13697f63e
3
+ size 8240319168
model-00005-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:89288b23eb78a7ab092d059b15937bda7d7ced38c783f2aefab8e635e4f3b5dc
3
+ size 7836037040
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
quantization_config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0c437ca8132d967ede4499394336da00a650a0ec24511174801e5801153a82c7
3
+ size 21625732
special_tokens_map.json ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|endoftext|>",
4
+ "[MASK]",
5
+ "[gMASK]",
6
+ "[sMASK]",
7
+ "<sop>",
8
+ "<eop>",
9
+ "<|system|>",
10
+ "<|user|>",
11
+ "<|assistant|>",
12
+ "<|observation|>",
13
+ "<|begin_of_image|>",
14
+ "<|end_of_image|>",
15
+ "<|begin_of_video|>",
16
+ "<|end_of_video|>",
17
+ "<|begin_of_audio|>",
18
+ "<|end_of_audio|>",
19
+ "<|begin_of_transcription|>",
20
+ "<|end_of_transcription|>",
21
+ "<|code_prefix|>",
22
+ "<|code_middle|>",
23
+ "<|code_suffix|>",
24
+ "/nothink"
25
+ ],
26
+ "eos_token": {
27
+ "content": "<|endoftext|>",
28
+ "lstrip": false,
29
+ "normalized": false,
30
+ "rstrip": false,
31
+ "single_word": false
32
+ },
33
+ "pad_token": {
34
+ "content": "<|endoftext|>",
35
+ "lstrip": false,
36
+ "normalized": false,
37
+ "rstrip": false,
38
+ "single_word": false
39
+ }
40
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bda8e2146c3bb7b7e0fc96dcc4f0aeff041c6c27952e3ace0665663ebff346ba
3
+ size 19970700
tokenizer_config.json ADDED
@@ -0,0 +1,326 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "151329": {
4
+ "content": "<|endoftext|>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "151330": {
12
+ "content": "[MASK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "151331": {
20
+ "content": "[gMASK]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "151332": {
28
+ "content": "[sMASK]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "151333": {
36
+ "content": "<sop>",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "151334": {
44
+ "content": "<eop>",
45
+ "lstrip": false,
46
+ "normalized": false,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": true
50
+ },
51
+ "151335": {
52
+ "content": "<|system|>",
53
+ "lstrip": false,
54
+ "normalized": false,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": true
58
+ },
59
+ "151336": {
60
+ "content": "<|user|>",
61
+ "lstrip": false,
62
+ "normalized": false,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": true
66
+ },
67
+ "151337": {
68
+ "content": "<|assistant|>",
69
+ "lstrip": false,
70
+ "normalized": false,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": true
74
+ },
75
+ "151338": {
76
+ "content": "<|observation|>",
77
+ "lstrip": false,
78
+ "normalized": false,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": true
82
+ },
83
+ "151339": {
84
+ "content": "<|begin_of_image|>",
85
+ "lstrip": false,
86
+ "normalized": false,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": true
90
+ },
91
+ "151340": {
92
+ "content": "<|end_of_image|>",
93
+ "lstrip": false,
94
+ "normalized": false,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": true
98
+ },
99
+ "151341": {
100
+ "content": "<|begin_of_video|>",
101
+ "lstrip": false,
102
+ "normalized": false,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": true
106
+ },
107
+ "151342": {
108
+ "content": "<|end_of_video|>",
109
+ "lstrip": false,
110
+ "normalized": false,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": true
114
+ },
115
+ "151343": {
116
+ "content": "<|begin_of_audio|>",
117
+ "lstrip": false,
118
+ "normalized": false,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": true
122
+ },
123
+ "151344": {
124
+ "content": "<|end_of_audio|>",
125
+ "lstrip": false,
126
+ "normalized": false,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": true
130
+ },
131
+ "151345": {
132
+ "content": "<|begin_of_transcription|>",
133
+ "lstrip": false,
134
+ "normalized": false,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": true
138
+ },
139
+ "151346": {
140
+ "content": "<|end_of_transcription|>",
141
+ "lstrip": false,
142
+ "normalized": false,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": true
146
+ },
147
+ "151347": {
148
+ "content": "<|code_prefix|>",
149
+ "lstrip": false,
150
+ "normalized": false,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": true
154
+ },
155
+ "151348": {
156
+ "content": "<|code_middle|>",
157
+ "lstrip": false,
158
+ "normalized": false,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": true
162
+ },
163
+ "151349": {
164
+ "content": "<|code_suffix|>",
165
+ "lstrip": false,
166
+ "normalized": false,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": true
170
+ },
171
+ "151350": {
172
+ "content": "<think>",
173
+ "lstrip": false,
174
+ "normalized": false,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "151351": {
180
+ "content": "</think>",
181
+ "lstrip": false,
182
+ "normalized": false,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "151352": {
188
+ "content": "<tool_call>",
189
+ "lstrip": false,
190
+ "normalized": false,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "151353": {
196
+ "content": "</tool_call>",
197
+ "lstrip": false,
198
+ "normalized": false,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "151354": {
204
+ "content": "<tool_response>",
205
+ "lstrip": false,
206
+ "normalized": false,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "151355": {
212
+ "content": "</tool_response>",
213
+ "lstrip": false,
214
+ "normalized": false,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "151356": {
220
+ "content": "<arg_key>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": false
226
+ },
227
+ "151357": {
228
+ "content": "</arg_key>",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": false
234
+ },
235
+ "151358": {
236
+ "content": "<arg_value>",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": false
242
+ },
243
+ "151359": {
244
+ "content": "</arg_value>",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": false
250
+ },
251
+ "151360": {
252
+ "content": "/nothink",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "151361": {
260
+ "content": "<|begin_of_box|>",
261
+ "lstrip": false,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": false
266
+ },
267
+ "151362": {
268
+ "content": "<|end_of_box|>",
269
+ "lstrip": false,
270
+ "normalized": false,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "151363": {
276
+ "content": "<|image|>",
277
+ "lstrip": false,
278
+ "normalized": false,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "151364": {
284
+ "content": "<|video|>",
285
+ "lstrip": false,
286
+ "normalized": false,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ }
291
+ },
292
+ "additional_special_tokens": [
293
+ "<|endoftext|>",
294
+ "[MASK]",
295
+ "[gMASK]",
296
+ "[sMASK]",
297
+ "<sop>",
298
+ "<eop>",
299
+ "<|system|>",
300
+ "<|user|>",
301
+ "<|assistant|>",
302
+ "<|observation|>",
303
+ "<|begin_of_image|>",
304
+ "<|end_of_image|>",
305
+ "<|begin_of_video|>",
306
+ "<|end_of_video|>",
307
+ "<|begin_of_audio|>",
308
+ "<|end_of_audio|>",
309
+ "<|begin_of_transcription|>",
310
+ "<|end_of_transcription|>",
311
+ "<|code_prefix|>",
312
+ "<|code_middle|>",
313
+ "<|code_suffix|>",
314
+ "/nothink"
315
+ ],
316
+ "clean_up_tokenization_spaces": false,
317
+ "do_lower_case": false,
318
+ "eos_token": "<|endoftext|>",
319
+ "extra_special_tokens": {},
320
+ "model_max_length": 128000,
321
+ "pad_token": "<|endoftext|>",
322
+ "padding_side": "left",
323
+ "remove_space": false,
324
+ "split_special_tokens": false,
325
+ "tokenizer_class": "PreTrainedTokenizerFast"
326
+ }