ArtusDev commited on
Commit
bf6eb55
·
verified ·
1 Parent(s): cb0c9ae

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,114 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - multilingual
4
+ license: other
5
+ license_name: kwaipilot-license
6
+ license_link: LICENSE
7
+ library_name: transformers
8
+ ---
9
+ <div align="center">
10
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/61ee40a269351366e29972ad/KIYEa1c_WJEWPpeS0L_k1.png" width="100%" alt="Kwaipilot" />
11
+ </div>
12
+
13
+ <hr>
14
+
15
+
16
+ # Highlights
17
+ **KAT-Dev-32B** is an open-source 32B-parameter model for software engineering tasks.
18
+
19
+ On SWE-Bench Verified, **KAT-Dev-32B** achieves comparable performance with **62.4%** resolved and ranks **5th** among all open-source models with different scales.
20
+
21
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61ee40a269351366e29972ad/dTpQQPQnp1TdD4YB8gZAu.png)
22
+
23
+ # Introduction
24
+
25
+ **KAT-Dev-32B** is optimized via several stages of training, including a mid-training stage, supervised fine-tuning (SFT) & reinforcement fine-tuning (RFT) stage and an large-scale agentic reinforcement learning (RL) stage. In summary, our contributions include:
26
+
27
+ <table>
28
+ <thead>
29
+ <tr>
30
+ <th style="text-align:left; width:18%;">Stage</th>
31
+ <th style="text-align:left;">Key Techniques</th>
32
+ </tr>
33
+ </thead>
34
+ <tbody>
35
+ <tr>
36
+ <td><strong>1. Mid-Training</strong></td>
37
+ <td>We observe that adding extensive training for tool-use capability, multi-turn interaction, and instruction-following at this stage may not yield large performance gains in the current results (e.g., on leaderboards like SWE-bench). However, since our experiments are based on the Qwen3-32B model, we find that enhancing these foundational capabilities will have a significant impact on the subsequent SFT and RL stages. This suggests that improving such core abilities can profoundly influence the model’s capacity to handle more complex tasks.
38
+ </td>
39
+ </tr>
40
+ <tr>
41
+ <td><strong>2. SFT & RFT</strong></td>
42
+ <td>We meticulously curated eight task types and eight programming scenarios during the SFT stage to ensure the model’s generalization and comprehensive capabilities. Moreover, before RL, we innovatively introduced an RFT stage. Compared with traditional RL, we incorporate “teacher trajectories” annotated by human engineers as guidance during training—much like a learner driver being assisted by an experienced co-driver before officially driving after getting a license. This step not only boosts model performance but also further stabilizes the subsequent RL training.
43
+ </td>
44
+ </tr>
45
+ <tr>
46
+ <td><strong>3. Agentic RL Scaling</strong></td>
47
+ <td>Scaling agentic RL hinges on three challenges: efficient learning over nonlinear trajectory histories, leveraging intrinsic model signals, and building scalable high-throughput infrastructure. We address these with a multi-level prefix caching mechanism in the RL training engine, an entropy-based trajectory pruning technique, and an inner implementation of SeamlessFlow[1] architecture that cleanly decouples agents from training while exploiting heterogeneous compute. These innovations together cut scaling costs and enable efficient large-scale RL.
48
+ </td>
49
+ </tr>
50
+ </tbody>
51
+ </table>
52
+
53
+ For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our [blog](https://kwaipilot.github.io/KAT-Coder/).
54
+
55
+ # Quickstart
56
+
57
+ ```python
58
+ from transformers import AutoModelForCausalLM, AutoTokenizer
59
+
60
+ model_name = "Kwaipilot/KAT-Dev"
61
+
62
+ # load the tokenizer and the model
63
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
64
+ model = AutoModelForCausalLM.from_pretrained(
65
+ model_name,
66
+ torch_dtype="auto",
67
+ device_map="auto"
68
+ )
69
+
70
+ # prepare the model input
71
+ prompt = "Give me a short introduction to large language model."
72
+ messages = [
73
+ {"role": "user", "content": prompt}
74
+ ]
75
+ text = tokenizer.apply_chat_template(
76
+ messages,
77
+ tokenize=False,
78
+ add_generation_prompt=True,
79
+ )
80
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
81
+
82
+ # conduct text completion
83
+ generated_ids = model.generate(
84
+ **model_inputs,
85
+ max_new_tokens=65536
86
+ )
87
+ output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
88
+
89
+ content = tokenizer.decode(output_ids, skip_special_tokens=True)
90
+
91
+ print("content:", content)
92
+ ```
93
+
94
+ ## Claude Code
95
+ ### vllm server
96
+ ```
97
+ MODEL_PATH="Kwaipilot/KAT-Dev"
98
+
99
+ vllm serve $MODEL_PATH \
100
+ --enable-prefix-caching \
101
+ --tensor-parallel-size 8 \
102
+ --tool-parser-plugin $MODEL_PATH/qwen3coder_tool_parser.py \
103
+ --chat-template $MODEL_PATH/chat_template.jinja \
104
+ --enable-auto-tool-choice --tool-call-parser qwen3_coder
105
+ ```
106
+
107
+ [claude-code-router](https://github.com/musistudio/claude-code-router) is a third-party routing utility that allows Claude Code to flexibly switch between different backend APIs.
108
+ On the dashScope platform, you can install the **claude-code-config** extension package, which automatically generates a default configuration for `claude-code-router` with built-in dashScope support.
109
+
110
+ Once the configuration files and plugin directory are generated, the environment required by `ccr` will be ready.
111
+ If needed, you can still manually edit `~/.claude-code-router/config.json` and the files under `~/.claude-code-router/plugins/` to customize the setup.
112
+
113
+ Finally, simply start `ccr` to run Claude Code and seamlessly connect it with the powerful coding capabilities of **KAT-Dev-32B**.
114
+ Happy coding!
added_tokens.json ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "</think>": 151668,
3
+ "</tool_call>": 151658,
4
+ "</tool_response>": 151666,
5
+ "<think>": 151667,
6
+ "<tool_call>": 151657,
7
+ "<tool_response>": 151665,
8
+ "<|box_end|>": 151649,
9
+ "<|box_start|>": 151648,
10
+ "<|endoftext|>": 151643,
11
+ "<|file_sep|>": 151664,
12
+ "<|fim_middle|>": 151660,
13
+ "<|fim_pad|>": 151662,
14
+ "<|fim_prefix|>": 151659,
15
+ "<|fim_suffix|>": 151661,
16
+ "<|im_end|>": 151645,
17
+ "<|im_start|>": 151644,
18
+ "<|image_pad|>": 151655,
19
+ "<|object_ref_end|>": 151647,
20
+ "<|object_ref_start|>": 151646,
21
+ "<|quad_end|>": 151651,
22
+ "<|quad_start|>": 151650,
23
+ "<|repo_name|>": 151663,
24
+ "<|video_pad|>": 151656,
25
+ "<|vision_end|>": 151653,
26
+ "<|vision_pad|>": 151654,
27
+ "<|vision_start|>": 151652
28
+ }
chat_template.jinja ADDED
@@ -0,0 +1,117 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {% macro render_extra_keys(json_dict, handled_keys) %}
2
+ {%- if json_dict is mapping %}
3
+ {%- for json_key in json_dict if json_key not in handled_keys %}
4
+ {%- if json_dict[json_key] is mapping or (json_dict[json_key] is sequence and json_dict[json_key] is not string) %}
5
+ {{- '\n<' ~ json_key ~ '>' ~ (json_dict[json_key] | tojson | safe) ~ '</' ~ json_key ~ '>' }}
6
+ {%- else %}
7
+ {{-'\n<' ~ json_key ~ '>' ~ (json_dict[json_key] | string) ~ '</' ~ json_key ~ '>' }}
8
+ {%- endif %}
9
+ {%- endfor %}
10
+ {%- endif %}
11
+ {% endmacro %}
12
+
13
+ {%- if messages[0]["role"] == "system" %}
14
+ {%- set system_message = messages[0]["content"] %}
15
+ {%- set loop_messages = messages[1:] %}
16
+ {%- else %}
17
+ {%- set loop_messages = messages %}
18
+ {%- endif %}
19
+
20
+ {%- if not tools is defined %}
21
+ {%- set tools = [] %}
22
+ {%- endif %}
23
+
24
+ {%- if system_message is defined %}
25
+ {{- "<|im_start|>system\n" + system_message }}
26
+ {%- else %}
27
+ {%- if tools is iterable and tools | length > 0 %}
28
+ {{- "<|im_start|>system\nYou are a helpful AI assistant that can interact with a computer to solve tasks." }}
29
+ {%- endif %}
30
+ {%- endif %}
31
+ {%- if tools is iterable and tools | length > 0 %}
32
+ {{- "\n\n# Tools\n\nYou have access to the following functions:\n\n" }}
33
+ {{- "<tools>" }}
34
+ {%- for tool in tools %}
35
+ {%- if tool.function is defined %}
36
+ {%- set tool = tool.function %}
37
+ {%- endif %}
38
+ {{- "\n<function>\n<name>" ~ tool.name ~ "</name>" }}
39
+ {%- if tool.description is defined %}
40
+ {{- '\n<description>' ~ (tool.description | trim) ~ '</description>' }}
41
+ {%- endif %}
42
+ {{- '\n<parameters>' }}
43
+ {%- if tool.parameters is defined and tool.parameters is mapping and tool.parameters.properties is defined and tool.parameters.properties is mapping %}
44
+ {%- for param_name, param_fields in tool.parameters.properties|items %}
45
+ {{- '\n<parameter>' }}
46
+ {{- '\n<name>' ~ param_name ~ '</name>' }}
47
+ {%- if param_fields.type is defined %}
48
+ {{- '\n<type>' ~ (param_fields.type | string) ~ '</type>' }}
49
+ {%- endif %}
50
+ {%- if param_fields.description is defined %}
51
+ {{- '\n<description>' ~ (param_fields.description | trim) ~ '</description>' }}
52
+ {%- endif %}
53
+ {%- set handled_keys = ['name', 'type', 'description'] %}
54
+ {{- render_extra_keys(param_fields, handled_keys) }}
55
+ {{- '\n</parameter>' }}
56
+ {%- endfor %}
57
+ {%- endif %}
58
+ {% set handled_keys = ['type', 'properties'] %}
59
+ {{- render_extra_keys(tool.parameters, handled_keys) }}
60
+ {{- '\n</parameters>' }}
61
+ {%- set handled_keys = ['type', 'name', 'description', 'parameters'] %}
62
+ {{- render_extra_keys(tool, handled_keys) }}
63
+ {{- '\n</function>' }}
64
+ {%- endfor %}
65
+ {{- "\n</tools>" }}
66
+ {{- '\n\nIf you choose to call a function ONLY reply in the following format with NO suffix:\n\n<tool_call>\n<function=example_function_name>\n<parameter=example_parameter_1>\nvalue_1\n</parameter>\n<parameter=example_parameter_2>\nThis is the value for the second parameter\nthat can span\nmultiple lines\n</parameter>\n</function>\n</tool_call>\n\n<IMPORTANT>\nReminder:\n- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags\n- Required parameters MUST be specified\n- You may provide optional reasoning for your function call in natural language BEFORE the function call, but NOT after\n- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls\n</IMPORTANT>' }}
67
+ {%- endif %}
68
+ {%- if system_message is defined %}
69
+ {{- '<|im_end|>\n' }}
70
+ {%- else %}
71
+ {%- if tools is iterable and tools | length > 0 %}
72
+ {{- '<|im_end|>\n' }}
73
+ {%- endif %}
74
+ {%- endif %}
75
+ {%- for message in loop_messages %}
76
+ {%- if message.role == "assistant" and message.tool_calls is defined and message.tool_calls is iterable and message.tool_calls | length > 0 %}
77
+ {{- '<|im_start|>' + message.role }}
78
+ {%- if message.content is defined and message.content is string and message.content | trim | length > 0 %}
79
+ {{- '\n' + message.content | trim + '\n' }}
80
+ {%- endif %}
81
+ {%- for tool_call in message.tool_calls %}
82
+ {%- if tool_call.function is defined %}
83
+ {%- set tool_call = tool_call.function %}
84
+ {%- endif %}
85
+ {{- '\n<tool_call>\n<function=' + tool_call.name + '>\n' }}
86
+ {%- if tool_call.arguments is defined %}
87
+ {%- for args_name, args_value in tool_call.arguments|items %}
88
+ {{- '<parameter=' + args_name + '>\n' }}
89
+ {%- set args_value = args_value | tojson | safe if args_value is mapping or (args_value is sequence and args_value is not string) else args_value | string %}
90
+ {{- args_value }}
91
+ {{- '\n</parameter>\n' }}
92
+ {%- endfor %}
93
+ {%- endif %}
94
+ {{- '</function>\n</tool_call>' }}
95
+ {%- endfor %}
96
+ {{- '<|im_end|>\n' }}
97
+ {%- elif message.role == "user" or message.role == "system" or message.role == "assistant" %}
98
+ {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
99
+ {%- elif message.role == "tool" %}
100
+ {%- if loop.previtem and loop.previtem.role != "tool" %}
101
+ {{- '<|im_start|>user\n' }}
102
+ {%- endif %}
103
+ {{- '<tool_response>\n' }}
104
+ {{- message.content }}
105
+ {{- '\n</tool_response>\n' }}
106
+ {%- if not loop.last and loop.nextitem.role != "tool" %}
107
+ {{- '<|im_end|>\n' }}
108
+ {%- elif loop.last %}
109
+ {{- '<|im_end|>\n' }}
110
+ {%- endif %}
111
+ {%- else %}
112
+ {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>\n' }}
113
+ {%- endif %}
114
+ {%- endfor %}
115
+ {%- if add_generation_prompt %}
116
+ {{- '<|im_start|>assistant\n' }}
117
+ {%- endif %}
config.json ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Qwen3ForCausalLM"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "eos_token_id": 151645,
8
+ "head_dim": 128,
9
+ "hidden_act": "silu",
10
+ "hidden_size": 5120,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 25600,
13
+ "max_position_embeddings": 131072,
14
+ "max_window_layers": 64,
15
+ "model_type": "qwen3",
16
+ "num_attention_heads": 64,
17
+ "num_hidden_layers": 64,
18
+ "num_key_value_heads": 8,
19
+ "pad_token_id": 151643,
20
+ "rms_norm_eps": 1e-06,
21
+ "rope_scaling": null,
22
+ "rope_theta": 1000000,
23
+ "sliding_window": null,
24
+ "tie_word_embeddings": false,
25
+ "torch_dtype": "bfloat16",
26
+ "transformers_version": "4.52.3",
27
+ "use_cache": false,
28
+ "use_sliding_window": false,
29
+ "vocab_size": 151936,
30
+ "quantization_config": {
31
+ "quant_method": "exl3",
32
+ "version": "0.0.7",
33
+ "bits": 3.0,
34
+ "head_bits": 6,
35
+ "calibration": {
36
+ "rows": 100,
37
+ "cols": 2048
38
+ },
39
+ "out_scales": "auto"
40
+ }
41
+ }
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 151643,
3
+ "do_sample": true,
4
+ "eos_token_id": [
5
+ 151645,
6
+ 151643
7
+ ],
8
+ "pad_token_id": 151643,
9
+ "temperature": 0.6,
10
+ "top_k": 20,
11
+ "top_p": 0.95,
12
+ "transformers_version": "4.52.3"
13
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model-00001-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:941efd3f55bd64dd7eeddda60555a6e796b847281a8b3ed8475c55cc3f5bbc5a
3
+ size 8514816144
model-00002-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b622ab2487cc96f2aab220a2a4f5d97377a3fdf66ef156dd33b6d468aed6305f
3
+ size 5345171344
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
quantization_config.json ADDED
The diff for this file is too large to render. See raw diff
 
qwen3coder_tool_parser.py ADDED
@@ -0,0 +1,689 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # SPDX-License-Identifier: Apache-2.0
2
+ # SPDX-FileCopyrightText: Copyright contributors to the vLLM project
3
+ import ast
4
+ import json
5
+ import uuid
6
+ from collections.abc import Sequence
7
+ from typing import Any, List, Optional, Union
8
+
9
+ import regex as re
10
+
11
+ from vllm.entrypoints.openai.protocol import (ChatCompletionRequest,
12
+ ChatCompletionToolsParam,
13
+ DeltaFunctionCall, DeltaMessage,
14
+ DeltaToolCall,
15
+ ExtractedToolCallInformation,
16
+ FunctionCall, ToolCall)
17
+ from vllm.entrypoints.openai.tool_parsers.abstract_tool_parser import (
18
+ ToolParser, ToolParserManager)
19
+ from vllm.logger import init_logger
20
+ from vllm.transformers_utils.tokenizer import AnyTokenizer
21
+
22
+ logger = init_logger(__name__)
23
+
24
+
25
+ @ToolParserManager.register_module("qwen3_coder")
26
+ class Qwen3CoderToolParser(ToolParser):
27
+
28
+ def __init__(self, tokenizer: AnyTokenizer):
29
+ super().__init__(tokenizer)
30
+
31
+ self.current_tool_name_sent: bool = False
32
+ self.prev_tool_call_arr: list[dict] = []
33
+ self.current_tool_id: int = -1
34
+ self.streamed_args_for_tool: list[str] = []
35
+
36
+ # Sentinel tokens for streaming mode
37
+ self.tool_call_start_token: str = "<tool_call>"
38
+ self.tool_call_end_token: str = "</tool_call>"
39
+ self.tool_call_prefix: str = "<function="
40
+ self.function_end_token: str = "</function>"
41
+ self.parameter_prefix: str = "<parameter="
42
+ self.parameter_end_token: str = "</parameter>"
43
+ self.is_tool_call_started: bool = False
44
+ self.failed_count: int = 0
45
+
46
+ # Enhanced streaming state - reset for each new message
47
+ self._reset_streaming_state()
48
+
49
+ # Regex patterns
50
+ self.tool_call_complete_regex = re.compile(
51
+ r"<tool_call>(.*?)</tool_call>", re.DOTALL)
52
+ self.tool_call_regex = re.compile(
53
+ r"<tool_call>(.*?)</tool_call>|<tool_call>(.*?)$", re.DOTALL)
54
+ self.tool_call_function_regex = re.compile(
55
+ r"<function=(.*?)</function>|<function=(.*)$", re.DOTALL)
56
+ self.tool_call_parameter_regex = re.compile(
57
+ r"<parameter=(.*?)(?:</parameter>|(?=<parameter=)|(?=</function>)|$)",
58
+ re.DOTALL)
59
+
60
+ if not self.model_tokenizer:
61
+ raise ValueError(
62
+ "The model tokenizer must be passed to the ToolParser "
63
+ "constructor during construction.")
64
+
65
+ self.tool_call_start_token_id = self.vocab.get(
66
+ self.tool_call_start_token)
67
+ self.tool_call_end_token_id = self.vocab.get(self.tool_call_end_token)
68
+
69
+ if self.tool_call_start_token_id is None or self.tool_call_end_token_id is None:
70
+ raise RuntimeError(
71
+ "Qwen3 XML Tool parser could not locate tool call start/end "
72
+ "tokens in the tokenizer!")
73
+
74
+ logger.info(
75
+ f"vLLM Successfully import tool parser {self.__class__.__name__} !"
76
+ )
77
+
78
+ def _generate_tool_call_id(self) -> str:
79
+ """Generate a unique tool call ID."""
80
+ return f"call_{uuid.uuid4().hex[:24]}"
81
+
82
+ def _reset_streaming_state(self):
83
+ """Reset all streaming state."""
84
+ self.current_tool_index = 0
85
+ self.is_tool_call_started = False
86
+ self.header_sent = False
87
+ self.current_tool_id = None
88
+ self.current_function_name = None
89
+ self.current_param_name = None
90
+ self.current_param_value = ""
91
+ self.param_count = 0
92
+ self.in_param = False
93
+ self.in_function = False
94
+ self.accumulated_text = ""
95
+ self.json_started = False
96
+ self.json_closed = False
97
+ # Store accumulated parameters for type conversion
98
+ self.accumulated_params = {}
99
+ self.streaming_request = None
100
+
101
+ def _get_arguments_config(
102
+ self, func_name: str,
103
+ tools: Optional[list[ChatCompletionToolsParam]]) -> dict:
104
+ """Extract argument configuration for a function."""
105
+ if tools is None:
106
+ return {}
107
+ for config in tools:
108
+ if not hasattr(config, "type") or not (hasattr(
109
+ config, "function") and hasattr(config.function, "name")):
110
+ continue
111
+ if config.type == "function" and config.function.name == func_name:
112
+ if not hasattr(config.function, "parameters"):
113
+ return {}
114
+ params = config.function.parameters
115
+ if isinstance(params, dict) and "properties" in params:
116
+ return params["properties"]
117
+ elif isinstance(params, dict):
118
+ return params
119
+ else:
120
+ return {}
121
+ logger.warning(f"Tool '{func_name}' is not defined in the tools list.")
122
+ return {}
123
+
124
+ def _convert_param_value(self, param_value: str, param_name: str,
125
+ param_config: dict, func_name: str) -> Any:
126
+ """Convert parameter value based on its type in the schema."""
127
+ # Handle null value for any type
128
+ if param_value.lower() == "null":
129
+ return None
130
+
131
+ if param_name not in param_config:
132
+ if param_config != {}:
133
+ logger.warning(
134
+ f"Parsed parameter '{param_name}' is not defined in the tool "
135
+ f"parameters for tool '{func_name}', directly returning the string value."
136
+ )
137
+ return param_value
138
+
139
+ if isinstance(param_config[param_name],
140
+ dict) and "type" in param_config[param_name]:
141
+ param_type = str(param_config[param_name]["type"]).strip().lower()
142
+ else:
143
+ param_type = "string"
144
+ if param_type in ["string", "str", "text", "varchar", "char", "enum"]:
145
+ return param_value
146
+ elif param_type.startswith("int") or param_type.startswith(
147
+ "uint") or param_type.startswith(
148
+ "long") or param_type.startswith(
149
+ "short") or param_type.startswith("unsigned"):
150
+ try:
151
+ param_value = int(param_value)
152
+ except:
153
+ logger.warning(
154
+ f"Parsed value '{param_value}' of parameter '{param_name}' is not an integer in tool "
155
+ f"'{func_name}', degenerating to string.")
156
+ return param_value
157
+ elif param_type.startswith("num") or param_type.startswith("float"):
158
+ try:
159
+ float_param_value = float(param_value)
160
+ param_value = float_param_value if float_param_value - int(
161
+ float_param_value) != 0 else int(float_param_value)
162
+ except:
163
+ logger.warning(
164
+ f"Parsed value '{param_value}' of parameter '{param_name}' is not a float in tool "
165
+ f"'{func_name}', degenerating to string.")
166
+ return param_value
167
+ elif param_type in ["boolean", "bool", "binary"]:
168
+ param_value = param_value.lower()
169
+ if param_value not in ["true", "false"]:
170
+ logger.warning(
171
+ f"Parsed value '{param_value}' of parameter '{param_name}' is not a boolean (`true` of `false`) in tool '{func_name}', degenerating to false."
172
+ )
173
+ return param_value == "true"
174
+ else:
175
+ if param_type in ["object", "array", "arr"
176
+ ] or param_type.startswith(
177
+ "dict") or param_type.startswith("list"):
178
+ try:
179
+ param_value = json.loads(param_value)
180
+ return param_value
181
+ except:
182
+ logger.warning(
183
+ f"Parsed value '{param_value}' of parameter '{param_name}' cannot be parsed with json.loads in tool "
184
+ f"'{func_name}', will try other methods to parse it.")
185
+ try:
186
+ param_value = ast.literal_eval(param_value) # safer
187
+ except:
188
+ logger.warning(
189
+ f"Parsed value '{param_value}' of parameter '{param_name}' cannot be converted via Python `ast.literal_eval()` in tool '{func_name}', degenerating to string."
190
+ )
191
+ return param_value
192
+
193
+ def _parse_xml_function_call(
194
+ self, function_call_str: str,
195
+ tools: Optional[list[ChatCompletionToolsParam]]
196
+ ) -> Optional[ToolCall]:
197
+
198
+ # Extract function name
199
+ end_index = function_call_str.index(">")
200
+ function_name = function_call_str[:end_index]
201
+ param_config = self._get_arguments_config(function_name, tools)
202
+ parameters = function_call_str[end_index + 1:]
203
+ param_dict = {}
204
+ for match_text in self.tool_call_parameter_regex.findall(parameters):
205
+ idx = match_text.index(">")
206
+ param_name = match_text[:idx]
207
+ param_value = str(match_text[idx + 1:])
208
+ # Remove prefix and trailing \n
209
+ if param_value.startswith("\n"):
210
+ param_value = param_value[1:]
211
+ if param_value.endswith("\n"):
212
+ param_value = param_value[:-1]
213
+
214
+ param_dict[param_name] = self._convert_param_value(
215
+ param_value, param_name, param_config, function_name)
216
+ return ToolCall(
217
+ type="function",
218
+ function=FunctionCall(name=function_name,
219
+ arguments=json.dumps(param_dict,
220
+ ensure_ascii=False)),
221
+ )
222
+
223
+ def _get_function_calls(self, model_output: str) -> List[str]:
224
+ # Find all tool calls
225
+ matched_ranges = self.tool_call_regex.findall(model_output)
226
+ raw_tool_calls = [
227
+ match[0] if match[0] else match[1] for match in matched_ranges
228
+ ]
229
+
230
+ # Back-off strategy if no tool_call tags found
231
+ if len(raw_tool_calls) == 0:
232
+ raw_tool_calls = [model_output]
233
+
234
+ raw_function_calls = []
235
+ for tool_call in raw_tool_calls:
236
+ raw_function_calls.extend(
237
+ self.tool_call_function_regex.findall(tool_call))
238
+
239
+ function_calls = [
240
+ match[0] if match[0] else match[1] for match in raw_function_calls
241
+ ]
242
+ return function_calls
243
+
244
+ def extract_tool_calls(
245
+ self,
246
+ model_output: str,
247
+ request: ChatCompletionRequest,
248
+ ) -> ExtractedToolCallInformation:
249
+ # Quick check to avoid unnecessary processing
250
+ if self.tool_call_prefix not in model_output:
251
+ return ExtractedToolCallInformation(tools_called=False,
252
+ tool_calls=[],
253
+ content=model_output)
254
+
255
+ try:
256
+ function_calls = self._get_function_calls(model_output)
257
+ if len(function_calls) == 0:
258
+ return ExtractedToolCallInformation(tools_called=False,
259
+ tool_calls=[],
260
+ content=model_output)
261
+
262
+ tool_calls = [
263
+ self._parse_xml_function_call(function_call_str, request.tools)
264
+ for function_call_str in function_calls
265
+ ]
266
+
267
+ # Populate prev_tool_call_arr for serving layer to set finish_reason
268
+ self.prev_tool_call_arr.clear() # Clear previous calls
269
+ for tool_call in tool_calls:
270
+ if tool_call:
271
+ self.prev_tool_call_arr.append({
272
+ "name":
273
+ tool_call.function.name,
274
+ "arguments":
275
+ tool_call.function.arguments,
276
+ })
277
+
278
+ # Extract content before tool calls
279
+ content_index = model_output.find(self.tool_call_start_token)
280
+ content_index = content_index if content_index >= 0 else model_output.find(
281
+ self.tool_call_prefix)
282
+ content = model_output[:content_index] # .rstrip()
283
+
284
+ return ExtractedToolCallInformation(
285
+ tools_called=(len(tool_calls) > 0),
286
+ tool_calls=tool_calls,
287
+ content=content if content else None,
288
+ )
289
+
290
+ except Exception:
291
+ logger.exception("Error in extracting tool call from response.")
292
+ return ExtractedToolCallInformation(tools_called=False,
293
+ tool_calls=[],
294
+ content=model_output)
295
+
296
+ def extract_tool_calls_streaming(
297
+ self,
298
+ previous_text: str,
299
+ current_text: str,
300
+ delta_text: str,
301
+ previous_token_ids: Sequence[int],
302
+ current_token_ids: Sequence[int],
303
+ delta_token_ids: Sequence[int],
304
+ request: ChatCompletionRequest,
305
+ ) -> Union[DeltaMessage, None]:
306
+ # Store request for type conversion
307
+ if not previous_text:
308
+ self._reset_streaming_state()
309
+ self.streaming_request = request
310
+
311
+ # If no delta text, return None unless it's an EOS token after tool calls
312
+ if not delta_text:
313
+ # Check if this is an EOS token after all tool calls are complete
314
+ # We check for tool calls in the text even if is_tool_call_started is False
315
+ # because it might have been reset after processing all tools
316
+ if delta_token_ids and self.tool_call_end_token_id not in delta_token_ids:
317
+ # Count complete tool calls
318
+ complete_calls = len(
319
+ self.tool_call_complete_regex.findall(current_text))
320
+
321
+ # If we have completed tool calls and populated prev_tool_call_arr
322
+ if complete_calls > 0 and len(self.prev_tool_call_arr) > 0:
323
+ # Check if all tool calls are closed
324
+ open_calls = current_text.count(
325
+ self.tool_call_start_token) - current_text.count(
326
+ self.tool_call_end_token)
327
+ if open_calls == 0:
328
+ # Return empty delta message to allow finish_reason processing
329
+ return DeltaMessage(content="")
330
+ elif not self.is_tool_call_started and current_text:
331
+ # This is a regular content response that's now complete
332
+ return DeltaMessage(content="")
333
+ return None
334
+
335
+ # Update accumulated text
336
+ self.accumulated_text = current_text
337
+
338
+ # Check if we need to advance to next tool
339
+ if self.json_closed and not self.in_function:
340
+ # Check if this tool call has ended
341
+ tool_ends = current_text.count(self.tool_call_end_token)
342
+ if tool_ends > self.current_tool_index:
343
+ # This tool has ended, advance to next
344
+ self.current_tool_index += 1
345
+ self.header_sent = False
346
+ self.param_count = 0
347
+ self.json_started = False
348
+ self.json_closed = False
349
+ self.accumulated_params = {}
350
+
351
+ # Check if there are more tool calls
352
+ tool_starts = current_text.count(self.tool_call_start_token)
353
+ if self.current_tool_index >= tool_starts:
354
+ # No more tool calls
355
+ self.is_tool_call_started = False
356
+ # Continue processing next tool
357
+ return None
358
+
359
+ # Handle normal content before tool calls
360
+ if not self.is_tool_call_started:
361
+ # Check if tool call is starting
362
+ if self.tool_call_start_token_id in delta_token_ids or self.tool_call_start_token in delta_text:
363
+ self.is_tool_call_started = True
364
+ # Return any content before the tool call
365
+ if self.tool_call_start_token in delta_text:
366
+ content_before = delta_text[:delta_text.index(
367
+ self.tool_call_start_token)]
368
+ if content_before:
369
+ return DeltaMessage(content=content_before)
370
+ return None
371
+ else:
372
+ # Check if we're between tool calls - skip whitespace
373
+ if current_text.rstrip().endswith(self.tool_call_end_token):
374
+ # We just ended a tool call, skip whitespace
375
+ if delta_text.strip() == "":
376
+ return None
377
+ # Normal content, no tool call
378
+ return DeltaMessage(content=delta_text)
379
+
380
+ # Check if we're between tool calls (waiting for next one)
381
+ # Count tool calls we've seen vs processed
382
+ tool_starts_count = current_text.count(self.tool_call_start_token)
383
+ if self.current_tool_index >= tool_starts_count:
384
+ # We're past all tool calls, shouldn't be here
385
+ return None
386
+
387
+ # We're in a tool call, find the current tool call portion
388
+ # Need to find the correct tool call based on current_tool_index
389
+ tool_starts = []
390
+ idx = 0
391
+ while True:
392
+ idx = current_text.find(self.tool_call_start_token, idx)
393
+ if idx == -1:
394
+ break
395
+ tool_starts.append(idx)
396
+ idx += len(self.tool_call_start_token)
397
+
398
+ if self.current_tool_index >= len(tool_starts):
399
+ # No more tool calls to process yet
400
+ return None
401
+
402
+ tool_start_idx = tool_starts[self.current_tool_index]
403
+ # Find where this tool call ends (or current position if not ended yet)
404
+ tool_end_idx = current_text.find(self.tool_call_end_token,
405
+ tool_start_idx)
406
+ if tool_end_idx == -1:
407
+ tool_text = current_text[tool_start_idx:]
408
+ else:
409
+ tool_text = current_text[tool_start_idx:tool_end_idx +
410
+ len(self.tool_call_end_token)]
411
+
412
+ # Looking for function header
413
+ if not self.header_sent:
414
+ if self.tool_call_prefix in tool_text:
415
+ func_start = tool_text.find(self.tool_call_prefix) + len(
416
+ self.tool_call_prefix)
417
+ func_end = tool_text.find(">", func_start)
418
+
419
+ if func_end != -1:
420
+ # Found complete function name
421
+ self.current_function_name = tool_text[func_start:func_end]
422
+ self.current_tool_id = self._generate_tool_call_id()
423
+ self.header_sent = True
424
+ self.in_function = True
425
+
426
+ # IMPORTANT: Add to prev_tool_call_arr immediately when we detect a tool call
427
+ # This ensures finish_reason="tool_calls" even if parsing isn't complete
428
+ already_added = any(
429
+ tool.get("name") == self.current_function_name
430
+ for tool in self.prev_tool_call_arr)
431
+ if not already_added:
432
+ self.prev_tool_call_arr.append({
433
+ "name": self.current_function_name,
434
+ "arguments":
435
+ "{}", # Placeholder, will be updated later
436
+ })
437
+
438
+ # Send header with function info
439
+ return DeltaMessage(tool_calls=[
440
+ DeltaToolCall(
441
+ index=self.current_tool_index,
442
+ id=self.current_tool_id,
443
+ function=DeltaFunctionCall(
444
+ name=self.current_function_name, arguments=""),
445
+ type="function",
446
+ )
447
+ ])
448
+ return None
449
+
450
+ # We've sent header, now handle function body
451
+ if self.in_function:
452
+ # Send opening brace if not sent yet
453
+ if not self.json_started and self.parameter_prefix not in delta_text:
454
+ self.json_started = True
455
+ return DeltaMessage(tool_calls=[
456
+ DeltaToolCall(
457
+ index=self.current_tool_index,
458
+ function=DeltaFunctionCall(arguments="{"),
459
+ )
460
+ ])
461
+
462
+ # Make sure json_started is set if we're processing parameters
463
+ if not self.json_started:
464
+ self.json_started = True
465
+
466
+ # Check for function end in accumulated text
467
+ if not self.json_closed and self.function_end_token in tool_text:
468
+ # Close JSON
469
+ self.json_closed = True
470
+
471
+ # Extract the complete tool call to update prev_tool_call_arr with final arguments
472
+ # Find the function content
473
+ func_start = tool_text.find(self.tool_call_prefix) + len(
474
+ self.tool_call_prefix)
475
+ func_content_end = tool_text.find(self.function_end_token,
476
+ func_start)
477
+ if func_content_end != -1:
478
+ func_content = tool_text[func_start:func_content_end]
479
+ # Parse to get the complete arguments
480
+ try:
481
+ parsed_tool = self._parse_xml_function_call(
482
+ func_content, self.streaming_request.tools
483
+ if self.streaming_request else None)
484
+ if parsed_tool:
485
+ # Update existing entry in prev_tool_call_arr with complete arguments
486
+ for i, tool in enumerate(self.prev_tool_call_arr):
487
+ if tool.get(
488
+ "name") == parsed_tool.function.name:
489
+ self.prev_tool_call_arr[i][
490
+ "arguments"] = parsed_tool.function.arguments
491
+ break
492
+ except Exception:
493
+ pass # Ignore parsing errors during streaming
494
+
495
+ result = DeltaMessage(tool_calls=[
496
+ DeltaToolCall(
497
+ index=self.current_tool_index,
498
+ function=DeltaFunctionCall(arguments="}"),
499
+ )
500
+ ])
501
+
502
+ # Reset state for next tool
503
+ self.in_function = False
504
+ self.json_closed = True
505
+ self.accumulated_params = {}
506
+
507
+ return result
508
+
509
+ # Look for parameters
510
+ # Find all parameter starts
511
+ param_starts = []
512
+ idx = 0
513
+ while True:
514
+ idx = tool_text.find(self.parameter_prefix, idx)
515
+ if idx == -1:
516
+ break
517
+ param_starts.append(idx)
518
+ idx += len(self.parameter_prefix)
519
+
520
+ # Check if we should start a new parameter
521
+ if not self.in_param and self.param_count < len(param_starts):
522
+
523
+ if len(param_starts) > self.param_count:
524
+ # Process the next parameter
525
+ param_idx = param_starts[self.param_count]
526
+ param_start = param_idx + len(self.parameter_prefix)
527
+ remaining = tool_text[param_start:]
528
+
529
+ if ">" in remaining:
530
+ # We have the complete parameter name
531
+ name_end = remaining.find(">")
532
+ self.current_param_name = remaining[:name_end]
533
+
534
+ # Find the parameter value
535
+ value_start = param_start + name_end + 1
536
+ value_text = tool_text[value_start:]
537
+ if value_text.startswith("\n"):
538
+ value_text = value_text[1:]
539
+
540
+ # Find where this parameter ends
541
+ param_end_idx = value_text.find(
542
+ self.parameter_end_token)
543
+ if param_end_idx == -1:
544
+ # No closing tag, look for next parameter or function end
545
+ next_param_idx = value_text.find(
546
+ self.parameter_prefix)
547
+ func_end_idx = value_text.find(
548
+ self.function_end_token)
549
+
550
+ if next_param_idx != -1 and (func_end_idx == -1
551
+ or next_param_idx
552
+ < func_end_idx):
553
+ param_end_idx = next_param_idx
554
+ elif func_end_idx != -1:
555
+ param_end_idx = func_end_idx
556
+ else:
557
+ # Neither found, check if tool call is complete
558
+ if self.tool_call_end_token in tool_text:
559
+ # Tool call is complete, so parameter must be complete too
560
+ # Use all remaining text before function end as value
561
+ param_end_idx = len(value_text)
562
+ else:
563
+ # Still streaming, wait for more content
564
+ return None
565
+
566
+ if param_end_idx != -1:
567
+ # Complete parameter found
568
+ param_value = value_text[:param_end_idx]
569
+ if param_value.endswith("\n"):
570
+ param_value = param_value[:-1]
571
+
572
+ # Store raw value for later processing
573
+ self.accumulated_params[
574
+ self.current_param_name] = param_value
575
+
576
+ # Get parameter configuration for type conversion
577
+ param_config = self._get_arguments_config(
578
+ self.current_function_name,
579
+ self.streaming_request.tools
580
+ if self.streaming_request else None)
581
+
582
+ # Convert the parameter value to the appropriate type
583
+ converted_value = self._convert_param_value(
584
+ param_value, self.current_param_name,
585
+ param_config, self.current_function_name)
586
+
587
+ # Build JSON fragment based on the converted type
588
+ # Use json.dumps to properly serialize the value
589
+ serialized_value = json.dumps(converted_value,
590
+ ensure_ascii=False)
591
+
592
+ if self.param_count == 0:
593
+ json_fragment = f'"{self.current_param_name}": {serialized_value}'
594
+ else:
595
+ json_fragment = f', "{self.current_param_name}": {serialized_value}'
596
+
597
+ self.param_count += 1
598
+
599
+ return DeltaMessage(tool_calls=[
600
+ DeltaToolCall(
601
+ index=self.current_tool_index,
602
+ function=DeltaFunctionCall(
603
+ arguments=json_fragment),
604
+ )
605
+ ])
606
+
607
+ # Continue parameter value - Not used in the current implementation
608
+ # since we process complete parameters above
609
+ if self.in_param:
610
+ if self.parameter_end_token in delta_text:
611
+ # End of parameter
612
+ end_idx = delta_text.find(self.parameter_end_token)
613
+ value_chunk = delta_text[:end_idx]
614
+
615
+ # Skip past > if at start
616
+ if not self.current_param_value and ">" in value_chunk:
617
+ gt_idx = value_chunk.find(">")
618
+ value_chunk = value_chunk[gt_idx + 1:]
619
+
620
+ if not self.current_param_value and value_chunk.startswith(
621
+ "\n"):
622
+ value_chunk = value_chunk[1:]
623
+
624
+ # Store complete value
625
+ full_value = self.current_param_value + value_chunk
626
+ self.accumulated_params[
627
+ self.current_param_name] = full_value
628
+
629
+ # Get parameter configuration for type conversion
630
+ param_config = self._get_arguments_config(
631
+ self.current_function_name,
632
+ self.streaming_request.tools
633
+ if self.streaming_request else None)
634
+
635
+ # Convert the parameter value to the appropriate type
636
+ converted_value = self._convert_param_value(
637
+ full_value, self.current_param_name, param_config,
638
+ self.current_function_name)
639
+
640
+ # Serialize the converted value
641
+ serialized_value = json.dumps(converted_value,
642
+ ensure_ascii=False)
643
+
644
+ # Since we've been streaming the quoted version, we need to close it properly
645
+ # This is complex - for now just complete the value
646
+ self.in_param = False
647
+ self.current_param_value = ""
648
+
649
+ # Just close the current parameter string
650
+ return DeltaMessage(tool_calls=[
651
+ DeltaToolCall(
652
+ index=self.current_tool_index,
653
+ function=DeltaFunctionCall(
654
+ arguments='"'), # Close the string quote
655
+ )
656
+ ])
657
+ else:
658
+ # Continue accumulating value
659
+ value_chunk = delta_text
660
+
661
+ # Handle first chunk after param name
662
+ if not self.current_param_value and ">" in value_chunk:
663
+ gt_idx = value_chunk.find(">")
664
+ value_chunk = value_chunk[gt_idx + 1:]
665
+
666
+ if not self.current_param_value and value_chunk.startswith(
667
+ "\n"):
668
+ value_chunk = value_chunk[1:]
669
+
670
+ if value_chunk:
671
+ # Stream the escaped delta
672
+ prev_escaped = json.dumps(
673
+ self.current_param_value, ensure_ascii=False
674
+ )[1:-1] if self.current_param_value else ""
675
+ self.current_param_value += value_chunk
676
+ full_escaped = json.dumps(self.current_param_value,
677
+ ensure_ascii=False)[1:-1]
678
+ delta_escaped = full_escaped[len(prev_escaped):]
679
+
680
+ if delta_escaped:
681
+ return DeltaMessage(tool_calls=[
682
+ DeltaToolCall(
683
+ index=self.current_tool_index,
684
+ function=DeltaFunctionCall(
685
+ arguments=delta_escaped),
686
+ )
687
+ ])
688
+
689
+ return None
special_tokens_map.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|im_start|>",
4
+ "<|im_end|>",
5
+ "<|object_ref_start|>",
6
+ "<|object_ref_end|>",
7
+ "<|box_start|>",
8
+ "<|box_end|>",
9
+ "<|quad_start|>",
10
+ "<|quad_end|>",
11
+ "<|vision_start|>",
12
+ "<|vision_end|>",
13
+ "<|vision_pad|>",
14
+ "<|image_pad|>",
15
+ "<|video_pad|>"
16
+ ],
17
+ "eos_token": {
18
+ "content": "<|im_end|>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ },
24
+ "pad_token": {
25
+ "content": "<|endoftext|>",
26
+ "lstrip": false,
27
+ "normalized": false,
28
+ "rstrip": false,
29
+ "single_word": false
30
+ }
31
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aeb13307a71acd8fe81861d94ad54ab689df773318809eed3cbe794b4492dae4
3
+ size 11422654
tokenizer_config.json ADDED
@@ -0,0 +1,240 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": false,
3
+ "add_prefix_space": false,
4
+ "added_tokens_decoder": {
5
+ "151643": {
6
+ "content": "<|endoftext|>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "151644": {
14
+ "content": "<|im_start|>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "151645": {
22
+ "content": "<|im_end|>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ },
29
+ "151646": {
30
+ "content": "<|object_ref_start|>",
31
+ "lstrip": false,
32
+ "normalized": false,
33
+ "rstrip": false,
34
+ "single_word": false,
35
+ "special": true
36
+ },
37
+ "151647": {
38
+ "content": "<|object_ref_end|>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false,
43
+ "special": true
44
+ },
45
+ "151648": {
46
+ "content": "<|box_start|>",
47
+ "lstrip": false,
48
+ "normalized": false,
49
+ "rstrip": false,
50
+ "single_word": false,
51
+ "special": true
52
+ },
53
+ "151649": {
54
+ "content": "<|box_end|>",
55
+ "lstrip": false,
56
+ "normalized": false,
57
+ "rstrip": false,
58
+ "single_word": false,
59
+ "special": true
60
+ },
61
+ "151650": {
62
+ "content": "<|quad_start|>",
63
+ "lstrip": false,
64
+ "normalized": false,
65
+ "rstrip": false,
66
+ "single_word": false,
67
+ "special": true
68
+ },
69
+ "151651": {
70
+ "content": "<|quad_end|>",
71
+ "lstrip": false,
72
+ "normalized": false,
73
+ "rstrip": false,
74
+ "single_word": false,
75
+ "special": true
76
+ },
77
+ "151652": {
78
+ "content": "<|vision_start|>",
79
+ "lstrip": false,
80
+ "normalized": false,
81
+ "rstrip": false,
82
+ "single_word": false,
83
+ "special": true
84
+ },
85
+ "151653": {
86
+ "content": "<|vision_end|>",
87
+ "lstrip": false,
88
+ "normalized": false,
89
+ "rstrip": false,
90
+ "single_word": false,
91
+ "special": true
92
+ },
93
+ "151654": {
94
+ "content": "<|vision_pad|>",
95
+ "lstrip": false,
96
+ "normalized": false,
97
+ "rstrip": false,
98
+ "single_word": false,
99
+ "special": true
100
+ },
101
+ "151655": {
102
+ "content": "<|image_pad|>",
103
+ "lstrip": false,
104
+ "normalized": false,
105
+ "rstrip": false,
106
+ "single_word": false,
107
+ "special": true
108
+ },
109
+ "151656": {
110
+ "content": "<|video_pad|>",
111
+ "lstrip": false,
112
+ "normalized": false,
113
+ "rstrip": false,
114
+ "single_word": false,
115
+ "special": true
116
+ },
117
+ "151657": {
118
+ "content": "<tool_call>",
119
+ "lstrip": false,
120
+ "normalized": false,
121
+ "rstrip": false,
122
+ "single_word": false,
123
+ "special": false
124
+ },
125
+ "151658": {
126
+ "content": "</tool_call>",
127
+ "lstrip": false,
128
+ "normalized": false,
129
+ "rstrip": false,
130
+ "single_word": false,
131
+ "special": false
132
+ },
133
+ "151659": {
134
+ "content": "<|fim_prefix|>",
135
+ "lstrip": false,
136
+ "normalized": false,
137
+ "rstrip": false,
138
+ "single_word": false,
139
+ "special": false
140
+ },
141
+ "151660": {
142
+ "content": "<|fim_middle|>",
143
+ "lstrip": false,
144
+ "normalized": false,
145
+ "rstrip": false,
146
+ "single_word": false,
147
+ "special": false
148
+ },
149
+ "151661": {
150
+ "content": "<|fim_suffix|>",
151
+ "lstrip": false,
152
+ "normalized": false,
153
+ "rstrip": false,
154
+ "single_word": false,
155
+ "special": false
156
+ },
157
+ "151662": {
158
+ "content": "<|fim_pad|>",
159
+ "lstrip": false,
160
+ "normalized": false,
161
+ "rstrip": false,
162
+ "single_word": false,
163
+ "special": false
164
+ },
165
+ "151663": {
166
+ "content": "<|repo_name|>",
167
+ "lstrip": false,
168
+ "normalized": false,
169
+ "rstrip": false,
170
+ "single_word": false,
171
+ "special": false
172
+ },
173
+ "151664": {
174
+ "content": "<|file_sep|>",
175
+ "lstrip": false,
176
+ "normalized": false,
177
+ "rstrip": false,
178
+ "single_word": false,
179
+ "special": false
180
+ },
181
+ "151665": {
182
+ "content": "<tool_response>",
183
+ "lstrip": false,
184
+ "normalized": false,
185
+ "rstrip": false,
186
+ "single_word": false,
187
+ "special": false
188
+ },
189
+ "151666": {
190
+ "content": "</tool_response>",
191
+ "lstrip": false,
192
+ "normalized": false,
193
+ "rstrip": false,
194
+ "single_word": false,
195
+ "special": false
196
+ },
197
+ "151667": {
198
+ "content": "<think>",
199
+ "lstrip": false,
200
+ "normalized": false,
201
+ "rstrip": false,
202
+ "single_word": false,
203
+ "special": false
204
+ },
205
+ "151668": {
206
+ "content": "</think>",
207
+ "lstrip": false,
208
+ "normalized": false,
209
+ "rstrip": false,
210
+ "single_word": false,
211
+ "special": false
212
+ }
213
+ },
214
+ "additional_special_tokens": [
215
+ "<|im_start|>",
216
+ "<|im_end|>",
217
+ "<|object_ref_start|>",
218
+ "<|object_ref_end|>",
219
+ "<|box_start|>",
220
+ "<|box_end|>",
221
+ "<|quad_start|>",
222
+ "<|quad_end|>",
223
+ "<|vision_start|>",
224
+ "<|vision_end|>",
225
+ "<|vision_pad|>",
226
+ "<|image_pad|>",
227
+ "<|video_pad|>"
228
+ ],
229
+ "bos_token": null,
230
+ "clean_up_tokenization_spaces": false,
231
+ "eos_token": "<|im_end|>",
232
+ "errors": "replace",
233
+ "extra_special_tokens": {},
234
+ "model_max_length": 131072,
235
+ "pad_token": "<|endoftext|>",
236
+ "padding_side": "right",
237
+ "split_special_tokens": false,
238
+ "tokenizer_class": "Qwen2Tokenizer",
239
+ "unk_token": null
240
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff