kunhunjon commited on
Commit
6223ae4
·
verified ·
1 Parent(s): 4e0e9cf

Add pipeline_tag metadata and model documentation

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - chess
8
+ - neuron
9
+ - aws-trainium
10
+ - vllm
11
+ - optimum-neuron
12
+ base_model: karanps/ChessLM_Qwen3
13
+ ---
14
+
15
+ # ChessLM Qwen3 - Neuron Traced for AWS Trainium/Inferentia
16
+
17
+ This is a Neuron-traced version of [karanps/ChessLM_Qwen3](https://huggingface.co/karanps/ChessLM_Qwen3) optimized for AWS Trainium (trn1) and Inferentia (inf2) instances using vLLM.
18
+
19
+ ## Model Details
20
+
21
+ - **Base Model**: Qwen3-2B fine-tuned for chess
22
+ - **Compilation**: optimum-neuron[vllm]==0.3.0
23
+ - **Target Hardware**: AWS Trainium (trn1) / Inferentia (inf2)
24
+ - **Precision**: BF16
25
+ - **Tensor Parallelism**: 2 cores
26
+ - **Batch Size**: 1
27
+ - **Max Sequence Length**: 2048
28
+
29
+ ## Requirements
30
+
31
+ ```bash
32
+ pip install optimum-neuron[vllm]==0.3.0
33
+ pip install neuronx-distributed --extra-index-url=https://pip.repos.neuron.amazonaws.com
34
+ ```
35
+
36
+ ## Usage
37
+
38
+ ### Loading the Model
39
+
40
+ ```python
41
+ from optimum.neuron import NeuronModelForCausalLM
42
+ from transformers import AutoTokenizer
43
+
44
+ # Load the traced model
45
+ model = NeuronModelForCausalLM.from_pretrained("kunhunjon/ChessLM_Qwen3_Trainium")
46
+ tokenizer = AutoTokenizer.from_pretrained("kunhunjon/ChessLM_Qwen3_Trainium")
47
+
48
+ # Run inference
49
+ prompt = "e2e4"
50
+ inputs = tokenizer(prompt, return_tensors="pt")
51
+ outputs = model.generate(**inputs, max_new_tokens=20)
52
+ result = tokenizer.decode(outputs[0], skip_special_tokens=True)
53
+ print(result)
54
+ ```
55
+
56
+ ### Hardware Requirements
57
+
58
+ - AWS Trainium (trn1.32xlarge, trn1.2xlarge) or Inferentia (inf2) instances
59
+ - At least 2 Neuron cores (as configured during tracing)
60
+ - Minimum 32GB RAM recommended
61
+
62
+ ## Compilation Details
63
+
64
+ This model was traced with the following parameters:
65
+ - `batch_size=1`
66
+ - `sequence_length=2048`
67
+ - `num_cores=2`
68
+ - `auto_cast_type="bf16"`
69
+ - vLLM-compatible compilation
70
+
71
+ ## License
72
+
73
+ This model inherits the license from the base model [karanps/ChessLM_Qwen3](https://huggingface.co/karanps/ChessLM_Qwen3).
74
+
75
+ ## Citation
76
+
77
+ If you use this model, please cite the original ChessLM model and AWS Neuron tools.