orionweller commited on
Commit
912f86e
·
verified ·
1 Parent(s): 041b25b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -0
README.md CHANGED
@@ -260,6 +260,41 @@ All training artifacts are publicly available:
260
 
261
  ## Usage Examples
262
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
263
  ### Encoder: Masked Language Modeling
264
  <details>
265
  <summary>Click to expand <strong>encoder</strong> usage examples</summary>
 
260
 
261
  ## Usage Examples
262
 
263
+ ### Quantization
264
+ <details>
265
+ <summary>Click to expand <strong>encoder</strong> usage examples</summary>
266
+ import torch
267
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
268
+
269
+ quantization_config = BitsAndBytesConfig(
270
+ load_in_8bit=True,
271
+ )
272
+
273
+ tokenizer = AutoTokenizer.from_pretrained("jhu-clsp/ettin-decoder-1b")
274
+ model = AutoModelForCausalLM.from_pretrained(
275
+ "jhu-clsp/ettin-decoder-1b",
276
+ torch_dtype=torch.float16,
277
+ device_map="auto",
278
+ quantization_config=quantization_config
279
+ )
280
+
281
+ prompt = "The future of artificial intelligence is"
282
+ inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
283
+
284
+ with torch.no_grad():
285
+ outputs = model.generate(
286
+ **inputs,
287
+ max_length=50,
288
+ num_return_sequences=1,
289
+ temperature=0.7,
290
+ do_sample=True,
291
+ pad_token_id=tokenizer.eos_token_id
292
+ )
293
+
294
+ generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
295
+ print(f"Generated text: {generated_text}")
296
+ </summary>
297
+
298
  ### Encoder: Masked Language Modeling
299
  <details>
300
  <summary>Click to expand <strong>encoder</strong> usage examples</summary>