Text Generation
Safetensors
English
hudsongouge commited on
Commit
a96cc7e
·
verified ·
1 Parent(s): 6f3f654

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -28,7 +28,7 @@ The training data was composed exclusively of the following sources:
28
 
29
  Only the datasets listed above were used, and each was included in its entirety.
30
 
31
- The Discord datasets (combined ~693MB) were formatted in **ChatML**, with usernames serving as speaker roles, enabling the model to learn natural dialogue structure and dynamics. Discord data included many diverse topics, especially code. Thus, the model understands basic syntax patterns of some common programming languages. However, due to its lack of training on large scale high quality code samples, generated code is likely not to be very helpful.
32
  Larger models in the family received a larger and more diverse training set.
33
 
34
  ---
 
28
 
29
  Only the datasets listed above were used, and each was included in its entirety.
30
 
31
+ The Discord datasets (combined ~693MB) were formatted in **ChatML**, with usernames serving as speaker roles, enabling the model to learn natural dialogue structure and dynamics. Discord data included many diverse topics, especially code. Thus, the model understands basic syntax patterns of some common programming languages. However, due to its lack of training on large scale high quality code samples, generated code will likely not be reliable or production-quality.
32
  Larger models in the family received a larger and more diverse training set.
33
 
34
  ---