用readme的代码测试,返回乱码
#9
by
tarjintor
- opened
估计是gpu算子的问题,把device改成cpu,dtype改成torch.float16就能正常运行了,kt之类的运行还得等社区其他人适配
尝试了一下readme的例子(使用torch forward),结果:
能否补充更完整的环境信息(如 GPU 型号、CUDA版本、PyTorch/Transformers 版本、dtype/精度设置),以便我们复现并定位问题?
python 3.12.9
gpu rtx4090
cuda 12.2
accelerate 1.11.0.dev0
certifi 2025.8.3
charset-normalizer 3.4.3
filelock 3.19.1
fsspec 2025.9.0
hf-xet 1.1.9
huggingface-hub 0.34.4
idna 3.10
jinja2 3.1.6
markupsafe 3.0.2
mpmath 1.3.0
networkx 3.5
numpy 2.3.3
nvidia-cublas-cu12 12.8.4.1
nvidia-cuda-cupti-cu12 12.8.90
nvidia-cuda-nvrtc-cu12 12.8.93
nvidia-cuda-runtime-cu12 12.8.90
nvidia-cudnn-cu12 9.10.2.21
nvidia-cufft-cu12 11.3.3.83
nvidia-cufile-cu12 1.13.1.3
nvidia-curand-cu12 10.3.9.90
nvidia-cusolver-cu12 11.7.3.90
nvidia-cusparse-cu12 12.5.8.93
nvidia-cusparselt-cu12 0.7.1
nvidia-nccl-cu12 2.27.3
nvidia-nvjitlink-cu12 12.8.93
nvidia-nvtx-cu12 12.8.90
packaging 25.0
pillow 11.3.0
psutil 7.0.0
pyyaml 6.0.2
regex 2025.9.1
requests 2.32.5
safetensors 0.6.2
setuptools 80.9.0
sympy 1.14.0
tokenizers 0.22.0
torch 2.8.0
torchaudio 2.8.0
torchvision 0.23.0
tqdm 4.67.1
transformers 4.57.0.dev0
triton 3.4.0
typing-extensions 4.15.0
urllib3 2.5.0
代码指定device为cpu就不是乱码,否则为乱码
model = AutoModelForCausalLM.from_pretrained(
model_name,
dtype="auto",
device_map="auto",
# dtype=torch.float16,
# device_map="cpu",
)
LoL, can this problem be fixed soon?
CPU is a bit too slow
LoL, can this problem be fixed soon?
CPU is a bit too slow
you can use fastllm,it works well now


