用readme的代码测试,返回乱码

#9
by tarjintor - opened

image.png

估计是gpu算子的问题,把device改成cpu,dtype改成torch.float16就能正常运行了,kt之类的运行还得等社区其他人适配

尝试了一下readme的例子(使用torch forward),结果:
image.png

能否补充更完整的环境信息(如 GPU 型号、CUDA版本、PyTorch/Transformers 版本、dtype/精度设置),以便我们复现并定位问题?

尝试了一下readme的例子(使用torch forward),结果:
image.png

能否补充更完整的环境信息(如 GPU 型号、CUDA版本、PyTorch/Transformers 版本、dtype/精度设置),以便我们复现并定位问题?

截图 2025-09-12 16-54-28.png

python 3.12.9
gpu rtx4090
cuda 12.2

accelerate               1.11.0.dev0
certifi                  2025.8.3
charset-normalizer       3.4.3
filelock                 3.19.1
fsspec                   2025.9.0
hf-xet                   1.1.9
huggingface-hub          0.34.4
idna                     3.10
jinja2                   3.1.6
markupsafe               3.0.2
mpmath                   1.3.0
networkx                 3.5
numpy                    2.3.3
nvidia-cublas-cu12       12.8.4.1
nvidia-cuda-cupti-cu12   12.8.90
nvidia-cuda-nvrtc-cu12   12.8.93
nvidia-cuda-runtime-cu12 12.8.90
nvidia-cudnn-cu12        9.10.2.21
nvidia-cufft-cu12        11.3.3.83
nvidia-cufile-cu12       1.13.1.3
nvidia-curand-cu12       10.3.9.90
nvidia-cusolver-cu12     11.7.3.90
nvidia-cusparse-cu12     12.5.8.93
nvidia-cusparselt-cu12   0.7.1
nvidia-nccl-cu12         2.27.3
nvidia-nvjitlink-cu12    12.8.93
nvidia-nvtx-cu12         12.8.90
packaging                25.0
pillow                   11.3.0
psutil                   7.0.0
pyyaml                   6.0.2
regex                    2025.9.1
requests                 2.32.5
safetensors              0.6.2
setuptools               80.9.0
sympy                    1.14.0
tokenizers               0.22.0
torch                    2.8.0
torchaudio               2.8.0
torchvision              0.23.0
tqdm                     4.67.1
transformers             4.57.0.dev0
triton                   3.4.0
typing-extensions        4.15.0
urllib3                  2.5.0

代码指定device为cpu就不是乱码,否则为乱码

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    dtype="auto",
    device_map="auto",
    # dtype=torch.float16,
    # device_map="cpu",
)

LoL, can this problem be fixed soon?
CPU is a bit too slow

LoL, can this problem be fixed soon?
CPU is a bit too slow

you can use fastllm,it works well now

Sign up or log in to comment