admin管理员组

文章数量:1531703

文章目录

    • 问题描述
    • 问题解决

问题描述

在部署llama3模型时,遇到如下报错:

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
  0%|                                                                                                                                                                                                                                                                                                                                                                               | 0/175 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/stage2/auto/evaluation/C-eval/evaluate_zh2.py", line 206, in <module>
    main()
  File "/stage2/auto/evaluation/C-eval/evaluate_zh2.py", line 202, in main
    ceval.run(args.shot, args.split)
  File "/stage2/auto/evaluation/C-eval/evaluate_zh2.py", line 112, in run
    result, acc = self.run_single_task(task_name, shot, split)
  File "/stage2/auto/evaluation/C-eval/evaluate_zh2.py", line 139, in run_single_task
    output = self.model.generate(
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1989, in generate
    result = self._sample(
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2932, in _sample
    outputs = self(**model_inputs, return_dict=True)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 169, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 1141, in forward
    outputs = self.model(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 914, in forward
    causal_mask = self._update_causal_mask(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py", line 1038, in _update_causal_mask
    causal_mask = torch.triu(causal_mask, diagonal=1)
RuntimeError: "triu_tril_cuda_template" not implemented for 'BFloat16'

问题解决

原因分析:torch版本与代码不符,半精度的表示方法不对,参考资料:https://github/meta-llama/llama3/issues/80
将torch安装至2.2.2版本即可解决问题:

pip install torch==2.2.2

本文标签: 报错RuntimeErrortriutrilcudatemplateimplemented