纯净版ubuntu2204部署pytorch环境 - - 闪电云GPU算力平台

模型地址：https://www.modelscope.cn/models/qwen/Qwen-14B-Chat-Int4/summary

一。安装依赖

1. 依赖要求

python 3.8及以上版本
pytorch 2.0及以上版本
建议使用CUDA 11.4及以上（GPU用户、flash-attention用户等需考虑此选项）
python 3.8 and above
pytorch 2.0 and above, 2.0 and above are recommended
CUDA 11.4 and above are recommended (this is for GPU users, flash-attention users, etc.)

2. 安装miniconda

wget https://repo.anaconda.com/miniconda/Miniconda3-py38_4.12.0-Linux-x86_64.sh

sh Miniconda3-py38_4.12.0-Linux-x86_64.sh

3. 安装CUDA11.8

wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run

sudo sh cuda_11.8.0_520.61.05_linux.run 添加环境变量 echo "export PATH=/u
添加环境变量 echo "export PATH=/u

sr/local/cuda-11.8/bin${PATH:+:${PATH}}" >> ~/.bashrc echo "export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}" >> ~/.bashrc sudo ldconfig载入安装后的动态链接库 source ~/.bashrc 使环境变量生效测试安装情况 nvcc -V

4. 安装pytorch包

版本页面：https://pytorch.org/get-started/previous-versions/

安装pytorch2.1.0,cuda11.8对应版本

pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118 -i https://mirrors.aliyun.com/pypi/simple

5. 安装千问依赖项

运行Qwen-14B-Chat-Int4，请确保满足上述要求，再执行以下pip命令安装依赖库。如安装auto-gptq遇到问题，我们建议您到官方repo搜索合适的预编译wheel。 pip install transformers==4.32.0 accelerate tiktoken einops scipy transformers_stream_generator==0.0.4 peft deepspeed -i https://mirrors.aliyun.com/pypi/simple pip install auto-gptq optimum

另外，推荐安装flash-attention库（当前已支持flash attention 2），以实现更高的效率和更低的显存占用。需要确认，网络是否可以访问：https://objects.githubusercontent.com/以及https://github.com/ 否则pip install 过程会非常缓慢，而且失败 git clone https://github.com/Dao-AILab/flash-attention cd flash-attention && pip install . # 下方安装可选，安装可能比较缓慢。 # pip install csrc/layer_norm # pip install csrc/rotary

6. 下载模型

git clone https://www.modelscope.cn/qwen/Qwen-72B-Chat.git

7.测试

一般在魔搭社区的模型介绍页，提供的有测试脚本

#下面我们展示了一个使用Qwen-14B-Chat-Int4模型的样例： from modelscope import AutoTokenizer, AutoModelForCausalLM # Note: The default behavior now has injection attack prevention off. tokenizer = AutoTokenizer.from_pretrained("qwen/Qwen-14B-Chat-Int4", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( "qwen/Qwen-14B-Chat-Int4", device_map="auto", trust_remote_code=True ).eval() response, history = model.chat(tokenizer, "你好", history=None) print(response) # 你好！很高兴为你提供帮助。

安装modelspcop库：

pip install modelscope -i https://mirrors.aliyun.com/pypi/simple

运行脚本，就可以自动安装下载模型文件了

纯净版ubuntu2204部署pytorch环境 - - 闪电云GPU算力平台

发布时间：2024-08-21 1813

相关推荐