备注 chatglm-6b 模型国内下载地址
https://cloud.tsinghua.edu.cn/d/fb9f16d6dc8f482596c2/
下载模型文件
一般从 https://huggingface.co/internlm/internlm-7b/tree/main 下载。使用 git clone 总会出现报错,所以直接使用浏览器下载
上传文件到 linux server
这里使用 rsync 上传,据说比 scp 上传还快
https://groups.google.com/g/xidian_linux/c/58jMVPoZKcA
rsync -e "~/.ssh/common.pem" -aP ./internlm-7b root@101.126.33.219:/root/howard/InternLM/internlm-7b
➜ rsync -aP ./internlm-7b root@101.126.33.219:/root/howard/InternLM/internlm-7b
building file list ...
19 files to consider
internlm-7b/
internlm-7b/pytorch_model-00001-of-00008.bin
1969370847 100% 14.63MB/s 0:02:08 (xfer#1, to-check=11/19)
internlm-7b/pytorch_model-00002-of-00008.bin
1933844137 100% 14.06MB/s 0:02:11 (xfer#2, to-check=10/19)
internlm-7b/pytorch_model-00003-of-00008.bin
1933844201 100% 9.36MB/s 0:03:16 (xfer#3, to-check=9/19)
internlm-7b/pytorch_model-00004-of-00008.bin
1990458181 100% 15.18MB/s 0:02:05 (xfer#4, to-check=8/19)
internlm-7b/pytorch_model-00005-of-00008.bin
1990458775 100% 12.24MB/s 0:02:35 (xfer#5, to-check=7/19)
internlm-7b/pytorch_model-00006-of-00008.bin
1990458775 100% 12.47MB/s 0:02:32 (xfer#6, to-check=6/19)
internlm-7b/pytorch_model-00007-of-00008.bin
1990467305 100% 15.59MB/s 0:02:01 (xfer#7, to-check=5/19)
internlm-7b/pytorch_model-00008-of-00008.bin
845153194 100% 12.92MB/s 0:01:02 (xfer#8, to-check=4/19)
internlm-7b/pytorch_model.bin.index.json
37116 100% 362.46kB/s 0:00:00 (xfer#9, to-check=3/19)
internlm-7b/special_tokens_map.json
95 100% 0.92kB/s 0:00:00 (xfer#10, to-check=2/19)
internlm-7b/tokenizer.model
1658691 100% 5.40MB/s 0:00:00 (xfer#11, to-check=1/19)
internlm-7b/tokenizer_config.json
343 100% 1.14kB/s 0:00:00 (xfer#12, to-check=0/19)
sent 14638575565 bytes received 18296 bytes 13674538.87 bytes/sec
total size is 14645813931 speedup is 1.00
SSH 端口转发
ssh root@101.126.33.219 -L 0.0.0.0:8501:0.0.0.0:8501
启动 web 服务
pip install streamlit
streamlit run web_demo.py --server.address 0.0.0.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
或者也可以在 python 中直接启动
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("/root/howard/InternLM/internlm-7b/internlm-7b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("/root/howard/InternLM/internlm-7b/internlm-7b", trust_remote_code=True).cuda()
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)
(python39) ➜ InternLM git:(main) ✗ python
Python 3.9.18 (main, Sep 11 2023, 13:41:44)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> tokenizer = AutoTokenizer.from_pretrained("/root/howard/InternLM/internlm-7b/internlm-7b", trust_remote_code=True)
>>> model = AutoModelForCausalLM.from_pretrained("/root/howard/InternLM/internlm-7b/internlm-7b", trust_remote_code=True).cuda()
Loading checkpoint shards: 100%|██████████████████████| 8/8 [00:08<00:00, 1.04s/it]
Some weights of the model checkpoint at /root/howard/InternLM/internlm-7b/internlm-7b were not used when initializing InternLMForCausalLM: ['model.layers.16.self_attn.rotary_emb.inv_freq', 'model.layers.25.self_attn.rotary_emb.inv_freq', 'model.layers.29.self_attn.rotary_emb.inv_freq', 'model.layers.26.self_attn.rotary_emb.inv_freq', 'model.layers.3.self_attn.rotary_emb.inv_freq', 'model.layers.5.self_attn.rotary_emb.inv_freq', 'model.layers.7.self_attn.rotary_emb.inv_freq', 'model.layers.19.self_attn.rotary_emb.inv_freq', 'model.layers.24.self_attn.rotary_emb.inv_freq', 'model.layers.13.self_attn.rotary_emb.inv_freq', 'model.layers.9.self_attn.rotary_emb.inv_freq', 'model.layers.18.self_attn.rotary_emb.inv_freq', 'model.layers.15.self_attn.rotary_emb.inv_freq', 'model.layers.14.self_attn.rotary_emb.inv_freq', 'model.layers.23.self_attn.rotary_emb.inv_freq', 'model.layers.6.self_attn.rotary_emb.inv_freq', 'model.layers.0.self_attn.rotary_emb.inv_freq', 'model.layers.8.self_attn.rotary_emb.inv_freq', 'model.layers.21.self_attn.rotary_emb.inv_freq', 'model.layers.4.self_attn.rotary_emb.inv_freq', 'model.layers.27.self_attn.rotary_emb.inv_freq', 'model.layers.20.self_attn.rotary_emb.inv_freq', 'model.layers.31.self_attn.rotary_emb.inv_freq', 'model.layers.22.self_attn.rotary_emb.inv_freq', 'model.layers.1.self_attn.rotary_emb.inv_freq', 'model.layers.28.self_attn.rotary_emb.inv_freq', 'model.layers.10.self_attn.rotary_emb.inv_freq', 'model.layers.30.self_attn.rotary_emb.inv_freq', 'model.layers.11.self_attn.rotary_emb.inv_freq', 'model.layers.17.self_attn.rotary_emb.inv_freq', 'model.layers.2.self_attn.rotary_emb.inv_freq', 'model.layers.12.self_attn.rotary_emb.inv_freq']
- This IS expected if you are initializing InternLMForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing InternLMForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/root/anaconda3/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 918, in cuda
return self._apply(lambda t: t.cuda(device))
File "/root/anaconda3/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
File "/root/anaconda3/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
File "/root/anaconda3/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 810, in _apply
module._apply(fn)
[Previous line repeated 2 more times]
File "/root/anaconda3/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 833, in _apply
param_applied = fn(param)
File "/root/anaconda3/envs/python39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 918, in <lambda>
return self._apply(lambda t: t.cuda(device))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 172.00 MiB. GPU 0 has a total capacty of 21.99 GiB of which 107.00 MiB is free. Including non-PyTorch memory, this process has 21.87 GiB memory in use. Of the allocated memory 21.65 GiB is allocated by PyTorch, and 1.49 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
>>> model = model.eval()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'model' is not defined
>>> response, history = model.chat(tokenizer, "你好", history=[])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'model' is not defined
>>> print(response)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'response' is not defined
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'e' is not defined
>>> exit()
查看 GPU 用量
nvidia-smi
(python39) ➜ InternLM git:(main) ✗ nvidia-smi
Tue Oct 17 11:05:20 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.12 Driver Version: 535.104.12 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A10 On | 00000000:65:01.0 Off | 0 |
| 0% 46C P0 59W / 150W | 14236MiB / 23028MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 589375 C .../anaconda3/envs/python39/bin/python 14224MiB |
+---------------------------------------------------------------------------------------+