|
@@ -0,0 +1,76 @@
|
|
|
|
+
|
|
|
|
+#### 项目启动选项
|
|
|
|
+```aidl
|
|
|
|
+usage: langchina-ChatGLM [-h] [--no-remote-model] [--model MODEL] [--lora LORA] [--model-dir MODEL_DIR] [--lora-dir LORA_DIR] [--cpu] [--auto-devices] [--gpu-memory GPU_MEMORY [GPU_MEMORY ...]] [--cpu-memory CPU_MEMORY]
|
|
|
|
+ [--load-in-8bit] [--bf16]
|
|
|
|
+
|
|
|
|
+基于langchain和chatGML的LLM文档阅读器
|
|
|
|
+
|
|
|
|
+options:
|
|
|
|
+ -h, --help show this help message and exit
|
|
|
|
+ --no-remote-model remote in the model on loader checkpoint, if your load local model to add the ` --no-remote-model`
|
|
|
|
+ --model MODEL Name of the model to load by default.
|
|
|
|
+ --lora LORA Name of the LoRA to apply to the model by default.
|
|
|
|
+ --model-dir MODEL_DIR
|
|
|
|
+ Path to directory with all the models
|
|
|
|
+ --lora-dir LORA_DIR Path to directory with all the loras
|
|
|
|
+ --cpu Use the CPU to generate text. Warning: Training on CPU is extremely slow.
|
|
|
|
+ --auto-devices Automatically split the model across the available GPU(s) and CPU.
|
|
|
|
+ --gpu-memory GPU_MEMORY [GPU_MEMORY ...]
|
|
|
|
+ Maxmimum GPU memory in GiB to be allocated per GPU. Example: --gpu-memory 10 for a single GPU, --gpu-memory 10 5 for two GPUs. You can also set values in MiB like --gpu-memory 3500MiB.
|
|
|
|
+ --cpu-memory CPU_MEMORY
|
|
|
|
+ Maximum CPU memory in GiB to allocate for offloaded weights. Same as above.
|
|
|
|
+ --load-in-8bit Load the model with 8-bit precision.
|
|
|
|
+ --bf16 Load the model with bfloat16 precision. Requires NVIDIA Ampere GPU.
|
|
|
|
+
|
|
|
|
+```
|
|
|
|
+
|
|
|
|
+#### 示例
|
|
|
|
+
|
|
|
|
+- 1、加载本地模型
|
|
|
|
+
|
|
|
|
+```text
|
|
|
|
+--model-dir 本地checkpoint存放文件夹
|
|
|
|
+--model 模型名称
|
|
|
|
+--no-remote-model 不从远程加载模型
|
|
|
|
+```
|
|
|
|
+```shell
|
|
|
|
+$ python cli_demo.py --model-dir /media/mnt/ --model chatglm-6b --no-remote-model
|
|
|
|
+```
|
|
|
|
+
|
|
|
|
+- 2、低精度加载模型
|
|
|
|
+```text
|
|
|
|
+--model-dir 本地checkpoint存放文件夹
|
|
|
|
+--model 模型名称
|
|
|
|
+--no-remote-model 不从远程加载模型
|
|
|
|
+--load-in-8bit 以8位精度加载模型
|
|
|
|
+```
|
|
|
|
+```shell
|
|
|
|
+$ python cli_demo.py --model-dir /media/mnt/ --model chatglm-6b --no-remote-model --load-in-8bit
|
|
|
|
+```
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+- 3、使用cpu预测模型
|
|
|
|
+```text
|
|
|
|
+--model-dir 本地checkpoint存放文件夹
|
|
|
|
+--model 模型名称
|
|
|
|
+--no-remote-model 不从远程加载模型
|
|
|
|
+--cpu 使用CPU生成文本。警告:CPU上的训练非常缓慢。
|
|
|
|
+```
|
|
|
|
+```shell
|
|
|
|
+$ python cli_demo.py --model-dir /media/mnt/ --model chatglm-6b --no-remote-model --cpu
|
|
|
|
+```
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+- 3、加载lora微调文件
|
|
|
|
+```text
|
|
|
|
+--model-dir 本地checkpoint存放文件夹
|
|
|
|
+--model 模型名称
|
|
|
|
+--no-remote-model 不从远程加载模型
|
|
|
|
+--lora-dir 本地lora存放文件夹
|
|
|
|
+--lora lora名称
|
|
|
|
+```
|
|
|
|
+```shell
|
|
|
|
+$ python cli_demo.py --model-dir /media/mnt/ --model chatglm-6b --no-remote-model --lora-dir /media/mnt/loras --lora chatglm-step100
|
|
|
|
+```
|