2 years ago · abb1bc8956
--- a/README.md
+++ b/README.md
@@ -58,10 +58,10 @@
 
				 - Embedding 模型硬件需求
			
 
				 
			
 
				     本项目中默认选用的 Embedding 模型 [GanymedeNil/text2vec-large-chinese](https://huggingface.co/GanymedeNil/text2vec-large-chinese/tree/main) 约占用显存 3GB，也可修改为在 CPU 中运行。
			
 
				-### 软件需求
			
 
				-本项目已在 python 3.8，cuda 11.7 环境下完成测试。
			
 
				 
			
 
				+### 软件需求
			
 
				 
			
 
				+本项目已在 Python 3.8，CUDA 11.7 环境下完成测试。
			
 
				 
			
 
				 ### 1. 安装环境
			
 
				 
			
@@ -111,10 +111,9 @@ $ python webui.py
 
				 ![webui](img/ui1.png)
			
 
				 Web UI 可以实现如下功能：
			
 
				 
			
 
				-1. 自动读取`knowledge_based_chatglm.py`中`LLM`及`embedding`模型枚举，选择后点击`setting`进行模型加载，可随时切换模型进行测试
			
 
				+1. 运行前自动读取`configs/model_config.py`中`LLM`及`Embedding`模型枚举及默认模型设置运行模型，如需重新加载模型，可在界面重新选择后点击`重新加载模型`进行模型加载；
			
 
				 2. 可手动调节保留对话历史长度，可根据显存大小自行调节
			
 
				-3. 添加上传文件功能，通过下拉框选择已上传的文件，点击`loading`加载文件，过程中可随时更换加载的文件
			
 
				-4. 底部添加`use via API`可对接到自己系统
			
 
				+3. 添加上传文件功能，通过下拉框选择已上传的文件，点击`加载文件`按钮，过程中可随时更换加载的文件
			
 
				 
			
 
				 或执行 [knowledge_based_chatglm.py](cli_demo.py) 脚本体验**命令行交互**
			
 
				 ```shell
			
--- a/README_en.md
+++ b/README_en.md
@@ -18,22 +18,19 @@
 
				 
			
 
				 🚩 This project does not involve fine-tuning or training; however, fine-tuning or training can be employed to optimize the effectiveness of this project.
			
 
				 
			
 
				-[TOC]
			
 
				 
			
 
				 ## Changelog
			
 
				 
			
 
				-**[2023/04/07]**
			
 
				+**[2023/04/15]**
			
 
				 
			
 
				-   1. Resolved the issue of doubled video memory usage when loading the ChatGLM model (thanks to [@suc16](https://github.com/suc16) and [@myml](https://github.com/myml));
			
 
				-   2. Added a mechanism to clear video memory;
			
 
				-   3. Added `nghuyong/ernie-3.0-nano-zh` and `nghuyong/ernie-3.0-base-zh` as Embedding model options, which consume less video memory resources than `GanymedeNil/text2vec-large-chinese` (thanks to [@lastrei](https://github.com/lastrei)).
			
 
				+   1. refactor the project structure to keep the command line demo [cli_demo.py](cli_demo.py) and the Web UI demo [webui.py](webui.py) in the root directory.
			
 
				+   2. Improve the Web UI by modifying it to first load the model according to the default option of [configs/model_config.py](configs/model_config.py) after running the Web UI, and adding error messages, etc.
			
 
				+   3. Update FAQ.
			
 
				 
			
 
				-**[2023/04/09]**
			
 
				+**[2023/04/12]**
			
 
				 
			
 
				-   1. Replaced the previously selected `ChatVectorDBChain` with `RetrievalQA` in `langchain`, effectively reducing the issue of stopping due to insufficient video memory after asking 2-3 times;
			
 
				-   2. Added `EMBEDDING_MODEL`, `VECTOR_SEARCH_TOP_K`, `LLM_MODEL`, `LLM_HISTORY_LEN`, `REPLY_WITH_SOURCE` parameter value settings in `knowledge_based_chatglm.py`;
			
 
				-   3. Added `chatglm-6b-int4` and `chatglm-6b-int4-qe`, which require less GPU memory, as LLM model options;
			
 
				-   4. Corrected code errors in `README.md` (thanks to [@calcitem](https://github.com/calcitem)).
			
 
				+   1. Replaced the sample files in the Web UI to avoid issues with unreadable files due to encoding problems in Ubuntu;
			
 
				+   2. Replaced the prompt template in `knowledge_based_chatglm.py` to prevent confusion in the content returned by ChatGLM, which may arise from the prompt template containing Chinese and English bilingual text.
			
 
				 
			
 
				 **[2023/04/11]**
			
 
				 
			
@@ -42,10 +39,18 @@
 
				    3. Enhanced automatic detection for the availability of `cuda`, `mps`, and `cpu` for LLM and Embedding model running devices;
			
 
				    4. Added a check for `filepath` in `knowledge_based_chatglm.py`. In addition to supporting single file import, it now supports a single folder path as input. After input, it will traverse each file in the folder and display a command-line message indicating the success of each file load.
			
 
				 
			
 
				-   **[2023/04/12]**
			
 
				+5. **[2023/04/09]**
			
 
				 
			
 
				-   1. Replaced the sample files in the Web UI to avoid issues with unreadable files due to encoding problems in Ubuntu;
			
 
				-   2. Replaced the prompt template in `knowledge_based_chatglm.py` to prevent confusion in the content returned by ChatGLM, which may arise from the prompt template containing Chinese and English bilingual text.
			
 
				+   1. Replaced the previously selected `ChatVectorDBChain` with `RetrievalQA` in `langchain`, effectively reducing the issue of stopping due to insufficient video memory after asking 2-3 times;
			
 
				+   2. Added `EMBEDDING_MODEL`, `VECTOR_SEARCH_TOP_K`, `LLM_MODEL`, `LLM_HISTORY_LEN`, `REPLY_WITH_SOURCE` parameter value settings in `knowledge_based_chatglm.py`;
			
 
				+   3. Added `chatglm-6b-int4` and `chatglm-6b-int4-qe`, which require less GPU memory, as LLM model options;
			
 
				+   4. Corrected code errors in `README.md` (thanks to [@calcitem](https://github.com/calcitem)).
			
 
				+
			
 
				+**[2023/04/07]**
			
 
				+
			
 
				+   1. Resolved the issue of doubled video memory usage when loading the ChatGLM model (thanks to [@suc16](https://github.com/suc16) and [@myml](https://github.com/myml));
			
 
				+   2. Added a mechanism to clear video memory;
			
 
				+   3. Added `nghuyong/ernie-3.0-nano-zh` and `nghuyong/ernie-3.0-base-zh` as Embedding model options, which consume less video memory resources than `GanymedeNil/text2vec-large-chinese` (thanks to [@lastrei](https://github.com/lastrei)).
			
 
				 
			
 
				 ## How to Use
			
 
				 
			
@@ -111,13 +116,11 @@ Note: Before executing, check the remaining space in the `$HOME/.cache/huggingfa
 
				 
			
 
				 The resulting interface is shown below:
			
 
				 ![webui](img/ui1.png)
			
 
				-The API interface provided in the Web UI is shown below:
			
 
				-![webui](img/ui2.png)The Web UI supports the following features:
			
 
				+The Web UI supports the following features:
			
 
				 
			
 
				-1. Automatically reads the `LLM` and `embedding` model enumerations in `knowledge_based_chatglm.py`, allowing you to select and load the model by clicking `setting`. Models can be switched at any time for testing.
			
 
				+1. Automatically reads the `LLM` and `embedding` model enumerations in `configs/model_config.py`, allowing you to select and reload the model by clicking `重新加载模型`.
			
 
				 2. The length of retained dialogue history can be manually adjusted according to the available video memory.
			
 
				-3. Adds a file upload function. Select the uploaded file through the drop-down box, click `loading` to load the file, and change the loaded file at any time during the process.
			
 
				-4. Adds a `use via API` option at the bottom to connect to your own system.
			
 
				+3. Adds a file upload function. Select the uploaded file through the drop-down box, click `加载文件` to load the file, and change the loaded file at any time during the process.
			
 
				 
			
 
				 Alternatively, execute the [knowledge_based_chatglm.py](https://chat.openai.com/chat/cli_demo.py) script to experience **command line interaction**:
			
 
				 
			
@@ -189,12 +192,12 @@ ChatGLM's answer after using LangChain to access the README.md file of the ChatG
 
				 >4. Introduce more evaluation metrics: Incorporate additional evaluation metrics to assess the model's performance, which can help identify the shortcomings and limitations of ChatGLM-6B.
			
 
				 >5. Enhance the model architecture: Improve ChatGLM-6B's model architecture to boost its performance and capabilities. For example, employ larger neural networks or refined convolutional neural network structures.
			
 
				 
			
 
				-## Road map
			
 
				+## Roadmap
			
 
				 
			
 
				 - [x] Implement LangChain + ChatGLM-6B for local knowledge application
			
 
				 - [x] Unstructured file access based on langchain
			
 
				    - [x].md
			
 
				-   - [x].pdf (need to install `detectron2` as described in FAQ Q2)
			
 
				+   - [x].pdf
			
 
				    - [x].docx
			
 
				    - [x].txt
			
 
				 - [ ] Add support for more LLM models
			
@@ -203,8 +206,6 @@ ChatGLM's answer after using LangChain to access the README.md file of the ChatG
 
				    - [x] THUDM/chatglm-6b-int4-qe
			
 
				 - [ ] Add Web UI DEMO
			
 
				    - [x]  Implement Web UI DEMO using Gradio
			
 
				-   - [ ] Add model loading progress bar
			
 
				-   - [ ] Add output and error messages
			
 
				-   - [ ] Internationalization for language switching
			
 
				+   - [x] Add output and error messages
			
 
				    - [ ] Citation callout
			
 
				 - [ ] Use FastAPI to implement API deployment method and develop a Web UI DEMO for API calls