模态框 (Modal)
This page covers how to use the Modal ecosystem to run LangChain custom LLMs. It is broken into two parts:
- Modal安装和Web端点部署 (Modal installation and web endpoint deployment)
- 使用部署的Web端点和
LLM
包装类 (Using deployed web endpoint withLLM
wrapper class)
安装和设置 (Installation and Setup)
- 使用
pip install modal
进行安装 (Install withpip install modal
) - 运行
modal token new
(Runmodal token new
)
定义你的Modal函数和Webhooks (Define your Modal Functions and Webhooks)
你必须包含一个提示。有一个严格的响应结构 (You must include a prompt. There is a rigid response structure):
class Item(BaseModel):
prompt: str
@stub.function()
@modal.web_endpoint(method="POST")
def get_text(item: Item):
return {"prompt": run_gpt2.call(item.prompt)}
以下是一个使用GPT2模型的示例 (The following is an example with the GPT2 model):
from pydantic import BaseModel
import modal
CACHE_PATH = "/root/model_cache"
class Item(BaseModel):
prompt: str
stub = modal.Stub(name="example-get-started-with-langchain")
def download_model():
from transformers import GPT2Tokenizer, GPT2LMHeadModel
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')
tokenizer.save_pretrained(CACHE_PATH)
model.save_pretrained(CACHE_PATH)
# 定义一个用于下面的LLM函数的容器镜像,该函数会下载并存储GPT-2模型 (Define a container image for the LLM function below, which downloads and stores the GPT-2 model)
image = modal.Image.debian_slim().pip_install(
"tokenizers", "transformers", "torch", "accelerate"
).run_function(download_model)
@stub.function(
gpu="any",
image=image,
retries=3,
)
def run_gpt2(text: str):
from transformers import GPT2Tokenizer, GPT2LMHeadModel
tokenizer = GPT2Tokenizer.from_pretrained(CACHE_PATH)
model = GPT2LMHeadModel.from_pretrained(CACHE_PATH)
encoded_input = tokenizer(text, return_tensors='pt').input_ids
output = model.generate(encoded_input, max_length=50, do_sample=True)
return tokenizer.decode(output[0], skip_special_tokens=True)
@stub.function()
@modal.web_endpoint(method="POST")
def get_text(item: Item):
return {"prompt": run_gpt2.call(item.prompt)}
部署Web端点 (Deploy the web endpoint)
使用 modal deploy
CLI命令将Web端点部署到Modal云。
你的Web端点将在modal.run
域名下获得一个持久的URL (Deploy the web endpoint to Modal cloud with the modal deploy
CLI command.
Your web endpoint will acquire a persistent URL under the modal.run
domain).
LLM包装Modal Web端点 (LLM wrapper around Modal web endpoint)
Modal
LLM包装类将接受你部署的Web端点的URL (The Modal
LLM wrapper class which will accept your deployed web endpoint's URL).
from langchain.llms import Modal
endpoint_url = "https://ecorp--custom-llm-endpoint.modal.run" # 用你部署的Modal Web端点的URL替换我 (REPLACE ME with your deployed Modal web endpoint's URL)
llm = Modal(endpoint_url=endpoint_url)
llm_chain = LLMChain(prompt=prompt, llm=llm)
question = "What NFL team won the Super Bowl in the year Justin Beiber was born?"
llm_chain.run(question)