Skip to main content

模态框 (Modal)

This page covers how to use the Modal ecosystem to run LangChain custom LLMs. It is broken into two parts:

  1. Modal安装和Web端点部署 (Modal installation and web endpoint deployment)
  2. 使用部署的Web端点和LLM包装类 (Using deployed web endpoint with LLM wrapper class)

安装和设置 (Installation and Setup)

  • 使用 pip install modal 进行安装 (Install with pip install modal)
  • 运行 modal token new (Run modal token new)

定义你的Modal函数和Webhooks (Define your Modal Functions and Webhooks)

你必须包含一个提示。有一个严格的响应结构 (You must include a prompt. There is a rigid response structure):

class Item(BaseModel):
prompt: str

@stub.function()
@modal.web_endpoint(method="POST")
def get_text(item: Item):
return {"prompt": run_gpt2.call(item.prompt)}

以下是一个使用GPT2模型的示例 (The following is an example with the GPT2 model):

from pydantic import BaseModel

import modal

CACHE_PATH = "/root/model_cache"

class Item(BaseModel):
prompt: str

stub = modal.Stub(name="example-get-started-with-langchain")

def download_model():
from transformers import GPT2Tokenizer, GPT2LMHeadModel
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')
tokenizer.save_pretrained(CACHE_PATH)
model.save_pretrained(CACHE_PATH)

# 定义一个用于下面的LLM函数的容器镜像,该函数会下载并存储GPT-2模型 (Define a container image for the LLM function below, which downloads and stores the GPT-2 model)
image = modal.Image.debian_slim().pip_install(
"tokenizers", "transformers", "torch", "accelerate"
).run_function(download_model)

@stub.function(
gpu="any",
image=image,
retries=3,
)
def run_gpt2(text: str):
from transformers import GPT2Tokenizer, GPT2LMHeadModel
tokenizer = GPT2Tokenizer.from_pretrained(CACHE_PATH)
model = GPT2LMHeadModel.from_pretrained(CACHE_PATH)
encoded_input = tokenizer(text, return_tensors='pt').input_ids
output = model.generate(encoded_input, max_length=50, do_sample=True)
return tokenizer.decode(output[0], skip_special_tokens=True)

@stub.function()
@modal.web_endpoint(method="POST")
def get_text(item: Item):
return {"prompt": run_gpt2.call(item.prompt)}

部署Web端点 (Deploy the web endpoint)

使用 modal deploy CLI命令将Web端点部署到Modal云。 你的Web端点将在modal.run域名下获得一个持久的URL (Deploy the web endpoint to Modal cloud with the modal deploy CLI command. Your web endpoint will acquire a persistent URL under the modal.run domain).

LLM包装Modal Web端点 (LLM wrapper around Modal web endpoint)

Modal LLM包装类将接受你部署的Web端点的URL (The Modal LLM wrapper class which will accept your deployed web endpoint's URL).

from langchain.llms import Modal

endpoint_url = "https://ecorp--custom-llm-endpoint.modal.run" # 用你部署的Modal Web端点的URL替换我 (REPLACE ME with your deployed Modal web endpoint's URL)

llm = Modal(endpoint_url=endpoint_url)
llm_chain = LLMChain(prompt=prompt, llm=llm)

question = "What NFL team won the Super Bowl in the year Justin Beiber was born?"

llm_chain.run(question)