Hugging Face Local Pipelines
Hugging Face 模型可以通过 HuggingFacePipeline
类在本地运行。
Hugging Face 模型中心 托管了超过 120,000 个模型、20,000 个数据集和 50,000 个演示应用程序(Spaces),所有这些都是开源的,公开可用,人们可以在这个在线平台上轻松地进行协作和构建机器学习。
这些模型可以通过 LangChain 中的本地管道包装器调用,也可以通过调用 HuggingFaceHub 类的托管推理端点来调用。有关托管管道的更多信息,请参阅 HuggingFaceHub 笔记本。
要使用,您应该安装 transformers
Python 包。
pip install transformers > /dev/null
加载模型
from langchain import HuggingFacePipeline
llm = HuggingFacePipeline.from_model_id(
model_id="bigscience/bloom-1b7",
task="text-generation",
model_kwargs={"temperature": 0, "max_length": 64},
)
WARNING:root:Failed to default session, using empty session: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /sessions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x1117f9790>: Failed to establish a new connection: [Errno 61] Connection refused'))
将模型集成到 LLMChain 中
from langchain import PromptTemplate, LLMChain
template = """问题:{question}
回答:让我们逐步思考。"""
prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = LLMChain(prompt=prompt, llm=llm)
question = "什么是脑电图?"
print(llm_chain.run(question))
/Users/wfh/code/lc/lckg/.venv/lib/python3.11/site-packages/transformers/generation/utils.py:1288: UserWarning: Using `max_length`'s default (64) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.
warnings.warn(
WARNING:root:Failed to persist run: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /chain-runs (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x144d06910>: Failed to establish a new connection: [Errno 61] Connection refused'))
首先,我们需要了解什么是脑电图。脑电图是一种记录脑活动的方法。它是通过在头皮上放置电极来记录脑活动的一种方法。电极被放置在