Skip to main content

RePhraseQueryRetriever

简单的检索器,将用户输入和查询传递给检索器之间应用LLM。

它可以用于以任何方式预处理用户输入。

from_llm类方法中使用的默认提示:

DEFAULT_TEMPLATE = """You are an assistant tasked with taking a natural language \
query from a user and converting it into a query for a vectorstore. \
In this process, you strip out information that is not relevant for \
the retrieval task. Here is the user query: {question}"""

创建一个向量存储。

from langchain.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
data = loader.load()

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)

from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings

vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())
import logging

logging.basicConfig()
logging.getLogger("langchain.retrievers.re_phraser").setLevel(logging.INFO)
from langchain.chat_models import ChatOpenAI
from langchain.retrievers import RePhraseQueryRetriever

使用默认提示

llm = ChatOpenAI(temperature=0)
retriever_from_llm = RePhraseQueryRetriever.from_llm(
retriever=vectorstore.as_retriever(), llm=llm
)
docs = retriever_from_llm.get_relevant_documents(
"Hi I'm Lance. What are the approaches to Task Decomposition?"
)
    INFO:langchain.retrievers.re_phraser:重新短语化的问题:用户查询可以转换为以下向量存储的查询:

"approaches to Task Decomposition"
docs = retriever_from_llm.get_relevant_documents(
"I live in San Francisco. What are the Types of Memory?"
)
    INFO:langchain.retrievers.re_phraser:重新短语化的问题:向量存储的查询:"Types of Memory"

提供提示

from langchain import LLMChain
from langchain.prompts import PromptTemplate

QUERY_PROMPT = PromptTemplate(
input_variables=["question"],
template="""You are an assistant tasked with taking a natural languge query from a user
and converting it into a query for a vectorstore. In the process, strip out all
information that is not relevant for the retrieval task and return a new, simplified
question for vectorstore retrieval. The new user query should be in pirate speech.
Here is the user query: {question} """,
)
llm = ChatOpenAI(temperature=0)
llm_chain = LLMChain(llm=llm, prompt=QUERY_PROMPT)
retriever_from_llm_chain = RePhraseQueryRetriever(
retriever=vectorstore.as_retriever(), llm_chain=llm_chain
)
docs = retriever_from_llm_chain.get_relevant_documents(
"Hi I'm Lance. What is Maximum Inner Product Search?"
)
    INFO:langchain.retrievers.re_phraser:重新短语化的问题:啊哈伊,伙计!什么是最大内积搜索,你这个可恶的家伙?