存储和引用聊天记录(Store and reference chat history)

ConversationalRetrievalQA链基于RetrievalQAChain构建，提供了一个聊天记录组件。

首先，它将聊天记录（可以是显式传入的，也可以是从提供的内存中检索到的）和问题合并为一个独立的问题，然后从检索器中查找相关文档，最后将这些文档和问题传递给一个问题回答链以返回响应。

要创建一个，您需要一个检索器。在下面的示例中，我们将从一个向量存储中创建一个检索器，该向量存储可以从嵌入中创建。

from langchain.embeddings.openai import OpenAIEmbeddings  
from langchain.vectorstores import Chroma  
from langchain.text_splitter import CharacterTextSplitter  
from langchain.llms import OpenAI  
from langchain.chains import ConversationalRetrievalChain  

加载文档。您可以将其替换为您想要的任何类型的数据加载器。

from langchain.document_loaders import TextLoader  
loader = TextLoader("../../state_of_the_union.txt")  
documents = loader.load()  

如果您有多个加载器要合并，可以这样做：

# loaders = [....]  
# docs = []  
# for loader in loaders:  
#     docs.extend(loader.load())  

现在我们将文档拆分，为它们创建嵌入，并将它们放入向量存储中。这样我们就可以对它们进行语义搜索。

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)  
documents = text_splitter.split_documents(documents)  
      
embeddings = OpenAIEmbeddings()  
vectorstore = Chroma.from_documents(documents, embeddings)  

我们现在可以创建一个内存对象，这对于跟踪输入/输出并进行对话是必要的。

from langchain.memory import ConversationBufferMemory  
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)  

现在我们初始化ConversationalRetrievalChain

qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), memory=memory)  

query = "What did the president say about Ketanji Brown Jackson"  
result = qa({"question": query})  

result["answer"]  

"总统说Ketanji Brown Jackson是全国顶尖的法律专家之一，曾是一位顶级的私人执业律师，曾是一位前联邦公共辩护人，并来自一家公立学校教育工作者和警察的家庭。他还表示，她是一个共识建设者，并得到了从警察协会到民主党和共和党任命的前法官的广泛支持。"

query = "Did he mention who she succeeded"  
result = qa({"question": query})  

result['answer']  

"Ketanji Brown Jackson在美国最高法院继任了Stephen Breyer大法官。"

传入聊天记录

在上面的示例中，我们使用了一个Memory对象来跟踪聊天记录。我们也可以直接传入它。为了做到这一点，我们需要初始化一个没有任何内存对象的链。

qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever())  

这是一个没有聊天记录的问题的示例

chat_history = []  
query = "What did the president say about Ketanji Brown Jackson"  
result = qa({"question": query, "chat_history": chat_history})  

result["answer"]  

"总统说Ketanji Brown Jackson是全国顶尖的法律专家之一，曾是一位顶级的私人执业律师，曾是一位前联邦公共辩护人，并来自一家公立学校教育工作者和警察的家庭。他还表示，她是一个共识建设者，并得到了从警察协会到民主党和共和党任命的前法官的广泛支持。"

这是一个带有一些聊天记录的问题的示例

chat_history = [(query, result["answer"])]  
query = "Did he mention who she succeeded"  
result = qa({"question": query, "chat_history": chat_history})  

result['answer']  

"Ketanji Brown Jackson在美国最高法院继任了Stephen Breyer大法官。"

使用不同的模型来压缩问题

这个链有两个步骤。首先，它将当前问题和聊天记录压缩为一个独立的问题。这是为了创建一个用于检索的独立向量。之后，它进行检索，然后使用单独的模型进行检索增强生成来回答问题。LangChain声明性的特性的一部分是您可以轻松地为每个调用使用不同的语言模型。这对于在压缩问题的简单任务中使用更便宜和更快的模型，然后在回答问题时使用更昂贵的模型非常有用。下面是一个示例。

from langchain.chat_models import ChatOpenAI  

qa = ConversationalRetrievalChain.from_llm(  
    ChatOpenAI(temperature=0, model="gpt-4"),  
    vectorstore.as_retriever(),  
    condense_question_llm = ChatOpenAI(temperature=0, model='gpt-3.5-turbo'),  
)  

chat_history = []  
query = "What did the president say about Ketanji Brown Jackson"  
result = qa({"question": query, "chat_history": chat_history})  

chat_history = [(query, result["answer"])]  
query = "Did he mention who she succeeded"  
result = qa({"question": query, "chat_history": chat_history})  

使用自定义提示来压缩问题

默认情况下，ConversationalRetrievalQA使用CONDENSE_QUESTION_PROMPT来压缩问题。以下是文档中此功能的实现。

from langchain.prompts.prompt import PromptTemplate  

_template = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.  

Chat History:  
{chat_history}  
Follow Up Input: {question}  
Standalone question:"""  
CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template(_template)  

但是，您可以使用任何自定义模板来进一步增强问题中的信息或指示LLM执行某些操作。以下是一个示例。

from langchain.prompts.prompt import PromptTemplate  

custom_template = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question. At the end of standalone question add this 'Answer the question in German language.' If you do not know the answer reply with 'I am sorry'.  
Chat History:  
{chat_history}  
Follow Up Input: {question}  
Standalone question:"""  
CUSTOM_QUESTION_PROMPT = PromptTemplate.from_template(custom_template)  

model = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.3)  
embeddings = OpenAIEmbeddings()  
vectordb = Chroma(embedding_function=embeddings, persist_directory=directory)  
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)  
qa = ConversationalRetrievalChain.from_llm(  
    model,  
    vectordb.as_retriever(),  
    condense_question_prompt=CUSTOM_QUESTION_PROMPT,  
    memory=memory  
)  

query = "What did the president say about Ketanji Brown Jackson"  
result = qa({"question": query})  

query = "Did he mention who she succeeded"  
result = qa({"question": query})  

返回源文档

您还可以轻松地从ConversationalRetrievalChain中返回源文档。这在您想要检查返回了哪些文档时非常有用。

qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), return_source_documents=True)  

chat_history = []  
query = "What did the president say about Ketanji Brown Jackson"  
result = qa({"question": query, "chat_history": chat_history})  

result['source_documents'][0]  

ConversationalRetrievalChain with `search_distance`

如果您使用的向量存储支持按搜索距离过滤，您可以添加一个阈值参数。

vectordbkwargs = {"search_distance": 0.9}  

qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), return_source_documents=True)  
chat_history = []  
query = "What did the president say about Ketanji Brown Jackson"  
result = qa({"question": query, "chat_history": chat_history, "vectordbkwargs": vectordbkwargs})  

ConversationalRetrievalChain with `map_reduce`

我们还可以使用不同类型的组合文档链与ConversationalRetrievalChain链。

from langchain.chains import LLMChain  
from langchain.chains.question_answering import load_qa_chain  
from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_PROMPT  

llm = OpenAI(temperature=0)  
question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)  
doc_chain = load_qa_chain(llm, chain_type="map_reduce")  
      
chain = ConversationalRetrievalChain(  
    retriever=vectorstore.as_retriever(),  
    question_generator=question_generator,  
    combine_docs_chain=doc_chain,  
)  

chat_history = []  
query = "What did the president say about Ketanji Brown Jackson"  
result = chain({"question": query, "chat_history": chat_history})  

result['answer']  

"总统说Ketanji Brown Jackson是全国顶尖的法律专家之一，曾是一位顶级的私人执业律师，曾是一位前联邦公共辩护人，并来自一家公立学校教育工作者和警察的家庭。他还表示，她是一个共识建设者，并得到了从警察协会到民主党和共和党任命的前法官的广泛支持。"

ConversationalRetrievalChain with Question Answering with sources

您还可以将此链与带有源的问题回答链一起使用。

from langchain.chains.qa_with_sources import load_qa_with_sources_chain  

llm = OpenAI(temperature=0)  
question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)  
doc_chain = load_qa_with_sources_chain(llm, chain_type="map_reduce")  
      
chain = ConversationalRetrievalChain(  
    retriever=vectorstore.as_retriever(),  
    question_generator=question_generator,  
    combine_docs_chain=doc_chain,  
)  

chat_history = []  
query = "What did the president say about Ketanji Brown Jackson"  
result = chain({"question": query, "chat_history": chat_history})  

result['answer']  

"总统说Ketanji Brown Jackson是全国顶尖的法律专家之一，曾是一位顶级的私人执业律师，曾是一位前联邦公共辩护人，并来自一家公立学校教育工作者和警察的家庭。他还表示，她是一个共识建设者，并得到了从警察协会到民主党和共和党任命的前法官的广泛支持。 \nSOURCES: ../../state_of_the_union.txt"

ConversationalRetrievalChain with streaming to `stdout`

在此示例中，链的输出将逐个令牌流式传输到stdout。

from langchain.chains.llm import LLMChain  
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler  
from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_PROMPT, QA_PROMPT  
from langchain.chains.question_answering import load_qa_chain  

llm = OpenAI(temperature=0)  
streaming_llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0)  
      
question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)  
doc_chain = load_qa_chain(streaming_llm, chain_type="stuff", prompt=QA_PROMPT)  
      
qa = ConversationalRetrievalChain(  
    retriever=vectorstore.as_retriever(), combine_docs_chain=doc_chain, question_generator=question_generator)  

chat_history = []  
query = "What did the president say about Ketanji Brown Jackson"  
result = qa({"question": query, "chat_history": chat_history})  

chat_history = [(query, result["answer"])]  
query = "Did he mention who she succeeded"  
result = qa({"question": query, "chat_history": chat_history})  

get_chat_history函数

您还可以指定一个get_chat_history函数，用于格式化聊天记录字符串。

def get_chat_history(inputs) -> str:  
    res = []  
    for human, ai in inputs:  
        res.append(f"Human:{human}\nAI:{ai}")  
    return "\n".join(res)  
qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), get_chat_history=get_chat_history)  

chat_history = []  
query = "What did the president say about Ketanji Brown Jackson"  
result = qa({"question": query, "chat_history": chat_history})  

result['answer']  

"总统说Ketanji Brown Jackson是全国顶尖的法律专家之一，曾是一位顶级的私人执业律师，曾是一位前联邦公共辩护人，并来自一家公立学校教育工作者和警察的家庭。他还表示，她是一个共识建设者，并得到了从警察协会到民主党和共和党任命的前法官的广泛支持。"

存储和引用聊天记录(Store and reference chat history)

传入聊天记录​

使用不同的模型来压缩问题​

使用自定义提示来压缩问题​

返回源文档​

ConversationalRetrievalChain with search_distance​

ConversationalRetrievalChain with map_reduce​

ConversationalRetrievalChain with Question Answering with sources​

ConversationalRetrievalChain with streaming to stdout​

get_chat_history函数​