多个检索源

通常情况下，您可能希望在多个源上进行检索。这些源可以是不同的向量存储（其中一个包含有关主题X的信息，另一个包含有关主题Y的信息）。它们也可以是完全不同的数据库！

关键部分是尽可能并行地进行检索。这将使延迟尽可能低。幸运的是，LangChain表达式语言支持开箱即用的并行处理。

让我们看一下如何在SQL数据库和向量存储上进行检索。

from langchain.chat_models import ChatOpenAI

API参考：

ChatOpenAI 来自 langchain.chat_models

设置SQL查询

from langchain.utilities import SQLDatabase
from langchain.chains import create_sql_query_chain

db = SQLDatabase.from_uri("sqlite:///../../../../../notebooks/Chinook.db")
query_chain = create_sql_query_chain(ChatOpenAI(temperature=0), db)

API参考：

SQLDatabase 来自 langchain.utilities
create_sql_query_chain 来自 langchain.chains

设置向量存储

from langchain.indexes import VectorstoreIndexCreator
from langchain.schema.document import Document

index_creator = VectorstoreIndexCreator()
index = index_creator.from_documents([Document(page_content="Foo")])
retriever = index.vectorstore.as_retriever()

API参考：

VectorstoreIndexCreator 来自 langchain.indexes
Document 来自 langchain.schema.document

结合

from langchain.prompts import ChatPromptTemplate

system_message = """Use the information from the below two sources to answer any questions.

Source 1: a SQL database about employee data
<source1>
{source1}
</source1>

Source 2: a text database of random information
<source2>
{source2}
</source2>
"""

prompt = ChatPromptTemplate.from_messages([("system", system_message), ("human", "{question}")])

API参考：

ChatPromptTemplate 来自 langchain.prompts

full_chain = {
    "source1": {"question": lambda x: x["question"]} | query_chain | db.run,
    "source2": (lambda x: x['question']) | retriever,
    "question": lambda x: x['question'],
} | prompt | ChatOpenAI()

response = full_chain.invoke({"question":"How many Employees are there"})
print(response)

Number of requested results 4 is greater than number of elements in index 1, updating n_results = 1

content='There are 8 employees.' additional_kwargs={} example=False

多个检索源

API参考：​

设置SQL查询​

API参考：​

设置向量存储​

API参考：​

结合​

API参考：​

API参考：

设置SQL查询

API参考：

设置向量存储

API参考：

结合

API参考：