Skip to main content

Tair

Tair 是由阿里云开发的云原生内存数据库服务。它提供丰富的数据模型和企业级功能,以支持您的实时在线场景,同时保持与开源Redis的完全兼容性。Tair还引入了基于新的非易失性内存(NVM)存储介质的持久内存优化实例。

本笔记本展示了如何使用与Tair向量数据库相关的功能。

要运行,您应该有一个正在运行的Tair实例。

from langchain.embeddings.fake import FakeEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Tair
from langchain.document_loaders import TextLoader

loader = TextLoader("../../../state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

embeddings = FakeEmbeddings(size=128)

使用TAIR_URL环境变量连接到Tair

export TAIR_URL="redis://{username}:{password}@{tair_address}:{tair_port}"

或使用关键字参数tair_url

然后将文档和嵌入存储到Tair中。

tair_url = "redis://localhost:6379"

# 如果索引已经存在,则先删除
Tair.drop_index(tair_url=tair_url)

vector_store = Tair.from_documents(docs, embeddings, tair_url=tair_url)

查询相似的文档。

query = "What did the president say about Ketanji Brown Jackson"
docs = vector_store.similarity_search(query)
docs[0]
    Document(page_content='We’re going after the criminals who stole billions in relief money meant for small businesses and millions of Americans.  \n\nAnd tonight, I’m announcing that the Justice Department will name a chief prosecutor for pandemic fraud. \n\nBy the end of this year, the deficit will be down to less than half what it was before I took office.  \n\nThe only president ever to cut the deficit by more than one trillion dollars in a single year. \n\nLowering your costs also means demanding more competition. \n\nI’m a capitalist, but capitalism without competition isn’t capitalism. \n\nIt’s exploitation—and it drives up prices. \n\nWhen corporations don’t have to compete, their profits go up, your prices go up, and small businesses and family farmers and ranchers go under. \n\nWe see it happening with ocean carriers moving goods in and out of America. \n\nDuring the pandemic, these foreign-owned companies raised prices by as much as 1,000% and made record profits.', metadata={'source': '../../../state_of_the_union.txt'})