Skip to main content

Ollama

Ollama允许您在本地运行开源的大型语言模型,例如Llama 2。

Ollama将模型权重、配置和数据捆绑到一个单一的包中,由Modelfile定义。

它优化了设置和配置细节,包括GPU使用。

有关支持的模型和模型变体的完整列表,请参阅Ollama模型库

设置 (Setup)

首先,按照以下说明设置和运行本地的 Ollama 实例:

  • 下载
  • 获取一个模型,例如 Llama-7bollama pull llama2
  • 运行 ollama run llama2

使用方法 (Usage)

您可以在API参考页面上查看支持的参数的完整列表。

from langchain.llms import Ollama
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
llm = Ollama(base_url="http://localhost:11434",
model="llama2",
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()]))

使用 StreamingStdOutCallbackHandler,您将看到流式传输的令牌。

llm("Tell me about the history of AI")
    
太棒了!人工智能(AI)的历史是一个引人入胜且复杂的话题,跨越了几十年。以下是一个简要概述:

1. 早期(1950年代-1960年代):计算机科学家约翰·麦卡锡于1956年创造了“人工智能”一词。然而,AI的概念可以追溯到古希腊,那里创造了像塔洛斯和赫菲斯托斯这样的神话生物,可以在没有任何人类干预的情况下执行任务。在20世纪50年代和60年代,研究人员开始探索使用计算机复制人类智能的方法,从而开发出了简单的AI程序,如ELIZA(1966)和PARRY(1972)。
2. 基于规则的系统(1970年代-1980年代):随着计算能力的增加,研究人员开发了基于规则的系统,例如Mycin(1976),它可以根据一组规则诊断医疗状况。这一时期还出现了专家系统,如EDICT(1985),它模仿了特定领域的人类专家。
3. 机器学习(1990年代-2000年代):随着大数据和机器学习算法的出现,AI发展到包括神经网络、决策树和其他技术,用于在大型数据集上训练模型。这导致了诸如语音识别(例如Siri、Alexa)、图像识别(例如Google图像搜索)和自然语言处理(例如聊天机器人)等应用的发展。
4. 深度学习(2010年至今):深度学习技术的兴起,如卷积神经网络(CNN)和循环神经网络(RNN),使得AI能够执行复杂的任务,如图像和语音识别、自然语言处理,甚至自动驾驶。谷歌、Facebook和百度等公司在深度学习研究方面投入了大量资金,取得了面部识别、物体检测和机器翻译等领域的突破。
5. 当前趋势(现在-未来):AI目前正在应用于各个行业,包括医疗保健、金融、教育和娱乐。随着云计算、边缘AI和自主系统的发展,我们可以预期在不久的将来会看到更复杂的AI应用。然而,人们也对AI的伦理影响提出了一些担忧,例如数据隐私、算法偏见和工作岗位的替代。

请记住,AI有着悠久的历史,其发展是一个持续进行的过程。随着技术的进步,我们可以预期在各个领域看到更多创新的AI应用。




'\nGreat! The history of Artificial Intelligence (AI) is a fascinating and complex topic that spans several decades. Here\'s a brief overview:\n\n1. Early Years (1950s-1960s): The term "Artificial Intelligence" was coined in 1956 by computer scientist John McCarthy. However, the concept of AI dates back to ancient Greece, where mythical creatures like Talos and Hephaestus were created to perform tasks without any human intervention. In the 1950s and 1960s, researchers began exploring ways to replicate human intelligence using computers, leading to the development of simple AI programs like ELIZA (1966) and PARRY (1972).\n2. Rule-Based Systems (1970s-1980s): As computing power increased, researchers developed rule-based systems, such as Mycin (1976), which could diagnose medical conditions based on a set of rules. This period also saw the rise of expert systems, like EDICT (1985), which mimicked human experts in specific domains.\n3. Machine Learning (1990s-2000s): With the advent of big data and machine learning algorithms, AI evolved to include neural networks, decision trees, and other techniques for training models on large datasets. This led to the development of applications like speech recognition (e.g., Siri, Alexa), image recognition (e.g., Google Image Search), and natural language processing (e.g., chatbots).\n4. Deep Learning (2010s-present): The rise of deep learning techniques, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), has enabled AI to perform complex tasks like image and speech recognition, natural language processing, and even autonomous driving. Companies like Google, Facebook, and Baidu have invested heavily in deep learning research, leading to breakthroughs in areas like facial recognition, object detection, and machine translation.\n5. Current Trends (present-future): AI is currently being applied to various industries, including healthcare, finance, education, and entertainment. With the growth of cloud computing, edge AI, and autonomous systems, we can expect to see more sophisticated AI applications in the near future. However, there are also concerns about the ethical implications of AI, such as data privacy, algorithmic bias, and job displacement.\n\nRemember, AI has a long history, and its development is an ongoing process. As technology advances, we can expect to see even more innovative applications of AI in various fields.'

RAG

我们可以使用RAG和Olama,就像这里展示的那样

让我们使用13b模型:

ollama pull llama2:13b
ollama run llama2:13b

我们还可以使用来自GPT4AllEmbeddingsChroma的本地嵌入。

pip install gpt4all chromadb
from langchain.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
data = loader.load()

from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)
from langchain.vectorstores import Chroma
from langchain.embeddings import GPT4AllEmbeddings

vectorstore = Chroma.from_documents(documents=all_splits, embedding=GPT4AllEmbeddings())
    Found model file at  /Users/rlm/.cache/gpt4all/ggml-all-MiniLM-L6-v2-f16.bin
question = "任务分解的方法有哪些?"
docs = vectorstore.similarity_search(question)
len(docs)
    4
from langchain import PromptTemplate

# 提示
template = """使用以下上下文片段回答最后的问题。
如果你不知道答案,只需说你不知道,不要试图编造一个答案。
最多使用三个句子,并尽量简洁地回答。
{context}
问题:{question}
有用的答案:"""
QA_CHAIN_PROMPT = PromptTemplate(
input_variables=["context", "question"],
template=template,
)
# LLM
from langchain.llms import Ollama
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
llm = Ollama(base_url="http://localhost:11434",
model="llama2",
verbose=True,
callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]))
# QA链
from langchain.chains import RetrievalQA
qa_chain = RetrievalQA.from_chain_type(
llm,
retriever=vectorstore.as_retriever(),
chain_type_kwargs={"prompt": QA_CHAIN_PROMPT},
)
question = "AI代理的任务分解方法有哪些?"
result = qa_chain({"query": question})
    AI代理的任务分解可以通过以下几种方式进行:

1. 使用简单的提示,如“XYZ的步骤。”或“实现XYZ的子目标是什么?”来引导LLM。
2. 提供任务特定的指令,例如为写小说而提供“写故事大纲”。
3. 利用人类输入帮助AI代理理解任务并将其分解为较小的步骤。

你还可以获得令牌的日志。

from langchain.schema import LLMResult
from langchain.callbacks.base import BaseCallbackHandler

class GenerationStatisticsCallback(BaseCallbackHandler):
def on_llm_end(self, response: LLMResult, **kwargs) -> None:
print(response.generations[0][0].generation_info)

callback_manager = CallbackManager([StreamingStdOutCallbackHandler(), GenerationStatisticsCallback()])

llm = Ollama(base_url="http://localhost:11434",
model="llama2",
verbose=True,
callback_manager=callback_manager)

qa_chain = RetrievalQA.from_chain_type(
llm,
retriever=vectorstore.as_retriever(),
chain_type_kwargs={"prompt": QA_CHAIN_PROMPT},
)

question = "任务分解的方法有哪些?"
result = qa_chain({"query": question})
    任务分解可以通过以下三种方式进行:(1) 使用简单的提示,如“XYZ的步骤。”、“实现XYZ的子目标是什么?”等,(2) 使用任务特定的指令,或(3) 利用人类输入。{'model': 'llama2', 'created_at': '2023-08-08T04:01:09.005367Z', 'done': True, 'context': [1, 29871, 1, 13, 9314, 14816, 29903, 6778, 13, 13, 3492, 526, 263, 8444, 29892, 3390, 1319, 322, 15993, 20255, 29889, 29849, 1234, 408, 1371, 3730, 408, 1950, 29892, 1550, 1641, 9109, 29889, 3575, 6089, 881, 451, 3160, 738, 10311, 1319, 29892, 443, 621, 936, 29892, 11021, 391, 29892, 7916, 391, 29892, 304, 27375, 29892, 18215, 29892, 470, 27302, 2793, 29889, 3529, 9801, 393, 596, 20890, 526, 5374, 635, 443, 5365, 1463, 322, 6374, 297, 5469, 29889, 13, 13, 3644, 263, 1139, 947, 451, 1207, 738, 4060, 29892, 470, 338, 451, 2114, 1474, 16165, 261, 296, 29892, 5649, 2020, 2012, 310, 22862, 1554, 451, 1959, 29889, 960, 366, 1016, 29915, 29873, 1073, 278, 1234, 304, 263, 1139, 29892, 3113, 1016, 29915, 29873, 6232, 2089, 2472, 29889, 13, 13, 29966, 829, 14816, 29903, 6778, 13, 13, 29961, 25580, 29962, 4803, 278, 1494, 12785, 310, 3030, 304, 1234, 278, 1139, 472, 278, 1095, 29889, 29871, 13, 3644, 366, 1016, 29915, 29873, 1073, 278, 1234, 29892, 925, 1827, 393, 366, 1016, 29915, 29873, 1073, 29892, 1016, 29915, 29873, 1018, 304, 1207, 701, 385, 1234, 29889, 29871, 13, 11403, 2211, 25260, 7472, 322, 3013, 278, 1234, 408, 3022, 895, 408, 1950, 29889, 29871, 13, 5398, 26227, 508, 367, 2309, 313, 29896, 29897, 491, 365, 26369, 411, 2560, 9508, 292, 763, 376, 7789, 567, 363, 1060, 29979, 29999, 7790, 29876, 29896, 19602, 376, 5618, 526, 278, 1014, 1484, 1338, 363, 3657, 15387, 1060, 29979, 29999, 29973, 613, 313, 29906, 29897, 491, 773, 3414, 29899, 14940, 11994, 29936, 321, 29889, 29887, 29889, 376, 6113, 263, 5828, 27887, 1213, 363, 5007, 263, 9554, 29892, 470, 313, 29941, 29897, 411, 5199, 10970, 29889, 13, 13, 5398, 26227, 508, 367, 2309, 313, 29896, 29897, 491, 365, 26369, 411, 2560, 9508, 292, 763, 376, 7789, 567, 363, 1060, 29979, 29999, 7790, 29876, 29896, 19602, 376, 5618, 526, 278, 1014, 1484, 1338, 363, 3657, 15387, 1060, 29979, 29999, 29973, 613, 313, 29906, 29897, 491, 773, 3414, 29899, 14940, 11994, 29936, 321, 29889, 29887, 29889, 376, 6113, 263, 5828, 27887, 1213, 363, 5007, 263, 9554, 29892, 470, 313, 29941, 29897, 411, 5199, 10970, 29889, 13, 13, 5398, 26227, 508, 367, 2309, 313, 29896, 29897, 491, 365, 26369, 411, 2560, 9508, 292, 763, 376, 7789, 567, 363, 1060, 29979, 29999, 7790, 29876, 29896, 19602, 376, 5618, 526, 278, 1014, 1484, 1338, 363, 3657, 15387, 1060, 29979, 29999, 29973, 613, 313, 29906, 29897, 491, 773, 3414, 29899, 14940, 11994, 29936, 321, 29889, 29887, 29889, 376, 6113, 263, 5828, 27887, 1213, 363, 5007, 263, 9554, 29892, 470, 313, 29941, 29897, 411, 5199, 10970, 29889, 13, 13, 1451, 16047, 267, 297, 1472, 29899, 8489, 18987, 322, 3414, 26227, 29901, 1858, 9450, 975, 263, 3309, 29891, 4955, 322, 17583, 3902, 8253, 278, 1650, 2913, 3933, 18066, 292, 29889, 365, 26369, 29879, 21117, 304, 10365, 13900, 746, 20050, 411, 15668, 4436, 29892, 3907, 963, 3109, 16424, 9401, 304, 25618, 1058, 5110, 515, 14260, 322, 1059, 29889, 13, 16492, 29901, 1724, 526, 278, 13501, 304, 9330, 897, 510, 3283, 29973, 13, 29648, 1319, 673, 29901, 518, 29914, 25580, 29962, 13, 5398, 26227, 508, 367, 26733, 297, 2211, 5837, 29901, 313, 29896, 29897, 773, 2560, 9508, 292, 763, 376, 7789, 567, 363, 1060, 29979, 29999, 7790, 29876, 29896, 19602, 376, 5618, 526, 278, 1014, 1484, 1338, 363, 3657, 15387, 1060, 29979, 29999, 29973, 613, 313, 29906, 29897, 491, 773, 3414, 29899, 14940, 11994, 29892, 470, 313, 29941, 29897, 411, 5199, 10970, 29889, 2], 'total_duration': 1364428708, 'load_duration': 1246375, 'sample_count': 62, 'sample_duration': 44859000, 'prompt_eval_count': 1, 'eval_count': 62, 'eval_duration': 1313002000}

eval_count / (eval_duration/10e9) 得到 tok / s

62 / (1313002000/1000/1000/1000)
    47.22003469910937