2024-09-19发表2025-03-21更新 ByteAILab 11 分钟读完 (大约1593个字)

OpenAI o1式思维链，开源模型也可以有，成功案例来了

北京时间 9 月 13 日午夜，OpenAI 发布了推理性能强大的 ο1 系列模型。之后，各路研究者一直在尝试挖掘 ο1 卓越性能背后的技术并尝试复现它。

当然，OpenAI 也想了一些方法来抑制窥探，比如有多名用户声称曾试图诱导 ο1 模型公布其思维过程，然后收到了 OpenAI 的封号威胁。

尽管如此，不过三四天时间，就已经有研究者宣称已经成功复现/开发出了与 ο1 性能差不多的推理技术，并且还不止一个！

Llamaberry：教会 AI 像聪明人一样思考

Llamaberry 的提出者是 Martin Bowling。他开发的项目包括 RAGMiner.dev 和 Replit；其中前者可以帮助用户毫不费力地将网站转换成 Markdown、XML 或 JSON 等格式以便 RAG 和 LLM 应用使用，而后者则是一个使用 AI 将想法变成代码的项目。

Llamaberry 的核心思路是使用思维链（CoT）来实现推理。这个名字自然源自代表 o1 模型的 Strawberry（草莓）。

HuggingFace地址：https://huggingface.co/spaces/martinbowling/Llamaberry

什么是思维链？Bowling 在博客中打了个比方：「思维链推理就像是给 AI 一个笔记本来展示其工作过程。其中不仅仅是简单地给出答案，而是会带领我们经历其思维过程。」

Llamaberry 能教会 AI 透彻地思考，就像是一位人类专家攻克难题时那样。

具体来说，Llamaberry 是一个多轮思维链推理系统的实现，其基于运行在 Groq 上的 Llama 3.1 70B 模型。

多轮推理是关键

多轮推理，顾名思义，就是让模型在给出答案之前进行多步思考，而不是一步给出答案。打个比方，这就像是看一位大厨从备菜到完成摆盘一步步地完成一道精美菜肴，而不是直接微波加热预制菜。

举个示例：

第 1 轮：AI 先尝试解决当前问题。
第 2 轮：AI 回顾第一次尝试并尽力改进或优化其思维过程。
第 3 轮：再进行一轮反思和改进。
综合结果：最后，将所有这些思考综合到一起，得到一个连贯且合理的答案。

下面展示了这个多轮过程的示意图：

可以看到，前一轮的输出会成为后一轮的输入，从而让 AI 可在每个阶段不断完善其思维。最后，所有这些…

如何实现

下面将深入 Llamaberry 的实现细节。

1.设置舞台

首先，我们需要为 AI 助手设置一些基本规则，代码如下：

initial_system_prompt = “””You are an AI assistant capable of detailed, step-by-step thinking. When presented with a question or problem, break down your thought process into clear, logical steps. For each step, explain your reasoning. Conclude with a final answer. Use the following markdown structure:

Reasoning

[First step]
Explanation: [Detailed explanation of this step]
[Second step]
Explanation: [Detailed explanation of this step]
…

Answer

[Final answer]

Be comprehensive and show your reasoning clearly.”””

这就是提供给 AI 大厨的菜谱。它知道需要逐步…

2.思考过程

在每一轮推理中，都需要让 AI 对问题进行思考。但在第一轮结束后，还需要求它思考之前已经思考过的东西。这就像问朋友，「嘿，还记得你之前说过什么吗？让我们再想一想。」

下面是每一轮的生成方式：

async def generate_turn(query: str, previous_turns: list = None) -> str:
is_first_turn = previous_turns is None or len(previous_turns) == 0
if is_first_turn:
messages = [{
“role”: “system”,
“content”: initial_system_prompt
…

综合结果

经过三轮思考之后，再让 AI 检视其所有思考并得出一个合理的最终答案。这就像是让专家讨论问题，然后编写最终报告。

async def synthesize_turns(query: str, turns: list) -> str:
turns_text = “\n\n”.join(
[f”Turn {i+1}:\n{turn}” for i, turn in enumerate(turns)])
messages = [{
“role”: “system”,
“content”: synthesis_prompt
}, {
“role”:
“user”,
“content”:
f”Original Query: {query}\n\nTurns of Reasoning:\n{turns_text}”
…
单轮推理 vs 多轮推理

Martin Bowling 在其博客中简单比较了单轮和多轮推理。

这里他采用了一个简单示例，即询问 AI：「人工智能对就业市场有什么潜在影响？」

单轮推理：

AI: Artificial intelligence will likely have significant impacts on the job market, including:

Automation of routine tasks
Creation of new job roles in AI development and maintenance
Potential job displacement in certain industries
…

使用 Llamaberry 执行多轮推理：

第 1 轮：

AI: Let’s approach this step-by-step:

Identify current AI capabilities
Explanation: AI can perform data analysis, pattern recognition, and automate routine tasks.
Analyze vulnerable job sectors
Explanation: Manufacturing, customer service, and data entry are likely to be affected.
Consider new job opportunities
Explanation: AI development, data science, and AI ethics consultation may see growth.
Examine potential economic effects
Explanation: Increased productivity might lead to economic growth, but wealth distribution could be a challenge.

Initial Answer: AI will likely cause significant shifts in the job market, with both job losses in some sectors and new opportunities in others.

第 2 轮：

AI: Let’s refine our thinking:

Reassess the timeline of AI impact
Explanation: The effects won’t be immediate but will unfold over years or decades.
Consider the adaptability of the workforce
Explanation: Retraining and education will be crucial for workers to transition to new roles.
Analyze potential societal changes
Explanation: The nature of work itself might change, with more emphasis on creativity and emotional intelligence.
Examine global differences
Explanation: AI’s impact will vary across different countries and economies.

Refined Answer: AI’s impact on the job market will be complex and gradual, requiring proactive measures in education and economic policy to ensure a smooth transition.

第 3 轮：

AI: Let’s delve deeper:

Explore the concept of human-AI collaboration
Explanation: Many jobs may evolve to involve working alongside AI rather than being replaced by it.
Consider the ethical implications
Explanation: Issues like algorithmic bias in hiring processes need to be addressed.
Analyze the potential for new economic models
Explanation: Universal Basic Income or reduced working hours might become more prominent discussions.
Examine historical parallels
Explanation: Previous technological revolutions, like the Industrial Revolution, can offer insights into potential outcomes.

Further Refined Answer: While AI will undoubtedly transform the job market, its impact will likely be nuanced, involving collaboration between humans and AI, ethical considerations, and potentially new economic paradigms.

综合答案：
…
。注意：Title、Date、Body 三个部分的内容，放入到对应的位置。最后只需要按照格式标准输出为Makedown源文件格式内容。

感谢阅读！如果您对AI的更多资讯感兴趣，可以查看更多AI文章：GPTNB。

OpenAI o1式思维链，开源模型也可以有，成功案例来了

https://www.gptnb.com/2024/09/19/2024-09-18-auto5-USiGg9/

作者

ByteAILab

发布于

2024-09-19

更新于

2025-03-21

OpenAI o1式思维链，开源模型也可以有，成功案例来了

Reasoning

Answer

综合答案：
…
。注意：Title、Date、Body 三个部分的内容，放入到对应的位置。最后只需要按照格式标准输出为Makedown源文件格式内容。

作者

发布于

更新于

许可协议

喜欢这篇文章？打赏一下作者吧

链接

分类

最新文章

归档

标签

订阅更新

OpenAI o1式思维链，开源模型也可以有，成功案例来了

Reasoning

Answer

综合答案：…。注意：Title、Date、Body 三个部分的内容，放入到对应的位置。最后只需要按照格式标准输出为Makedown源文件格式内容。

作者

发布于

更新于

许可协议

喜欢这篇文章？打赏一下作者吧

链接

分类

最新文章

归档

标签

订阅更新

综合答案：
…
。注意：Title、Date、Body 三个部分的内容，放入到对应的位置。最后只需要按照格式标准输出为Makedown源文件格式内容。