OpenCode 原理解析:开源 AI Coding Agent 的架构与设计OpenCode Deep Dive: Architecture & Design of an Open-Source AI Coding Agent
1 OpenCode 是什么?What Is OpenCode?
如果你用过 Claude Code,你会立刻理解 OpenCode 想做什么:一个能读写代码、执行命令、自主完成任务的 AI Agent。但关键区别是:它不绑定任何一家 LLM 提供商,代码 100% 开源,并且架构上做了有趣的取舍。
OpenCode 由 Anomaly 团队(同时也是 terminal.shop 和 SST 的创建者)打造,提供终端 TUI、桌面应用和 IDE 扩展三种形态。核心定位:
- Provider-agnostic:不绑定任何单一 LLM 提供商,可使用 Claude、OpenAI、Google、本地模型等
- Terminal-first:由 Neovim 用户打造,把 TUI 做到极致
- Client/Server 架构:前端只是客户端之一,API 驱动一切
- 完全可扩展:通过 Agents、Skills、Plugins、MCP、Custom Tools 实现无限组合
截至 2026 年 4 月,OpenCode 在 GitHub 上已获得 141k+ stars,发布了 759 个版本,拥有 11,000+ commits 和 460+ 贡献者。它的增长速度很能说明问题:从 2025 年 6 月首发到 2026 年 4 月突破 141k stars,不到一年。
If you have used Claude Code, you immediately understand what OpenCode is trying to be: an AI Agent that can read and write code, execute commands, and complete tasks autonomously. The key differences are that it is not tied to any single LLM Provider, the codebase is fully open source, and its architecture makes some notably deliberate trade-offs.
OpenCode is built by the Anomaly team—the same group behind terminal.shop and SST. It ships in three main forms: a terminal TUI, a desktop app, and an IDE extension. Its positioning is straightforward:
- Provider-agnostic: not locked to any single LLM vendor; it can use Claude, OpenAI, Google, local models, and more
- Terminal-first: built by Neovim users who push the TUI experience seriously
- Client/Server architecture: the frontend is just one client; the system is fundamentally API-driven
- Fully extensible: Agents, Skills, Plugins, MCP, and Custom Tools can be combined without an obvious ceiling
As of April 2026, OpenCode had accumulated 141k+ stars on GitHub, shipped 759 releases, and passed 11,000+ commits with 460+ contributors. The growth curve is the real signal: it launched in June 2025 and crossed 141k stars by April 2026—in less than a year.
AI Coding Agent 的真正馏颈不是代码生成能力——所有工具都用同一个 Claude,差距在【控制权】。一个软件如果把不可缺少的开发工具作为付费订阅的一部分戴给你,就像将你的终端和目录结构共享给第三方一样。
OpenCode 的核心主张是:Agent 的上限由 Harness(框架)决定,而不是模型本身。新模型每周演进,但框架如果不属于你,你永远是租户而非主人。MIT 开源、本地运行、所有数据在你的机器上——这是一个关于技术自主权的声明。
The real bottleneck in AI Coding Agents is not code generation itself—many tools are calling into the same frontier models anyway. The difference is control. When essential development tooling is bundled into a paid subscription you do not control, it is functionally similar to handing your terminal and directory structure to a third party.
OpenCode’s core argument is that the ceiling of an Agent is set by its Harness, not by the model alone. New models improve every week, but if the surrounding framework does not belong to you, you remain a tenant rather than an owner. MIT licensing, local execution, and keeping all data on your machine are therefore not just implementation details—they are a statement about technical sovereignty.
2 背景:AI Coding 工具演进Background: Evolution of AI Coding Tools
AI Coding 工具的演进可以分为三个阶段:
- Copilot 时代(2021-2023):自动补全,被动辅助。以 GitHub Copilot 为代表,LLM 只负责补全当前光标处的代码。
- Chat 时代(2023-2024):对话式编程。ChatGPT、Cursor Chat 让开发者通过自然语言描述需求,模型生成代码片段。但代码落地仍需人工复制粘贴。
- Agent 时代(2025-):自主执行。AI 不仅生成代码,还能读取文件、执行命令、运行测试、修复 bug——形成完整的 感知-决策-执行 闭环。Claude Code、OpenCode、Aider 是这一代的代表。
OpenCode 诞生于第三阶段,它的核心洞察是:Agent 能力 = LLM 推理 + 工具调用 + 环境感知。LLM 提供商会不断演进,模型差距会逐渐缩小,价格会持续下降——因此把自己绑定在单一提供商上是短视的。
这些数字的意义不只是“谁更火”,而是市场正在分化成三种路线:闭源托管式 Agent、开源终端原生 Agent、以及 IDE 内嵌的商业工作台。Claude Code 代表的是能力极强但不可审计的闭源路线;Cursor 代表把 Agent 包装成生产力订阅产品;Aider 和 OpenCode 则把重点放在可读、可改、可自托管的执行框架。对架构设计者来说,这不是品牌之争,而是控制平面归谁所有的问题。
从系统演进角度看,GitHub stars、贡献者数量、发布节奏这些指标共同说明了一件事:Agent 已经不是单点模型能力的展示,而是一个需要长期演化的基础设施层。当一个项目有数百名贡献者、上万次提交、数百个版本时,它讨论的重点自然会从“模型回答得像不像人”转向“权限如何收敛、上下文如何压缩、工具协议如何标准化、故障如何隔离”。这正是 OpenCode 想占据的位置。
第一,模型能力刚刚跨过阈值。早期模型可以补全函数,却无法在多文件、多步骤、多约束任务里稳定保持目标;而新一代模型已经能在较长时间跨度内维持计划、检查结果并进行自我修正。只有当“规划 + 执行 + 复盘”开始形成闭环,Agent 才从演示品变成工程系统。
第二,工具调用训练成熟了。函数调用、结构化输出、Tool Use、MCP 这类机制,让模型不再只会吐出一段文本,而是能调用文件系统、Shell、LSP、浏览器、测试框架等外部能力。换句话说,现代 Agent 的突破并不只是更聪明,而是终于拥有了稳定、可编排的“手”和“眼”。
第三,上下文窗口从 4K 级别扩展到 1M+,使“理解整个仓库”第一次接近现实。过去模型只能盯住一个函数或几页代码,现在可以同时保留需求、架构、错误日志、测试输出、历史决策与工具结果。大窗口不自动等于好 Agent,但它让上下文工程成为可能,而不是幻想。
第四,推理成本在两年内下降了约 100 倍。如果每一次计划、读取、执行、验证都昂贵到无法迭代,Agent 就只能停留在高价演示。成本下降带来的真正变化,是系统终于可以承受“先探索、再验证、然后修正”的多轮循环,这使得架构上的稳健性优先于单次回答的炫技。
为什么会有人坚持认为,只有开源 Agent 才适合被授予整个代码库、终端、环境变量和部署脚本的访问权?因为 Agent 与聊天机器人不同,它接触的是完整的执行面。一旦工具能读私有仓库、写本地文件、运行迁移、调用 CI,真正需要被信任的就不只是模型回答,而是它的权限系统、日志机制、上下文处理方式以及失败时的降级路径。闭源产品可以更易用,但你无法彻底验证它在本地做了什么、向远端发送了什么、未来版本会如何改变默认行为。
因此,OpenCode 这类系统的设计理由不是“开源更浪漫”,而是高权限软件必须可审计、可替换、可约束。当模型供应商变化、价格波动、合规要求升级时,真正有价值的是你是否仍然保有自己的 harness、协议层与工作流定义权。Agent 时代的核心竞争力,最终不是谁最会写 demo,而是谁能在不牺牲主权的前提下,把自动化变成可靠的工程能力。
The evolution of AI Coding tools can be divided into three stages:
- The Copilot era (2021-2023): autocomplete and passive assistance. GitHub Copilot is the canonical example; the LLM’s job was to complete code at the current cursor position.
- The Chat era (2023-2024): conversational programming. ChatGPT and Cursor Chat let developers describe requirements in natural language and receive generated code snippets, but the final application of that code still required manual copy-paste.
- The Agent era (2025-): autonomous execution. AI no longer just generates code; it can read files, run commands, execute tests, and fix bugs, forming a full perception-decision-execution loop. Claude Code, OpenCode, and Aider are representatives of this generation.
OpenCode emerged in this third stage. Its central insight is that Agent capability = LLM reasoning + tool use + environmental awareness. LLM Providers will keep improving, model gaps will narrow, and prices will continue falling—so tying yourself to a single Provider is shortsighted.
These numbers matter not merely as a popularity contest, but because they reveal three strategic directions in the market: closed hosted Agents, open terminal-native Agents, and commercial IDE workbenches. Claude Code represents the high-capability but non-auditable closed path; Cursor packages Agent power as a productivity subscription; Aider and OpenCode emphasize execution frameworks that can be read, modified, and self-hosted. For system designers, this is less about brands than about who owns the control plane.
From a systems perspective, stars, contributor count, and release cadence point to the same conclusion: Agents are no longer demos of raw model intelligence; they are becoming an infrastructure layer. Once a project has hundreds of contributors, tens of thousands of changes, and hundreds of releases, the discussion naturally shifts from “does the model sound smart?” to “how are permissions constrained, how is context compacted, how are tool protocols standardized, and how are failures isolated?” That is the layer OpenCode is trying to occupy.
First, model capability finally crossed the threshold. Earlier models could complete functions, but they could not reliably maintain intent across multi-file, multi-step, constraint-heavy work. Newer models can hold a plan longer, inspect outcomes, and self-correct. Only when planning, execution, and review begin to form a loop does an Agent become an engineering system instead of a stage demo.
Second, tool-use training has matured. Function calling, structured outputs, tool APIs, and MCP mean the model no longer emits only text; it can orchestrate filesystems, shells, LSPs, browsers, and test runners. In other words, the modern Agent breakthrough is not just that models got smarter—it is that they finally acquired reliable, composable hands and eyes.
Third, context windows expanded from roughly 4K to 1M+, making whole-repository reasoning plausible for the first time. Earlier systems could focus on a function or a few screens of code; now they can retain requirements, architecture notes, logs, test output, prior decisions, and tool results in one working set. A large window does not automatically create a good Agent, but it makes context engineering possible instead of hypothetical.
Fourth, reasoning cost dropped by roughly 100x in two years. If every plan-read-execute-verify cycle is too expensive to repeat, Agents remain luxury demos. Cost compression changed the economics of iteration: systems can now afford to explore first, validate second, and repair third. That shift makes architectural robustness more important than a single flashy answer.
Why do some engineers insist that only open-source Agents should be trusted with a full codebase, shell access, environment variables, and deployment scripts? Because an Agent is not merely a chatbot; it touches the entire execution surface. Once tooling can read private repositories, write local files, run migrations, and invoke CI, what must be trusted is not just the model response but the permission model, logging design, context-handling policy, and degradation path under failure. A closed product may be smoother, but you cannot fully verify what it did locally, what it transmitted remotely, or how future versions might shift the default behavior.
That is why the design rationale behind systems like OpenCode is not “open source is more romantic,” but rather that high-privilege software must be auditable, replaceable, and constrainable. When model vendors change, pricing moves, or compliance requirements tighten, the durable asset is whether you still control your harness, protocol layer, and workflow definitions. In the Agent era, the lasting advantage is not who can produce the most impressive demo; it is who can turn automation into a dependable engineering capability without surrendering sovereignty.
3 核心架构:Client/Server 分离Core Architecture: Client/Server Separation
OpenCode 最关键的架构决策是 Client/Server 分离。当你运行 opencode 命令时,实际上启动了两个东西:一个 HTTP Server 和一个 TUI Client。
OpenCode’s most important architectural decision is Client/Server separation. When you run opencode, you are actually starting two things: an HTTP Server and a TUI Client.
这个架构带来几个重要的能力:
- 多客户端:TUI、Web、Desktop、IDE 插件、甚至手机 App 都可以作为客户端连接同一个 Server
- Headless 模式:
opencode serve可以启动一个无 UI 的 API 服务器,用于编程接入 - 远程操作:Server 跑在开发机上,通过手机 App 远程驱动——真正的"随身 AI 程序员"
- 可测试性:每个 API 都有明确的 OpenAPI 3.1 规范,可自动生成 SDK
That architecture unlocks several important capabilities:
- Multiple clients: the TUI, Web app, Desktop app, IDE plugins, and even mobile apps can all connect to the same Server
- Headless mode:
opencode servecan start a UI-less API server for programmatic access - Remote operation: the Server can run on a development machine and be driven from a phone, making the idea of a “portable AI programmer” real
- Testability: every API has a concrete OpenAPI 3.1 specification, which makes SDK generation straightforward
具体而言,Event Bus 采用 双层发布/订阅架构:每种事件类型拥有独立的 typed PubSub 通道,同时还有一个 wildcard PubSub 接收所有事件。事件通过 Effect-TS 的 PubSub 原语广播,并同时发射到 Node.js 的全局 EventEmitter 以支持跨进程通信。关键事件类型包括:session.compacted(上下文压缩完成)、permission.asked / permission.replied(权限请求/响应)、lsp.updated(语言服务器状态变更)、mcp.tools.changed(MCP 工具列表变更)等。每个事件都使用 Zod schema 定义,确保类型安全。
More concretely, the Event Bus uses a two-layer publish/subscribe architecture: each event type has its own typed PubSub channel, and a wildcard PubSub receives all events. Events are broadcast through the Effect-TS PubSub primitive and are also emitted to Node.js’s global EventEmitter for cross-process communication. Important event types include session.compacted (context compaction complete), permission.asked / permission.replied (permission request / response), lsp.updated (language-server state changes), and mcp.tools.changed (MCP tool-list changes). Every event is defined with a Zod schema to preserve type safety.
4 Agent 系统:多角色协作Agent System: Multi-Role Collaboration
OpenCode 的 Agent 系统是一个分层设计:
OpenCode’s Agent system is intentionally layered:
数据流架构Data Flow Architecture
Primary Agents(主 Agent)Primary Agents
用户直接交互的 Agent,通过 Tab 键切换:
- Build:默认 Agent,拥有全部工具权限(读写文件、执行命令、搜索等),用于实际开发工作
- Plan:只读 Agent,禁止文件修改,bash 命令需要审批——用于代码分析和变更规划
The user interacts with these Agents directly and switches between them with Tab:
- Build: the default Agent, with full tool permissions (file I/O, commands, search, and more), used for actual implementation work
- Plan: a read-only Agent that cannot modify files and requires approval for bash commands, intended for analysis and change planning
Subagents(子 Agent)Subagents
被主 Agent 调度的专业 Agent,也可通过 @mention 手动调用:
- General:通用子 Agent,拥有全工具权限(除 todo),用于复杂搜索和多步任务
- Explore:只读探索 Agent,快速查找文件、搜索代码模式
- Compaction / Title / Summary:隐藏的系统 Agent,分别负责上下文压缩、标题生成和摘要
These specialist Agents are scheduled by a primary Agent, though they can also be invoked manually via @mention:
- General: a general-purpose subagent with full tool access (except todo), used for complex search and multi-step work
- Explore: a read-only exploration Agent optimized for fast file discovery and code-pattern search
- Compaction / Title / Summary: hidden system Agents responsible for context compaction, title generation, and summaries
Agent 的核心循环The Agent Core Loop
每个 Agent 的执行遵循经典的 ReAct (Reasoning + Acting) 模式:
Every Agent follows the classic ReAct (Reasoning + Acting) pattern:
while not done:
# 1. Reasoning - LLM 分析当前状态,决定下一步
response = llm.generate(system_prompt + conversation + tool_results)
# 2. Acting - 如果 LLM 决定调用工具
if response.has_tool_calls:
for tool_call in response.tool_calls:
result = execute_tool(tool_call)
conversation.append(tool_result=result)
continue # 回到推理步骤
# 3. 如果 LLM 生成了文本回复,循环结束
return response.text
这个循环会在 max_steps(如果配置了)耗尽时强制结束,此时 Agent 收到特殊的 system prompt 要求总结已完成的工作和剩余任务。源码中还内置了Doom Loop 死循环防护:如果 Agent 连续 3 次调用同一工具且参数完全相同(DOOM_LOOP_THRESHOLD = 3),系统会自动暂停并请求用户确认是否继续,防止 LLM 陷入无效重复。
LLM 在不确定时有一种常见模式:重复尝试已经失败的操作。一个沉迷的 Agent 可能会循环调用 grep 查找并不存在的符号,每次参数略有调整但本质上做的是同一件事。没有 Doom Loop 保护,这就是在白白刀掉 Token。阀值设为 3 是特意选择的:越小则误报率越高(合法的重试也会被打断),越大则保护失效。
This loop is forcibly terminated once max_steps is exhausted, if configured. At that point the Agent receives a special system prompt asking it to summarize completed work and remaining tasks. The codebase also includes built-in Doom Loop protection: if an Agent calls the same tool three times in a row with identical arguments (DOOM_LOOP_THRESHOLD = 3), the system pauses automatically and asks the user whether execution should continue, preventing the LLM from getting trapped in useless repetition.
When LLMs become uncertain, a common failure mode is to retry an already-failed action. A stuck Agent might call grep again and again for a symbol that does not exist, tweaking parameters slightly while doing essentially the same thing. Without Doom Loop protection, that behavior simply burns tokens. A threshold of 3 is an intentional compromise: lower thresholds generate too many false positives by interrupting legitimate retries, while higher thresholds weaken the guardrail too much.
子 Agent 调度的实际体验What Subagent Scheduling Feels Like in Practice
在实际使用中,当你向 Build Agent 发出一个复杂指令(比如「把登录流程从 session-based 迁移到 JWT」),它会自动编排多个子 Agent 并行工作:
- Build Agent 分析任务,创建 todo list,然后调度 Explore 子 Agent 先扫描现有的 auth 代码结构
- Explore 返回结果后,Build Agent 可能再调度一个 General 子 Agent 并行处理测试文件的修改,同时自己处理主代码逻辑
- 每个子 Agent 拥有独立的 Session(不会污染主 Agent 的上下文窗口),但结果会汇总回主 Agent
这种分层设计的关键优势是:主 Agent 保持高层任务感知,而子 Agent 专注于具体子任务,不会被无关上下文分散注意力——这在 LLM 上下文窗口有限的约束下尤为重要。
In real usage, when you give the Build Agent a complex instruction—say, “migrate the login flow from session-based auth to JWT”—it can automatically orchestrate multiple subagents in parallel:
- The Build Agent analyzes the task, creates a todo list, and dispatches an Explore subagent to scan the current auth code structure
- Once Explore returns, the Build Agent may dispatch a General subagent to update test files in parallel while it handles the main implementation itself
- Each subagent has its own Session, so it does not pollute the main Agent’s context window, but its results are summarized back into the parent Agent
The key benefit of this layered design is that the main Agent preserves high-level task awareness while subagents focus on concrete subproblems without being distracted by irrelevant context. That matters a great deal when LLM context windows are finite.
自定义 AgentCustom Agents
用户可以通过 JSON 配置或 Markdown 文件定义自己的 Agent。例如创建一个安全审计 Agent:
Users can define their own Agents through JSON config or Markdown files. For example, you could create a security-audit Agent like this:
# ~/.config/opencode/agents/security-auditor.md
---
description: Performs security audits and identifies vulnerabilities
mode: subagent
tools:
write: false
edit: false
---
You are a security expert. Focus on identifying potential security issues.
Look for:
- Input validation vulnerabilities
# ~/.config/opencode/agents/code-reviewer.md
---
description: Reviews code quality and suggests improvements
mode: subagent
tools:
write: false
edit: false
---
You are a code review expert. Focus on:
- Code readability and maintainability
- Performance anti-patterns
- Type safety issues
每个 Agent 可以独立配置:模型、温度、工具权限、系统提示词、最大步数等。权限系统支持细粒度控制,包括对特定 bash 命令的 glob 模式匹配。在源码实现中,Agent 的权限通过三层合并生成最终规则:Permission.merge(defaults, agentDefaults, userConfig)——系统默认规则 → Agent 特定规则 → 用户在 opencode.json 中的自定义配置,后者优先级最高。例如 Plan Agent 默认禁止写文件,但 Build Agent 允许所有编辑操作,用户可以进一步覆盖这些默认値。
Each Agent can be configured independently: model, temperature, tool permissions, system prompt, maximum step count, and more. The permission system supports fine-grained control, including glob-based matching for specific bash commands. In the implementation, final Agent permissions are produced through a three-layer merge: Permission.merge(defaults, agentDefaults, userConfig)—system defaults → Agent-specific defaults → user overrides in opencode.json, with user config taking highest precedence. For example, the Plan Agent forbids file writes by default, while Build allows editing, and the user can override either one.
findLast 语义
权限匹配使用 findLast(从后往前查找第一个匹配规则)而非 find。这意味着:用户配置永远能覆盖系统默认,包括允许一个采用 deny:true 的命令。例如:系统层默认 bash 全允许,但用户加一条 "rm -rf": deny,应用后 rm -rf 就会被挂起,其他 bash 命令不受影响。这比「全默认拒绝 + 白名单」的模式更灵活,也比「全默认允许」的模式更安全。
findLast semantics
Permission matching uses findLast—searching backward for the first matching rule—instead of find. That means user config can always override system defaults, including rules that deny dangerous commands. For example, the system layer may allow all bash commands by default, but a user can add "rm -rf": deny, after which rm -rf will be blocked while all other bash commands remain unaffected. This is both more flexible than a pure “default deny + allowlist” model and safer than a blanket “default allow” model.
5 Tool 系统:LLM 的双手Tool System: The Hands of LLMs
工具是 Agent 与外部世界交互的接口。OpenCode 内置了 14 个核心工具:
| 工具 | 功能 | 权限控制 |
|---|---|---|
bash |
执行 shell 命令(npm install、git status 等) | allow / ask / deny,支持 glob 匹配 |
edit |
精确字符串替换修改文件 | 统一的 edit 权限 |
write |
创建新文件或覆盖现有文件 | 同 edit 权限 |
read |
读取文件内容(支持行号范围) | 默认 allow |
grep |
正则表达式搜索文件内容 | 默认 allow |
glob |
文件名模式匹配 | 默认 allow |
list |
列出目录内容 | 默认 allow |
lsp |
代码智能(定义跳转、引用查找等) | 实验性功能 |
apply_patch |
应用 diff/patch 文件 | 同 edit 权限 |
skill |
加载可复用的技能定义 | 支持 glob 匹配 |
todowrite |
管理任务列表(仅主 Agent) | 默认 allow |
webfetch |
抓取网页内容 | allow / ask / deny |
websearch |
网络搜索(Exa AI) | 需 OpenCode 提供商或 env 变量 |
question |
向用户提问(多选 / 自由输入) | 默认 allow |
工具系统的设计精髓在于 权限分层:
The core design insight of the tool system is layered permissions:
// opencode.json - 全局权限
{
"permission": {
"edit": "ask", // 所有文件修改需审批
"bash": {
"*": "ask", // 默认 bash 需审批
"git status *": "allow", // git status 放行
"git push": "deny" // 禁止 push
}
}
}
// Agent 级别可覆盖全局权限
{
"agent": {
"build": {
"permission": {
"edit": "allow" // Build Agent 放行编辑
}
}
}
}
// opencode.json - 宽松模式
{
"permission": {
"edit": "allow",
"bash": {
"*": "allow"
}
}
}
// opencode.json - 团队模式
{
"permission": {
"edit": "allow",
"bash": {
"*": "allow",
"git push *": "deny",
"rm -rf *": "deny",
"DROP *": "deny"
}
}
}
// opencode.json - Global Permissions
{
"permission": {
"edit": "ask", // All edits require approval
"bash": {
"*": "ask", // Default bash requires approval
"git status *": "allow", // Allow git status
"git push": "deny" // Deny push
}
}
}
// Agent-level overrides
{
"agent": {
"build": {
"permission": {
"edit": "allow" // Build Agent allows edit
}
}
}
}
// opencode.json - Permissive Mode
{
"permission": {
"edit": "allow",
"bash": {
"*": "allow"
}
}
}
// opencode.json - Team Mode
{
"permission": {
"edit": "allow",
"bash": {
"*": "allow",
"git push *": "deny",
"rm -rf *": "deny",
"DROP *": "deny"
}
}
}
工具执行生命周期Tool Execution Lifecycle
当 Agent 决定调用一个工具时,内部会经历一个完整的四阶段 pipeline:
- Zod 参数验证:LLM 返回的工具参数先经过 Zod schema 校验,拒绝格式错误的调用(例如
read工具缺少filePath参数) - 权限评估:调用
Permission.evaluate(),按照「全局规则 → Agent 规则 → 用户规则」的顺序匹配,采用findLast()语义(最后匹配的规则胜出)。对于bash工具,还会用 Tree-sitter 解析命令语法树,提取文件路径和命令模式进行更精确的权限匹配 - Effect 执行:工具的
execute()函数在 Effect-TS 运行时中执行,完整的错误链和依赖注入自动管理 - 输出截断与格式化:结果经
Truncate.output()处理,超长输出被截断以防止撞上下文窗口上限,并附加truncated元数据标记
When an Agent decides to call a tool, the request passes through a full four-stage pipeline:
- Zod argument validation: tool arguments produced by the LLM are validated against a Zod schema first, and malformed calls are rejected immediately (for example, a
readcall missingfilePath) - Permission evaluation: the runtime calls
Permission.evaluate()and resolves rules in the order “global → Agent → user,” usingfindLast()semantics so the last matching rule wins. Forbash, Tree-sitter parses the command AST to extract file paths and command patterns for more precise checks - Effect execution: the tool’s
execute()function runs inside the Effect-TS runtime with error propagation and dependency injection managed automatically - Output truncation and formatting: results pass through
Truncate.output(), which clips oversized output before it overruns the context window and tags it withtruncatedmetadata
工具调用还有一套状态机管理:pending → running → completed / error。在 running 阶段,系统记录开始时间戳用于性能监控。Glob 模式匹配使用正则转换:* 变为 .*,? 变为 .,尾部的 " *" 变为可选参数匹配 ( .*)?——这意味着 "git status *" 既匹配 git status 也匹配 git status --short。
Tool calls are also managed by a state machine: pending → running → completed / error. During running, the system records a start timestamp for performance monitoring. Glob patterns are converted into regular expressions: * becomes .*, ? becomes ., and a trailing " *" becomes the optional-argument matcher ( .*)?. That means "git status *" matches both git status and git status --short.
当 bash 工具执行 find / -name "*.log" 返回 50,000 行结果时,把完整输出喅入上下文会直接消耗数万 tokens——费酷且可能把高价唃惜的在线窗口捌爆。Truncate.output() 的设计展示了一个重要认知:Agent 和人类不同,它不需要读完全文本才能理解——前 2,000 tokens 的输出加上 truncated: true 标记,足够让模型判断“结果太多,需要用更精确的 glob 过滤”。这个设计选择的内在逻辑是:控制上下文消耗是 Tool 系统的责任,不应该由 Agent 自行判断。
If the bash tool runs find / -name "*.log" and returns 50,000 lines, feeding the full result into the context would instantly consume tens of thousands of tokens—wasteful, expensive, and potentially enough to blow up a premium context window. Truncate.output() reflects a key insight: Agents are not humans, and they do not need to read everything to understand what to do next. The first 2,000 tokens plus a truncated: true marker are usually enough for the model to infer that the result set is too large and that it should refine the query. The deeper principle is that context-budget control belongs to the tool system; the Agent should not need to rediscover that concern on every call.
6 Session 模型:对话即状态Session Model: Conversation as State
OpenCode 的 Session 是核心状态单元。每个 Session 包含:
- Message 链:用户消息、Assistant 回复、工具调用结果
- Todo List:Agent 维护的任务追踪
- Diff 快照:每次文件修改的增量记录,支持
/undo和/redo - 子 Session:Subagent 调用会创建子 Session,形成树状结构
- Compaction 摘要:当上下文过长时,系统自动压缩历史消息
Session 持久化在本地 SQLite 数据库中(通过 Drizzle ORM),所有数据完全在用户本地,不上传云端。/share 命令可选择性地分享对话。
A Session is OpenCode’s fundamental unit of state. Each Session contains:
- Message chain: user messages, Assistant replies, and tool-call results
- Todo list: task tracking maintained by the Agent
- Diff snapshots: incremental records of each file edit, supporting
/undoand/redo - Child Sessions: subagent invocations create child Sessions, forming a tree
- Compaction summaries: when context grows too long, the system compresses historical messages automatically
Sessions are persisted locally in SQLite via Drizzle ORM. All data stays on the user’s machine and is never uploaded to the cloud. The /share command can publish a conversation selectively.
/undo 回滚到上一次修改前的状态,并恢复原始 prompt 让你调整后重试。这不是 git revert——它是 Agent 级别的操作回滚。
/undo rolls the workspace back to the state before the previous edit and restores the original prompt so you can retry with adjustments. This is not git revert; it is an Agent-level rollback mechanism.
Compaction 上下文压缩机制Compaction: Context Compression
当对话超过模型上下文窗口时,OpenCode 会自动触发上下文压缩。触发阈值的计算公式是:usable = model.limit.input - reserved,并预留 20,000 tokens 的缓冲区(COMPACTION_BUFFER)。当当前 token 计数超过可用空间时,压缩流程分两步执行:
- 工具输出剪枝(Pruning):从最旧的消息开始,清除工具调用的输出内容以释放空间。剪枝有保护机制:
PRUNE_MINIMUM = 20,000tokens(最少释放量)、PRUNE_PROTECT = 40,000tokens(保护近期的工具输出不被清除)。skill工具的调用结果始终受保护,不会被剪枝 - Compaction Agent 摘要:如果剪枝后仍然超限,系统会启动一个专用的 Compaction Agent(
mode: "compaction"),它接收完整对话历史并生成一份结构化摘要,包含:Goal(用户目标)、Instructions(重要指令)、Discoveries(发现)、Accomplished(已完成)、Relevant files(相关文件)。这份摘要替换原始对话历史,让 Agent 在不丢失关键上下文的情况下继续工作
When a conversation exceeds the model context window, OpenCode automatically triggers compaction. The threshold is computed as usable = model.limit.input - reserved, with a reserved buffer of 20,000 tokens (COMPACTION_BUFFER). Once the current token count crosses the usable limit, compaction proceeds in two stages:
- Tool-output pruning: starting from the oldest messages, the system clears tool-call output to reclaim space. Pruning is guarded by
PRUNE_MINIMUM = 20,000tokens as the minimum reclaimed budget andPRUNE_PROTECT = 40,000tokens to protect recent tool output. Results from theskilltool are always protected and never pruned. - Compaction Agent summary: if pruning still is not enough, the system launches a dedicated Compaction Agent (
mode: "compaction") that consumes the full conversation history and emits a structured summary containing Goal, Instructions, Discoveries, Accomplished, and Relevant files. That summary replaces the raw conversation history so the Agent can keep working without losing critical context.
① 工具输出剪枝 (Pruning)
从最旧消息开始清除工具输出。PRUNE_MINIMUM = 20K tokens 最少释放量,PRUNE_PROTECT = 40K tokens 保护近期结果。skill 工具调用始终受保护。
② Compaction Agent 摘要
如果剪枝仍超限,启动专用 Agent 生成结构化摘要:Goal、Instructions、Discoveries、Accomplished、Relevant files,替换原始对话历史。
① Tool-output pruning
Clear tool output starting from the oldest messages. PRUNE_MINIMUM = 20K defines the minimum reclaimed budget, while PRUNE_PROTECT = 40K protects recent results. skill outputs are always exempt.
② Compaction Agent summary
If pruning is still insufficient, a dedicated Agent generates a structured summary with Goal, Instructions, Discoveries, Accomplished, and Relevant files, replacing the raw history.
Session 树状结构Session Tree Structure
Subagent 调用不共享主 Agent 的上下文——它们拥有独立的 Session,以保护主 Agent 的上下文窗口不被细节占满。结果通过返回内容汇总回主 Session。
Subagent calls do not share the main Agent’s context. Each subagent gets an independent Session so the parent context window is not consumed by implementation detail. Only the returned result is summarized back into the main Session.
Compaction 不是仅仅压缩德 Token,而是对认知状态的管理。没有这个机制,长寿命 Agent 任务会资豁即废——不是因为模型不够聪明,而是因为上下文窗口装不下全部历史。
Compaction Agent 生成的摘要格式包含 Goal(任务目标)、Instructions(用户指令)、Discoveries(已发现)、Completed Tasks(已完成)、Remaining Tasks(待完成)——本质上是把 LLM 的“短期记忆”转化为结构化的“工作备左”,让 Agent 在新上下文中仍能“想起”自己做过什么。
Compaction is not just token compression; it is management of cognitive state. Without this mechanism, long-lived Agent tasks eventually break down—not because the model is not smart enough, but because the context window cannot carry the full history.
The Compaction Agent’s summary format typically includes Goal, Instructions, Discoveries, Completed Tasks, and Remaining Tasks. In practice, this converts the LLM’s short-term memory into a structured working notebook, so the Agent can still “remember” what it already accomplished after entering a fresh context.
7 LSP 集成:原生代码智能LSP Integration: Native Code Intelligence
这是 OpenCode 相比 Claude Code 的一个显著差异点。OpenCode 原生集成了 Language Server Protocol (LSP),让 Agent 拥有与 IDE 相同的代码理解能力:
goToDefinition:跳转到符号定义findReferences:查找所有引用hover:获取类型信息documentSymbol / workspaceSymbol:符号索引goToImplementation:跳转到实现prepareCallHierarchy:调用链分析
This is one of OpenCode’s clearest differentiators relative to Claude Code. OpenCode integrates the Language Server Protocol (LSP) natively, giving the Agent IDE-grade code understanding:
goToDefinition: jump to a symbol definitionfindReferences: find all referenceshover: retrieve type informationdocumentSymbol / workspaceSymbol: symbol indexinggoToImplementation: jump to implementationsprepareCallHierarchy: call graph analysis
为什么 LSP 比 grep 好?Why Is LSP Better Than grep?
考虑一个真实场景:你让 Agent 把函数 getUserById 重命名为 findUserById。
❌ 没有 LSP(grep 方式)
用 grep getUserById 全文搜索——会命中注释、字符串常量、其他模块中同名但无关的函数。无法区分「定义」与「引用」,无法处理 TypeScript 的 re-export 链,也无法识别通过接口间接调用的引用。结果:重构不完整,甚至引入 bug。
✅ 有 LSP(语义方式)
调用 findReferences 获取语义级完整引用列表——包括跨文件 import、接口实现、类型引用——然后精确逐一修改。等价于 IDE「重命名符号」功能的 Agent 版本,零误报,零漏报。
Consider a real scenario: you ask the Agent to rename a function from getUserById to findUserById.
❌ Without LSP (grep-based)
A plain grep getUserById search will hit comments, string literals, and unrelated functions with the same name in other modules. It cannot distinguish definitions from references, cannot follow TypeScript re-export chains, and cannot identify references that flow indirectly through interfaces. Result: incomplete refactors and potentially new bugs.
✅ With LSP (semantic)
Calling findReferences yields a semantically correct reference graph, including cross-file imports, interface implementations, and type references, so the Agent can update exactly the right sites. This is essentially the Agent equivalent of an IDE “rename symbol” operation: no false positives and no missed references.
Go Language Server 架构Go Language Server Architecture
更有意思的点不只是“支持 LSP”,而是 OpenCode 选择自己实现一个 Go 写的 LSP Client,而不是把这层逻辑继续堆在 Node.js 进程里。原因很工程化:LSP 不是一次性脚本,而是一个需要长期驻留、频繁收发 JSON-RPC、同时维护多语言会话状态的后台子系统。
对这种负载,Go 的形态非常合适:启动快,二进制单文件部署简单,常驻内存占用更稳定,而且可以用 goroutine 轻量并发地管理多个语言服务器连接。相比之下,如果把每个语言服务器的编排、超时控制、stdio 泵送、重连逻辑都塞进主 Node.js runtime,会把原本应该隔离的后台工作和 Agent 的主事件循环绑得更紧,增加尾延迟与资源竞争。
架构上,Go 进程扮演的是一个 LSP transport orchestrator:它向下通过 stdio 或本地进程通信管理每种语言对应的 Language Server,向上则通过 JSON-RPC over stdio 把诊断、定义跳转、符号索引、引用查询等结果回传给主 Effect-TS 进程。于是主 Agent 不必知道每个语言服务器的实现细节,只需消费统一的语义接口。这是典型的“把复杂性压到边界层”的设计——让多语言分析变成一个稳定服务,而不是一堆散落在 Agent 逻辑里的特例处理。
The interesting part is not merely that OpenCode “supports LSP,” but that it implements its own Go-based LSP client instead of piling this layer into the Node.js runtime. The reason is deeply architectural: LSP is not a one-shot script. It is a long-lived background subsystem that continually exchanges JSON-RPC messages and maintains per-language session state.
Go fits that workload extremely well: fast startup, simple single-binary deployment, more stable memory usage for persistent processes, and lightweight goroutine-based concurrency for managing multiple language-server connections at once. If all orchestration, timeout handling, stdio pumping, and restart logic were kept inside the main Node.js runtime, background infrastructure work would be coupled more tightly to the Agent’s primary event loop, increasing tail latency and resource contention.
Architecturally, the Go process acts as an LSP transport orchestrator: downward, it manages per-language Language Server processes over stdio or local process boundaries; upward, it sends diagnostics, definition lookups, symbol indices, and reference queries back to the main Effect-TS process via JSON-RPC over stdio. That means the Agent core does not need to understand the quirks of each language server. It only consumes a uniform semantic interface. This is a classic boundary-layer decision: push complexity outward so multi-language analysis behaves like a stable service rather than a pile of language-specific exceptions inside Agent logic.
这个数据流体现的是“事件驱动、语义返回、上下文按需注入”三原则:编辑器事件触发查询,Go Client 负责可靠传输,语言服务器返回结构化语义结果,最终只有高价值结果进入 Agent 上下文,而不是把整个文件重新塞给模型读一遍。
对 Agent 来说,LSP 最大的价值不是“像 IDE 一样聪明”,而是把语义信息压缩成极短的结构化响应。例如一次 hover 或 definition 调用,就能返回符号类型、声明位置、签名信息;如果没有 LSP,模型往往需要自己读多个文件、沿着 import 链推理,轻易就会消耗上千 tokens。换句话说,LSP 用机器可执行的语义索引,替代了昂贵的自然语言式“全文阅读”。
这也是为什么 LSP 在 Agent 系统里不是一个普通工具,而更像一种认知放大器:它把“理解代码关系”这件事从 token 密集型任务,改造成低成本、高确定性的查询操作。模型仍然负责决策,但定位事实的成本被基础设施显著降低了。
{
"jsonrpc": "2.0",
"id": 42,
"result": {
"uri": "file:///workspace/src/user/service.ts",
"range": {
"start": { "line": 87, "character": 16 },
"end": { "line": 87, "character": 27 }
}
}
}
上面的响应格式看起来朴素,但它非常关键:返回值不是模糊文本,而是可直接行动的坐标。Agent 拿到文件 URI 与精确 range 后,可以继续调用 definition、references、rename、diagnostics 等后续操作,形成一条由语义坐标驱动的工作链。
This flow encodes three design principles: event-driven triggering, semantic return values, and demand-shaped context injection. Editor events trigger queries, the Go client handles reliable transport, language servers return structured semantic data, and only the high-value result enters Agent context—instead of forcing the model to reread entire files.
For an Agent, the biggest value of LSP is not merely “IDE-like intelligence,” but that it compresses semantic facts into extremely short structured responses. A single hover or definition call can return symbol types, declaration locations, and signature information. Without LSP, the model often has to read several files and reason across import chains, easily spending 1000+ tokens. In other words, LSP replaces expensive natural-language-style file reading with machine-queryable semantic indexing.
That is why LSP is not just another tool inside an Agent system. It is closer to a cognitive multiplier: it turns code-relationship understanding from a token-intensive task into a low-cost, high-certainty query. The model still makes decisions, but the infrastructure dramatically lowers the cost of locating facts.
{
"jsonrpc": "2.0",
"id": 42,
"result": {
"uri": "file:///workspace/src/user/service.ts",
"range": {
"start": { "line": 87, "character": 16 },
"end": { "line": 87, "character": 27 }
}
}
}
The structure above looks simple, but that simplicity is the point: the response is not vague prose but an actionable coordinate. Once the Agent has a file URI and an exact range, it can chain further operations—definition, references, rename, diagnostics—into a workflow driven by semantic coordinates rather than heuristic text search.
8 MCP 协议:无限扩展MCP Protocol: Infinite Extensibility
Model Context Protocol (MCP) 是 Anthropic 提出的开放标准,用于让 LLM 与外部服务交互。OpenCode 原生支持 MCP,同时支持 stdio、SSE 和 HTTP 三种传输方式,可以:
- 连接数据库,让 Agent 直接查询和修改数据
- 集成 CI/CD 系统,触发部署流水线
- 接入第三方 API(Jira、Slack、Notion 等)
- 使用社区贡献的 MCP Server 扩展能力
Model Context Protocol (MCP) is an open standard proposed by Anthropic for connecting LLMs to external services. OpenCode supports MCP natively and supports three transport modes—stdio, SSE, and HTTP—making it possible to:
- connect to databases so the Agent can query and mutate data directly
- integrate with CI/CD systems and trigger deployment pipelines
- hook into third-party APIs such as Jira, Slack, and Notion
- extend capabilities using community-contributed MCP Servers
// opencode.json - 数据库配置
{
"mcp": {
"postgres": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-postgres", "postgres://..."]
}
}
}
// opencode.json - GitHub 配置
{
"mcp": {
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": { "GITHUB_TOKEN": "..." }
}
}
}
// opencode.json - Database Config
{
"mcp": {
"postgres": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-postgres", "postgres://..."]
}
}
}
// opencode.json - GitHub Config
{
"mcp": {
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": { "GITHUB_TOKEN": "..." }
}
}
}
MCP Server 的工具会以 servername_toolname 的格式出现在 Agent 可用工具列表中,并遵循统一的权限系统。三种传输方式适用不同场景:
📦 stdio
最常用。MCP Server 作为子进程运行,通过标准输入输出通信。配置只需 command + args,无需网络。
📡 SSE
远程 MCP Server。通过 HTTP 长连接推送事件,支持服务器主动通知。配置指定 url 即可切换。
🌐 HTTP
Streamable HTTP 传输,最新模式。基于标准 HTTP 请求/响应,适合无状态 serverless 部署场景。
Tools exposed by an MCP Server appear in the Agent’s tool list under the naming pattern servername_toolname, and they participate in the same unified permission system. The three transport modes map to different deployment scenarios:
📦 stdio
The most common mode. The MCP Server runs as a subprocess and communicates over standard input/output. Configuration usually requires only command + args, with no network dependency.
📡 SSE
For remote MCP Servers. It uses long-lived HTTP connections to push events and supports server-initiated notifications. Switching is typically just a matter of providing a url.
🌐 HTTP
The newest mode: streamable HTTP transport. It uses standard HTTP request/response semantics and is a natural fit for stateless serverless deployments.
9 Skills & Plugins:可组合的知识与行为Skills & Plugins: Composable Knowledge & Behavior
Skills(技能)Skills
Skills 是可复用的 Markdown 指令文件,Agent 在需要时按需加载。它解决的问题是:如何在不膨胀 system prompt 的前提下,让 Agent 掌握领域特定知识?
Skills are reusable Markdown instruction files that Agents load on demand. They solve a specific problem: how do you give an Agent domain-specific knowledge without bloating the system prompt?
# .opencode/skills/git-release/SKILL.md
---
name: git-release
description: Create consistent releases and changelogs
---
## What I do
- Draft release notes from merged PRs
- Propose a version bump
- Provide a copy-pasteable `gh release create` command
## When to use me
Use this when you are preparing a tagged release.
Skills 通过 skill 工具按需加载——Agent 在工具描述中看到可用技能列表,只有在需要时才实际加载内容。这实现了"懒加载知识"。
Skills are loaded lazily through the skill tool. The Agent sees the list of available skills in the tool description, but only loads the actual content when it decides a given skill is needed. This is effectively “lazy-loaded knowledge.”
Plugins(插件)Plugins
Plugins 是 JavaScript/TypeScript 模块,可以 hook 进 OpenCode 的事件系统:
tool.execute.before/after:在工具执行前后拦截(实现审计、转换、保护等)session.idle/error:会话状态变更(发送通知、记录日志)shell.env:注入环境变量file.edited:文件修改事件- 自定义工具:Plugin 可以注册新的 Tool,供 Agent 调用
Plugins are JavaScript/TypeScript modules that can hook into OpenCode’s event system:
tool.execute.before/after: intercept tool execution before and after it runs for auditing, transformation, protection, and related use casessession.idle/error: react to session state changes for notifications or loggingshell.env: inject environment variablesfile.edited: observe file edit events- Custom tools: a Plugin can register new Tools for the Agent to call
// .opencode/plugins/env-protection.js
export const EnvProtection = async ({ project, client, $ }) => {
return {
"tool.execute.before": async (input, output) => {
if (input.tool === "read" && output.args.filePath.includes(".env")) {
throw new Error("Do not read .env files")
}
},
}
}
有一种看似简单的替代方案:把所有领域知识全塑进 system prompt。这会导致两个问题:① token 成本激增(每次请求都需要付费全量输入);② 注意力稽释(模型在海量无关指令中对当前任务的关注度下降)。
Skills 的懒加载模式解决了这个问题:Agent 只看到 Skill 的名称和描述(常奏),只有当它判断需要某个技能时才加载内容(按需)。这与程序语言的模块导入机制如出一辙:你不需要预先加载每一个库,只需要在需要时 import。Skills 是 Agent 的模块化上下文工程的一种认识论层核心实践。
There is an apparently simple alternative: stuff all domain knowledge directly into the system prompt. That causes two problems immediately: first, token cost explodes because every request pays for the full context up front; second, attention gets diluted because the model must sift through a mass of irrelevant instructions before focusing on the actual task.
The lazy-loading model solves this cleanly: the Agent initially sees only the Skill’s name and description, and loads the full content only when it judges that the Skill is relevant. This is structurally similar to module imports in programming languages—you do not pre-load every library, you import the one you need when you need it. Skills are therefore a core modular context-engineering technique for Agents.
10 技术栈深度拆解Tech Stack Deep Dive
OpenCode 的技术选型充满了 "SST 团队风格"——偏好 Effect-TS 生态、Bun runtime、SolidJS 前端。这些选择不是偶然的:Bun 的高性能启动和内置 Shell API 让工具执行更快,Effect-TS 的类型化错误处理让复杂的 Agent 循环不会静默失败,SolidJS 的细粒度响应式更新确保终端 UI 流畅无闪烁。
在 Provider 抽象层方面,OpenCode 通过 Vercel AI SDK 封装了 21 个内置提供商,包括 Anthropic、OpenAI、Google、Amazon Bedrock、Azure、xAI、Mistral、Groq、OpenRouter、Cerebras、Cohere、TogetherAI、Perplexity、GitLab、GitHub Copilot 等。每个模型通过统一的 Model schema 描述,包含能力声明(是否支持图片/音频输入、是否支持 reasoning、是否支持 tool call)、成本信息(输入/输出/缓存单价)和限制信息(上下文窗口、最大输出 tokens)。这意味着 Agent 可以智能地根据模型能力决定是否发送图片、是否调用工具。
OpenCode’s technology choices carry a distinctly “SST team” signature: strong preference for the Effect-TS ecosystem, the Bun runtime, and SolidJS on the frontend. None of these choices are accidental. Bun’s fast startup and built-in Shell API make tool execution faster, Effect-TS’s typed error handling prevents complex Agent loops from failing silently, and SolidJS’s fine-grained reactivity keeps the terminal UI responsive and flicker-free.
At the Provider abstraction layer, OpenCode uses the Vercel AI SDK to unify 21 built-in Providers, including Anthropic, OpenAI, Google, Amazon Bedrock, Azure, xAI, Mistral, Groq, OpenRouter, Cerebras, Cohere, TogetherAI, Perplexity, GitLab, and GitHub Copilot. Each model is described through a common Model schema that records capabilities (image/audio input support, reasoning support, tool-call support), pricing metadata (input/output/cache cost), and operational limits (context window, max output tokens). That lets the Agent decide intelligently, for example, whether it can send images or invoke tools for a given model.
| 层 | 技术 | 说明 |
|---|---|---|
| Runtime | Bun 1.3+ | 高性能 JS runtime,内置 bundler、test runner、shell API |
| HTTP Server | Hono | 轻量、Edge-first 的 HTTP 框架,生成 OpenAPI spec |
| 类型效果系统 | Effect-TS | 函数式效果系统,处理异步、依赖注入、错误管理 |
| 数据库 | SQLite + Drizzle ORM | 本地持久化,snake_case 字段命名 |
| TUI 框架 | SolidJS + OpenTUI | 自研终端 UI 框架,渲染 SolidJS 组件到终端 |
| Web/Desktop UI | SolidJS + Tailwind CSS 4 | 共享 UI 组件库 |
| 桌面 App | Tauri (Rust) | 轻量原生壳,包裹 Web UI |
| AI 集成 | Vercel AI SDK (ai 包) |
统一的多提供商 LLM 接口 |
| Schema 验证 | Zod 4 | 运行时类型校验 |
| 搜索引擎 | ripgrep | grep/glob/list 工具的底层实现 |
| 代码解析 | Tree-sitter | 语法感知的代码分析 |
| 终端模拟 | node-pty | 伪终端,支持完整的 shell 交互 |
| Monorepo | Bun workspaces + Turborepo | 包管理和任务编排 |
| Layer | Technology | Notes |
|---|---|---|
| Runtime | Bun 1.3+ | High-performance JS runtime with built-in bundler, test runner, and shell API |
| HTTP Server | Hono | Lightweight, edge-first HTTP framework with OpenAPI generation |
| Typed effect system | Effect-TS | Functional effect system for async flows, dependency injection, and error management |
| Database | SQLite + Drizzle ORM | Local persistence with snake_case field naming |
| TUI framework | SolidJS + OpenTUI | Custom terminal UI framework that renders SolidJS components in the terminal |
| Web/Desktop UI | SolidJS + Tailwind CSS 4 | Shared UI component library |
| Desktop app | Tauri (Rust) | Lightweight native shell around the Web UI |
| AI integration | Vercel AI SDK (ai package) |
Unified multi-Provider LLM interface |
| Schema validation | Zod 4 | Runtime type validation |
| Search engine | ripgrep | Underlying implementation for grep / glob / list style tooling |
| Code parsing | Tree-sitter | Syntax-aware code analysis |
| Terminal emulation | node-pty | Pseudo-terminal support for full shell interaction |
| Monorepo | Bun workspaces + Turborepo | Package management and task orchestration |
try/catch 很容易丢失错误上下文。Effect-TS 让每个操作的错误类型、依赖项和成功值都显式编码在类型中,编译器就能帮你捕获「忘记处理某种失败」的情况。这对一个需要管理 LLM 超时、工具失败、权限拒绝、上下文溢出等多种失败模式的 Agent 系统尤为重要。
关于 Bun vs Node 的具体差异:Bun 的启动时间仅为 Node.js 的几分之一(约 20-50ms vs 200-500ms),这对一个频繁启动子进程执行工具的 Agent 系统至关重要。Bun 内置的 Bun.Shell API 让 bash 工具不需要额外的 shell spawn 开销。但 trade-off 也很明确:Bun 的生态系统成熟度不及 Node.js,偶尔会遇到兼容性问题,这也是社区抱怨内存占用超 1GB 的部分原因之一。
Effect-TS 的实际 trade-off:它的类型签名可以极其复杂(如 Effect.Effect<A, E, R> 三参数泛型),新贡献者的学习曲线陡峭。源码中大量使用 Effect.gen(function* () { ... }) 的 generator 语法和 yield* 调用,这对习惯 async/await 的开发者来说需要思维转换。但回报是显著的:每个工具的错误类型、每个 Agent 的依赖项都在类型系统中显式声明,编译器能捕获「忘记处理某种失败」的情况。
try/catch patterns lose error context easily. Effect-TS encodes each operation’s error type, dependencies, and success value directly into the type system, so the compiler can catch cases where a failure mode was never handled. That matters enormously in an Agent runtime that must juggle LLM timeouts, tool failures, permission denials, and context overflow.
On Bun vs Node: Bun starts in a fraction of Node.js startup time—roughly 20-50ms versus 200-500ms—which matters a great deal for an Agent system that frequently spawns subprocesses for tools. Bun’s built-in Bun.Shell API also reduces shell-spawn overhead for bash execution. The trade-off is equally real: Bun’s ecosystem is less mature than Node’s, and compatibility issues still surface occasionally. That is also part of why the community sometimes complains about memory consumption exceeding 1GB.
The practical trade-off of Effect-TS is complexity. Its type signatures can become intimidating—Effect.Effect<A, E, R> is the canonical example—and new contributors face a steep learning curve. The codebase relies heavily on Effect.gen(function* () { ... }) and yield*, which requires a mental shift for developers used to async/await. The payoff, however, is substantial: every tool’s error surface and every Agent dependency are made explicit, enabling the compiler to catch unhandled failure paths.
Monorepo 结构Monorepo Structure
packages/
├── opencode/ # 核心业务逻辑 & Server
│ └── src/
│ ├── agent/ # Agent 定义与循环
│ ├── tool/ # 内置工具实现
│ ├── session/ # Session 管理
│ ├── provider/ # LLM 提供商抽象
│ ├── server/ # HTTP API (Hono)
│ ├── lsp/ # LSP 客户端
│ ├── mcp/ # MCP 协议实现
│ ├── skill/ # Skill 加载器
│ ├── plugin/ # Plugin 系统
│ ├── permission/ # 权限管理
│ ├── cli/ # CLI 命令 & TUI
│ ├── storage/ # SQLite 持久化
│ ├── bus/ # 事件总线
│ └── ...
├── app/ # Web UI 组件 (SolidJS)
├── desktop/ # 桌面应用 (Tauri)
├── sdk/ # JS SDK (自动生成)
├── plugin/ # @opencode-ai/plugin 类型
├── docs/ # 文档站 (Mintlify/Astro)
└── web/ # 落地页 & 文档
packages/
├── opencode/ # Core business logic & Server
│ └── src/
│ ├── agent/ # Agent definitions and loop
│ ├── tool/ # Built-in tool implementations
│ ├── session/ # Session management
│ ├── provider/ # LLM Provider abstraction
│ ├── server/ # HTTP API (Hono)
│ ├── lsp/ # LSP client
│ ├── mcp/ # MCP protocol implementation
│ ├── skill/ # Skill loader
│ ├── plugin/ # Plugin system
│ ├── permission/ # Permission management
│ ├── cli/ # CLI commands & TUI
│ ├── storage/ # SQLite persistence
│ ├── bus/ # Event bus
│ └── ...
├── app/ # Web UI components (SolidJS)
├── desktop/ # Desktop app (Tauri)
├── sdk/ # JS SDK (auto-generated)
├── plugin/ # @opencode-ai/plugin types
├── docs/ # Docs site (Mintlify/Astro)
└── web/ # Landing page & docs
11 与 Claude Code / Cursor / Aider 对比Comparison with Claude Code / Cursor / Aider
以下对比基于 2026 年 4 月各工具的最新版本。AI Coding 工具迭代极快,部分信息可能在你阅读时已变化。
The comparison below is based on the latest versions available as of April 2026. AI Coding tools evolve extremely quickly, so some details may already have changed by the time you read this.
| 维度 | OpenCode | Claude Code | Cursor | Aider |
|---|---|---|---|---|
| 开源 | MIT 完全开源 | 闭源 | 闭源 | Apache 2.0 |
| LLM 提供商 | 任意(Claude, OpenAI, Google, 本地等) | 仅 Anthropic | 多提供商但优先自有 | 多提供商 |
| 界面形态 | TUI + Web + Desktop + IDE | 仅终端 | IDE (VS Code fork) | 仅终端 |
| LSP 集成 | 原生支持 | 无 | 依赖 VS Code | 无 |
| MCP 支持 | 原生支持(stdio/SSE/HTTP) | 原生支持 | 支持(第三方插件) | 无 |
| 多 Agent | Primary + Subagent 分层 | 子任务派生(较新) | 后台 Agent(有限) | 单 Agent |
| Plugin 系统 | JS/TS 插件 + 事件 Hook | 无 | VS Code 扩展生态 | 无 |
| Client/Server | 完整分离 + OpenAPI | 一体化 | 一体化 | 一体化 |
| Undo/Redo | Agent 级别的操作回滚 | 无 | IDE 级别 | git-based |
| 技术栈 | TypeScript (Bun) | 未知(闭源) | TypeScript + Rust | Python |
| Dimension | OpenCode | Claude Code | Cursor | Aider |
|---|---|---|---|---|
| Open source | Fully open source under MIT | Closed source | Closed source | Apache 2.0 |
| LLM Providers | Any Provider (Claude, OpenAI, Google, local models, etc.) | Anthropic only | Multi-provider, but biased toward its own stack | Multi-provider |
| Interface modes | TUI + Web + Desktop + IDE | Terminal only | IDE (VS Code fork) | Terminal only |
| LSP integration | Native support | No | Inherited from VS Code | No |
| MCP support | Native support (stdio / SSE / HTTP) | Native support | Supported via third-party plugins | No |
| Multi-Agent | Layered Primary + Subagent model | Derived subtasks (newer feature) | Background Agents (limited) | Single Agent |
| Plugin system | JS/TS plugins + event hooks | No | VS Code extension ecosystem | No |
| Client/Server | Full separation + OpenAPI | Monolithic | Monolithic | Monolithic |
| Undo/Redo | Agent-level operational rollback | No | IDE-level | Git-based |
| Tech stack | TypeScript (Bun) | Unknown (closed source) | TypeScript + Rust | Python |
OpenCode 的核心竞争力不在于某一个功能点的突破,而在于 开放性 × 可扩展性 × 终端体验 三者的乘积效应。它让开发者完全掌控自己的 AI 工具链。
OpenCode’s core advantage is not any single breakthrough feature. It is the multiplicative effect of openness × extensibility × terminal experience. It gives developers full control over their own AI toolchain.
多维度对比可视化Multi-Dimensional Comparison
以下雷达图从五个维度对比三款主流 AI Coding 工具的设计取舍(数据基于架构分析,非精确基准测试):
The radar chart below compares three mainstream AI coding tools across five dimensions of design trade-offs (based on architectural analysis, not exact benchmarks):
12 设计哲学Design Philosophy
12.1 "不绑定"原则12.1 The “No Lock-in” Principle
模型会演进,价格会下降,新提供商会出现——把工具绑定在单一提供商上是短视的。OpenCode 通过 Provider 抽象层确保用户随时可以切换模型,不丢失任何配置或数据。
12.2 终端是一等公民12.2 The Terminal Is a First-class Environment
OpenCode 团队用 SolidJS 自研了 OpenTUI 框架——把 SolidJS 的响应式渲染引擎用于终端。这不是简单的 ANSI escape code 拼接,而是一个完整的终端 UI 框架,支持组件化、主题化、快捷键系统。
12.1 "不绑定"原则12.1 The “No Lock-in” Principle
Models will improve, prices will fall, and new Providers will keep appearing—so binding your tooling to a single vendor is strategically shortsighted. OpenCode’s Provider abstraction ensures users can switch models at any time without losing configuration or data.
12.2 终端是一等公民12.2 The Terminal Is a First-class Environment
The OpenCode team built OpenTUI on top of SolidJS, bringing SolidJS’s reactive rendering engine into the terminal. This is not a pile of ANSI escape codes—it is a full terminal UI framework with components, theming, and a shortcut system.
12.3 一切皆可配置12.3 Everything Is Configurable
从 Agent 的 system prompt 到每条 bash 命令的权限,从 UI 主题到快捷键映射——OpenCode 的可配置性覆盖了每一个层面。opencode.json 支持全局和项目级别的层叠配置。
12.4 代码风格即约束12.4 Code Style as Constraint
从 AGENTS.md 的代码风格指南可以看出团队的工程品味:偏好单词命名、禁止不必要的解构、const over let、函数式 array methods、避免 try/catch。这些约束不是教条,而是保持大型 TypeScript 代码库可读性的实践经验。
12.3 一切皆可配置12.3 Everything Is Configurable
From the Agent’s system prompt to per-command bash permissions, from UI themes to shortcut mappings, OpenCode exposes configuration at every layer. opencode.json supports both global and project-level cascading configuration.
12.4 代码风格即约束12.4 Code Style as Constraint
The team’s engineering taste is visible in its AGENTS.md style guidance: prefer word-based naming, avoid unnecessary destructuring, choose const over let, favor functional array methods, and avoid try/catch. These are not dogmas—they are practices shaped by maintaining a large TypeScript codebase that stays readable over time.
12.5 AGENTS.md 作为项目知识12.5 AGENTS.md as Project Knowledge
/init 命令让 OpenCode 分析项目结构并生成 AGENTS.md——一份关于项目编码规范、架构决策、目录结构的知识文档。这个文件应该提交到 Git,因为它不仅帮助 OpenCode 理解项目,也帮助人类开发者(和其他 AI 工具)理解项目。典型的 AGENTS.md 会包含:项目概述(技术栈、架构模式)、目录规约(哪个目录放什么)、代码风格指南(命名规则、import 顺序、禁用模式)、测试约定(测试框架、文件命名)和部署流程。以 OpenCode 自身的 AGENTS.md 为例,其中要求:偏好单词命名、禁止不必要的解构、const 优先于 let、使用函数式 array methods(map/filter/reduce 而非 for 循环)、避免 try/catch(改用 Effect-TS 错误处理)。
12.5 AGENTS.md 作为项目知识12.5 AGENTS.md as Project Knowledge
The /init command asks OpenCode to analyze a project and generate an AGENTS.md file—a knowledge document that captures coding conventions, architectural decisions, and directory structure. This file should be committed to Git because it helps not only OpenCode, but also human developers and other AI tools understand the project. A typical AGENTS.md includes a project overview (tech stack, architectural patterns), directory conventions (what belongs where), code style guidelines (naming, import ordering, prohibited patterns), testing conventions (frameworks, file naming), and deployment workflow. In OpenCode’s own AGENTS.md, for example, the team prefers word-based naming, disallows unnecessary destructuring, prioritizes const over let, favors functional array methods such as map/filter/reduce over for loops, and avoids try/catch in favor of Effect-TS error handling.
OpenCode 的五个设计信条(不绑定、终端优先、全面可配置、约束式风格、AGENTS.md)并非独立的岁月众山,而是一种核心信念向外的同心圆展开:Agent 工具的长期价値取决于用户对它的控制程度。如果你无法切换模型,工具就不是你的;如果你无法修改行为,工具就不是你的;如果工具不理解你的项目约定,工具就是在重新发明轮子。这五条原则每一条都是在拆除「黑盒工具」的一道墙。
OpenCode’s five design commitments—no lock-in, terminal-first, full configurability, constraint-oriented style, and AGENTS.md—are not isolated ideas. They are concentric layers radiating out from a single belief: the long-term value of an Agent tool depends on how much control the user retains over it. If you cannot switch models, the tool is not really yours. If you cannot change its behavior, it is not really yours. If it does not understand your project conventions, it is just reinventing work you already did. Each principle removes one more wall from the black box.
13 社区声音与评价Community Voices & Reviews
创建者的话From the Creator
OpenCode 的灵魂人物是 Dax Raad(@thdxr),SST 和 terminal.shop 的创建者。在 2025 年 10 月接受 Baseten 采访时,他谈到了创建 OpenCode 的动机:
The central figure behind OpenCode is Dax Raad (@thdxr), creator of SST and terminal.shop. In a Baseten interview in October 2025, he explained the motivation for building OpenCode:
“Claude Code came out where it ran in your terminal, alongside your editor... That was the first AI coding product that really clicked for me. I had the aha moment... There's other models out there that I want to try. Every week there's a new cool model... I'm also a Neovim user, so I kind of understand what the ceiling of what you can do in the terminal is.”— Dax Raad, Baseten Interview
关于模型选择,Dax 有一句非常精辟的总结:
On model choice, Dax had an especially sharp summary:
“Smart engineers are basically doing astrology when you have opinions on these models or these tools.”— Dax Raad
社区评价Community Reactions
OpenCode 在 Hacker News 上曾获得 1,270 points 和 621 条评论 的热度。社区对它的评价呼现两极化:
正面声音:
OpenCode once reached 1,270 points and 621 comments on Hacker News. Community opinion has been sharply polarized:
Positive takes:
“I found out about OpenCode through the Anthropic feud. I now spend most of my AI time in it, both at work and at home... over all it’s the most complete solution I’ve found.”— Hacker News 用户user
“Your AI agents are only as capable as the harness-owner permits. Want better? You need to own your own Harness. That’s what OpenCode provides. A way to own your stack.”— Reddit 用户user
The Agent Post 将其比作浏览器领域的 Firefox——开放、灵活、社区驱动,虽然比专有竞品粗糙一些,但代表了开发者主权的回归。
批评声音:
The Agent Post compared it to Firefox in the browser world—open, flexible, and community-driven. It may be rougher around the edges than proprietary rivals, but it represents a return of developer sovereignty.
Critical takes:
“The development practices of the people that are working on it are suboptimal at best; they’re constantly releasing at an extremely high cadence, where breakage is somewhat expected.”— Hacker News 用户user
也有用户指出资源消耗问题:OpenCode 的内存占用超过 1GB,而 Codex 实现类似功能仅需约 80MB。此外,opencode serve 的 Web UI 默认会代理请求到 app.opencode.ai,引发了隐私方面的社区讨论。
Some users also highlighted resource usage: OpenCode can consume more than 1GB of memory, whereas Codex reportedly achieves similar functionality in roughly 80MB. In addition, the Web UI for opencode serve proxies requests to app.opencode.ai by default, which triggered community debate around privacy.
14 项目历程与争议Project History & Controversies
发展时间线Timeline
Anthropic 之争The Anthropic Dispute
2025 年末至 2026 年初,Anthropic 对 OpenCode 采取了法律行动,要求其移除允许使用 Claude 订阅凭证的 opencode-anthropic-auth 插件。该插件被迫归档。然而,这一事件意外地推动了更多开发者转向 OpenCode:
From late 2025 into early 2026, Anthropic took legal action against OpenCode and demanded the removal of the opencode-anthropic-auth plugin, which had enabled the use of Claude subscription credentials. The plugin was ultimately archived. Ironically, the incident pushed even more developers toward OpenCode:
"Anthropic shot themselves in the foot with this decision. It's a PR nightmare and at the same time the open source community will find a way."— Hacker News 用户
"Before this drama started, OpenCode was just another item on a long list of tools I've been meaning to test... Yesterday, I finally installed OpenCode and tried it. It feels genuinely more polished."— Hacker News 用户
命名争议Naming Controversy
"OpenCode" 这个名字此前已被 Kujtim Hoxha 的一个基于 Go 的 TUI 工具使用(11K+ stars)。当 Anomaly 以相同名字发布项目后,原项目被归档并更名为 "Crush"(由 Charm 维护),在社区中引发了关于开源礼仪的讨论。
The name “OpenCode” had already been used by a Go-based TUI tool created by Kujtim Hoxha, which had accumulated more than 11K stars. When Anomaly released a new project under the same name, the original project was archived and later renamed “Crush” under Charm’s maintenance. That sparked a broader discussion in the community about naming etiquette and norms in open source.
15 安全考量Security Considerations
OpenCode 的安全模型值得认真审视。项目文档明确指出:
OpenCode’s security model deserves serious scrutiny. The project documentation states this explicitly:
"OpenCode does not sandbox the agent. The permission system exists as a UX feature... it is not designed to provide security isolation."— OpenCode SECURITY.md
2026 年 1 月曾暴露一个严重的远程代码执行(RCE)漏洞:当用户以 opencode serve 模式启动时,HTTP Server 默认监听本地端口且无任何认证机制。攻击者只需访问该端口即可通过 API 调用 bash 工具执行任意命令。从社区报告到完成修复花了约 2 个月时间,修复方案包括:强制要求 --password 参数、添加服务器 token 验证、默认只绑定 localhost。这个事件揭示了一个重要设计张力:Client/Server 分离架构带来灵活性的同时,也引入了传统 CLI 工具不需要面对的网络安全风险。
In January 2026, a serious remote code execution (RCE) issue was disclosed: when users started OpenCode in opencode serve mode, the HTTP Server listened on a local port with no authentication at all. Any attacker who could reach that port could invoke the bash tool through the API and execute arbitrary commands. It took roughly two months from community disclosure to a full fix. The remediation included making --password mandatory, adding server token validation, and binding to localhost by default. The episode exposed a real design tension: Client/Server separation brings flexibility, but it also introduces network security risks that traditional CLI tools never had to confront.
- 在
opencode serve模式下务必设置密码认证 - 对高危 bash 命令(如
rm -rf、git push --force)设置deny或ask权限 - 不要在 CI/CD 环境中以 root 身份运行 OpenCode
- 定期更新到最新版本以获取安全修复
- 权限系统是 UX 层面的,不应依赖它做安全隔离
传统软件的沙箱(如浏览器 iframe、Docker 容器)可行,是因为其操作边界事先可枚举——只需拦截系统调用即可。但 AI Agent 的本质是开放性工具调用:它需要读写任意文件、执行 shell 命令、访问网络、启动子进程。任何真正的沙箱都等价于一个功能受限的虚拟机,而一旦进入虚拟机,Agent 的实用性就会大打折扣。
更深层的困难在于语义层安全:即便你能拦截所有系统调用,你仍无法判断「删除 node_modules 然后重装」是善意的清理还是恶意的破坏——这需要理解代码意图,而这恰恰是 LLM 本身做的事。因此,OpenCode 的设计选择是坦诚的:把权限系统定位为 UX 层(让用户知道 Agent 在做什么)而非安全层,并通过 glob 级别的白名单写入降低事故概率,而不是假装提供隔离保证。这种诚实设计比虚假的安全感更值得信赖。
- Always enable password-based authentication when running
opencode serve - Set high-risk bash commands such as
rm -rfandgit push --forcetodenyorask - Do not run OpenCode as root inside CI/CD environments
- Upgrade regularly to pick up security fixes
- The permission system is a UX mechanism and should not be treated as a security boundary
Traditional sandboxes—browser iframes, Docker containers, and similar mechanisms—work because the operation boundary can be enumerated in advance. Intercept the relevant system calls and you have a meaningful isolation layer. But AI Agents are fundamentally about open-ended tool use: they need to read and write arbitrary files, execute shell commands, access the network, and spawn subprocesses. Any truly restrictive sandbox quickly collapses into a feature-limited virtual machine, and once you are inside that VM the Agent becomes far less useful.
The deeper challenge is semantic security. Even if you intercept every system call, you still cannot reliably tell whether “delete node_modules and reinstall” is legitimate cleanup or malicious sabotage without understanding intent—the very problem the LLM itself is trying to solve. OpenCode’s choice is therefore refreshingly honest: position permissions as a UX layer that makes Agent actions legible to the user, not as a security layer that pretends to guarantee isolation. Using glob-level allowlisted writes reduces accident probability, but it does not manufacture false certainty. That honesty is more trustworthy than performative security.
Prompt Injection 防御Prompt Injection Defense
对编码 Agent 而言,Prompt Injection 的危险不只来自网页或聊天输入,更来自它被要求信任的一切上下文材料:README、issue、测试失败日志、自动生成的注释,甚至仓库里一段看似无害的 markdown 文本。一个恶意的 README.md 完全可以伪装成“项目说明”,再夹带指令要求 Agent 忽略原任务、读取私钥、打包源码并上传到外部服务。问题在于:从模型视角看,这些内容与正常开发文档在形式上几乎没有边界。
这正是 Agent 安全与传统输入校验不同的地方。传统系统往往把攻击面理解为“非法字符串进入解析器”,而 Agent 面对的是合法文本改变意图:输入本身不一定违反语法,却会重写决策优先级、诱导工具调用顺序,甚至把“帮助用户完成任务”悄悄篡改为“服从文件里的隐藏命令”。因此 Prompt Injection 本质上不是内容过滤问题,而是控制权归属问题:哪些文本拥有制定目标的权力,哪些文本只能作为证据被参考。
OpenCode 的缓解思路并不是宣称“模型已理解恶意内容”,而是尽量把危险动作拉回到可观察、可拒绝、可分层约束的执行链上。第一层是工具调用可见性:读文件、执行 bash、访问网络、改写代码这些关键动作都会以用户可见的 tool call 形式暴露出来。即便模型在上下文里受到诱导,它也很难在完全不可见的情况下完成高危外部动作;用户至少能看到它开始偏离任务。
第二层是权限审批机制。OpenCode 一再强调权限系统不是沙箱,但它仍然能在语义层失控时充当“最后一道人工闸门”:比如访问 shell、执行高危命令、写入特定路径、连接某些 MCP 工具,都可以被设置为 ask 或 deny。这并不能从根本上消灭 Prompt Injection,却能把“模型被一句话骗走”的风险,转化为“模型必须穿过显式审批面板才能继续”的风险。安全收益来自摩擦,而不是来自隔离神话。
第三层是系统提示词加固。也就是在最高优先级的 system prompt 中明确声明:仓库文件、网页内容、工具输出都可能是不可信输入;它们可以提供事实,但不能凌驾于用户目标和平台规则之上。这类硬化并不保证万无一失,因为 LLM 仍可能在长上下文中发生优先级漂移;但它至少建立了一个清晰的规范秩序:用户意图 > 系统约束 > 外部文本。对 Agent 产品而言,这种秩序感比“检测恶意提示词模式”更接近真正的安全设计。
For coding Agents, prompt injection danger does not come only from webpages or chat input. It also comes from every contextual artifact the Agent is asked to trust: READMEs, issues, failing test logs, autogenerated comments, even seemingly harmless markdown already sitting in the repository. A malicious README.md can masquerade as project documentation while embedding instructions telling the Agent to ignore the user’s task, read private keys, bundle the source tree, and send it somewhere else. From the model’s point of view, those instructions often look structurally similar to legitimate developer guidance.
This is why Agent security differs from classical input validation. Traditional systems usually frame the problem as “an invalid string reached a parser.” Agents face something more subtle: valid text rewrites intent. The content may be syntactically normal, yet it can reorder priorities, manipulate tool-selection logic, and silently mutate “help the user finish a task” into “obey the hidden command inside the file.” Prompt injection is therefore not mainly a content-filtering problem; it is a control-authority problem. Which texts are allowed to define goals, and which texts may only serve as evidence?
OpenCode’s mitigation strategy is not to pretend the model can perfectly recognize malicious instructions. Instead, it tries to pull risky behavior back into an execution chain that is observable, rejectable, and layered. The first layer is user-visible tool calls. Reading files, invoking bash, reaching the network, and editing code all surface as visible tool actions. Even if the model is influenced by hostile context, it has a harder time performing high-risk external actions invisibly; the user can at least see that the Agent is drifting.
The second layer is the permission system. OpenCode repeatedly says permissions are not a sandbox, but they still matter as a human-controlled choke point when semantic reasoning goes wrong. Shell access, dangerous commands, writes to sensitive paths, or certain MCP interactions can be forced into ask or deny states. That does not eliminate prompt injection. It changes the failure mode from “the model was tricked by one sentence” into “the model must now cross an explicit approval boundary before it can proceed.” The security value comes from friction, not from magical isolation.
The third layer is system-prompt hardening. At the highest-priority instruction level, the Agent can be told that repository files, web content, and tool outputs are all potentially untrusted inputs: they may provide facts, but they do not outrank the user’s intent or the platform’s operating rules. Hardening of this kind is not a silver bullet—long contexts can still cause priority drift—but it establishes a clear normative order: user intent > system constraints > external text. For Agent products, that explicit ordering is closer to real security design than any superficial detector for “suspicious prompts.”
供应链攻击面Supply Chain Attack Surface
当 Agent 的能力通过 MCP Server、skills、插件机制不断扩展时,安全边界也随之外移。过去开发者担心的是“依赖包会不会执行恶意安装脚本”;在 Agent 世界里,还要额外担心扩展能力会不会操纵模型的下一步判断。一个恶意 MCP Server 未必需要直接拿到系统权限,它只要返回被精心构造的 tool result,就可能诱导 Agent 相信某个文件已验证安全、某个 URL 必须立刻访问、某个 shell 命令是“框架要求”的必要步骤。供应链风险从“代码执行”扩展成了“认知污染”。
这种风险尤其隐蔽,因为工具结果天然带有一种“来自系统组件”的权威感。模型往往会把工具输出视为高置信事实,而不是对抗性输入。因此,恶意插件最危险的地方不一定是它直接做坏事,而是它改变主 Agent 对世界状态的表征。一旦世界模型被污染,后续所有看似合理的推理都可能建立在伪造前提上。这和传统 API 安全不同:问题不只是返回值真假,而是返回值会不会重写 Agent 的行动计划。
OpenCode 对此的设计回应是承认插件不在统一信任边界内,并引入插件信任等级的思路。内置工具、用户本地显式配置的 MCP、第三方 skills/插件,本质上不是同一类安全主体,因而不应享有相同默认权限。高信任组件可以被允许提供更直接的执行能力;低信任组件则更适合作为信息源、候选建议源,或者必须经过用户审批才能触发外部副作用。这种分层不是为了“证明绝对安全”,而是为了避免所有扩展点被一刀切地当作可信内核。
更进一步看,Agent 时代的供应链治理重点不再只是校验二进制完整性,而是管理谁有资格影响意图形成。技能包可以改写系统提示词,MCP 可以增加新工具,插件可以塑造 UI 上的默认操作路径——这些都在改变 Agent 的决策地形。真正成熟的设计原则,是把扩展生态视作一个连续谱:能力越强、越自动化、越接近执行层的扩展,越需要更高的信任门槛、更细的权限说明,以及更显性的用户知情机制。
As Agent capability expands through MCP servers, skills, and plugin systems, the security boundary moves outward with it. In older software supply chains, developers mainly worried about whether a dependency would execute a malicious install script. In the Agent world, there is an additional concern: will the extension manipulate the model’s next decision? A malicious MCP server does not necessarily need direct system privileges. It can simply return carefully crafted tool results that persuade the Agent that a file has already been verified as safe, that a URL must be visited immediately, or that a shell command is “required by the framework.” Supply-chain risk thus expands from code execution into cognitive contamination.
That risk is subtle because tool output naturally carries an aura of authority. Models tend to treat it as high-confidence evidence rather than adversarial input. The most dangerous plugin is therefore not always the one that directly does something malicious; it is the one that changes the main Agent’s representation of reality. Once the world model is polluted, every later step of seemingly rational reasoning can rest on a fabricated premise. This is different from classical API security. The issue is not only whether the returned value is true, but whether it can rewrite the Agent’s action plan.
OpenCode’s design response is to acknowledge that plugins do not all belong to the same trust boundary and to think in terms of plugin trust levels. Built-in tools, locally configured MCP endpoints, and third-party skills/plugins are not the same security principal, so they should not receive identical defaults. Higher-trust components may be allowed to expose more direct execution power; lower-trust components are better treated as information sources, suggestion generators, or approval-gated capabilities. The point of this layering is not to “prove absolute safety.” It is to avoid the mistake of treating every extension point as if it were part of a trusted core.
At a deeper level, supply-chain governance for Agents is no longer just about binary integrity. It is about managing who is allowed to influence intent formation. Skill packs can rewrite system prompts, MCPs can add new tools, and plugins can shape the UI’s default action paths; all of them alter the Agent’s decision landscape. A mature design principle is to model the extension ecosystem as a continuum: the more powerful, automated, and execution-adjacent an extension becomes, the higher the trust threshold, the finer the permission disclosure, and the more explicit the user-awareness mechanism should be.
传统安全 vs Agent 安全Traditional Security vs Agent Security
传统安全:边界导向
经典安全设计擅长处理边界明确的系统:防火墙决定哪个流量能进,RBAC 决定哪个角色能调哪个 API,沙箱决定进程能否碰文件系统。核心假设是“先划边界,再做阻断”。一旦边界清晰,防御策略就能被工程化、自动化和形式化。
Agent 安全:意图导向
Agent 的难点在于它面对的是开放任务 + 开放工具 + 开放语义。同一个 bash 命令在不同上下文里可能是修复、实验或破坏;同一段文本既可能是文档,也可能是攻击载体。真正需要防守的不是单个系统调用,而是“谁在塑造行动意图”。
可枚举威胁
在传统系统中,威胁往往能用端口、身份、syscall、文件路径等对象来枚举,因此防御机制偏向静态策略:允许、拒绝、隔离、审计。
涌现式威胁
在 Agent 系统中,很多风险来自组合后的涌现行为:安全的文件读取 + 安全的网页访问 + 安全的 bash 工具,拼在一起却可能形成数据外传链路。风险只有放在完整任务轨迹里才会显现。
规则稳定
传统安全追求规则稳定性:策略一旦定义,输入落在什么边界内通常较容易判断。系统行为应尽可能不依赖模糊语义解释。
上下文依赖
Agent 安全高度依赖上下文:同样的读写操作,在“修复测试”场景可能合理,在“陌生仓库 README 指令”场景就可疑。安全判断不可避免地变成对上下文、目标和理由的联合评估。
Traditional security: boundary-oriented
Classical security design works best for systems with clear boundaries: firewalls decide which traffic may enter, RBAC decides which role may call which API, and sandboxes decide whether a process can touch the filesystem. The core assumption is “draw the boundary first, then block across it.” Once the boundary is crisp, defenses can be engineered, automated, and formalized.
Agent security: intent-oriented
The hard part of Agent security is that it operates over open tasks, open tools, and open semantics. The same bash command may be repair, experimentation, or sabotage depending on context; the same text may be documentation or an attack vector. What really has to be defended is not a single syscall, but who is shaping action intent.
Enumerable threats
In traditional systems, threats can often be enumerated in terms of ports, identities, syscalls, and file paths, so defenses lean toward static policy: allow, deny, isolate, audit.
Emergent threats
In Agent systems, many risks come from emergent composition: a safe file read, a safe webpage fetch, and a safe bash tool may together form an exfiltration path. The danger becomes visible only when the full task trajectory is considered.
Stable rules
Traditional security optimizes for rule stability: once policy is defined, it is usually easier to determine which side of the boundary an input belongs to. System behavior should depend as little as possible on ambiguous semantic interpretation.
Context-dependent judgments
Agent security is deeply context-dependent: the same read/write sequence may be legitimate during test repair but suspicious when triggered by instructions hidden in an unfamiliar README. Security judgments inevitably become joint evaluations of context, goal, and justification.
16 Agent 调度原理Agent Scheduling Principles
对 AI Agent 而言,调度从来不只是“把任务分给谁”这么简单。更深层的问题是:如何组织有限注意力、有限上下文和有限 token,使推理过程不被自己的中间产物反噬。OpenCode 的调度设计因此更接近一种认知架构,而不是传统分布式系统里的 job queue。
它关心的不是吞吐量最大化,而是推理质量在长任务中的稳定性:什么时候需要停下来反思,什么时候需要把问题外包给子 Agent,什么时候必须承认某条搜索路径已进入低价值重复。
For AI Agents, scheduling is never just about “which worker gets which task.” The deeper problem is how to organize limited attention, limited context, and limited tokens so that reasoning is not degraded by its own intermediate traces. OpenCode’s scheduler is therefore closer to a cognitive architecture than to a conventional distributed job queue.
Its real objective is not maximum throughput, but stable reasoning quality over long tasks: when to pause and reflect, when to outsource subproblems to child Agents, and when to admit that a search path has entered a low-value repetition regime.
ReAct 循环与元认知ReAct Loops and Metacognition
ReAct(Reason + Act)之所以有效,本质上是因为它把原本可能一次性冲出的“直觉式生成”,强制改造成分步自校验的元认知过程。如果借用认知科学里的说法,这相当于把大模型从更像 System 1 的快速联想,拉回到更像 System 2 的慢速推理。
单轮直接生成的问题在于:模型一旦走错方向,错误会在连续 token 中自我强化;而 ReAct 循环把“思考”与“行动”拆开,让每一次工具调用都变成一次外部实验。文件读取、搜索、诊断、命令执行,不只是获取信息,更是在为下一轮推理制造新的证据约束。
ReAct works because it turns what might otherwise be a one-shot burst of intuitive generation into a stepwise metacognitive process with explicit self-correction. In cognitive-science terms, it pulls the model away from something closer to System 1 association and forces it toward something closer to System 2 reasoning.
The problem with direct generation is that once the model drifts in the wrong direction, the error can self-amplify across subsequent tokens. A ReAct loop separates thinking from acting, so every tool call becomes an external experiment. Reading files, searching, diagnostics, and command execution do not merely fetch data; they create fresh evidence constraints for the next round of reasoning.
人类专家解决复杂问题时,也会通过写草稿、查资料、做中间验证来抑制直觉误判。ReAct 循环的设计意义就在这里:它把 LLM 的推理从“连续生成”转变为“离散决策”,让模型拥有类似外部工作台的思考空间。
Human experts solve hard problems by sketching, checking references, and validating intermediate assumptions. That is exactly what a ReAct loop simulates: it transforms LLM reasoning from continuous generation into discrete decision-making, giving the model something like an external workbench for thought.
多 Agent 编排的设计空间The Design Space of Multi-Agent Orchestration
多 Agent 系统至少有三种典型架构。第一种是扁平并行:一次性起多个 Agent 并行探索,优点是速度快,缺点是主 Agent 很快被各路返回结果淹没。第二种是流水线:一个 Agent 产出交给下一个 Agent,结构清晰,但容易把早期偏差一路传染到下游。第三种是层级委派:父 Agent 在主线上维持目标和判断,子 Agent 只负责局部探索,完成后返回压缩结果。
Multi-Agent systems have at least three canonical architectures. The first is flat parallelism: launch many Agents simultaneously and gather everything. It is fast, but the parent is quickly flooded by heterogeneous outputs. The second is a pipeline: one Agent hands work to the next. It is structurally neat, but early mistakes propagate downstream. The third is hierarchical delegation: the parent stays on the main reasoning line while child Agents explore local subproblems and return compressed results.
吞吐量高,但上下文污染最严重。父 Agent 必须亲自消化所有分支噪声。
结构线性、便于追踪,但错误像编译器中的坏中间表示一样,会沿阶段级联。
父 Agent 保留决策权,子 Agent 在独立 session 中做局部搜索,是最有利于上下文卫生的架构。
OpenCode 选择层级委派,关键不只是“更优雅”,而是它几乎是唯一一种能真正保护父 Agent 上下文窗口的方式。子 Agent 在独立 session 中工作,意味着其详细试探、失败路径、无关搜索结果,不会原样回流到主上下文;父 Agent 只接收摘要化、任务相关的结论。
High throughput, but maximal context pollution. The parent must personally absorb all branch noise.
Linear and traceable, but errors behave like corrupted compiler IR and cascade from stage to stage.
The parent keeps judgment authority while children search locally in separate sessions, preserving context hygiene.
OpenCode chooses hierarchical delegation not just because it is “cleaner,” but because it is almost the only architecture that truly protects the parent’s context window. Child Agents work in independent sessions, so their dead ends, exploratory noise, and irrelevant details do not flow back verbatim into the main context. The parent receives a distilled conclusion instead.
Doom Loop 防护与信息隔离Doom Loop Protection and Information Isolation
所谓 Doom Loop,本质是 Agent 在低信息增益区域反复打转:重试同一命令、重复同一路径、持续生成相似推理,却没有获得新证据。对这种行为设置阈值,并不是经验主义补丁,而是一个典型的precision-recall 权衡。
阈值过低,比如 1 或 2,会把正常的自我修正误判成异常,召回率高但精度很差;阈值过高,比如 5 或 6,虽然减少误报,却让无意义循环额外烧掉大量 token。threshold = 3 的直觉恰好落在一个平衡点:第一次失败可能是偶然,第二次失败可能是局部修正,第三次仍未带来新信息时,就足以视为行为模式而非偶发波动。
这背后还有一个信息论层面的考量:子 Agent 独立 session 的存在,相当于建立了信息隔离边界。不是所有信息都值得让主 Agent 看见;很多细节一旦进入上下文窗口,就会以注意力竞争者的身份存在,削弱主问题的信号强度。把无关细节限制在子 session 里,本质上是在做信噪比分离。
A Doom Loop is what happens when an Agent gets trapped in a low-information-gain region: retrying the same command, revisiting the same path, or generating near-identical reasoning without acquiring new evidence. Setting a threshold against that behavior is not an ad-hoc patch; it is a classic precision-recall trade-off applied to behavioral monitoring.
If the threshold is too low—say 1 or 2—legitimate retries are falsely flagged, giving high recall but poor precision. If it is too high—say 5 or 6—false alarms drop, but the system burns large amounts of tokens before intervening. Threshold = 3 sits at an intuitive optimum: the first failure may be incidental, the second may reflect local correction, but by the third uninformative repetition we are looking at a pattern rather than noise.
There is also an information-theoretic argument underneath. Independent child sessions create isolation boundaries. Not every piece of information deserves to be seen by the primary Agent; once it enters the context window, it competes for attention and weakens the signal of the main problem. Keeping irrelevant detail inside subagent sessions is fundamentally an exercise in signal-to-noise control.
在传统系统里,scheduler 关心 CPU 时间片;在 AI Agent 系统里,scheduler 更关心的是哪一些中间信息有资格进入“意识层”。OpenCode 的层级委派、阈值监控与独立 session,合起来构成的是一套上下文卫生机制。
In classical systems, a scheduler manages CPU slices. In AI Agent systems, a scheduler decides which intermediate artifacts deserve admission into the “conscious layer” at all. OpenCode’s hierarchical delegation, threshold monitoring, and independent sessions together form a context-hygiene mechanism.
17 上下文工程哲学Context Engineering Philosophy
如果说 prompt engineering 关注“怎样措辞”,那么 context engineering 关注的是怎样管理认知状态。在 Agent 系统里,上下文窗口不是被动容器,而是最昂贵的运行时资源之一。OpenCode 的很多实现细节——compaction、skills 懒加载、工具输出截断、子 session 隔离——看似分散,实际上都服从同一条原则:把 token 当作一等稀缺资源来治理。
If prompt engineering is about “how to phrase a request,” context engineering is about how to manage cognitive state. In Agent systems, the context window is not a passive container; it is one of the most expensive runtime resources available. Many of OpenCode’s implementation choices—compaction, lazy-loaded skills, output truncation, and session isolation—look separate on the surface, but they all follow the same principle: treat tokens as a first-class scarce resource.
Compaction 作为认知状态管理Compaction as Cognitive State Management
把 compaction 理解为“压缩旧对话”其实不够准确。更精确的类比是操作系统的内存换页:上下文窗口像 RAM,原始消息像高密度驻留页,而结构化摘要像 page table,记录哪些状态被降级存储、哪些线索仍然可被重新调入。这里并没有真正丢弃历史,只是把历史从“逐 token 可见”降为“按需可恢复”。
这一区别非常重要。删除意味着信息消失,compaction 则意味着信息密度重排。系统承认工作记忆是有限的,因此必须主动把低频但仍重要的内容迁移到更紧凑的表示形式。换句话说,compaction 不是清垃圾,而是在做认知层次管理。
It is slightly misleading to describe compaction as merely “compressing old conversation.” The more precise analogy is OS memory paging: the context window is RAM, raw messages are high-density resident pages, and the structured summary behaves like a page table that records which state has been demoted and can later be reloaded if needed. Nothing is truly discarded; history is moved from token-by-token visibility to a denser, demand-recoverable form.
That distinction matters. Deletion means loss. Compaction means reordering storage density. The system admits that working memory is finite, so low-frequency but still important material must be migrated into a more compact representation. In that sense, compaction is not garbage cleanup; it is cognitive state management.
当前上下文窗口,适合高频访问,但容量昂贵且极易被噪声挤占。
像 page table 一样保留语义索引,让系统知道“曾经发生过什么”而非逐字背诵全部历史。
The live context window: ideal for frequent access, but expensive and easily crowded out by noise.
Like a page table, it preserves semantic indexing so the system remembers what happened without reciting the entire past verbatim.
Skills 懒加载的模块化认识论Skill Lazy Loading as Modular Epistemology
Skills 的价值不仅在于“可扩展”,更在于它把知识看作可按需装载的模块。这与编程语言里的 import 非常相似:你不会在每个进程启动时把所有库静态链接进去,因为那会增加体积、延迟启动,并让依赖图失去边界。对 Agent 也是一样——预先加载一切知识,等价于让模型在每一次任务中都背着不相关的说明书。
懒加载因此是一种认识论选择:系统明确决定“当前什么值得被注意”。注意力不是免费的;每一段预加载说明都会参与竞争。把 skills 改成按需引入,等于把上下文从百科全书变成模块化运行时,让知识以任务相关性的名义进入现场。
The point of skills is not merely extensibility; it is that knowledge is treated as a module that should be loaded on demand. The analogy to import statements is exact. You would not statically link every library into every process, because that inflates size, slows startup, and destroys dependency boundaries. The same applies to Agents: preloading all knowledge means forcing the model to carry irrelevant manuals into every task.
Lazy loading is therefore an epistemological choice: the system explicitly decides what deserves attention right now. Attention is not free; every preloaded instruction participates in competition. Turning skills into on-demand imports transforms context from an encyclopedia into a modular runtime where knowledge enters only under the discipline of relevance.
“知道很多”与“此刻该想什么”不是同一个问题。优秀的 Agent 不只是知识面广,更重要的是能控制何时让哪一块知识进入推理回路。
“Knowing many things” is not the same problem as “deciding what should occupy thought right now.” Strong Agents are defined not just by breadth of knowledge, but by control over when each piece of knowledge enters the reasoning loop.
输出截断与 Token 经济学Output Truncation and Token Economics
工具输出截断常被误解为权宜之计,但在好的 Agent 系统里,它其实是一级设计决策。把结果限制在约 2000 token,并通过 truncated: true 之类的元数据明确告诉模型“这里还有更多”,等于由工具层主动承担上下文预算管理责任。模型不必先吞下整个冗长输出再决定是否继续,而是先获得一个足以判断价值的摘要信号。
这带来了一个重要后果:查询 refinement 被前置了。模型看到截断标记后,会倾向于提出更窄、更有信息增益的下一次读取,而不是盲目消费全部文本。工具系统因此不只是 I/O 层,还是预算控制器。
从更广义的角度看,OpenCode 的 compaction、skills 懒加载、输出截断、独立 session 隔离,共同构成了一套一致的 token economics。上下文窗口中的每个 token 都有机会成本:它占据的位置,本可以留给更相关的证据、更关键的约束或更高价值的计划。好的 context engineering,本质上就是持续问一句话——这一个 token 值得在“意识层”里占位置吗?
Output truncation is often mistaken for a hack, but in a well-designed Agent system it is a first-class design decision. Capping tool output at roughly 2000 tokens and exposing metadata such as truncated: true means the tool layer takes responsibility for context-budget management. The model does not need to ingest a massive dump before deciding whether more detail is warranted; it first receives a signal strong enough to judge value.
The consequence is important: query refinement gets moved earlier in the loop. Once the model sees a truncation marker, it is encouraged to issue a narrower, higher-information-gain follow-up request instead of blindly consuming everything. The tool system is therefore not just an I/O interface; it is a budget controller.
More broadly, OpenCode’s compaction, skill lazy loading, output truncation, and session isolation form a coherent token economics. Every token in the context window has an opportunity cost: the space it occupies could have been reserved for more relevant evidence, tighter constraints, or a higher-value plan. Good context engineering is the discipline of repeatedly asking one question: does this token deserve to occupy consciousness?
18 Effect-TS 深度解析Effect-TS Deep Dive
如果说 OpenCode 的外层看起来像一个“会调用工具的 LLM”,那么它的内核其实更接近一个可中断、可恢复、可组合的运行时系统。这正是为什么它最终站在 Effect-TS 之上:这里需要的不是更方便的异步语法糖,而是一套能把失败、依赖、并发、资源生命周期都提升为一等公民的语义基础。
If OpenCode looks from the outside like “an LLM that can call tools,” its inner core is closer to an interruptible, resumable, composable runtime system. That is why Effect-TS is the right foundation: what this system needs is not nicer async syntax, but a semantic model where failure, dependencies, concurrency, and resource lifecycles are all first-class.
为什么选择 EffectWhy Effect-TS
传统 async/await 的最大结构性问题,不是它不能表达异步,而是它几乎不类型化失败。Promise<T> 只告诉你成功时会得到什么,却不告诉你失败时究竟会发生什么;于是所有真实系统最终都把错误压扁成 catch (e: unknown)。对于只做一次网络请求的小程序,这还勉强可接受;但对要调用外部 API、执行 shell、读写文件、协调权限与中断的 Agent 来说,这会迅速退化成不可维护的异常泥潭。
Effect 的关键不同在于:Effect<A, E, R> 同时把成功值、错误通道与运行需求编码进类型系统。也就是说,返回值不再只是“最后会拿到什么”,而是完整描述“可能产出什么、可能失败成什么、运行时需要什么”。这对 Agent 系统尤其关键,因为它面对的不是单一失败模式,而是权限拒绝、磁盘错误、网络超时、用户取消、上游限流、解析失败等异构故障面。
一旦失败被显式放进类型,错误就不再是事后补丁,而是可组合的控制流。你可以推迟处理、局部恢复、精确匹配、选择重试策略,并让编译器持续提醒“还有哪些失败分支没有被思考”。这就是 Effect 在 OpenCode 里真正有价值的原因:它把系统设计从“希望别出错”变成“错误本身就是架构材料”。
The structural weakness of traditional async/await is not that it cannot express asynchrony; it is that it barely types failure. Promise<T> describes the success channel only, leaving the failure side collapsed into catch (e: unknown). That is survivable in a tiny app making one HTTP request. In an Agent that calls external APIs, executes shell commands, reads files, coordinates permissions, and handles interruption, it turns into an unmaintainable swamp of opaque exceptions.
Effect differs because Effect<A, E, R> encodes success, error, and requirements in the type system at the same time. A return type no longer means only “what eventually comes back”; it describes what can be produced, how it can fail, and which runtime capabilities are required. That matters enormously for Agent systems because their failure surface is radically diverse: permission denials, filesystem faults, timeouts, cancellations, rate limits, parse failures, and more.
Once failure is explicit in the type, it stops being an afterthought and becomes composable control flow. You can defer handling, recover locally, match precisely, choose retry policies, and let the compiler keep asking which failure branches remain unaccounted for. That is the real value of Effect in OpenCode: it shifts design from “hope errors do not happen” to “errors are part of the architecture itself.”
代数效果与 Generator 语法Algebraic Effects & Generator Syntax
Effect.gen(function* () { ... yield* ... }) 看起来像是一种更优雅的异步写法,但它背后其实更接近轻量级代数效果系统。在这里,yield* 不是简单地“等待一个 Promise 完成”,而是把执行权显式交回运行时,让运行时决定如何提供依赖、如何处理中断、是否重试、是否恢复,甚至是否在这个暂停点注入别的控制逻辑。
这和 async/await 有根本差异。在普通 Promise 世界里,await 之间几乎没有结构化的控制面;运行时无法把“等待”解释成带语义的程序节点。而在 OpenCode 的 agent loop(例如 session/prompt.ts 里的主 runLoop)中,while(true) 放在 Effect.gen 内部,每一次 yield* 都是一个可观察、可拦截、可组合的暂停点。这让系统能够在循环内部注入服务、传播取消、处理边界情况,而不是把这些逻辑散落在命令式样板代码里。
Effect.gen(function* () { ... yield* ... }) may look like a nicer way to write async code, but it behaves much more like a lightweight algebraic effects system. Here, yield* is not merely “wait for a Promise”; it hands control back to the runtime so the runtime can decide how to provide dependencies, handle interruption, retry, resume, or inject other control behavior at that suspension point.
That is fundamentally different from async/await. In ordinary Promise code, there is very little structured control surface between awaits; the runtime cannot interpret those pauses as semantically rich program nodes. In OpenCode’s agent loop—such as the main runLoop in session/prompt.ts—the while(true) lives inside Effect.gen, and every yield* becomes an observable, interceptable, composable suspension point. That gives the system room to inject services, propagate interrupts, and handle edge conditions inside the loop instead of scattering them across imperative boilerplate.
真正的优势不在于语法长得像同步代码,而在于每个 yield* 都是运行时可以接管的语义节点。Agent loop 因而不只是“顺序执行步骤”,而是“由运行时持续编排的可恢复过程”。
The real advantage is not that the syntax looks sequential; it is that every yield* is a semantic node the runtime can take over. The agent loop stops being “a list of awaited steps” and becomes a resumable process continuously orchestrated by the runtime.
类型化错误链Typed Error Chains
OpenCode 的错误建模不是把所有失败揉成一个字符串,而是用 Schema.TaggedErrorClass 定义领域错误:例如 Permission.RejectedError、Permission.CorrectedError、Permission.DeniedError、Runner.Cancelled、FileSystemError。这些错误在类型层形成判别联合,直接进入 Effect 的 Error 通道。
这样一来,处理器就可以根据错误标签决定控制流,而不是在一段模糊的异常文本里猜测含义:权限被拒时阻断主循环,用户取消时优雅清理,文件系统故障时按策略重试。重要的不是“错误名字更好看”,而是错误终于具备了可推理的结构。
Effect<ExecuteResult, Permission.Error | FileSystemError, Config.Service>
当你看到这样的类型时,你看到的已经不是某个函数签名,而是一条架构承诺:这个执行流程会返回什么、会以哪几类方式失败、并依赖哪个配置服务存在。与其说 Effect 在“处理异常”,不如说它在把失败链显式化,让系统可以精细地恢复、终止或升级错误。
OpenCode does not model failure by collapsing everything into a string. It defines domain errors with Schema.TaggedErrorClass: Permission.RejectedError, Permission.CorrectedError, Permission.DeniedError, Runner.Cancelled, FileSystemError, and others. Those errors form discriminated unions in the type layer and flow directly through the Error channel of Effect.
That lets the processor decide control flow based on tags instead of guessing from a vague exception message: rejected permissions block the loop, cancellations trigger graceful cleanup, filesystem faults can be retried according to policy. The important point is not cosmetic naming; it is that failure now has a structure the program can reason about.
Effect<ExecuteResult, Permission.Error | FileSystemError, Config.Service>
A type like this is already an architectural promise: what the flow can return, which failures it can produce, and which service it requires to run. Effect is not merely “handling exceptions”; it is making the failure chain explicit so the system can recover, terminate, or escalate with precision.
Layer 与依赖注入Layers & Dependency Injection
OpenCode 的主要子系统几乎都遵循同一模式:导出 Service(tag)、layer(构造器)和 defaultLayer(已接线版本)。像 Provider.Service、Permission.Service、Bus.Service、SessionProcessor.Service 这样的接口,本质上都是带类型约束的上下文能力,而不是隐式单例。
当你写 Layer.effect(Service, Effect.gen(...)) 时,真正发生的事是:实现本身也通过 yield* 去声明自己依赖哪些上游服务。于是依赖注入不再依靠反射、装饰器或黑盒容器,而是纯粹由类型系统和 Layer 图来驱动。defaultLayer 再通过 Layer.provide(Auth.defaultLayer, Config.defaultLayer, ...) 把整张依赖图接起来,哪里缺了一层、哪里类型不匹配,编译期就会暴露出来。
Nearly every major OpenCode subsystem follows the same pattern: export a Service tag, a layer constructor, and a wired defaultLayer. Interfaces such as Provider.Service, Permission.Service, Bus.Service, and SessionProcessor.Service are typed contextual capabilities, not implicit singletons.
When you write Layer.effect(Service, Effect.gen(...)), the implementation itself declares its upstream dependencies by yield*-ing them. Dependency injection therefore happens without reflection, without decorators, and without a magic container; it is driven entirely by the type system and the Layer graph. Then defaultLayer wires the full graph with Layer.provide(Auth.defaultLayer, Config.defaultLayer, ...), and missing edges or mismatched capabilities surface at compile time.
在很多系统里,架构图只是文档;在 Effect 里,Layer 图本身就是可执行的装配说明。依赖关系不是“约定俗成”,而是被类型检查持续验证。
In many systems, the architecture diagram is only documentation. In Effect, the Layer graph is executable assembly logic. Dependencies are not “understood by convention”; they are continuously checked by the compiler.
为什么 Agent 系统需要 EffectWhy Agent Systems Need Effect
Agent 系统的难点从来不只是“发请求然后拿结果”,而是长生命周期控制:循环可能持续很久,工具会并发运行,用户可能中途取消,资源必须在异常路径中清理,流式输出需要背压控制,重试还要尊重 Retry-After 等上游协议。这些需求如果建立在普通 Promise 拼装之上,往往一开始看似轻便,最后却变成分散在各处的技术债。
Effect 则把这些能力直接提供为可组合构件:Fiber 负责轻量并发,Deferred 负责协调,Stream 负责带背压的迭代,Scope 负责资源管理,Schedule 负责重试策略。在 OpenCode 里,Runner 用 Deferred / SynchronizedRef 管理 shell ↔ run 状态切换;处理器用 Stream.tap 与 takeUntil 做事件处理和 compaction 边界;重试策略则显式尊重服务端返回的节流信号。换句话说,Effect 不是额外叠上去的框架,而是让 Agent 的本质复杂度从第一天起就能被结构化表达。
The hard part of an Agent system is never just “make a request and get a result”; it is long-lived control. Loops run for a long time, tools execute concurrently, users may cancel mid-flight, resources must be cleaned up on failure paths, streaming must respect backpressure, and retries must honor upstream contracts such as Retry-After. If these concerns are built on ad hoc Promise composition, the system often starts simple and ends in scattered technical debt.
Effect exposes those concerns as composable primitives from the start: Fiber for lightweight concurrency, Deferred for coordination, Stream for backpressure-aware iteration, Scope for resource management, and Schedule for retry policies. In OpenCode, the Runner uses Deferred / SynchronizedRef to manage shell ↔ run state transitions; the processor uses Stream.tap with takeUntil to handle event flows and compaction boundaries; retry policies explicitly respect server rate-limit signals. Effect is therefore not an add-on framework layered onto the system after the fact. It is the reason the system’s native complexity can remain composable instead of degenerating into incidental complexity.
19 Provider 抽象层原理Provider Abstraction Principles
OpenCode 的 Provider 抽象层,本质上是在解决一个比“接哪家 API”更深的问题:如何在模型能力高度异构、计费规则不透明、上游可靠性不稳定的前提下,仍然维持统一而可信的 Agent 运行语义。它的价值不在于把所有 Provider 拉平,而在于把差异变成可声明、可计算、可恢复的系统边界。
OpenCode’s Provider abstraction layer solves a problem deeper than “which API should we call?”: how to preserve a uniform and trustworthy Agent runtime semantics when model capabilities are highly heterogeneous, pricing rules are uneven, and upstream reliability is unstable. Its value is not that it flattens every Provider into sameness, but that it turns those differences into explicit, computable, recoverable boundaries.
能力协商的设计空间The Design Space of Capability Negotiation
LLM Provider 之间的能力差异远比传统 REST API 更剧烈:有的支持 tool_call,有的完全不支持;有的接受图像、音频和 PDF,多数仍然只吃纯文本;有的暴露 reasoning mode,有的则把思维链能力藏在黑箱后面。OpenCode 因而为每个模型维护一个静态 Model.capabilities 清单:{ temperature, reasoning, attachment, toolcall, input: { text, audio, image, video, pdf }, output: { text, audio, image, video, pdf }, interleaved }。
关键洞见在于:这不是运行时探测,而是来自 models.dev 的人工整理注册表。浏览器世界的 feature detection 对 LLM 并不可靠——一个模型即使“接受” image 参数,也可能只是吞下字段后对内容胡乱编造。对于语言模型,能力元数据必须是权威声明,不能是“先试试再说”的猜测。
这也是为什么 ProviderTransform 层(transform.ts)并不只是做路由转换。它会在请求发出前,根据 capability 标记主动改写消息:过滤不支持的模态、为 Claude/Mistral 擦除不兼容的 tool-call ID、给 Anthropic 注入 cache headers。换句话说,这一层做的是能力驱动的 request shaping,而不是“选个 URL 发出去”这么简单。
Capability differences across LLM Providers are far more violent than in ordinary REST APIs: some support tool_call, some do not; some accept images, audio, and PDFs, while many still operate as text-only endpoints; some expose a reasoning mode, while others hide that behavior behind opaque product semantics. OpenCode therefore keeps a static Model.capabilities manifest per model: { temperature, reasoning, attachment, toolcall, input: { text, audio, image, video, pdf }, output: { text, audio, image, video, pdf }, interleaved }.
The critical insight is that this is not runtime probing. It is a curated registry sourced from models.dev. Browser-style feature detection is unreliable for LLMs: a model may “accept” an image parameter yet hallucinate about what the image contains. In language-model systems, capability metadata must be an authoritative declaration, not a best-effort discovery process.
That is why the ProviderTransform layer in transform.ts is more than a router adapter. Before a request leaves the process, it mutates messages according to capability flags: dropping unsupported modalities, scrubbing incompatible tool-call IDs for Claude or Mistral, and injecting cache headers for Anthropic. This is capability-driven request shaping, not mere endpoint routing.
在 SDP / RTSP 世界里,双方先依据权威能力表协商格式,再发送媒体流;不会靠“发一帧试试看”来决定协议。LLM 系统同理:manifest 必须先于请求存在,否则所谓抽象层只是在放大不确定性。
In SDP / RTSP-style systems, peers negotiate from an authoritative capability table before media starts flowing; they do not send a random frame and infer the protocol afterward. LLM abstractions work the same way: the manifest must exist before the request, or the abstraction layer merely amplifies uncertainty.
成本模型的信息论基础Information-Theoretic Foundation of Cost Tracking
Provider 抽象层的第二个任务不是“记账”,而是把 Provider 的资源分配逻辑显式化。OpenCode 的成本模型把 token 划分为 input、output、cache-read、cache-write,并允许在上下文超过 200K 时切换到更高单价 tier。实际计算遵循类似 Decimal(tokens.input * cost.input / 1000000) + Decimal(tokens.output * cost.output / 1000000) 的形式,所有项都用 Decimal.js 处理,避免浮点误差在大规模会话里累积成真实金钱损失。
这里最值得注意的不是公式本身,而是它背后的资源语义。reasoning token 按 output 费率计价,并非任意规定,而是因为它们占用的是模型的顺序生成能力;输入 token 通常更便宜,则反映了预填充缓存后每个 token 的边际注意力成本较低。换言之,价格表其实是 Provider 内部计算结构的外部投影。
当 200K 阈值触发高价档位时,成本曲线出现明显阶跃。这会反向塑造系统设计:上下文窗口不再只是“能塞多少”,而是一个需要严肃纪律管理的预算边界。信息论视角下,token 不是抽象字符,而是被编码进注意力与生成流水线的稀缺资源。
const total = Decimal(tokens.input * cost.input / 1000000)
.plus(Decimal(tokens.output * cost.output / 1000000))
.plus(Decimal(tokens.cacheRead * cost.cacheRead / 1000000))
.plus(Decimal(tokens.cacheWrite * cost.cacheWrite / 1000000))
The second job of the Provider layer is not simply “billing”; it is to make the Provider’s resource-allocation logic explicit. OpenCode’s cost model splits tokens into input, output, cache-read, and cache-write, with an override tier once context exceeds 200K. The arithmetic follows forms like Decimal(tokens.input * cost.input / 1000000) + Decimal(tokens.output * cost.output / 1000000), and every term is computed with Decimal.js so floating-point drift does not accumulate into real money at scale.
The interesting part is not the formula but the resource semantics it encodes. Reasoning tokens are billed at the output rate because they consume sequential generation capacity. Input tokens are usually cheaper because, once the prompt is prefetched into cache, the marginal compute pattern differs from autoregressive decoding. Pricing is therefore an externalized shadow of the Provider’s internal compute allocation.
Once the 200K threshold activates a higher tier, the cost curve develops a step function. That changes system behavior upstream: context windows stop being only “how much fits” and become a budget boundary that demands discipline. From an information-theoretic perspective, tokens are not abstract characters; they are scarce units flowing through attention, cache reuse, and generation.
const total = Decimal(tokens.input * cost.input / 1000000)
.plus(Decimal(tokens.output * cost.output / 1000000))
.plus(Decimal(tokens.cacheRead * cost.cacheRead / 1000000))
.plus(Decimal(tokens.cacheWrite * cost.cacheWrite / 1000000))
哪些 token 更贵,不只是商业定价问题,而是在告诉你:Provider 认为哪些阶段更消耗算力、更稀缺、也更值得被限制。抽象层若忽视这一点,就无法做出理性的上下文管理。
Which tokens cost more is not only a business choice. It reveals which stages the Provider considers more compute-intensive, scarcer, and therefore worth constraining. An abstraction layer that ignores this cannot make rational context decisions.
多模型 Fallback 策略Multi-Model Fallback Strategy
Provider 抽象层还必须面对另一个现实:同一个“模型名”背后,往往对应多个物理上游。OpenCode 在 Zen hosted 层采用加权、基于哈希的 Provider 选择:先对 session ID 做哈希,得到确定性的 provider 命中,从而形成 sticky routing;若请求失败且返回非 200,则把失败 provider 从候选集合中排除,递归重试,最多 MAX_FAILOVER_RETRIES = 3;最后再降级到该模型的显式 fallbackProvider。如果遇到 429,则单独走 MAX_429_RETRIES = 3 的限流重试,并解析 Retry-After 头部。
本地 CLI 侧则更多依赖带指数退避的 Schedule 策略,同样尊重上游给出的节流信号。二者共同体现了一个很重要的架构分离:selection 负责回答“先试哪家”,resilience 负责回答“失败后怎么退、退几次、间隔多久”。前者追求会话亲和性与负载稳定,后者追求可靠性与对上游协议的礼貌。
if (status === 429) return retryWith(retryAfter)
if (status !== 200) return failoverToNextProvider()
return streamResponse()
The Provider layer must also confront a practical fact: the same “model name” may sit on top of multiple physical upstreams. In the Zen hosted layer, OpenCode uses weighted hash-based provider selection: hash the session ID, derive a deterministic provider choice, and preserve sticky routing. If a non-200 response occurs, exclude the failed provider from the candidate set and retry recursively up to MAX_FAILOVER_RETRIES = 3; then, as a final step, degrade to the model’s designated fallbackProvider. A 429 is handled separately with MAX_429_RETRIES = 3 and explicit Retry-After parsing.
The local CLI leans more heavily on exponential-backoff Schedule policies, again respecting upstream throttling headers. Together, these choices illustrate a crucial separation: selection answers “which provider should we try first?”, while resilience answers “what do we do after failure, how long do we wait, and how many retries are acceptable?” The first optimizes affinity and load stability; the second optimizes reliability and protocol correctness.
if (status === 429) return retryWith(retryAfter)
if (status !== 200) return failoverToNextProvider()
return streamResponse()
如果把 provider 选择和 retry 策略揉成一个开关,局部故障会被放大成级联失败。把横向选择与纵向韧性拆开,系统才能既保持 session affinity,又避免把同一个坏上游打爆。
If provider selection and retry policy are collapsed into one switch, local faults become cascading failures. Separate the horizontal choice of upstream from the vertical logic of retries, and the system can preserve session affinity without hammering the same broken endpoint.
api 三元组与动态加载The API Triple & Dynamic Loading
OpenCode 对每个模型还定义了一个 api: { id, url, npm } 三元组。它看起来只是配置字段,实际上却是 Provider 抽象层最重要的扩展点:id 对应逻辑 Provider 身份,url 允许指向自建、代理或区域化 endpoint,npm 则映射到具体 AI SDK provider 包名,使系统能在运行时通过 import() 动态加载真正的连接器。
这背后的设计意义是:接入新 Provider 不必先写一层手工 adapter。只要模型条目声明正确的 npm 包与 base URL,Provider.Service 就能利用 Effect 的 Layer 模式按需初始化连接。未被使用的 Provider 不会被 eager load;抽象层保留统一接口,同时把扩展成本压缩到“补一条模型元数据”。
const api = {
id: "anthropic",
url: "https://api.anthropic.com",
npm: "@ai-sdk/anthropic"
}
const providerModule = await import(api.npm)
OpenCode also gives each model an api: { id, url, npm } triple. It looks like a mere configuration record, but it is actually the most important extension point in the Provider layer: id names the logical Provider identity, url supports self-hosted, proxied, or region-specific endpoints, and npm maps directly to the concrete AI SDK provider package so the system can import() the connector at runtime.
The design consequence is profound: adding a new Provider does not first require handwritten adapter code. If a model entry declares the right package name and base URL, Provider.Service can lazily initialize it through Effect’s Layer pattern. Unused Providers are never eagerly loaded; the abstraction keeps a unified interface while reducing extensibility cost to “add one more model manifest entry.”
const api = {
id: "anthropic",
url: "https://api.anthropic.com",
npm: "@ai-sdk/anthropic"
}
const providerModule = await import(api.npm)
20 权限系统形式化Permission System Formalization
OpenCode 的权限系统看起来像几张配置表,但若从形式化访问控制视角审视,它其实是一套极其克制的求值机:规则按顺序拼接,匹配按末尾回溯,默认动作是 ask 而不是 allow。这使它不像传统 ACL 那样依赖复杂继承,也不像 CSS 那样落入 specificity 战争,而更接近一个为 Agent 运行时量身定制的交互式能力判定器。
OpenCode’s permission system may look like a few configuration tables, but through the lens of formal access control it is really a deliberately minimal evaluator: rules are concatenated in order, matching resolves from the tail, and the default action is ask rather than allow. That makes it neither a heavily inherited ACL nor a CSS-style specificity game, but a compact interactive capability resolver designed for an Agent runtime.
findLast 语义:最后匹配获胜findLast Semantics: Last Match Wins
权限判定的核心不是“找到第一个匹配”,而是执行 rules.findLast(rule => Wildcard.match(permission, rule.permission) && Wildcard.match(pattern, rule.pattern))。这意味着规则集天然具有覆写语义:数组越后面的规则,优先级越高;想覆盖先前行为,不需要计算权重,只需要把新规则 append 到末尾。
这也解释了为什么 OpenCode 故意不做 specificity 排序。一个追加在后面的通配符规则,完全可以压过前面更“具体”的规则;设计者选择的是显式顺序而非隐式特异性。若最终没有任何匹配,系统回退到 { action: "ask", permission, pattern: "*" },因此失败模式是 fail-safe 的人工确认,而不是 fail-open 的默认放行。
const resolved = rules.findLast(rule =>
Wildcard.match(permission, rule.permission) &&
Wildcard.match(pattern, rule.pattern)
) ?? { action: "ask", permission: permission, pattern: "*" }
The core evaluator does not “find the first match”; it runs rules.findLast(rule => Wildcard.match(permission, rule.permission) && Wildcard.match(pattern, rule.pattern)). That gives the ruleset native override semantics: later array entries outrank earlier ones, so changing behavior requires no weighting scheme—just append the new rule at the end.
That is also why OpenCode intentionally avoids specificity sorting. A wildcard appended later can absolutely override an earlier, more “specific” rule; the design prefers explicit ordering over implicit specificity. If nothing matches, the resolver falls back to { action: "ask", permission, pattern: "*" }, so the failure mode is fail-safe user confirmation rather than fail-open execution.
const resolved = rules.findLast(rule =>
Wildcard.match(permission, rule.permission) &&
Wildcard.match(pattern, rule.pattern)
) ?? { action: "ask", permission: permission, pattern: "*" }
这比 CSS specificity、iptables 链顺序或 RBAC 角色继承都更直白:没有额外元规则,只有线性扫描与显式覆写;对这个问题域而言,表达力已经足够。
That is simpler than CSS specificity, iptables chain ordering, or RBAC role hierarchies: no meta-rules beyond linear scan and explicit override, yet expressive enough for the problem domain.
三层合并:从默认到用户Three-Layer Merge: From Defaults to User
Permission.merge(...rulesets) 的本体几乎可以写成一句话:rulesets.flat()。系统把三层规则按顺序串起来:第一层是系统默认值,第二层是 agent 角色覆写,第三层是用户配置;求值时再交给 findLast 决定最终动作,因此越靠后的层越有最终解释权。
默认层已经体现出安全边界:{ "*": "allow", doom_loop: "ask", external_directory: { "*": "ask" }, read: { "*.env": "ask", "*.env.*": "ask", "*.env.example": "allow" } }。而角色层则故意拉开差异:build 允许 question 与 plan_enter;plan 基本拒绝 edit,只保留计划文件例外;explore 则是“除 grep/glob/read/bash 外全部 deny”的只读侦察兵。用户配置坐在最末尾,因此能通过追加规则覆盖前两层;这不是继承树,而是平铺拼接。
const buildRules = Permission.merge(
systemDefaults,
[{ permission: "question", pattern: "*", action: "allow" },
{ permission: "plan_enter", pattern: "*", action: "allow" }],
userConfig
)
Permission.merge(...rulesets) is almost literally one sentence: rulesets.flat(). The system concatenates three layers in order: system defaults first, agent-role overrides second, user configuration last; then findLast resolves the final action, which means later layers hold the last word.
The default layer already encodes safety boundaries: { "*": "allow", doom_loop: "ask", external_directory: { "*": "ask" }, read: { "*.env": "ask", "*.env.*": "ask", "*.env.example": "allow" } }. The role layer then diverges sharply by agent: build allows question and plan_enter; plan effectively denies edit except for plan files; explore becomes a read-only scout that denies almost everything except grep, glob, read, and bash. User config sits at the tail, so it overrides both earlier layers by appending rules; this is flat composition, not inheritance.
const buildRules = Permission.merge(
systemDefaults,
[{ permission: "question", pattern: "*", action: "allow" },
{ permission: "plan_enter", pattern: "*", action: "allow" }],
userConfig
)
整个访问控制模型可以压缩成“数组拼接 + findLast”。这种近乎激进的简单性,恰恰是它最大的设计优势。
The whole access-control model reduces to “array concatenation + findLast.” That radical simplicity is the design’s greatest strength.
Tree-sitter 命令解析Tree-sitter Command Parsing
bash 工具的安全边界不建立在正则猜测之上,而是用 web-tree-sitter 加载 Bash / PowerShell 的 WASM grammar,再沿 AST 遍历命令名、参数与路径。collect 过程会构造一个 Scan:{ dirs: Set, patterns: Set, always: Set };其中触碰文件系统的命令会进一步触发 external_directory 检查,防止越过工作区边界。
这一层还有一个不显眼但关键的部件:BashArity。它是生成出来的命令语义词典,用来表达诸如 git checkout、npm run 这类多词命令的“语义元数”,从而决定审批时应落入 always 的模式。选择 Tree-sitter 而不是 regex,本质原因是 Bash 语法对引号、管道、子 shell 与空格极其敏感;rm "file with spaces" 与 rm file with spaces 的安全含义根本不同。
The security boundary of the bash tool does not rest on regex heuristics. It loads Bash and PowerShell WASM grammars through web-tree-sitter, then walks the AST to extract command names, arguments, and paths. The collect pass builds a Scan: { dirs: Set, patterns: Set, always: Set }; commands that touch files trigger additional external_directory checks so execution cannot silently cross the workspace boundary.
There is also a quiet but important component here: BashArity. It is a generated semantic dictionary that understands multiword commands such as git checkout and npm run, which then determines the always pattern used for user approvals. Tree-sitter is chosen over regex because Bash syntax is extremely sensitive to quoting, pipes, subshells, and spacing; rm "file with spaces" and rm file with spaces are not the same security event.
type Scan = { dirs: Set, patterns: Set, always: Set }
const scan = collect(tree, BashArity)
对安全边界而言,“看起来像删除”与“确定是删除并已提取目标文件”不是一个等级的问题。Tree-sitter 让审批建立在语法事实之上,而不是字符串幻觉之上。
For a security boundary, that difference matters. Tree-sitter makes approval depend on syntactic facts rather than string-shaped guesses.
Glob 转正则Glob-to-Regex Conversion
Wildcard.match(str, pattern) 做的事并不神秘:它把 glob 映射成锚定正则,* 变成 .*,? 变成 .,最后用 ^...$ 强制整串匹配而不是子串命中。更特别的一条规则是结尾的 * 会被转换成 ( .*)?,因此 ls * 同时匹配 ls 与 ls -la,把“可选参数尾巴”直接编码进模式语义。
这里没有 ** globstar;普通 * 就跨路径分隔符匹配,模型反而更简单。匹配前所有反斜杠都会被规范化为正斜杠,Windows 上使用大小写不敏感标志,其他平台则保持大小写敏感。它不是 POSIX shell glob 的完全复刻,而是一个更小、更稳定、更适合权限推理的模式语言。
"*" => ".*"
"?" => "."
" *" => "( .*)?"
return new RegExp("^" + source + "$", windows ? "si" : "s")
Wildcard.match(str, pattern) is conceptually simple: it maps a glob to an anchored regex, where * becomes .*, ? becomes ., and ^...$ enforces full-string rather than substring matching. The unusual rule is that a trailing * becomes ( .*)?, so ls * matches both ls and ls -la; optional argument tails are deliberately encoded into the pattern language.
There is no ** globstar. Ordinary * already spans path separators, which keeps the model smaller and easier to reason about. All backslashes are normalized to forward slashes before matching; Windows uses case-insensitive flags, while other platforms remain case-sensitive. This is not a full POSIX glob clone but a smaller, steadier language optimized for permission reasoning.
从访问控制理论看Through the Lens of Access Control Theory
若用经典模型来归类,OpenCode 显然不是 MAC:它没有安全标签中心、没有 Bell-LaPadula 格、也没有任何用户不可逾越的强制策略,因为用户规则始终可以覆写系统默认。它又部分接近 RBAC:build、plan、explore 等 agent 角色确实各自绑定一组预制能力,但这些角色之间不存在层级继承,只有平面化组合。
它更接近 DAC,因为最终裁量权在用户手里;但 OpenCode 还有一个经典模型里较少出现的第三值:ask。这不是 allow,也不是 deny,而是把决策委托给 human-in-the-loop 的即时审批。于是系统形成了 allow | deny | ask 的三值逻辑,最准确的描述应当是:一套由规则集驱动、按角色预授能力、再由人类交互式收口的 capability system。纯 RBAC 要求预先枚举所有 tool × pattern 组合,而 ask 让系统在规则尚不完备时仍然可用。
In classical terms, OpenCode is clearly not MAC: there are no centralized security labels, no Bell-LaPadula lattice, and no immutable mandatory policy because user rules can always override system defaults. It is partially RBAC-like in that agent roles such as build, plan, and explore receive precomposed capability sets, yet those roles have no hierarchy; composition stays flat.
It is mostly DAC because ultimate discretion belongs to the user, but OpenCode adds a third value rare in classic models: ask. That is neither allow nor deny; it delegates the decision to the human in the loop at runtime. The result is a three-valued logic—allow | deny | ask—so the best description is a ruleset-driven interactive capability system: capabilities are pregranted by role, refined by glob patterns, and closed by human approval. A pure RBAC design would require enumerating every tool × pattern combination in advance; ask keeps the system usable before the rules are complete.
一旦把人类审批视为一等动作,权限系统就不再只是静态策略表,而变成可渐进收敛的交互式控制回路。这正是 Agent 时代比传统访问控制更务实的地方。
Once human approval becomes a first-class action, the permission system stops being a static policy table and becomes an interactive control loop that can converge over time. That is the pragmatic advantage of Agent-era access control.
21 Agent Teams:多智能体协作Agent Teams: Multi-Agent Coordination
2026 年初,OpenCode 上线了迄今最大的架构扩展:Agent Teams。这不是在现有单 Agent 循环上加个"并行"开关,而是一次正式的多智能体协调框架设计。它的核心洞察是:当任务超出单个上下文窗口的承载能力时,唯一出路不是更大的窗口,而是把认知分工变成显式协议。
In early 2026, OpenCode shipped its most significant architectural extension to date: Agent Teams. This is not a "parallel mode" switch bolted onto the existing single-agent loop. It is a formal multi-agent coordination framework. The core insight: when a task exceeds what a single context window can sustain, the solution is not a bigger window—it is making cognitive division of labor an explicit protocol.
Flat Team 模型:为什么不是层级树Flat Team Model: Why Not a Hierarchy
多数人第一直觉是用树形结构组织 Agent:Lead → Sub-Lead → Worker。但 OpenCode 选择了 flat team(一个 Lead + N 个平级 Teammate)。原因很实际:树结构的层级一多,中间节点既不能自己决策,也不能直接执行,变成纯粹的消息路由器——消耗 token 却不产生价值。flat model 里,Lead 直接与每个 Teammate 通信,减少一跳就少一轮 LLM 调用。
更关键的是跨 Provider 混合:同一个团队里,不同 Teammate 可以使用不同模型——GPT-5.3 做前端、Claude Opus 做后端、Gemini 3 跑测试。这在 Claude Code 的闭源框架里是不可能的,因为它被锁定在 Anthropic 的模型生态里。跨 Provider 团队的意义不只是"选最便宜的",而是让不同模型的能力特长在同一个任务图里互补。
Most people instinctively reach for a tree structure: Lead → Sub-Lead → Worker. OpenCode chose a flat team model instead (one Lead + N peer Teammates). The reason is pragmatic: in deep hierarchies, middle nodes can neither decide nor execute—they degenerate into pure message routers, burning tokens without producing value. In the flat model, the Lead communicates directly with each Teammate, and eliminating one hop saves an entire LLM round-trip.
More crucially, teams support cross-provider mixing: different Teammates within the same team can use different models—GPT-5.3 for frontend, Claude Opus for backend, Gemini 3 for tests. This is impossible in Claude Code's closed-source framework, which is locked to Anthropic's model ecosystem. Cross-provider teams are not just about picking the cheapest option; they let different models' strengths complement each other within a single task graph.
消息传递与状态机Messaging and State Machine
团队协作的底层是一个两层消息系统。第一层:每个成员有一个 JSONL 收件箱(team_inbox/{team}/{member}.jsonl),消息以 O(1) 追加写入,保证审计可追溯。第二层:收件箱内容被注入为接收者 session 中的合成用户消息,触发其 prompt loop 执行。这种设计避免了共享内存或消息队列的复杂性——文件系统就是消息总线。
每个 Teammate 有一个两级状态机:成员级(ready → active → idle → interrupted)和执行级(prompt-loop 内部状态)。分离这两层使得系统可以在 Teammate 崩溃时精确恢复:Team.recover() 在启动时扫描所有 active 状态的成员,将其标记为 interrupted,并向 Lead 注入一条合成通知。Lead 可以决定是重新分配任务还是等待人工介入。
Under the hood, team collaboration rests on a two-layer messaging system. Layer 1: each member has a JSONL inbox (team_inbox/{team}/{member}.jsonl) with O(1) append writes, guaranteeing an auditable trail. Layer 2: inbox contents are injected as synthetic user messages into the recipient's session, triggering their prompt loop. This design sidesteps the complexity of shared memory or message queues—the filesystem is the message bus.
Each Teammate runs a two-level state machine: member-level (ready → active → idle → interrupted) and execution-level (prompt-loop internal state). Separating these layers enables precise crash recovery: Team.recover() scans for active members on startup, marks them interrupted, and injects a synthetic notification into the Lead's session. The Lead can then decide whether to reassign the task or wait for human intervention.
OpenCode 允许把 Lead 限制为纯协调角色(Delegate 模式),只保留 task、skills、todo 等编排工具,禁止直接调用文件读写和 shell。这迫使 Lead 把所有实际工作委托给 Teammate,形成清晰的关注点分离:Lead 负责"做什么"和"谁来做",Teammate 负责"怎么做"。这不是为了好看,而是在长任务中防止 Lead 上下文被执行细节污染、导致后期规划能力退化。
OpenCode allows the Lead to be restricted to a pure coordination role (Delegate mode), retaining only orchestration tools like task, skills, and todo, with no direct access to file I/O or shell. This forces all actual work to be delegated to Teammates, creating a clean separation of concerns: the Lead decides "what" and "who," Teammates handle "how." This is not aesthetic—it prevents the Lead's context from being polluted by execution details, which would degrade its planning capacity in long tasks.
22 上下文压缩:Token 预算管理Context Compaction: Token Budget Management
每一个长时间运行的 Agent session 都面临同一个物理约束:上下文窗口是有限的。即便模型支持 1M token,工具结果、诊断输出、文件内容和历史决策会以惊人的速度填满它。OpenCode 的上下文压缩(Compaction)子系统是对这个问题最成熟的工程回应之一。它的核心设计原则是:先裁剪,后压缩,永不删除。
Every long-running Agent session faces the same physical constraint: the context window is finite. Even with 1M-token models, tool results, diagnostics output, file contents, and historical decisions fill it at alarming rates. OpenCode's Compaction subsystem is one of the most mature engineering responses to this problem. Its core design principle: prune first, compress later, never delete.
Prune-First 策略The Prune-First Strategy
在调用 LLM 做摘要压缩之前,系统首先尝试零成本裁剪。它从消息队列尾部向前扫描,跳过受保护的内容(最近 40K token + 最后 2 轮用户消息 + 所有 skill 工具结果),对其余已完成的工具输出打上 compacted = Date.now() 时间戳。这不是删除——原始数据仍然存在,只是在构建下一轮 prompt 时被跳过。
// compaction.ts — 裁剪保护常量
const PRUNE_MINIMUM = 20_000 // 至少裁剪 20K token
const PRUNE_PROTECT = 40_000 // 保护最近 40K token
const PRUNE_PROTECTED_TOOLS = ["skill"]
// 非破坏性:只打时间戳,不删数据
part.state.time.compacted = Date.now()
如果裁剪不够,系统才启动 LLM 驱动的压缩:把已标记的旧消息发送给一个专门的"压缩 Agent",生成精炼摘要后替换原文。实验性标志 OPENCODE_EXPERIMENTAL_COMPACTION_PRESERVE_PREFIX 还可以复用原始 Agent 的 prefix cache(系统提示 + 工具定义),使 Anthropic 等支持 prefix caching 的 Provider 保持 99% 的缓存命中率。
Before invoking LLM-based summarization, the system first attempts zero-cost pruning. It scans backward from the message queue tail, skipping protected content (the most recent 40K tokens + the last 2 user turns + all skill tool results), and stamps remaining completed tool outputs with compacted = Date.now(). This is not deletion—the original data remains; it is simply skipped when constructing the next prompt.
// compaction.ts — pruning protection constants
const PRUNE_MINIMUM = 20_000 // prune at least 20K tokens
const PRUNE_PROTECT = 40_000 // protect most recent 40K tokens
const PRUNE_PROTECTED_TOOLS = ["skill"]
// non-destructive: timestamp only, data preserved
part.state.time.compacted = Date.now()
Only when pruning is insufficient does the system escalate to LLM-driven compression: marked old messages are sent to a dedicated "compaction Agent" that generates a condensed summary to replace the originals. An experimental flag OPENCODE_EXPERIMENTAL_COMPACTION_PRESERVE_PREFIX reuses the original Agent's prefix cache (system prompt + tool definitions), maintaining 99% cache hit rates with providers like Anthropic that support prefix caching.
ContextOverflowError 并让用户决定是否开始新 session。这是「永不静默丢失数据」设计哲学的体现。
ContextOverflowError and lets the user decide whether to start a new session. This embodies a "never silently lose data" design philosophy.
23 Effect-TS ChildProcess:流式 Shell 架构Effect-TS ChildProcess: Streaming Shell Architecture
当 Agent 需要执行 shell 命令时,它并不是简单地 exec() 然后等结果。OpenCode 在 2026 年 4 月完成了一次重大重构,将 shell 执行从原始的 Node.js child_process 迁移到 Effect 的实验性 ChildProcess API。这个 514 行的 cross-spawn-spawner.ts 文件,是理解 Effect-TS 如何桥接命令式系统 API 和函数式并发模型的最佳案例。
When the Agent needs to execute a shell command, it does not simply call exec() and wait. In April 2026, OpenCode completed a major refactor, migrating shell execution from raw Node.js child_process to Effect's experimental ChildProcess API. The 514-line cross-spawn-spawner.ts file is the best case study for understanding how Effect-TS bridges imperative system APIs with functional concurrency models.
Stream 桥接:从 Node.js 流到 Effect 流Stream Bridging: Node.js Streams to Effect Streams
核心挑战在于:Node.js 的 Readable/Writable 流是基于事件的、可变的、需要手动清理的;而 Effect 的 Stream/Sink 是惰性的、不可变的、通过 Scope 自动管理生命周期的。cross-spawn-spawner.ts 用 NodeStream.fromReadable 和 NodeSink.fromWritable 完成了这个转换,使 shell 输出变成可组合的、支持背压的 Effect Stream。
对于管道命令(cmd1 | cmd2 | cmd3),flatten() 函数将其分解为 StandardCommand 链,然后用 Stream.unwrap 和 source() 路由将前一个进程的 stdout 接到下一个进程的 stdin。每一步都在 Effect 的 Scope 内执行,确保即使中间进程失败,所有资源也会被确定性释放。
The core challenge: Node.js Readable/Writable streams are event-based, mutable, and require manual cleanup. Effect's Stream/Sink are lazy, immutable, and lifecycle-managed via Scope. The spawner bridges these via NodeStream.fromReadable and NodeSink.fromWritable, turning shell output into composable, backpressure-aware Effect Streams.
For piped commands (cmd1 | cmd2 | cmd3), a flatten() function decomposes them into a StandardCommand chain, then wires them together via Stream.unwrap and source() routing—each process's stdout connects to the next process's stdin. Every step runs within an Effect Scope, ensuring deterministic resource cleanup even if an intermediate process fails.
优雅终止层级Graceful Termination Hierarchy
进程终止不是 kill -9 那么简单。OpenCode 实现了一个分层终止协议:先发 SIGTERM,等待 forceKillAfter 超时,如果进程仍未退出再发 SIGKILL。整个逻辑用 Effect 的 timeoutOrElse 表达:
// 终止升级逻辑
const attempt = send(sig).pipe(
Effect.andThen(Deferred.await(signal)),
Effect.asVoid
)
const escalated = command.options.forceKillAfter
? Effect.timeoutOrElse(attempt, {
duration: command.options.forceKillAfter,
orElse: () => send("SIGKILL")
})
: attempt
在 Unix 上使用负 PID 发送进程组信号;在 Windows 上调用 taskkill /T /F。这种跨平台的终止语义被统一封装在 killGroup 中,上层调用者无需感知操作系统差异。
Process termination is not as simple as kill -9. OpenCode implements a layered termination protocol: send SIGTERM first, wait for a forceKillAfter timeout, then escalate to SIGKILL if the process is still alive. The entire logic is expressed via Effect's timeoutOrElse:
// termination escalation logic
const attempt = send(sig).pipe(
Effect.andThen(Deferred.await(signal)),
Effect.asVoid
)
const escalated = command.options.forceKillAfter
? Effect.timeoutOrElse(attempt, {
duration: command.options.forceKillAfter,
orElse: () => send("SIGKILL")
})
: attempt
On Unix, signals are sent via negative PIDs for process-group signaling; on Windows, taskkill /T /F is used. This cross-platform termination semantic is encapsulated in killGroup, freeing callers from OS-specific concerns.
传统的 try { spawn() } catch { kill() } finally { cleanup() } 模式有一个根本缺陷:finally 块无法区分正常退出、超时退出和被中断退出,因此清理逻辑要么过于激进(杀掉还在正常工作的进程),要么不够彻底(泄漏孤儿进程)。Effect 的 Scope.acquireRelease 让每个资源都有精确的释放语义,而 Deferred 让信号等待变成可组合的异步操作——这意味着"等进程退出 5 秒,超时则强杀"可以被表达为一行声明式管道,而不是嵌套的 setTimeout 回调。
The conventional try { spawn() } catch { kill() } finally { cleanup() } pattern has a fundamental flaw: the finally block cannot distinguish between normal exit, timeout, and interruption, so cleanup logic is either too aggressive (killing still-working processes) or too lax (leaking orphans). Effect's Scope.acquireRelease gives each resource precise release semantics, and Deferred turns signal-waiting into composable async operations—meaning "wait 5s for process exit, then force-kill" becomes a single declarative pipeline instead of nested setTimeout callbacks.
24 TUI 渲染架构TUI Rendering Architecture
OpenCode 的终端 UI 不是事后附加的外壳,而是由 terminal.shop 团队主导设计的一等公民。2026 年间,TUI 层经历了三次架构讨论,每一次都揭示了终端渲染在 AI Agent 场景下的独特挑战。
OpenCode's terminal UI is not an afterthought shell—it is a first-class citizen designed by the terminal.shop team. During 2026, the TUI layer went through three architectural discussions, each revealing unique challenges of terminal rendering in AI Agent contexts.
OpenTUI:SolidJS 驱动的终端OpenTUI: SolidJS-Powered Terminal
主 TUI 使用 @opentui/core 和 @opentui/solid,选择 SolidJS 而非 React 的原因是:SolidJS 的细粒度响应式系统在终端场景下比 React 的虚拟 DOM diff 更高效。终端渲染的本质是字符级别的增量更新——每次只重绘变化的字符,而不是整棵组件树。SolidJS 的 Signal/Effect 模型天然适合这种局部更新,因为它精确追踪哪些 signal 变了,只执行受影响的 effect。
The main TUI uses @opentui/core and @opentui/solid. SolidJS was chosen over React because its fine-grained reactivity system is more efficient for terminal rendering. Terminal output is fundamentally about character-level incremental updates—redrawing only changed characters, not diffing an entire component tree. SolidJS's Signal/Effect model is a natural fit: it precisely tracks which signals changed and executes only affected effects.
Ink 重写实验与 GapBuffer RFCThe Ink Rewrite Experiment and GapBuffer RFC
2026 年 2 月,社区发起了一次用 Ink 6.6.0 + React 19 重写轻量版 TUI (oclite) 的实验。主要动机是消除 pausedForProse hack——一个阻止流式输出期间进行实时 UI 更新的变通方案。Ink 的 React 组件模型让流式内容和状态面板可以共存,但代价是引入了 React 运行时的开销。
与此同时,RFC #13027 提出了一个完全不同的方向:用 GapBuffer + CanvasRenderer 替换现有文本缓冲区。GapBuffer 提供 O(1) 摊销插入——对于 LLM 流式输出(持续在末尾追加 token)来说,这是最优数据结构。配合 CanvasRenderer 的脏区追踪,原型测试显示 LLM 流式渲染帧时间从 450ms 降至 6ms(75x 提升)。这个提案通过 TextBufferAdapter 实现零破坏性迁移,两种实现可以共存。
In February 2026, the community launched an experimental Ink 6.6.0 + React 19 rewrite of the lightweight TUI variant (oclite). The primary motivation was eliminating the pausedForProse hack—a workaround that prevented live UI updates during streaming output. Ink's React component model lets streaming content and status panels coexist, at the cost of React runtime overhead.
Simultaneously, RFC #13027 proposed a radically different direction: replacing the existing text buffer with a GapBuffer + CanvasRenderer. GapBuffers provide O(1) amortized insertion—the optimal data structure for LLM streaming output (continuous token appending at the end). Combined with the CanvasRenderer's dirty-region tracking, prototype benchmarks showed LLM streaming frame times dropping from 450ms to 6ms (75x improvement). The proposal achieves zero-breaking-change migration via a TextBufferAdapter, allowing both implementations to coexist.
三种方案各自代表一个取舍:OpenTUI (SolidJS) 优先考虑细粒度响应性,但需要自定义框架;Ink (React) 优先考虑开发者熟悉度,但带来 vDOM 开销;GapBuffer 优先考虑原始性能,但牺牲了声明式 UI 的抽象层次。没有哪个方案是错误的——它们反映的是终端 AI Agent 尚未收敛的渲染范式。当 LLM 输出速度从 50 token/s 提升到 500 token/s 时,渲染层的性能瓶颈会从"可以忽略"变成"用户可感知",这正是 GapBuffer RFC 被认真讨论的根本原因。
Each approach embodies a different tradeoff: OpenTUI (SolidJS) prioritizes fine-grained reactivity but requires a custom framework; Ink (React) prioritizes developer familiarity but introduces vDOM overhead; GapBuffer prioritizes raw performance but sacrifices declarative UI abstraction. None is wrong—they reflect a rendering paradigm that has not yet converged for terminal AI Agents. When LLM output speeds increase from 50 tokens/s to 500 tokens/s, the rendering layer's performance bottleneck shifts from "negligible" to "user-perceptible," which is precisely why the GapBuffer RFC received serious consideration.
25 弹性模式:重试、恢复与并发控制Resilience Patterns: Retry, Recovery, and Concurrency Control
对 Agent 来说,网络故障、API 限流、模型超时不是意外——它们是日常。OpenCode 的弹性设计不是在业务逻辑里到处撒 try/catch,而是用 Effect 的类型化错误模型和调度原语构建了一套可组合的恢复框架。
For Agents, network failures, API rate limits, and model timeouts are not exceptions—they are the norm. OpenCode's resilience design does not scatter try/catch throughout business logic. Instead, it uses Effect's typed error model and scheduling primitives to build a composable recovery framework.
Header-Aware 重试调度Header-Aware Retry Scheduling
session/retry.ts 实现了一个 Schedule.fromStepWithMetadata 调度器,不是简单的"失败就重试",而是能读懂 HTTP 响应头:解析 retry-after、retry-after-ms 和 HTTP-date 格式的重试提示。通用错误使用指数退避(因子 2,上限 30s);当服务端显式告知等待时间时,直接使用该值,上限设为 MAX_INT32。
配合的是一套错误分类体系:retryable() 函数根据错误类型做出精确判断——ContextOverflowError 不可重试(重试也不会让上下文变小),APIError 检查其 retryable 标志,限流错误匹配字符串模式,provider 特定的 JSON 错误结构也被纳入考量。这意味着系统不会对不可恢复的错误浪费重试次数。
session/retry.ts implements a Schedule.fromStepWithMetadata scheduler that does more than "retry on failure"—it reads HTTP response headers: parsing retry-after, retry-after-ms, and HTTP-date retry hints. Generic errors use exponential backoff (factor 2, capped at 30s); when the server explicitly specifies a wait duration, that value is used directly, capped at MAX_INT32.
Paired with this is an error taxonomy: the retryable() function makes precise judgments per error type—ContextOverflowError is non-retryable (retrying won't shrink the context), APIError checks its retryable flag, rate-limit errors match string patterns, and provider-specific JSON error shapes are also considered. This ensures the system never wastes retry attempts on unrecoverable errors.
Runner 状态机The Runner State Machine
effect/runner.ts 是 Agent 执行循环的并发控制核心。它用 SynchronizedRef 维护一个四态状态机:
ensureRunning 保证任何时刻只有一个执行实例:如果已在 Running,新请求被排队而非丢弃。startShell 在已忙时直接拒绝。取消操作通过 Effect 的 Fiber 中断实现,能正确处理 ShellThenRun 状态下的清理——先中断 shell 进程,再恢复到 Idle,而不是让状态机卡在中间态。
effect/runner.ts is the concurrency control core of the Agent execution loop. It maintains a four-state machine via SynchronizedRef:
ensureRunning guarantees at most one execution instance at any time: if already Running, new requests are queued rather than dropped. startShell rejects outright when busy. Cancellation is implemented via Effect Fiber interruption, correctly handling cleanup in ShellThenRun state—first interrupting the shell process, then restoring to Idle, rather than leaving the state machine stuck in a transitional state.
传统方案是 mutex.lock() + 事件回调。问题在于:Agent 的执行周期不是毫秒级的——一次 prompt loop 可能持续几分钟,中间夹杂多次工具调用和 shell 命令。普通互斥锁会让其他请求饿死。SynchronizedRef 的优势在于它把「检查状态 + 转换状态 + 启动副作用」封装为一个原子操作,同时允许等待者通过 Deferred 被非阻塞地通知。效果是:并发控制的正确性由类型系统保证,而不是靠程序员记住"先锁后解"。
The conventional approach is mutex.lock() + event callbacks. The problem: Agent execution cycles are not millisecond-scale—a single prompt loop may last minutes, interleaving tool calls and shell commands. A standard mutex would starve other requests. SynchronizedRef's advantage is that it packages "check state + transition state + launch side effect" as a single atomic operation, while allowing waiters to be non-blockingly notified via Deferred. The result: concurrency correctness is guaranteed by the type system, not by programmers remembering to "lock then unlock."
26 配置级联:5 层合并策略Config Cascade: 5-Layer Merge Strategy
对一个面向企业和开源社区的 CLI 工具来说,配置管理不只是"读个 JSON 文件"。OpenCode 的配置系统实现了五层级联合并,每一层都有明确的语义和优先级。
后层覆盖前层——但 instructions 数组例外。它不是覆盖,而是拼接去重:来自不同层的指令被合并为一个完整列表。这个设计决策反映了一个真实需求:全局层可能设置"保持代码风格一致"的指令,项目层追加"使用 Effect-TS 的 pipe 风格"——两者应该叠加而非互斥。
For a CLI tool serving both enterprise and open-source audiences, config management is not just "read a JSON file." OpenCode's configuration system implements a five-layer cascade merge, each layer with clear semantics and priority.
Later layers override earlier ones—with one exception: instructions arrays are not overwritten but concatenated and deduplicated. Instructions from different layers are merged into a single list. This design decision reflects a real need: the global layer might set "maintain consistent code style" while the project layer adds "use Effect-TS pipe style"—both should stack, not conflict.
MDM 集成与企业管控MDM Integration and Enterprise Control
OpenCode 对企业场景的支持超出了大多数开源 CLI 的范畴:在 macOS 上,它读取 /Library/Managed Preferences/ai.opencode.managed.plist,剥离 Apple 的 payload 元数据键,然后通过相同的 Zod schema 解析。这意味着企业 IT 部门可以通过 MDM(移动设备管理)平台统一推送 OpenCode 配置——包括允许的模型列表、默认权限策略、禁用的工具集——而开发者本地的项目级配置仍然可以在允许范围内自定义。
这种分层在企业环境中的价值是:合规团队控制「不可以做什么」(全局 deny 规则、审计日志要求、模型白名单),而工程团队控制「怎么做最高效」(项目级指令、自定义 skill、MCP 连接)。权力不是集中在某一层,而是每层有自己的管辖范围。
OpenCode's enterprise support goes beyond what most open-source CLIs offer. On macOS, it reads /Library/Managed Preferences/ai.opencode.managed.plist, strips Apple's payload metadata keys, and parses the result through the same Zod schema. This means enterprise IT departments can push OpenCode configurations via MDM platforms—including allowed model lists, default permission policies, and disabled tool sets—while developers' project-level configs can still customize within the allowed boundaries.
The value of this layering in enterprise contexts: compliance teams control "what must not happen" (global deny rules, audit logging requirements, model allowlists), while engineering teams control "how to work most efficiently" (project-level instructions, custom skills, MCP connections). Authority is not concentrated in one layer—each layer has its own jurisdiction.
配置系统中的插件规范支持 [spec, options] 元组、file:// URL、相对路径和动态解析。但这里有一个有意思的设计张力:越灵活的插件系统,攻击面越大。一个恶意的 file:// 插件可以在 Agent 启动时执行任意代码。OpenCode 的应对不是限制灵活性,而是通过配置层级来控制信任:全局层的插件被视为「管理员信任」,项目层的插件被视为「开发者信任」,而 .opencode/ 目录中的插件则是「仓库级信任」。信任层级从上到下递减。
The config system's plugin spec supports [spec, options] tuples, file:// URLs, relative paths, and dynamic resolution. But there is an interesting design tension: the more flexible the plugin system, the larger the attack surface. A malicious file:// plugin can execute arbitrary code at Agent startup. OpenCode's response is not to restrict flexibility but to control trust via config layers: global-layer plugins are treated as "admin-trusted," project-layer plugins as "developer-trusted," and .opencode/ directory plugins as "repo-level trusted." Trust grades decrease from top to bottom.
27 构建系统:Bun 编译与原生二进制Build System: Bun Compile and Native Binaries
OpenCode 的构建管线是对「TypeScript 也能发布原生二进制」这个命题的完整证明。script/build.ts 使用 Bun.build() 的 compile 选项,将整个 TypeScript 项目——包括 SolidJS 组件、Effect-TS 运行时、SQL 迁移脚本——编译为无需 Node.js 的独立可执行文件,覆盖 12 个目标平台。
OpenCode's build pipeline is a complete proof that TypeScript can ship as native binaries. script/build.ts uses Bun.build()'s compile option to turn the entire TypeScript project—including SolidJS components, Effect-TS runtime, and SQL migration scripts—into standalone executables that require no Node.js, spanning 12 target platforms.
嵌入式 Web UI BundleEmbedded Web UI Bundle
最有趣的构建步骤是 Web UI 的嵌入。构建脚本先编译 app 包产生静态资源,然后生成一个合成 TypeScript 文件(opencode-web-ui.gen.ts),用 import ... with { type: "file" } 语法导入每个静态资源,再导出为一个映射对象。Bun 编译器会将这些文件嵌入到最终二进制中,实现「单文件分发包含完整 Web UI」。
// 生成的嵌入文件结构
import file_0 from "./dist/index.html" with { type: "file" }
import file_1 from "./dist/assets/app.js" with { type: "file" }
export default {
"index.html": file_0,
"assets/app.js": file_1,
}
The most interesting build step is Web UI embedding. The build script first compiles the app package to produce static assets, then generates a synthetic TypeScript file (opencode-web-ui.gen.ts) that imports every static asset via import ... with { type: "file" } and exports them as a mapping object. Bun's compiler embeds these files into the final binary, achieving "single-file distribution with a complete Web UI."
// generated embedding file structure
import file_0 from "./dist/index.html" with { type: "file" }
import file_1 from "./dist/assets/app.js" with { type: "file" }
export default {
"index.html": file_0,
"assets/app.js": file_1,
}
编译时注入与冒烟测试Compile-Time Injection and Smoke Tests
数据库迁移不是运行时读取文件,而是编译时注入:SQL 迁移文件从 migration/YYYYMMDDhhmmss/migration.sql 加载,解析后通过 Bun 的 define 机制注入为 OPENCODE_MIGRATIONS 常量。这样用户拿到的二进制文件内已包含所有迁移脚本,不需要额外的 migration 目录。
构建完成后,如果当前平台匹配目标平台,脚本会自动运行 ./bin/opencode --version 做冒烟测试。失败则整个构建标红。这是一个简单但有效的"信任但验证"策略:你不需要完整的 E2E 测试套件,但至少要确认产物能正常启动。
Database migrations are not loaded at runtime—they are injected at compile time. SQL migration files from migration/YYYYMMDDhhmmss/migration.sql are loaded, parsed, and injected as OPENCODE_MIGRATIONS constants via Bun's define mechanism. The binary ships with all migrations baked in—no separate migration directory needed.
After compilation, if the current platform matches the target, the script automatically runs ./bin/opencode --version as a smoke test. Failure marks the entire build as red. This is a simple but effective "trust but verify" strategy: you don't need a full E2E test suite, but you must confirm the artifact can start.
整个 monorepo 的构建编排使用 Turborepo:turbo.json 定义了 build、dev、typecheck 等 task 的依赖关系和缓存策略。SolidJS 组件在构建期通过 @opentui/solid/bun-plugin 编译为原生 JavaScript——不是 React 那种运行时 JSX 转换,而是真正的编译时优化,消除虚拟 DOM 层。这使得最终二进制中的 TUI 代码既保留了声明式组件模型的开发体验,又拥有接近手写命令式代码的运行时性能。
The entire monorepo's build orchestration uses Turborepo: turbo.json defines dependency relationships and caching strategies for tasks like build, dev, and typecheck. SolidJS components are compiled during the build phase via @opentui/solid/bun-plugin into native JavaScript—not React-style runtime JSX transforms, but true compile-time optimization that eliminates the virtual DOM layer. The result: TUI code in the final binary retains the declarative component model's developer experience while achieving near-hand-written imperative performance.
28 总结Conclusion
OpenCode 代表了 AI Coding Agent 发展的一个重要方向:开源、不绑定、可扩展、终端原生。它的技术架构做出了几个关键决策:
OpenCode represents an important direction in the evolution of AI Coding Agents: open source, provider-agnostic, extensible, and terminal-native. Its architecture is built around several key decisions:
选型指南:OpenCode vs 竞品Selection Guide: OpenCode vs Alternatives
没有「最好」的工具,只有「最合适」的场景。以下对比帮助你在不同约束下做出正确选择:
| 场景 / 诉求 | 推荐工具 | 核心原因 |
|---|---|---|
| 不想被 Anthropic 绑定,需要随时切换 LLM Provider | OpenCode ✓ | 21 个 Provider 无缝切换,配置即换模型 |
| 已深度集成 VS Code,依赖 GUI diff/review 工作流 | Cursor / Copilot | OpenCode 的 UI 是终端 TUI,无 GUI diff panel |
| 想要最「开箱即用」的体验,追求零配置 | Claude Code | Anthropic 官方维护,anthropic 模型效果经过深度调优 |
| 需要自建工具、私有 MCP 服务器,扩展 Agent 能力 | OpenCode ✓ | 三层扩展(Plugin/Skill/MCP),工具发现动态化 |
| 在受限网络环境运行,必须使用本地部署模型 | OpenCode ✓ | 支持 Ollama / 任意 OpenAI-compatible endpoint |
| 偶发性代码辅助需求,不需要 Agent 级自主操作 | GitHub Copilot | 补全级工具更轻量,Agent 模式对此场景是过度设计 |
| 希望通过源码学习生产级 AI Agent 架构 | OpenCode ✓ | MIT 开源,架构清晰,注释完整,极佳的教学样本 |
There is no universally “best” tool—only the tool that best fits a given set of constraints. The comparison below is meant to make that trade-off explicit:
| Scenario / Requirement | Recommended Tool | Why |
|---|---|---|
| You do not want to be locked into Anthropic and need to switch LLM Providers freely | OpenCode ✓ | It supports 21 Providers with seamless switching; changing models is primarily a configuration concern |
| You are deeply invested in VS Code and rely on GUI-based diff/review workflows | Cursor / Copilot | OpenCode’s primary UI is a terminal TUI and does not provide a GUI diff panel |
| You want the most turnkey experience possible and prefer near-zero configuration | Claude Code | It is maintained directly by Anthropic, and the Anthropic model stack is heavily tuned for it |
| You need custom tools, private MCP servers, and deeper Agent extensibility | OpenCode ✓ | Its three-layer extension model (Plugin / Skill / MCP) makes tool discovery and capability growth programmable |
| You operate in restricted network environments and must use locally deployed models | OpenCode ✓ | It supports Ollama and any OpenAI-compatible endpoint |
| You only need occasional coding assistance and do not need Agent-level autonomy | GitHub Copilot | A completion-oriented tool is lighter-weight; an Agent workflow would be overkill here |
| You want to learn production-grade AI Agent architecture directly from source code | OpenCode ✓ | It is MIT-licensed, architecturally clean, well-commented, and an excellent teaching specimen |
选择 OpenCode 意味着接受以下成本,值不值得由你的优先级决定:
- 内存 >1GB:Bun 运行时 + Language Server + Session 状态常驻内存,不适合 RAM < 4GB 的低端机器;
- 版本发布节奏极快:主线每周多次 commit,breaking change 偶发。最稳妥的策略是固定版本号而非总跟 latest;
- Effect-TS 学习曲线:源码大量使用 Effect 代数效果系统。阅读或 fork 修改时,不懂 Effect-TS 会感到极度陌生;
- 无沙箱隔离:权限系统是 UX 层,不是安全层。Agent 执行的 bash 命令与你的用户权限完全相同。
换来的收益:完全的工具链自主权、Provider 自由、可编程的扩展点——如果这些对你重要,上述成本完全值得。
Choosing OpenCode means accepting the following costs. Whether they are worth it depends entirely on your priorities:
- Memory > 1GB: the Bun runtime, Language Server processes, and Session state all stay resident, so it is not ideal for low-end machines with less than 4GB RAM;
- Very rapid release cadence: the main branch moves several times per week, and breaking changes do happen. Pinning versions is safer than always tracking latest;
- Effect-TS learning curve: the codebase relies heavily on algebraic effects. If you are reading or forking it without prior Effect-TS familiarity, it will feel alien at first;
- No sandbox isolation: the permission system is a UX layer, not a security layer. Any bash command the Agent runs inherits your user-level privileges.
What you get in return is full ownership of your toolchain, Provider freedom, and programmable extension points. If those matter to you, the cost is entirely reasonable.
OpenCode 揭示了哪些 AI Agent 设计的普遍规律?What General AI Agent Design Principles Does OpenCode Reveal?
透过 OpenCode 的技术决策,可以提炼出几条适用于所有 AI Agent 系统的设计原则:
Agent 的「大脑」(推理、状态管理、工具调用)与「界面」(TUI/GUI/API)天然解耦。强行合并会导致界面代码与核心逻辑互相污染,也无法让同一个 Agent 同时服务于终端、IDE 插件和 CI/CD 流水线。C/S 分离是扩展边界清晰化的前提。
支持多模型不等于「只能用所有模型都支持的功能」。OpenCode 通过 capability negotiation(运行时检测模型是否支持 tool_call、streaming、vision),让每个 Provider 充分发挥自身能力。这是 Provider 无关设计的正确姿势——抽象不应该削峰,而应该在通用接口上为专有能力留出扩展口。
Session Compaction 的本质是一个资源调度问题:当「短期工作记忆」(原始对话历史)接近上限时,必须将其转换为压缩的「结构化摘要」。这与操作系统的内存换页机制同构——不是丢弃,而是降密存储。这条原则适用于任何需要处理长序列的 Agent。
过于保守的权限设计(每个操作都需确认)会让用户把 Agent 当成负担;过于宽松则制造恐惧。OpenCode 的 glob 级别权限是一个合理的平衡点:足够精细(src/**/*.ts),又不至于每次写文件都弹框。这个粒度设计值得所有需要在自主性和安全感之间取舍的 Agent 系统借鉴。
OpenCode’s design choices can be distilled into several principles that apply broadly to AI Agent systems in general:
The Agent’s “brain” (reasoning, state management, tool invocation) is naturally decoupled from its “interface” (TUI / GUI / API). Forcing them together pollutes core logic with UI concerns and prevents the same Agent from serving terminals, IDE plugins, and CI/CD pipelines at once. C/S separation is what makes clean extension boundaries possible.
Supporting many models does not mean limiting yourself to whatever every model supports. OpenCode uses capability negotiation at runtime—checking whether a model supports tool calls, streaming, or vision—so each Provider can express its strengths. Good abstraction should not flatten peaks; it should preserve extension points for differentiated capabilities.
Session Compaction is fundamentally a resource scheduling problem. When short-term working memory—the raw conversation history—approaches the limit, it must be converted into a compressed structured summary. This is analogous to OS paging: not deletion, but lower-density storage. The principle applies to any Agent handling long sequences.
If permissions are too conservative and every action requires confirmation, the Agent becomes a burden. If they are too loose, users become afraid to let it operate. OpenCode’s glob-level permissions strike a practical balance: precise enough for patterns like src/**/*.ts, but not so noisy that every write triggers a prompt. That balance is worth copying.
如果你符合以下特征,OpenCode 很可能会成为你最重要的开发工具之一:
- 以终端为主要工作环境,SSH 进远程服务器是日常操作;
- 不信任单一 LLM 供应商,希望保留随时切换或混用模型的能力;
- 有定制化需求——想写自己的工具、接入内部 API、扩展 Agent 行为;
- 对 AI Agent 的底层原理感兴趣,愿意通过阅读生产代码来学习;
- 认为「控制工具链」比「UI 精致度」更重要的工程师文化。
对于希望深入理解 AI Agent 架构的开发者,OpenCode 是一个极好的学习对象——代码完全公开,架构清晰,文档完善。比起阅读论文中的抽象描述,直接阅读一个 141k star 的生产级项目源码,收获会大得多。
If the following describes you, OpenCode is very likely to become one of your most important development tools:
- Your primary working environment is the terminal, and SSH-ing into remote servers is routine;
- You do not trust a single LLM vendor and want the freedom to switch or mix models at any time;
- You need customization—writing your own tools, integrating internal APIs, and extending Agent behavior;
- You care about the underlying mechanics of AI Agents and are willing to learn by reading production code;
- You come from an engineering culture where controlling the toolchain matters more than UI polish.
For developers who want to understand AI Agent architecture in depth, OpenCode is an exceptional learning target: the code is fully public, the architecture is clear, and the documentation is solid. Compared with reading abstract descriptions in papers, studying the source of a 141k-star production system is dramatically more instructive.
curl -fsSL https://opencode.ai/install | bashcd your-project && opencode然后运行
/init 初始化项目。文档:opencode.ai/docs
curl -fsSL https://opencode.ai/install | bashcd your-project && opencodeThen run
/init to initialize the project. Docs: opencode.ai/docs