main: 增强工具调用与消息流程

- 支持 tool.call 和 tool.result 消息类型处理 - 引入 Tool 调度与执行逻辑，支持超时与结果截断 - 增加 ToolRegistry 和 ToolExecutor 管理工具定义与执行 - 更新上下文构建与消息映射逻辑，适配工具闭环处理 - 扩展配置与环境变量，支持 Tool 调用相关选项 - 增强单元测试覆盖工具调用与执行情景 - 更新文档和 OpenAPI，新增工具相关说明与模型定义
2025-12-22 12:36:59 +08:00
parent dcbd0338e6
commit 59d4831f00
23 changed files with 1253 additions and 103 deletions
--- a/docs/ChatSession/chat-session-api.md
+++ b/docs/ChatSession/chat-session-api.md
@@ -10,14 +10,15 @@
 - 2025-02-15：Agent Run MVP-0 —— RunDispatcher + AgentRunJob + DummyProvider；自动在 user.prompt 后触发一次 Run，落地 run.status / agent.message。
 - 2025-12-18：Agent Run 可靠性增强 —— 并发幂等、终态去重、取消语义加强、Provider 超时/重试/错误归一，SSE gap 回补与心跳。
 - 2025-12-19：AgentProvider Streaming 接入 —— ProviderEvent 统一事件流，新增 message.delta 输出与 OpenAI-compatible 适配器。
+- 2025-12-21：Tool 子 Run 模式 —— Provider 支持 tool.delta→tool.call，父 Run 调度子 Run 执行工具并写入 tool.result。

-## 本次变更摘要（2025-12-19）
+## 本次变更摘要（2025-12-21）
 - RunDispatcher 并发幂等：同 trigger_message_id 只产生一个 RUNNING，且仅新建时 dispatch。
- RunLoop/OutputSink 幂等：agent.message 与 run.status 采用 dedupe_key；重复执行不重复写。
- Cancel 强化：多检查点取消，确保不落 agent.message 且落 CANCELED 终态。
+- RunLoop/OutputSink 幂等：agent.message、run.status、tool.call、tool.result 均采用 dedupe_key。
+- Cancel 强化：多检查点取消，确保不落 agent.message 且落 CANCELED 终态；父 Run 取消会终止等待的子 Run。
 - Provider 可靠性：超时/重试/429/5xx，错误落库包含 retryable/http_status/provider/latency_ms。
- SSE 可靠性：gap 触发回补，心跳保活，publish 异常不影响主流程。
- Streaming：AgentProvider 以事件流产出 message.delta，RunLoop 汇总后写入 agent.message。
+- Streaming：AgentProvider 产出 message.delta / tool.delta / done；finish_reason=tool_calls 会触发子 Run 执行工具。
+- 工具闭环：tool.call（role=AGENT）落库→子 Run 调度→tool.result（role=TOOL）回灌→进入下一轮 LLM。

 ## 领域模型
 - `ChatSession`：`session_id`(UUID)、`session_name`、`status`(`OPEN`/`LOCKED`/`CLOSED`)、`last_seq`
@@ -25,6 +26,7 @@
 - 幂等：`UNIQUE (session_id, dedupe_key)`；同一 dedupe_key 返回已有消息。
 - 状态门禁：`CLOSED` 禁止追加，例外 `role=SYSTEM && type in [run.status, error]`；`LOCKED` 禁止 `role=USER && type=user.prompt`。
 - 会话缓存：`chat_sessions.last_message_id` 记录最后一条消息；`appendMessage` 事务内同步更新 `last_seq`、`last_message_id`、`updated_at`。
+- 工具消息：`tool.call`（role=AGENT，携带 tool_call_id/name/arguments）、`tool.result`（role=TOOL，携带 parent_run_id/run_id/status/result）。

 ## 接口
 ### 创建会话
--- a/docs/ChatSession/chat-session-openapi.yaml
+++ b/docs/ChatSession/chat-session-openapi.yaml
@@ -436,6 +436,8 @@ components:
            - $ref: '#/components/schemas/MessageDeltaPayload'
            - $ref: '#/components/schemas/RunCancelPayload'
            - $ref: '#/components/schemas/RunErrorPayload'
+            - $ref: '#/components/schemas/ToolCallPayload'
+            - $ref: '#/components/schemas/ToolResultPayload'
            - type: object
        reply_to:
          type: string
@@ -544,6 +546,38 @@ components:
        raw_message:
          type: string
          nullable: true
+    ToolCallPayload:
+      type: object
+      properties:
+        run_id:
+          type: string
+        tool_run_id:
+          type: string
+        tool_call_id:
+          type: string
+        name:
+          type: string
+        arguments:
+          type: object
+          additionalProperties: true
+    ToolResultPayload:
+      type: object
+      properties:
+        run_id:
+          type: string
+        parent_run_id:
+          type: string
+        tool_call_id:
+          type: string
+        name:
+          type: string
+        status:
+          type: string
+        error:
+          type: string
+          nullable: true
+        truncated:
+          type: boolean
    PaginationLinks:
      type: object
      properties:
--- a/docs/tools-subrun.md
+++ b/docs/tools-subrun.md
@@ -0,0 +1,25 @@
+# 工具调用（子 Run 模式）最小闭环
+
+本次改动新增 Tool 子系统，保持 RunLoop/Provider 的事件驱动模型不变，通过“子 Run”执行工具并把结果回灌到父 Run。
+
+## 关键链路
+- Provider 产生 `tool.call`（Streaming 中的 `tool.delta` 聚合），RunLoop 落库 `tool.call` 并生成子 Run `run:{parent}:{tool_call_id}`。
+- `ToolRunJob` 执行具体工具（当前内置 `get_time`），写入 `tool.result` 与子 Run 的 `run.status`。
+- 父 Run 轮询等待子 Run 结果（超时/失败即终止），将 `tool.result` 追加到上下文后再次调用 Provider，直至产出最终 `agent.message`。
+- 幂等：`tool.call`、子 Run `run.status` 与 `tool.result` 均带 dedupe_key；同一个 tool_call_id 只会执行一次。
+
+## 消息/事件
+- 新增消息类型：`tool.call`（role=AGENT，payload 含 tool_call_id/name/arguments）、`tool.result`（role=TOOL，payload 含 parent_run_id/run_id/status/result）。
+- Provider 事件新增 `tool.delta`，RunLoop 内部聚合后才触发子 Run；`finish_reason=tool_calls` 会结束本轮流并进入工具执行。
+
+## 配置要点
+- `AGENT_TOOL_MAX_CALLS_PER_RUN`：单个父 Run 允许的工具调用次数（默认 1，超过直接失败）。
+- `AGENT_TOOL_WAIT_TIMEOUT_MS` / `AGENT_TOOL_WAIT_POLL_MS`：父 Run 等待子 Run 结果的超时与轮询间隔。
+- `AGENT_TOOL_TIMEOUT_SECONDS` / `AGENT_TOOL_RESULT_MAX_BYTES`：工具执行超时标记与结果截断保护。
+- `AGENT_TOOL_CHOICE`：传递给 OpenAI 的 tool_choice（默认 auto）。
+- ToolRunJob 队列参数：`AGENT_TOOL_JOB_TRIES` / `AGENT_TOOL_JOB_BACKOFF` / `AGENT_TOOL_JOB_TIMEOUT`。
+
+## 预留/限制
+- 目前仅支持单工具调用闭环；多次调用的上限可调但仍是串行流程。
+- 工具列表可通过 `ToolRegistry` 扩展（当前内置 `get_time` 纯函数）。
+- 结果超时为父 Run 级别的软超时，PHP 层未强制中断长耗时函数（后续可接入外部超时控制）。