resolve conflicts with main

This commit is contained in:
Re-bin 2026-02-06 07:08:29 +00:00
commit 71fc73ecc4
22 changed files with 889 additions and 75 deletions

2
.gitignore vendored
View File

@ -13,3 +13,5 @@ docs/
*.pyz *.pyz
*.pywz *.pywz
*.pyzz *.pyzz
.venv/
__pycache__/

View File

@ -18,7 +18,10 @@
## 📢 News ## 📢 News
- **2026-02-01** 🎉 nanobot launched! Welcome to try 🐈 nanobot! - **2026-02-05** ✨ Added Feishu channel, DeepSeek provider, and enhanced scheduled tasks support!
- **2026-02-04** 🚀 Released v0.1.3.post4 with multi-provider & Docker support! Check [release notes](https://github.com/HKUDS/nanobot/releases/tag/v0.1.3.post4) for details.
- **2026-02-03** ⚡ Integrated vLLM for local LLM support and improved natural language task scheduling!
- **2026-02-02** 🎉 nanobot officially launched! Welcome to try 🐈 nanobot!
## Key Features of nanobot: ## Key Features of nanobot:
@ -28,7 +31,7 @@
⚡️ **Lightning Fast**: Minimal footprint means faster startup, lower resource usage, and quicker iterations. ⚡️ **Lightning Fast**: Minimal footprint means faster startup, lower resource usage, and quicker iterations.
💎 **Easy-to-Use**: One-click to depoly and you're ready to go. 💎 **Easy-to-Use**: One-click to deploy and you're ready to go.
## 🏗️ Architecture ## 🏗️ Architecture
@ -108,9 +111,13 @@ nanobot onboard
"model": "anthropic/claude-opus-4-5" "model": "anthropic/claude-opus-4-5"
} }
}, },
"webSearch": { "tools": {
"web": {
"search": {
"apiKey": "BSA-xxx" "apiKey": "BSA-xxx"
} }
}
}
} }
``` ```
@ -162,13 +169,14 @@ nanobot agent -m "Hello from my local LLM!"
## 💬 Chat Apps ## 💬 Chat Apps
Talk to your nanobot through Telegram, Discord, or WhatsApp — anytime, anywhere. Talk to your nanobot through Telegram, Discord, WhatsApp, or Feishu — anytime, anywhere.
| Channel | Setup | | Channel | Setup |
|---------|-------| |---------|-------|
| **Telegram** | Easy (just a token) | | **Telegram** | Easy (just a token) |
| **Discord** | Easy (bot token + intents) | | **Discord** | Easy (bot token + intents) |
| **WhatsApp** | Medium (scan QR) | | **WhatsApp** | Medium (scan QR) |
| **Feishu** | Medium (app credentials) |
<details> <details>
<summary><b>Telegram</b> (Recommended)</summary> <summary><b>Telegram</b> (Recommended)</summary>
@ -283,6 +291,55 @@ nanobot gateway
</details> </details>
<details>
<summary><b>Feishu (飞书)</b></summary>
Uses **WebSocket** long connection — no public IP required.
```bash
pip install nanobot-ai[feishu]
```
**1. Create a Feishu bot**
- Visit [Feishu Open Platform](https://open.feishu.cn/app)
- Create a new app → Enable **Bot** capability
- **Permissions**: Add `im:message` (send messages)
- **Events**: Add `im.message.receive_v1` (receive messages)
- Select **Long Connection** mode (requires running nanobot first to establish connection)
- Get **App ID** and **App Secret** from "Credentials & Basic Info"
- Publish the app
**2. Configure**
```json
{
"channels": {
"feishu": {
"enabled": true,
"appId": "cli_xxx",
"appSecret": "xxx",
"encryptKey": "",
"verificationToken": "",
"allowFrom": []
}
}
}
```
> `encryptKey` and `verificationToken` are optional for Long Connection mode.
> `allowFrom`: Leave empty to allow all users, or add `["ou_xxx"]` to restrict access.
**3. Run**
```bash
nanobot gateway
```
> [!TIP]
> Feishu uses WebSocket to receive messages — no webhook or public IP needed!
</details>
## ⚙️ Configuration ## ⚙️ Configuration
Config file: `~/.nanobot/config.json` Config file: `~/.nanobot/config.json`
@ -297,6 +354,7 @@ Config file: `~/.nanobot/config.json`
| `openrouter` | LLM (recommended, access to all models) | [openrouter.ai](https://openrouter.ai) | | `openrouter` | LLM (recommended, access to all models) | [openrouter.ai](https://openrouter.ai) |
| `anthropic` | LLM (Claude direct) | [console.anthropic.com](https://console.anthropic.com) | | `anthropic` | LLM (Claude direct) | [console.anthropic.com](https://console.anthropic.com) |
| `openai` | LLM (GPT direct) | [platform.openai.com](https://platform.openai.com) | | `openai` | LLM (GPT direct) | [platform.openai.com](https://platform.openai.com) |
| `deepseek` | LLM (DeepSeek direct) | [platform.deepseek.com](https://platform.deepseek.com) |
| `groq` | LLM + **Voice transcription** (Whisper) | [console.groq.com](https://console.groq.com) | | `groq` | LLM + **Voice transcription** (Whisper) | [console.groq.com](https://console.groq.com) |
| `gemini` | LLM (Gemini direct) | [aistudio.google.com](https://aistudio.google.com) | | `gemini` | LLM (Gemini direct) | [aistudio.google.com](https://aistudio.google.com) |
@ -332,6 +390,14 @@ Config file: `~/.nanobot/config.json`
}, },
"whatsapp": { "whatsapp": {
"enabled": false "enabled": false
},
"feishu": {
"enabled": false,
"appId": "cli_xxx",
"appSecret": "xxx",
"encryptKey": "",
"verificationToken": "",
"allowFrom": []
} }
}, },
"tools": { "tools": {
@ -438,7 +504,7 @@ PRs welcome! The codebase is intentionally small and readable. 🤗
### Contributors ### Contributors
<a href="https://github.com/HKUDS/nanobot/graphs/contributors"> <a href="https://github.com/HKUDS/nanobot/graphs/contributors">
<img src="https://contrib.rocks/image?repo=HKUDS/nanobot" /> <img src="https://contrib.rocks/image?repo=HKUDS/nanobot&max=100&columns=12" />
</a> </a>

View File

@ -2,6 +2,7 @@
import base64 import base64
import mimetypes import mimetypes
import platform
from pathlib import Path from pathlib import Path
from typing import Any from typing import Any
@ -74,6 +75,8 @@ Skills with available="false" need dependencies installed first - you can try in
from datetime import datetime from datetime import datetime
now = datetime.now().strftime("%Y-%m-%d %H:%M (%A)") now = datetime.now().strftime("%Y-%m-%d %H:%M (%A)")
workspace_path = str(self.workspace.expanduser().resolve()) workspace_path = str(self.workspace.expanduser().resolve())
system = platform.system()
runtime = f"{'macOS' if system == 'Darwin' else system} {platform.machine()}, Python {platform.python_version()}"
return f"""# nanobot 🐈 return f"""# nanobot 🐈
@ -87,6 +90,9 @@ You are nanobot, a helpful AI assistant. You have access to tools that allow you
## Current Time ## Current Time
{now} {now}
## Runtime
{runtime}
## Workspace ## Workspace
Your workspace is at: {workspace_path} Your workspace is at: {workspace_path}
- Memory files: {workspace_path}/memory/MEMORY.md - Memory files: {workspace_path}/memory/MEMORY.md
@ -118,6 +124,8 @@ When remembering something, write to {workspace_path}/memory/MEMORY.md"""
current_message: str, current_message: str,
skill_names: list[str] | None = None, skill_names: list[str] | None = None,
media: list[str] | None = None, media: list[str] | None = None,
channel: str | None = None,
chat_id: str | None = None,
) -> list[dict[str, Any]]: ) -> list[dict[str, Any]]:
""" """
Build the complete message list for an LLM call. Build the complete message list for an LLM call.
@ -127,6 +135,8 @@ When remembering something, write to {workspace_path}/memory/MEMORY.md"""
current_message: The new user message. current_message: The new user message.
skill_names: Optional skills to include. skill_names: Optional skills to include.
media: Optional list of local file paths for images/media. media: Optional list of local file paths for images/media.
channel: Current channel (telegram, feishu, etc.).
chat_id: Current chat/user ID.
Returns: Returns:
List of messages including system prompt. List of messages including system prompt.
@ -135,6 +145,8 @@ When remembering something, write to {workspace_path}/memory/MEMORY.md"""
# System prompt # System prompt
system_prompt = self.build_system_prompt(skill_names) system_prompt = self.build_system_prompt(skill_names)
if channel and chat_id:
system_prompt += f"\n\n## Current Session\nChannel: {channel}\nChat ID: {chat_id}"
messages.append({"role": "system", "content": system_prompt}) messages.append({"role": "system", "content": system_prompt})
# History # History

View File

@ -17,6 +17,7 @@ from nanobot.agent.tools.shell import ExecTool
from nanobot.agent.tools.web import WebSearchTool, WebFetchTool from nanobot.agent.tools.web import WebSearchTool, WebFetchTool
from nanobot.agent.tools.message import MessageTool from nanobot.agent.tools.message import MessageTool
from nanobot.agent.tools.spawn import SpawnTool from nanobot.agent.tools.spawn import SpawnTool
from nanobot.agent.tools.cron import CronTool
from nanobot.agent.subagent import SubagentManager from nanobot.agent.subagent import SubagentManager
from nanobot.session.manager import SessionManager from nanobot.session.manager import SessionManager
@ -40,14 +41,20 @@ class AgentLoop:
workspace: Path, workspace: Path,
model: str | None = None, model: str | None = None,
max_iterations: int = 20, max_iterations: int = 20,
brave_api_key: str | None = None brave_api_key: str | None = None,
exec_config: "ExecToolConfig | None" = None,
cron_service: "CronService | None" = None,
): ):
from nanobot.config.schema import ExecToolConfig
from nanobot.cron.service import CronService
self.bus = bus self.bus = bus
self.provider = provider self.provider = provider
self.workspace = workspace self.workspace = workspace
self.model = model or provider.get_default_model() self.model = model or provider.get_default_model()
self.max_iterations = max_iterations self.max_iterations = max_iterations
self.brave_api_key = brave_api_key self.brave_api_key = brave_api_key
self.exec_config = exec_config or ExecToolConfig()
self.cron_service = cron_service
self.context = ContextBuilder(workspace) self.context = ContextBuilder(workspace)
self.sessions = SessionManager(workspace) self.sessions = SessionManager(workspace)
@ -58,6 +65,7 @@ class AgentLoop:
bus=bus, bus=bus,
model=self.model, model=self.model,
brave_api_key=brave_api_key, brave_api_key=brave_api_key,
exec_config=self.exec_config,
) )
self._running = False self._running = False
@ -72,7 +80,11 @@ class AgentLoop:
self.tools.register(ListDirTool()) self.tools.register(ListDirTool())
# Shell tool # Shell tool
self.tools.register(ExecTool(working_dir=str(self.workspace))) self.tools.register(ExecTool(
working_dir=str(self.workspace),
timeout=self.exec_config.timeout,
restrict_to_workspace=self.exec_config.restrict_to_workspace,
))
# Web tools # Web tools
self.tools.register(WebSearchTool(api_key=self.brave_api_key)) self.tools.register(WebSearchTool(api_key=self.brave_api_key))
@ -86,6 +98,10 @@ class AgentLoop:
spawn_tool = SpawnTool(manager=self.subagents) spawn_tool = SpawnTool(manager=self.subagents)
self.tools.register(spawn_tool) self.tools.register(spawn_tool)
# Cron tool (for scheduling)
if self.cron_service:
self.tools.register(CronTool(self.cron_service))
async def run(self) -> None: async def run(self) -> None:
"""Run the agent loop, processing messages from the bus.""" """Run the agent loop, processing messages from the bus."""
self._running = True self._running = True
@ -149,11 +165,17 @@ class AgentLoop:
if isinstance(spawn_tool, SpawnTool): if isinstance(spawn_tool, SpawnTool):
spawn_tool.set_context(msg.channel, msg.chat_id) spawn_tool.set_context(msg.channel, msg.chat_id)
cron_tool = self.tools.get("cron")
if isinstance(cron_tool, CronTool):
cron_tool.set_context(msg.channel, msg.chat_id)
# Build initial messages (use get_history for LLM-formatted messages) # Build initial messages (use get_history for LLM-formatted messages)
messages = self.context.build_messages( messages = self.context.build_messages(
history=session.get_history(), history=session.get_history(),
current_message=msg.content, current_message=msg.content,
media=msg.media if msg.media else None, media=msg.media if msg.media else None,
channel=msg.channel,
chat_id=msg.chat_id,
) )
# Agent loop # Agent loop
@ -247,10 +269,16 @@ class AgentLoop:
if isinstance(spawn_tool, SpawnTool): if isinstance(spawn_tool, SpawnTool):
spawn_tool.set_context(origin_channel, origin_chat_id) spawn_tool.set_context(origin_channel, origin_chat_id)
cron_tool = self.tools.get("cron")
if isinstance(cron_tool, CronTool):
cron_tool.set_context(origin_channel, origin_chat_id)
# Build messages with the announce content # Build messages with the announce content
messages = self.context.build_messages( messages = self.context.build_messages(
history=session.get_history(), history=session.get_history(),
current_message=msg.content current_message=msg.content,
channel=origin_channel,
chat_id=origin_chat_id,
) )
# Agent loop (limited for announce handling) # Agent loop (limited for announce handling)
@ -307,21 +335,29 @@ class AgentLoop:
content=final_content content=final_content
) )
async def process_direct(self, content: str, session_key: str = "cli:direct") -> str: async def process_direct(
self,
content: str,
session_key: str = "cli:direct",
channel: str = "cli",
chat_id: str = "direct",
) -> str:
""" """
Process a message directly (for CLI usage). Process a message directly (for CLI or cron usage).
Args: Args:
content: The message content. content: The message content.
session_key: Session identifier. session_key: Session identifier.
channel: Source channel (for context).
chat_id: Source chat ID (for context).
Returns: Returns:
The agent's response. The agent's response.
""" """
msg = InboundMessage( msg = InboundMessage(
channel="cli", channel=channel,
sender_id="user", sender_id="user",
chat_id="direct", chat_id=chat_id,
content=content content=content
) )

View File

@ -33,12 +33,15 @@ class SubagentManager:
bus: MessageBus, bus: MessageBus,
model: str | None = None, model: str | None = None,
brave_api_key: str | None = None, brave_api_key: str | None = None,
exec_config: "ExecToolConfig | None" = None,
): ):
from nanobot.config.schema import ExecToolConfig
self.provider = provider self.provider = provider
self.workspace = workspace self.workspace = workspace
self.bus = bus self.bus = bus
self.model = model or provider.get_default_model() self.model = model or provider.get_default_model()
self.brave_api_key = brave_api_key self.brave_api_key = brave_api_key
self.exec_config = exec_config or ExecToolConfig()
self._running_tasks: dict[str, asyncio.Task[None]] = {} self._running_tasks: dict[str, asyncio.Task[None]] = {}
async def spawn( async def spawn(
@ -96,7 +99,11 @@ class SubagentManager:
tools.register(ReadFileTool()) tools.register(ReadFileTool())
tools.register(WriteFileTool()) tools.register(WriteFileTool())
tools.register(ListDirTool()) tools.register(ListDirTool())
tools.register(ExecTool(working_dir=str(self.workspace))) tools.register(ExecTool(
working_dir=str(self.workspace),
timeout=self.exec_config.timeout,
restrict_to_workspace=self.exec_config.restrict_to_workspace,
))
tools.register(WebSearchTool(api_key=self.brave_api_key)) tools.register(WebSearchTool(api_key=self.brave_api_key))
tools.register(WebFetchTool()) tools.register(WebFetchTool())
@ -142,7 +149,8 @@ class SubagentManager:
# Execute tools # Execute tools
for tool_call in response.tool_calls: for tool_call in response.tool_calls:
logger.debug(f"Subagent [{task_id}] executing: {tool_call.name}") args_str = json.dumps(tool_call.arguments)
logger.debug(f"Subagent [{task_id}] executing: {tool_call.name} with arguments: {args_str}")
result = await tools.execute(tool_call.name, tool_call.arguments) result = await tools.execute(tool_call.name, tool_call.arguments)
messages.append({ messages.append({
"role": "tool", "role": "tool",

View File

@ -12,6 +12,15 @@ class Tool(ABC):
the environment, such as reading files, executing commands, etc. the environment, such as reading files, executing commands, etc.
""" """
_TYPE_MAP = {
"string": str,
"integer": int,
"number": (int, float),
"boolean": bool,
"array": list,
"object": dict,
}
@property @property
@abstractmethod @abstractmethod
def name(self) -> str: def name(self) -> str:
@ -43,6 +52,44 @@ class Tool(ABC):
""" """
pass pass
def validate_params(self, params: dict[str, Any]) -> list[str]:
"""Validate tool parameters against JSON schema. Returns error list (empty if valid)."""
schema = self.parameters or {}
if schema.get("type", "object") != "object":
raise ValueError(f"Schema must be object type, got {schema.get('type')!r}")
return self._validate(params, {**schema, "type": "object"}, "")
def _validate(self, val: Any, schema: dict[str, Any], path: str) -> list[str]:
t, label = schema.get("type"), path or "parameter"
if t in self._TYPE_MAP and not isinstance(val, self._TYPE_MAP[t]):
return [f"{label} should be {t}"]
errors = []
if "enum" in schema and val not in schema["enum"]:
errors.append(f"{label} must be one of {schema['enum']}")
if t in ("integer", "number"):
if "minimum" in schema and val < schema["minimum"]:
errors.append(f"{label} must be >= {schema['minimum']}")
if "maximum" in schema and val > schema["maximum"]:
errors.append(f"{label} must be <= {schema['maximum']}")
if t == "string":
if "minLength" in schema and len(val) < schema["minLength"]:
errors.append(f"{label} must be at least {schema['minLength']} chars")
if "maxLength" in schema and len(val) > schema["maxLength"]:
errors.append(f"{label} must be at most {schema['maxLength']} chars")
if t == "object":
props = schema.get("properties", {})
for k in schema.get("required", []):
if k not in val:
errors.append(f"missing required {path + '.' + k if path else k}")
for k, v in val.items():
if k in props:
errors.extend(self._validate(v, props[k], path + '.' + k if path else k))
if t == "array" and "items" in schema:
for i, item in enumerate(val):
errors.extend(self._validate(item, schema["items"], f"{path}[{i}]" if path else f"[{i}]"))
return errors
def to_schema(self) -> dict[str, Any]: def to_schema(self) -> dict[str, Any]:
"""Convert tool to OpenAI function schema format.""" """Convert tool to OpenAI function schema format."""
return { return {

114
nanobot/agent/tools/cron.py Normal file
View File

@ -0,0 +1,114 @@
"""Cron tool for scheduling reminders and tasks."""
from typing import Any
from nanobot.agent.tools.base import Tool
from nanobot.cron.service import CronService
from nanobot.cron.types import CronSchedule
class CronTool(Tool):
"""Tool to schedule reminders and recurring tasks."""
def __init__(self, cron_service: CronService):
self._cron = cron_service
self._channel = ""
self._chat_id = ""
def set_context(self, channel: str, chat_id: str) -> None:
"""Set the current session context for delivery."""
self._channel = channel
self._chat_id = chat_id
@property
def name(self) -> str:
return "cron"
@property
def description(self) -> str:
return "Schedule reminders and recurring tasks. Actions: add, list, remove."
@property
def parameters(self) -> dict[str, Any]:
return {
"type": "object",
"properties": {
"action": {
"type": "string",
"enum": ["add", "list", "remove"],
"description": "Action to perform"
},
"message": {
"type": "string",
"description": "Reminder message (for add)"
},
"every_seconds": {
"type": "integer",
"description": "Interval in seconds (for recurring tasks)"
},
"cron_expr": {
"type": "string",
"description": "Cron expression like '0 9 * * *' (for scheduled tasks)"
},
"job_id": {
"type": "string",
"description": "Job ID (for remove)"
}
},
"required": ["action"]
}
async def execute(
self,
action: str,
message: str = "",
every_seconds: int | None = None,
cron_expr: str | None = None,
job_id: str | None = None,
**kwargs: Any
) -> str:
if action == "add":
return self._add_job(message, every_seconds, cron_expr)
elif action == "list":
return self._list_jobs()
elif action == "remove":
return self._remove_job(job_id)
return f"Unknown action: {action}"
def _add_job(self, message: str, every_seconds: int | None, cron_expr: str | None) -> str:
if not message:
return "Error: message is required for add"
if not self._channel or not self._chat_id:
return "Error: no session context (channel/chat_id)"
# Build schedule
if every_seconds:
schedule = CronSchedule(kind="every", every_ms=every_seconds * 1000)
elif cron_expr:
schedule = CronSchedule(kind="cron", expr=cron_expr)
else:
return "Error: either every_seconds or cron_expr is required"
job = self._cron.add_job(
name=message[:30],
schedule=schedule,
message=message,
deliver=True,
channel=self._channel,
to=self._chat_id,
)
return f"Created job '{job.name}' (id: {job.id})"
def _list_jobs(self) -> str:
jobs = self._cron.list_jobs()
if not jobs:
return "No scheduled jobs."
lines = [f"- {j.name} (id: {j.id}, {j.schedule.kind})" for j in jobs]
return "Scheduled jobs:\n" + "\n".join(lines)
def _remove_job(self, job_id: str | None) -> str:
if not job_id:
return "Error: job_id is required for remove"
if self._cron.remove_job(job_id):
return f"Removed job {job_id}"
return f"Job {job_id} not found"

View File

@ -54,6 +54,9 @@ class ToolRegistry:
return f"Error: Tool '{name}' not found" return f"Error: Tool '{name}' not found"
try: try:
errors = tool.validate_params(params)
if errors:
return f"Error: Invalid parameters for tool '{name}': " + "; ".join(errors)
return await tool.execute(**params) return await tool.execute(**params)
except Exception as e: except Exception as e:
return f"Error executing {name}: {str(e)}" return f"Error executing {name}: {str(e)}"

View File

@ -2,6 +2,8 @@
import asyncio import asyncio
import os import os
import re
from pathlib import Path
from typing import Any from typing import Any
from nanobot.agent.tools.base import Tool from nanobot.agent.tools.base import Tool
@ -10,9 +12,28 @@ from nanobot.agent.tools.base import Tool
class ExecTool(Tool): class ExecTool(Tool):
"""Tool to execute shell commands.""" """Tool to execute shell commands."""
def __init__(self, timeout: int = 60, working_dir: str | None = None): def __init__(
self,
timeout: int = 60,
working_dir: str | None = None,
deny_patterns: list[str] | None = None,
allow_patterns: list[str] | None = None,
restrict_to_workspace: bool = False,
):
self.timeout = timeout self.timeout = timeout
self.working_dir = working_dir self.working_dir = working_dir
self.deny_patterns = deny_patterns or [
r"\brm\s+-[rf]{1,2}\b", # rm -r, rm -rf, rm -fr
r"\bdel\s+/[fq]\b", # del /f, del /q
r"\brmdir\s+/s\b", # rmdir /s
r"\b(format|mkfs|diskpart)\b", # disk operations
r"\bdd\s+if=", # dd
r">\s*/dev/sd", # write to disk
r"\b(shutdown|reboot|poweroff)\b", # system power
r":\(\)\s*\{.*\};\s*:", # fork bomb
]
self.allow_patterns = allow_patterns or []
self.restrict_to_workspace = restrict_to_workspace
@property @property
def name(self) -> str: def name(self) -> str:
@ -41,6 +62,9 @@ class ExecTool(Tool):
async def execute(self, command: str, working_dir: str | None = None, **kwargs: Any) -> str: async def execute(self, command: str, working_dir: str | None = None, **kwargs: Any) -> str:
cwd = working_dir or self.working_dir or os.getcwd() cwd = working_dir or self.working_dir or os.getcwd()
guard_error = self._guard_command(command, cwd)
if guard_error:
return guard_error
try: try:
process = await asyncio.create_subprocess_shell( process = await asyncio.create_subprocess_shell(
@ -83,3 +107,35 @@ class ExecTool(Tool):
except Exception as e: except Exception as e:
return f"Error executing command: {str(e)}" return f"Error executing command: {str(e)}"
def _guard_command(self, command: str, cwd: str) -> str | None:
"""Best-effort safety guard for potentially destructive commands."""
cmd = command.strip()
lower = cmd.lower()
for pattern in self.deny_patterns:
if re.search(pattern, lower):
return "Error: Command blocked by safety guard (dangerous pattern detected)"
if self.allow_patterns:
if not any(re.search(p, lower) for p in self.allow_patterns):
return "Error: Command blocked by safety guard (not in allowlist)"
if self.restrict_to_workspace:
if "..\\" in cmd or "../" in cmd:
return "Error: Command blocked by safety guard (path traversal detected)"
cwd_path = Path(cwd).resolve()
win_paths = re.findall(r"[A-Za-z]:\\[^\\\"']+", cmd)
posix_paths = re.findall(r"/[^\s\"']+", cmd)
for raw in win_paths + posix_paths:
try:
p = Path(raw).resolve()
except Exception:
continue
if cwd_path not in p.parents and p != cwd_path:
return "Error: Command blocked by safety guard (path outside working dir)"
return None

View File

@ -5,6 +5,7 @@ import json
import os import os
import re import re
from typing import Any from typing import Any
from urllib.parse import urlparse
import httpx import httpx
@ -12,6 +13,7 @@ from nanobot.agent.tools.base import Tool
# Shared constants # Shared constants
USER_AGENT = "Mozilla/5.0 (Macintosh; Intel Mac OS X 14_7_2) AppleWebKit/537.36" USER_AGENT = "Mozilla/5.0 (Macintosh; Intel Mac OS X 14_7_2) AppleWebKit/537.36"
MAX_REDIRECTS = 5 # Limit redirects to prevent DoS attacks
def _strip_tags(text: str) -> str: def _strip_tags(text: str) -> str:
@ -28,6 +30,19 @@ def _normalize(text: str) -> str:
return re.sub(r'\n{3,}', '\n\n', text).strip() return re.sub(r'\n{3,}', '\n\n', text).strip()
def _validate_url(url: str) -> tuple[bool, str]:
"""Validate URL: must be http(s) with valid domain."""
try:
p = urlparse(url)
if p.scheme not in ('http', 'https'):
return False, f"Only http/https allowed, got '{p.scheme or 'none'}'"
if not p.netloc:
return False, "Missing domain"
return True, ""
except Exception as e:
return False, str(e)
class WebSearchTool(Tool): class WebSearchTool(Tool):
"""Search the web using Brave Search API.""" """Search the web using Brave Search API."""
@ -98,9 +113,18 @@ class WebFetchTool(Tool):
max_chars = maxChars or self.max_chars max_chars = maxChars or self.max_chars
# Validate URL before fetching
is_valid, error_msg = _validate_url(url)
if not is_valid:
return json.dumps({"error": f"URL validation failed: {error_msg}", "url": url})
try: try:
async with httpx.AsyncClient() as client: async with httpx.AsyncClient(
r = await client.get(url, headers={"User-Agent": USER_AGENT}, follow_redirects=True, timeout=30.0) follow_redirects=True,
max_redirects=MAX_REDIRECTS,
timeout=30.0
) as client:
r = await client.get(url, headers={"User-Agent": USER_AGENT})
r.raise_for_status() r.raise_for_status()
ctype = r.headers.get("content-type", "") ctype = r.headers.get("content-type", "")

263
nanobot/channels/feishu.py Normal file
View File

@ -0,0 +1,263 @@
"""Feishu/Lark channel implementation using lark-oapi SDK with WebSocket long connection."""
import asyncio
import json
import threading
from collections import OrderedDict
from typing import Any
from loguru import logger
from nanobot.bus.events import OutboundMessage
from nanobot.bus.queue import MessageBus
from nanobot.channels.base import BaseChannel
from nanobot.config.schema import FeishuConfig
try:
import lark_oapi as lark
from lark_oapi.api.im.v1 import (
CreateMessageRequest,
CreateMessageRequestBody,
CreateMessageReactionRequest,
CreateMessageReactionRequestBody,
Emoji,
P2ImMessageReceiveV1,
)
FEISHU_AVAILABLE = True
except ImportError:
FEISHU_AVAILABLE = False
lark = None
Emoji = None
# Message type display mapping
MSG_TYPE_MAP = {
"image": "[image]",
"audio": "[audio]",
"file": "[file]",
"sticker": "[sticker]",
}
class FeishuChannel(BaseChannel):
"""
Feishu/Lark channel using WebSocket long connection.
Uses WebSocket to receive events - no public IP or webhook required.
Requires:
- App ID and App Secret from Feishu Open Platform
- Bot capability enabled
- Event subscription enabled (im.message.receive_v1)
"""
name = "feishu"
def __init__(self, config: FeishuConfig, bus: MessageBus):
super().__init__(config, bus)
self.config: FeishuConfig = config
self._client: Any = None
self._ws_client: Any = None
self._ws_thread: threading.Thread | None = None
self._processed_message_ids: OrderedDict[str, None] = OrderedDict() # Ordered dedup cache
self._loop: asyncio.AbstractEventLoop | None = None
async def start(self) -> None:
"""Start the Feishu bot with WebSocket long connection."""
if not FEISHU_AVAILABLE:
logger.error("Feishu SDK not installed. Run: pip install lark-oapi")
return
if not self.config.app_id or not self.config.app_secret:
logger.error("Feishu app_id and app_secret not configured")
return
self._running = True
self._loop = asyncio.get_running_loop()
# Create Lark client for sending messages
self._client = lark.Client.builder() \
.app_id(self.config.app_id) \
.app_secret(self.config.app_secret) \
.log_level(lark.LogLevel.INFO) \
.build()
# Create event handler (only register message receive, ignore other events)
event_handler = lark.EventDispatcherHandler.builder(
self.config.encrypt_key or "",
self.config.verification_token or "",
).register_p2_im_message_receive_v1(
self._on_message_sync
).build()
# Create WebSocket client for long connection
self._ws_client = lark.ws.Client(
self.config.app_id,
self.config.app_secret,
event_handler=event_handler,
log_level=lark.LogLevel.INFO
)
# Start WebSocket client in a separate thread
def run_ws():
try:
self._ws_client.start()
except Exception as e:
logger.error(f"Feishu WebSocket error: {e}")
self._ws_thread = threading.Thread(target=run_ws, daemon=True)
self._ws_thread.start()
logger.info("Feishu bot started with WebSocket long connection")
logger.info("No public IP required - using WebSocket to receive events")
# Keep running until stopped
while self._running:
await asyncio.sleep(1)
async def stop(self) -> None:
"""Stop the Feishu bot."""
self._running = False
if self._ws_client:
try:
self._ws_client.stop()
except Exception as e:
logger.warning(f"Error stopping WebSocket client: {e}")
logger.info("Feishu bot stopped")
def _add_reaction_sync(self, message_id: str, emoji_type: str) -> None:
"""Sync helper for adding reaction (runs in thread pool)."""
try:
request = CreateMessageReactionRequest.builder() \
.message_id(message_id) \
.request_body(
CreateMessageReactionRequestBody.builder()
.reaction_type(Emoji.builder().emoji_type(emoji_type).build())
.build()
).build()
response = self._client.im.v1.message_reaction.create(request)
if not response.success():
logger.warning(f"Failed to add reaction: code={response.code}, msg={response.msg}")
else:
logger.debug(f"Added {emoji_type} reaction to message {message_id}")
except Exception as e:
logger.warning(f"Error adding reaction: {e}")
async def _add_reaction(self, message_id: str, emoji_type: str = "THUMBSUP") -> None:
"""
Add a reaction emoji to a message (non-blocking).
Common emoji types: THUMBSUP, OK, EYES, DONE, OnIt, HEART
"""
if not self._client or not Emoji:
return
loop = asyncio.get_running_loop()
await loop.run_in_executor(None, self._add_reaction_sync, message_id, emoji_type)
async def send(self, msg: OutboundMessage) -> None:
"""Send a message through Feishu."""
if not self._client:
logger.warning("Feishu client not initialized")
return
try:
# Determine receive_id_type based on chat_id format
# open_id starts with "ou_", chat_id starts with "oc_"
if msg.chat_id.startswith("oc_"):
receive_id_type = "chat_id"
else:
receive_id_type = "open_id"
# Build text message content
content = json.dumps({"text": msg.content})
request = CreateMessageRequest.builder() \
.receive_id_type(receive_id_type) \
.request_body(
CreateMessageRequestBody.builder()
.receive_id(msg.chat_id)
.msg_type("text")
.content(content)
.build()
).build()
response = self._client.im.v1.message.create(request)
if not response.success():
logger.error(
f"Failed to send Feishu message: code={response.code}, "
f"msg={response.msg}, log_id={response.get_log_id()}"
)
else:
logger.debug(f"Feishu message sent to {msg.chat_id}")
except Exception as e:
logger.error(f"Error sending Feishu message: {e}")
def _on_message_sync(self, data: "P2ImMessageReceiveV1") -> None:
"""
Sync handler for incoming messages (called from WebSocket thread).
Schedules async handling in the main event loop.
"""
if self._loop and self._loop.is_running():
asyncio.run_coroutine_threadsafe(self._on_message(data), self._loop)
async def _on_message(self, data: "P2ImMessageReceiveV1") -> None:
"""Handle incoming message from Feishu."""
try:
event = data.event
message = event.message
sender = event.sender
# Deduplication check
message_id = message.message_id
if message_id in self._processed_message_ids:
return
self._processed_message_ids[message_id] = None
# Trim cache: keep most recent 500 when exceeds 1000
while len(self._processed_message_ids) > 1000:
self._processed_message_ids.popitem(last=False)
# Skip bot messages
sender_type = sender.sender_type
if sender_type == "bot":
return
sender_id = sender.sender_id.open_id if sender.sender_id else "unknown"
chat_id = message.chat_id
chat_type = message.chat_type # "p2p" or "group"
msg_type = message.message_type
# Add reaction to indicate "seen"
await self._add_reaction(message_id, "THUMBSUP")
# Parse message content
if msg_type == "text":
try:
content = json.loads(message.content).get("text", "")
except json.JSONDecodeError:
content = message.content or ""
else:
content = MSG_TYPE_MAP.get(msg_type, f"[{msg_type}]")
if not content:
return
# Forward to message bus
reply_to = chat_id if chat_type == "group" else sender_id
await self._handle_message(
sender_id=sender_id,
chat_id=reply_to,
content=content,
metadata={
"message_id": message_id,
"chat_type": chat_type,
"msg_type": msg_type,
}
)
except Exception as e:
logger.error(f"Error processing Feishu message: {e}")

View File

@ -67,6 +67,17 @@ class ChannelManager:
except ImportError as e: except ImportError as e:
logger.warning(f"Discord channel not available: {e}") logger.warning(f"Discord channel not available: {e}")
# Feishu channel
if self.config.channels.feishu.enabled:
try:
from nanobot.channels.feishu import FeishuChannel
self.channels["feishu"] = FeishuChannel(
self.config.channels.feishu, self.bus
)
logger.info("Feishu channel enabled")
except ImportError as e:
logger.warning(f"Feishu channel not available: {e}")
async def start_all(self) -> None: async def start_all(self) -> None:
"""Start WhatsApp channel and the outbound dispatcher.""" """Start WhatsApp channel and the outbound dispatcher."""
if not self.channels: if not self.channels:

View File

@ -195,35 +195,40 @@ def gateway(
default_model=config.agents.defaults.model default_model=config.agents.defaults.model
) )
# Create agent # Create cron service first (callback set after agent creation)
cron_store_path = get_data_dir() / "cron" / "jobs.json"
cron = CronService(cron_store_path)
# Create agent with cron service
agent = AgentLoop( agent = AgentLoop(
bus=bus, bus=bus,
provider=provider, provider=provider,
workspace=config.workspace_path, workspace=config.workspace_path,
model=config.agents.defaults.model, model=config.agents.defaults.model,
max_iterations=config.agents.defaults.max_tool_iterations, max_iterations=config.agents.defaults.max_tool_iterations,
brave_api_key=config.tools.web.search.api_key or None brave_api_key=config.tools.web.search.api_key or None,
exec_config=config.tools.exec,
cron_service=cron,
) )
# Create cron service # Set cron callback (needs agent)
async def on_cron_job(job: CronJob) -> str | None: async def on_cron_job(job: CronJob) -> str | None:
"""Execute a cron job through the agent.""" """Execute a cron job through the agent."""
response = await agent.process_direct( response = await agent.process_direct(
job.payload.message, job.payload.message,
session_key=f"cron:{job.id}" session_key=f"cron:{job.id}",
channel=job.payload.channel or "cli",
chat_id=job.payload.to or "direct",
) )
# Optionally deliver to channel
if job.payload.deliver and job.payload.to: if job.payload.deliver and job.payload.to:
from nanobot.bus.events import OutboundMessage from nanobot.bus.events import OutboundMessage
await bus.publish_outbound(OutboundMessage( await bus.publish_outbound(OutboundMessage(
channel=job.payload.channel or "whatsapp", channel=job.payload.channel or "cli",
chat_id=job.payload.to, chat_id=job.payload.to,
content=response or "" content=response or ""
)) ))
return response return response
cron.on_job = on_cron_job
cron_store_path = get_data_dir() / "cron" / "jobs.json"
cron = CronService(cron_store_path, on_job=on_cron_job)
# Create heartbeat service # Create heartbeat service
async def on_heartbeat(prompt: str) -> str: async def on_heartbeat(prompt: str) -> str:
@ -309,7 +314,8 @@ def agent(
bus=bus, bus=bus,
provider=provider, provider=provider,
workspace=config.workspace_path, workspace=config.workspace_path,
brave_api_key=config.tools.web.search.api_key or None brave_api_key=config.tools.web.search.api_key or None,
exec_config=config.tools.exec,
) )
if message: if message:
@ -405,7 +411,7 @@ def _get_bridge_dir() -> Path:
raise typer.Exit(1) raise typer.Exit(1)
# Find source bridge: first check package data, then source dir # Find source bridge: first check package data, then source dir
pkg_bridge = Path(__file__).parent / "bridge" # nanobot/bridge (installed) pkg_bridge = Path(__file__).parent.parent / "bridge" # nanobot/bridge (installed)
src_bridge = Path(__file__).parent.parent.parent / "bridge" # repo root/bridge (dev) src_bridge = Path(__file__).parent.parent.parent / "bridge" # repo root/bridge (dev)
source = None source = None
@ -629,10 +635,10 @@ def cron_run(
def status(): def status():
"""Show nanobot status.""" """Show nanobot status."""
from nanobot.config.loader import load_config, get_config_path from nanobot.config.loader import load_config, get_config_path
from nanobot.utils.helpers import get_workspace_path
config_path = get_config_path() config_path = get_config_path()
workspace = get_workspace_path() config = load_config()
workspace = config.workspace_path
console.print(f"{__logo__} nanobot Status\n") console.print(f"{__logo__} nanobot Status\n")
@ -640,7 +646,6 @@ def status():
console.print(f"Workspace: {workspace} {'[green]✓[/green]' if workspace.exists() else '[red]✗[/red]'}") console.print(f"Workspace: {workspace} {'[green]✓[/green]' if workspace.exists() else '[red]✗[/red]'}")
if config_path.exists(): if config_path.exists():
config = load_config()
console.print(f"Model: {config.agents.defaults.model}") console.print(f"Model: {config.agents.defaults.model}")
# Check API keys # Check API keys

View File

@ -17,6 +17,17 @@ class TelegramConfig(BaseModel):
enabled: bool = False enabled: bool = False
token: str = "" # Bot token from @BotFather token: str = "" # Bot token from @BotFather
allow_from: list[str] = Field(default_factory=list) # Allowed user IDs or usernames allow_from: list[str] = Field(default_factory=list) # Allowed user IDs or usernames
proxy: str | None = None # HTTP/SOCKS5 proxy URL, e.g. "http://127.0.0.1:7890" or "socks5://127.0.0.1:1080"
class FeishuConfig(BaseModel):
"""Feishu/Lark channel configuration using WebSocket long connection."""
enabled: bool = False
app_id: str = "" # App ID from Feishu Open Platform
app_secret: str = "" # App Secret from Feishu Open Platform
encrypt_key: str = "" # Encrypt Key for event subscription (optional)
verification_token: str = "" # Verification Token for event subscription (optional)
allow_from: list[str] = Field(default_factory=list) # Allowed user open_ids
class DiscordConfig(BaseModel): class DiscordConfig(BaseModel):
@ -33,6 +44,7 @@ class ChannelsConfig(BaseModel):
whatsapp: WhatsAppConfig = Field(default_factory=WhatsAppConfig) whatsapp: WhatsAppConfig = Field(default_factory=WhatsAppConfig)
telegram: TelegramConfig = Field(default_factory=TelegramConfig) telegram: TelegramConfig = Field(default_factory=TelegramConfig)
discord: DiscordConfig = Field(default_factory=DiscordConfig) discord: DiscordConfig = Field(default_factory=DiscordConfig)
feishu: FeishuConfig = Field(default_factory=FeishuConfig)
class AgentDefaults(BaseModel): class AgentDefaults(BaseModel):
@ -60,6 +72,7 @@ class ProvidersConfig(BaseModel):
anthropic: ProviderConfig = Field(default_factory=ProviderConfig) anthropic: ProviderConfig = Field(default_factory=ProviderConfig)
openai: ProviderConfig = Field(default_factory=ProviderConfig) openai: ProviderConfig = Field(default_factory=ProviderConfig)
openrouter: ProviderConfig = Field(default_factory=ProviderConfig) openrouter: ProviderConfig = Field(default_factory=ProviderConfig)
deepseek: ProviderConfig = Field(default_factory=ProviderConfig)
groq: ProviderConfig = Field(default_factory=ProviderConfig) groq: ProviderConfig = Field(default_factory=ProviderConfig)
zhipu: ProviderConfig = Field(default_factory=ProviderConfig) zhipu: ProviderConfig = Field(default_factory=ProviderConfig)
vllm: ProviderConfig = Field(default_factory=ProviderConfig) vllm: ProviderConfig = Field(default_factory=ProviderConfig)
@ -83,9 +96,16 @@ class WebToolsConfig(BaseModel):
search: WebSearchConfig = Field(default_factory=WebSearchConfig) search: WebSearchConfig = Field(default_factory=WebSearchConfig)
class ExecToolConfig(BaseModel):
"""Shell exec tool configuration."""
timeout: int = 60
restrict_to_workspace: bool = False # If true, block commands accessing paths outside workspace
class ToolsConfig(BaseModel): class ToolsConfig(BaseModel):
"""Tools configuration.""" """Tools configuration."""
web: WebToolsConfig = Field(default_factory=WebToolsConfig) web: WebToolsConfig = Field(default_factory=WebToolsConfig)
exec: ExecToolConfig = Field(default_factory=ExecToolConfig)
class Config(BaseSettings): class Config(BaseSettings):
@ -102,9 +122,10 @@ class Config(BaseSettings):
return Path(self.agents.defaults.workspace).expanduser() return Path(self.agents.defaults.workspace).expanduser()
def get_api_key(self) -> str | None: def get_api_key(self) -> str | None:
"""Get API key in priority order: OpenRouter > Anthropic > OpenAI > Gemini > Zhipu > Groq > vLLM.""" """Get API key in priority order: OpenRouter > DeepSeek > Anthropic > OpenAI > Gemini > Zhipu > Groq > vLLM."""
return ( return (
self.providers.openrouter.api_key or self.providers.openrouter.api_key or
self.providers.deepseek.api_key or
self.providers.anthropic.api_key or self.providers.anthropic.api_key or
self.providers.openai.api_key or self.providers.openai.api_key or
self.providers.gemini.api_key or self.providers.gemini.api_key or

View File

@ -115,7 +115,7 @@ class HeartbeatService:
response = await self.on_heartbeat(HEARTBEAT_PROMPT) response = await self.on_heartbeat(HEARTBEAT_PROMPT)
# Check if agent said "nothing to do" # Check if agent said "nothing to do"
if HEARTBEAT_OK_TOKEN in response.upper().replace("_", ""): if HEARTBEAT_OK_TOKEN.replace("_", "") in response.upper().replace("_", ""):
logger.info("Heartbeat: OK (no action needed)") logger.info("Heartbeat: OK (no action needed)")
else: else:
logger.info(f"Heartbeat: completed task") logger.info(f"Heartbeat: completed task")

View File

@ -43,6 +43,8 @@ class LiteLLMProvider(LLMProvider):
elif self.is_vllm: elif self.is_vllm:
# vLLM/custom endpoint - uses OpenAI-compatible API # vLLM/custom endpoint - uses OpenAI-compatible API
os.environ["OPENAI_API_KEY"] = api_key os.environ["OPENAI_API_KEY"] = api_key
elif "deepseek" in default_model:
os.environ.setdefault("DEEPSEEK_API_KEY", api_key)
elif "anthropic" in default_model: elif "anthropic" in default_model:
os.environ.setdefault("ANTHROPIC_API_KEY", api_key) os.environ.setdefault("ANTHROPIC_API_KEY", api_key)
elif "openai" in default_model or "gpt" in default_model: elif "openai" in default_model or "gpt" in default_model:
@ -88,13 +90,13 @@ class LiteLLMProvider(LLMProvider):
model = f"openrouter/{model}" model = f"openrouter/{model}"
# For Zhipu/Z.ai, ensure prefix is present # For Zhipu/Z.ai, ensure prefix is present
# Handle cases like "glm-4.7-flash" -> "zhipu/glm-4.7-flash" # Handle cases like "glm-4.7-flash" -> "zai/glm-4.7-flash"
if ("glm" in model.lower() or "zhipu" in model.lower()) and not ( if ("glm" in model.lower() or "zhipu" in model.lower()) and not (
model.startswith("zhipu/") or model.startswith("zhipu/") or
model.startswith("zai/") or model.startswith("zai/") or
model.startswith("openrouter/") model.startswith("openrouter/")
): ):
model = f"zhipu/{model}" model = f"zai/{model}"
# For vLLM, use hosted_vllm/ prefix per LiteLLM docs # For vLLM, use hosted_vllm/ prefix per LiteLLM docs
# Convert openai/ prefix to hosted_vllm/ if user specified it # Convert openai/ prefix to hosted_vllm/ if user specified it

View File

@ -0,0 +1,40 @@
---
name: cron
description: Schedule reminders and recurring tasks.
---
# Cron
Use the `cron` tool to schedule reminders or recurring tasks.
## Two Modes
1. **Reminder** - message is sent directly to user
2. **Task** - message is a task description, agent executes and sends result
## Examples
Fixed reminder:
```
cron(action="add", message="Time to take a break!", every_seconds=1200)
```
Dynamic task (agent executes each time):
```
cron(action="add", message="Check HKUDS/nanobot GitHub stars and report", every_seconds=600)
```
List/remove:
```
cron(action="list")
cron(action="remove", job_id="abc123")
```
## Time Expressions
| User says | Parameters |
|-----------|------------|
| every 20 minutes | every_seconds: 1200 |
| every hour | every_seconds: 3600 |
| every day at 8am | cron_expr: "0 8 * * *" |
| weekdays at 5pm | cron_expr: "0 17 * * 1-5" |

View File

@ -9,9 +9,9 @@ This skill provides guidance for creating effective skills.
## About Skills ## About Skills
Skills are modular, self-contained packages that extend Codex's capabilities by providing Skills are modular, self-contained packages that extend the agent's capabilities by providing
specialized knowledge, workflows, and tools. Think of them as "onboarding guides" for specific specialized knowledge, workflows, and tools. Think of them as "onboarding guides" for specific
domains or tasks—they transform Codex from a general-purpose agent into a specialized agent domains or tasks—they transform the agent from a general-purpose agent into a specialized agent
equipped with procedural knowledge that no model can fully possess. equipped with procedural knowledge that no model can fully possess.
### What Skills Provide ### What Skills Provide
@ -25,9 +25,9 @@ equipped with procedural knowledge that no model can fully possess.
### Concise is Key ### Concise is Key
The context window is a public good. Skills share the context window with everything else Codex needs: system prompt, conversation history, other Skills' metadata, and the actual user request. The context window is a public good. Skills share the context window with everything else the agent needs: system prompt, conversation history, other Skills' metadata, and the actual user request.
**Default assumption: Codex is already very smart.** Only add context Codex doesn't already have. Challenge each piece of information: "Does Codex really need this explanation?" and "Does this paragraph justify its token cost?" **Default assumption: the agent is already very smart.** Only add context the agent doesn't already have. Challenge each piece of information: "Does the agent really need this explanation?" and "Does this paragraph justify its token cost?"
Prefer concise examples over verbose explanations. Prefer concise examples over verbose explanations.
@ -41,7 +41,7 @@ Match the level of specificity to the task's fragility and variability:
**Low freedom (specific scripts, few parameters)**: Use when operations are fragile and error-prone, consistency is critical, or a specific sequence must be followed. **Low freedom (specific scripts, few parameters)**: Use when operations are fragile and error-prone, consistency is critical, or a specific sequence must be followed.
Think of Codex as exploring a path: a narrow bridge with cliffs needs specific guardrails (low freedom), while an open field allows many routes (high freedom). Think of the agent as exploring a path: a narrow bridge with cliffs needs specific guardrails (low freedom), while an open field allows many routes (high freedom).
### Anatomy of a Skill ### Anatomy of a Skill
@ -64,7 +64,7 @@ skill-name/
Every SKILL.md consists of: Every SKILL.md consists of:
- **Frontmatter** (YAML): Contains `name` and `description` fields. These are the only fields that Codex reads to determine when the skill gets used, thus it is very important to be clear and comprehensive in describing what the skill is, and when it should be used. - **Frontmatter** (YAML): Contains `name` and `description` fields. These are the only fields that the agent reads to determine when the skill gets used, thus it is very important to be clear and comprehensive in describing what the skill is, and when it should be used.
- **Body** (Markdown): Instructions and guidance for using the skill. Only loaded AFTER the skill triggers (if at all). - **Body** (Markdown): Instructions and guidance for using the skill. Only loaded AFTER the skill triggers (if at all).
#### Bundled Resources (optional) #### Bundled Resources (optional)
@ -76,27 +76,27 @@ Executable code (Python/Bash/etc.) for tasks that require deterministic reliabil
- **When to include**: When the same code is being rewritten repeatedly or deterministic reliability is needed - **When to include**: When the same code is being rewritten repeatedly or deterministic reliability is needed
- **Example**: `scripts/rotate_pdf.py` for PDF rotation tasks - **Example**: `scripts/rotate_pdf.py` for PDF rotation tasks
- **Benefits**: Token efficient, deterministic, may be executed without loading into context - **Benefits**: Token efficient, deterministic, may be executed without loading into context
- **Note**: Scripts may still need to be read by Codex for patching or environment-specific adjustments - **Note**: Scripts may still need to be read by the agent for patching or environment-specific adjustments
##### References (`references/`) ##### References (`references/`)
Documentation and reference material intended to be loaded as needed into context to inform Codex's process and thinking. Documentation and reference material intended to be loaded as needed into context to inform the agent's process and thinking.
- **When to include**: For documentation that Codex should reference while working - **When to include**: For documentation that the agent should reference while working
- **Examples**: `references/finance.md` for financial schemas, `references/mnda.md` for company NDA template, `references/policies.md` for company policies, `references/api_docs.md` for API specifications - **Examples**: `references/finance.md` for financial schemas, `references/mnda.md` for company NDA template, `references/policies.md` for company policies, `references/api_docs.md` for API specifications
- **Use cases**: Database schemas, API documentation, domain knowledge, company policies, detailed workflow guides - **Use cases**: Database schemas, API documentation, domain knowledge, company policies, detailed workflow guides
- **Benefits**: Keeps SKILL.md lean, loaded only when Codex determines it's needed - **Benefits**: Keeps SKILL.md lean, loaded only when the agent determines it's needed
- **Best practice**: If files are large (>10k words), include grep search patterns in SKILL.md - **Best practice**: If files are large (>10k words), include grep search patterns in SKILL.md
- **Avoid duplication**: Information should live in either SKILL.md or references files, not both. Prefer references files for detailed information unless it's truly core to the skill—this keeps SKILL.md lean while making information discoverable without hogging the context window. Keep only essential procedural instructions and workflow guidance in SKILL.md; move detailed reference material, schemas, and examples to references files. - **Avoid duplication**: Information should live in either SKILL.md or references files, not both. Prefer references files for detailed information unless it's truly core to the skill—this keeps SKILL.md lean while making information discoverable without hogging the context window. Keep only essential procedural instructions and workflow guidance in SKILL.md; move detailed reference material, schemas, and examples to references files.
##### Assets (`assets/`) ##### Assets (`assets/`)
Files not intended to be loaded into context, but rather used within the output Codex produces. Files not intended to be loaded into context, but rather used within the output the agent produces.
- **When to include**: When the skill needs files that will be used in the final output - **When to include**: When the skill needs files that will be used in the final output
- **Examples**: `assets/logo.png` for brand assets, `assets/slides.pptx` for PowerPoint templates, `assets/frontend-template/` for HTML/React boilerplate, `assets/font.ttf` for typography - **Examples**: `assets/logo.png` for brand assets, `assets/slides.pptx` for PowerPoint templates, `assets/frontend-template/` for HTML/React boilerplate, `assets/font.ttf` for typography
- **Use cases**: Templates, images, icons, boilerplate code, fonts, sample documents that get copied or modified - **Use cases**: Templates, images, icons, boilerplate code, fonts, sample documents that get copied or modified
- **Benefits**: Separates output resources from documentation, enables Codex to use files without loading them into context - **Benefits**: Separates output resources from documentation, enables the agent to use files without loading them into context
#### What to Not Include in a Skill #### What to Not Include in a Skill
@ -116,7 +116,7 @@ Skills use a three-level loading system to manage context efficiently:
1. **Metadata (name + description)** - Always in context (~100 words) 1. **Metadata (name + description)** - Always in context (~100 words)
2. **SKILL.md body** - When skill triggers (<5k words) 2. **SKILL.md body** - When skill triggers (<5k words)
3. **Bundled resources** - As needed by Codex (Unlimited because scripts can be executed without reading into context window) 3. **Bundled resources** - As needed by the agent (Unlimited because scripts can be executed without reading into context window)
#### Progressive Disclosure Patterns #### Progressive Disclosure Patterns
@ -141,7 +141,7 @@ Extract text with pdfplumber:
- **Examples**: See [EXAMPLES.md](EXAMPLES.md) for common patterns - **Examples**: See [EXAMPLES.md](EXAMPLES.md) for common patterns
``` ```
Codex loads FORMS.md, REFERENCE.md, or EXAMPLES.md only when needed. the agent loads FORMS.md, REFERENCE.md, or EXAMPLES.md only when needed.
**Pattern 2: Domain-specific organization** **Pattern 2: Domain-specific organization**
@ -157,7 +157,7 @@ bigquery-skill/
└── marketing.md (campaigns, attribution) └── marketing.md (campaigns, attribution)
``` ```
When a user asks about sales metrics, Codex only reads sales.md. When a user asks about sales metrics, the agent only reads sales.md.
Similarly, for skills supporting multiple frameworks or variants, organize by variant: Similarly, for skills supporting multiple frameworks or variants, organize by variant:
@ -170,7 +170,7 @@ cloud-deploy/
└── azure.md (Azure deployment patterns) └── azure.md (Azure deployment patterns)
``` ```
When the user chooses AWS, Codex only reads aws.md. When the user chooses AWS, the agent only reads aws.md.
**Pattern 3: Conditional details** **Pattern 3: Conditional details**
@ -191,12 +191,12 @@ For simple edits, modify the XML directly.
**For OOXML details**: See [OOXML.md](OOXML.md) **For OOXML details**: See [OOXML.md](OOXML.md)
``` ```
Codex reads REDLINING.md or OOXML.md only when the user needs those features. the agent reads REDLINING.md or OOXML.md only when the user needs those features.
**Important guidelines:** **Important guidelines:**
- **Avoid deeply nested references** - Keep references one level deep from SKILL.md. All reference files should link directly from SKILL.md. - **Avoid deeply nested references** - Keep references one level deep from SKILL.md. All reference files should link directly from SKILL.md.
- **Structure longer reference files** - For files longer than 100 lines, include a table of contents at the top so Codex can see the full scope when previewing. - **Structure longer reference files** - For files longer than 100 lines, include a table of contents at the top so the agent can see the full scope when previewing.
## Skill Creation Process ## Skill Creation Process
@ -293,7 +293,7 @@ After initialization, customize the SKILL.md and add resources as needed. If you
### Step 4: Edit the Skill ### Step 4: Edit the Skill
When editing the (newly-generated or existing) skill, remember that the skill is being created for another instance of Codex to use. Include information that would be beneficial and non-obvious to Codex. Consider what procedural knowledge, domain-specific details, or reusable assets would help another Codex instance execute these tasks more effectively. When editing the (newly-generated or existing) skill, remember that the skill is being created for another instance of the agent to use. Include information that would be beneficial and non-obvious to the agent. Consider what procedural knowledge, domain-specific details, or reusable assets would help another the agent instance execute these tasks more effectively.
#### Learn Proven Design Patterns #### Learn Proven Design Patterns
@ -321,10 +321,10 @@ If you used `--examples`, delete any placeholder files that are not needed for t
Write the YAML frontmatter with `name` and `description`: Write the YAML frontmatter with `name` and `description`:
- `name`: The skill name - `name`: The skill name
- `description`: This is the primary triggering mechanism for your skill, and helps Codex understand when to use the skill. - `description`: This is the primary triggering mechanism for your skill, and helps the agent understand when to use the skill.
- Include both what the Skill does and specific triggers/contexts for when to use it. - Include both what the Skill does and specific triggers/contexts for when to use it.
- Include all "when to use" information here - Not in the body. The body is only loaded after triggering, so "When to Use This Skill" sections in the body are not helpful to Codex. - Include all "when to use" information here - Not in the body. The body is only loaded after triggering, so "When to Use This Skill" sections in the body are not helpful to the agent.
- Example description for a `docx` skill: "Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. Use when Codex needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks" - Example description for a `docx` skill: "Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. Use when the agent needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks"
Do not include any other fields in YAML frontmatter. Do not include any other fields in YAML frontmatter.

View File

@ -1,6 +1,6 @@
[project] [project]
name = "nanobot-ai" name = "nanobot-ai"
version = "0.1.3.post3" version = "0.1.3.post4"
description = "A lightweight personal AI assistant framework" description = "A lightweight personal AI assistant framework"
requires-python = ">=3.11" requires-python = ">=3.11"
license = {text = "MIT"} license = {text = "MIT"}
@ -32,6 +32,9 @@ dependencies = [
] ]
[project.optional-dependencies] [project.optional-dependencies]
feishu = [
"lark-oapi>=1.0.0",
]
dev = [ dev = [
"pytest>=7.0.0", "pytest>=7.0.0",
"pytest-asyncio>=0.21.0", "pytest-asyncio>=0.21.0",

View File

@ -0,0 +1,88 @@
from typing import Any
from nanobot.agent.tools.base import Tool
from nanobot.agent.tools.registry import ToolRegistry
class SampleTool(Tool):
@property
def name(self) -> str:
return "sample"
@property
def description(self) -> str:
return "sample tool"
@property
def parameters(self) -> dict[str, Any]:
return {
"type": "object",
"properties": {
"query": {"type": "string", "minLength": 2},
"count": {"type": "integer", "minimum": 1, "maximum": 10},
"mode": {"type": "string", "enum": ["fast", "full"]},
"meta": {
"type": "object",
"properties": {
"tag": {"type": "string"},
"flags": {
"type": "array",
"items": {"type": "string"},
},
},
"required": ["tag"],
},
},
"required": ["query", "count"],
}
async def execute(self, **kwargs: Any) -> str:
return "ok"
def test_validate_params_missing_required() -> None:
tool = SampleTool()
errors = tool.validate_params({"query": "hi"})
assert "missing required count" in "; ".join(errors)
def test_validate_params_type_and_range() -> None:
tool = SampleTool()
errors = tool.validate_params({"query": "hi", "count": 0})
assert any("count must be >= 1" in e for e in errors)
errors = tool.validate_params({"query": "hi", "count": "2"})
assert any("count should be integer" in e for e in errors)
def test_validate_params_enum_and_min_length() -> None:
tool = SampleTool()
errors = tool.validate_params({"query": "h", "count": 2, "mode": "slow"})
assert any("query must be at least 2 chars" in e for e in errors)
assert any("mode must be one of" in e for e in errors)
def test_validate_params_nested_object_and_array() -> None:
tool = SampleTool()
errors = tool.validate_params(
{
"query": "hi",
"count": 2,
"meta": {"flags": [1, "ok"]},
}
)
assert any("missing required meta.tag" in e for e in errors)
assert any("meta.flags[0] should be string" in e for e in errors)
def test_validate_params_ignores_unknown_fields() -> None:
tool = SampleTool()
errors = tool.validate_params({"query": "hi", "count": 2, "extra": "x"})
assert errors == []
async def test_registry_returns_validation_error() -> None:
reg = ToolRegistry()
reg.register(SampleTool())
result = await reg.execute("sample", {"query": "hi"})
assert "Invalid parameters" in result

View File

@ -16,6 +16,7 @@ You have access to:
- Shell commands (exec) - Shell commands (exec)
- Web access (search, fetch) - Web access (search, fetch)
- Messaging (message) - Messaging (message)
- Background tasks (spawn)
## Memory ## Memory

View File

@ -37,29 +37,31 @@ exec(command: str, working_dir: str = None) -> str
``` ```
**Safety Notes:** **Safety Notes:**
- Commands have a 60-second timeout - Commands have a configurable timeout (default 60s)
- Dangerous commands are blocked (rm -rf, format, dd, shutdown, etc.)
- Output is truncated at 10,000 characters - Output is truncated at 10,000 characters
- Use with caution for destructive operations - Optional `restrictToWorkspace` config to limit paths
## Web Access ## Web Access
### web_search ### web_search
Search the web using DuckDuckGo. Search the web using Brave Search API.
``` ```
web_search(query: str) -> str web_search(query: str, count: int = 5) -> str
``` ```
Returns top 5 search results with titles, URLs, and snippets. Returns search results with titles, URLs, and snippets. Requires `tools.web.search.apiKey` in config.
### web_fetch ### web_fetch
Fetch and extract main content from a URL. Fetch and extract main content from a URL.
``` ```
web_fetch(url: str) -> str web_fetch(url: str, extractMode: str = "markdown", maxChars: int = 50000) -> str
``` ```
**Notes:** **Notes:**
- Content is extracted using trafilatura - Content is extracted using readability
- Output is truncated at 8,000 characters - Supports markdown or plain text extraction
- Output is truncated at 50,000 characters by default
## Communication ## Communication
@ -69,6 +71,16 @@ Send a message to the user (used internally).
message(content: str, channel: str = None, chat_id: str = None) -> str message(content: str, channel: str = None, chat_id: str = None) -> str
``` ```
## Background Tasks
### spawn
Spawn a subagent to handle a task in the background.
```
spawn(task: str, label: str = None) -> str
```
Use for complex or time-consuming tasks that can run independently. The subagent will complete the task and report back when done.
## Scheduled Reminders (Cron) ## Scheduled Reminders (Cron)
Use the `exec` tool to create scheduled reminders with `nanobot cron add`: Use the `exec` tool to create scheduled reminders with `nanobot cron add`: