Why Agent Operations Matters
The trend: agent demos are becoming operated systems
Current agent work is shifting toward observability, human approval, policy, and audit trails. The hard part is no longer proving an agent can act once. It is proving the agent can be watched, paused, corrected, updated, and recovered without losing context or widening access by accident.
OpenClaw is well suited to that style because the Gateway is the control plane for channels, sessions, tools, the Control UI, nodes, and routing. Treat it like infrastructure.
What to observe
- Gateway reachability and WebSocket health
- Channel delivery, backlog, and pairing state
- Agent session ownership and recent tool use
- Provider auth, model routing, and spend surprises
- Security audit findings after config changes
What to control
- Who can message the bot
- Which agents receive which channels
- Which tools are available from non-local senders
- When human approval is required
- How updates and rollback are handled
Daily Operator Loop
Five-minute morning check
Run the fast checks first. They catch most failures without pulling you into log archaeology.
openclaw status
openclaw gateway status
openclaw health --json
openclaw doctor
- Green path: Gateway reachable, expected channels connected, no new critical doctor findings.
- Yellow path: One channel degraded, provider warning, or pending device approval. Fix before widening access.
- Red path: Gateway down, unexpected public exposure, open DM policy, failed auth, or tool permissions broader than intended.
Check the browser Control UI
The Control UI is the quickest way to inspect live chat, activity, nodes, config, and sessions. For local operators, the default URL is:
http://127.0.0.1:18789/
New browsers or remote devices can require explicit device pairing. If a browser reports pairing required, list pending requests and approve the correct one.
openclaw devices list
openclaw devices approve <requestId>
Tail logs only after status narrows the problem
Logs are useful, but start with status and health so you know what you are looking for.
openclaw logs --follow
# If RPC is down, fall back to the newest local log:
tail -f "$(ls -t /tmp/openclaw/openclaw-*.log | head -1)"
Weekly Audit Loop
Run the security audit before and after access changes
Run the audit whenever you add a channel, change a DM policy, expose the Gateway beyond loopback, add a reverse proxy, install new plugins, or give an agent broader tools.
openclaw security audit
openclaw security audit --deep
openclaw security audit --json
Use --fix for narrow common repairs only after reviewing what the audit found.
openclaw security audit --fix
Review the exposure inventory
Keep a short written inventory for any Gateway that accepts messages from outside the host.
Loopback, LAN, tailnet, Tailscale Serve, trusted proxy, or public internet.
Token, password, Tailscale identity headers, or trusted-proxy identity headers.
Which channels can wake which agents, and which sessions are used for DMs or groups.
Browser, exec, file, message, node, and external account access available to those agents.
Where config and credentials are backed up before widening access.
Keep shared trust boundaries honest
OpenClaw's practical security model is a personal assistant boundary: one trusted operator, potentially many agents. If mutually untrusted people can message the same tool-enabled agent, treat them as sharing that agent's delegated tool authority. Split gateways, credentials, OS users, or hosts when trust boundaries differ.
Approvals and Human Control
Use approval gates where consequences leave the chat
Require human approval for work that sends external messages, changes production systems, spends money, modifies customer data, or exposes credentials. Routine read-only status checks can stay automatic; irreversible or public actions should pause.
- Low risk: status checks, local search, summaries, read-only reports.
- Medium risk: draft generation, file edits in a review branch, staging deploys.
- High risk: public posts, email sends, production deploys, broad filesystem access, shell commands from remote senders.
Pair unknown senders instead of processing them
For Telegram, WhatsApp, Signal, iMessage, Microsoft Teams, Discord, Google Chat, and Slack-style channels, prefer pairing and allowlists over open public DMs. Unknown senders should get pairing flow, not an agent with tools.
openclaw pairing approve
openclaw doctor
Make group chats mention-gated by default
Group chats create noisy, ambiguous input. Require mentions, narrow allowed groups, and keep tool-heavy work in a safer session unless the group is intentionally trusted.
Update and Rollback Routine
Preview first, then update
Use the built-in updater for supervised installs because it coordinates install type, Gateway service metadata, doctor checks, and restart behavior.
openclaw update --dry-run
openclaw update
openclaw doctor
openclaw status --deep
Know your channel choice
Stable is the default for working systems. Beta and dev are useful when you need a specific fix, but they deserve tighter monitoring after upgrade.
openclaw update --channel stable --dry-run
openclaw update --channel beta --dry-run
openclaw update status --json
Keep a recovery path
Before changing install channel, channel policy, or Gateway exposure, record the current config path, package root, managed service Node path, and recent known-good version. If an npm package update fails part-way through, rerun the official installer rather than guessing which package tree is half-swapped.
Incident Response
First response checklist
Do not add channels, tools, or proxy exposure while debugging.
Run openclaw status --all, openclaw gateway status, and openclaw health --verbose.
Return the Gateway to loopback-only access or disable the affected channel if exposure is unclear.
Look for changed DM policy, new plugins, model/provider auth changes, and recently approved devices.
Run doctor, security audit, a channel test message, and a small tool test before declaring recovery.
Common recovery commands
openclaw status --all
openclaw status --deep
openclaw gateway status --deep
openclaw doctor
openclaw security audit --deep
openclaw logs --follow
Source Notes
This runbook reflects the July 2026 OpenClaw docs positioning: Node 24 recommended, openclaw onboard for setup, the Gateway as the control plane, Control UI device pairing, openclaw status/doctor/health for diagnostics, openclaw security audit for hardening checks, and openclaw update for supervised updates.
It also tracks the broader agent operations trend: production agents need monitoring, auditability, approvals, and rollback discipline before they deserve more autonomy.