One night I was staring at the monitoring panel—J reviewing code, Ada running tests, Lily organizing documents, Xiao Yue doing QA.
No one told them to do anything. No new commands. Just the previous task finished, and the next one automatically picked up.
I stopped on that画面 for a while. Because making this happen took months of experimentation and a lot of stupid mistakes.
Why regular Claude Code usage is still “Human Waiting for AI”
The problem isn’t the model—it’s the workflow design.
You give Claude Code a task, it runs and stops waiting for your next instruction. You provide the next command, you validate the results, you catch problems and re-prompt. The bottleneck throughout the entire flow isn’t AI’s speed—it’s your reaction speed.
That’s how we started too. J finished but Ada didn’t know, Ada finished but Xiao Yue didn’t get the signal—every handoff required manual notification or me to push things along. Four agents were each working fine, but together they still waited for me to coordinate.
The core problem Hook solves is simple: let Claude Code automatically do things when events happen, without waiting for human input.
But there are plenty of pitfalls in practice.
Our Production Environment’s Three-Layer Hook Architecture
We use three layers of Hooks, each with different responsibilities:
PreToolUse, the gatekeeper. Triggered every time an agent is about to execute a tool. We do two things here: security review (directly block certain command formats), and pre-dispatch state confirmation (ensure the inbox is empty before letting it proceed).
PostToolUse, the logger. Triggered after a tool finishes running. We log here, update task status, and if the tool’s output is needed by the next agent, we write it to the corresponding inbox to trigger the handoff.
Stop, the relay. Triggered every time Claude Code completes a response cycle. This is the most critical layer for chaining multiple agents together—J finishes, the Stop hook determines who the next handler is, whether the current state is suitable for handoff, then triggers the other party’s session.
Four specific Hooks: PreToolUse Security Gate, PreToolUse State Lock, PostToolUse Log Writer, Stop Relay Trigger.
The Three Hook Patterns That Made the Biggest Difference
The Stop Hook’s infinite loop problem. Early on we had Stop hook unconditionally trigger the next agent, which resulted in a death spiral—J triggered Ada, Ada finished and triggered J, J found more work and triggered Ada again, spinning uselessly all afternoon.
The issue was the stop_hook_active field wasn’t being handled. When the hook’s own action causes Claude to continue running, the system passes this flag in. Without checking it, you get infinite loops. The fix: the incoming JSON is read, first check if stop_hook_active is true—if yes, exit directly and let Claude stop normally. Only after confirming it’s a “genuine task completion” do you determine whether to relay. This is a pitfall I think most people hit.
PostToolUse only records, doesn’t decide. We initially had PostToolUse doing too much—logging, deciding, sometimes even modifying the next instruction. This made debugging really hard because you couldn’t tell whether a behavior was decided by Claude or inserted by the hook. Later we standardized the principle: PostToolUse only records, never decides. All decision logic moved to the Stop hook. That way when troubleshooting, the logs clearly distinguish “what Claude did” from “what the hook did.”
PreToolUse needs to tell Claude why it got blocked. PreToolUse can return a message to let Claude know why a tool call was blocked. We initially just blocked without explaining, so Claude didn’t know why it failed, sometimes trying to bypass it or achieve the same result with another tool. After adding explicit error messages (“This operation requires confirming the task status first, please call the status confirmation tool”), Claude could understand the context and follow the correct flow directly.
Real pitfalls from Hook + Agent combinations
Forgot to set execute permissions. After deploying to a new environment, the hooks didn’t run—searched for half an hour, turns out chmod +x wasn’t done. (scratches head) Now this step is automated in the deployment process, not relying on human memory.
Error messages get swallowed but Claude keeps running. If a hook script has the wrong exit code, sometimes Claude ignores it and continues executing. You need to make sure the block logic actually returns exit code 1 with an error message, not just echoes some text.
Hook and agent responsibilities got mixed up. For a while we put too much business logic into hook scripts, making hooks make decisions for agents, while agents just executed what hooks told them to do. That’s wrong. Hooks are guardrails and loggers, not the brain. The brain is still the agent—hooks are the mechanism that ensures the brain doesn’t go off track.
Now this system runs roughly like that 凌晨画面—not every task can work this way, some edge cases are still being磨合, some agents’ capabilities are still evolving. But the difference between having hooks and not having hooks is really significant.
$14.90 · 8 chapters + 6 templates
Learn More →