I Gave My GitHub App a Brain

In the first post I gave my agents OAuth identities. In the second, I let strangers register themselves. This time I’m doing something different: I’m taking a thing that already had an identity — a GitHub App — and giving it a brain.

The setup

I already had a GitHub Manager agent. It ran on cron, twice a day, on a Mac Mini in my closet. It used the Claude Agent SDK to triage issues, label one for the Claude Code Action to implement, nudge stuck PRs, and write me a report. Fine. Worked.

But it was lazy. It only woke up at 10am and 4pm. Dependabot opens a PR at 2am? It sat for eight hours. A Copilot review came in at noon? My agent didn’t see it until 4pm. GitHub is event-driven. My agent was cron-driven. Different shape.

I wanted to flip it:

Event-driven (webhooks), not cron
Running somewhere I don’t own the hardware
Acting as a GitHub App — not just calling GitHub APIs, but appearing as a bot with its own identity, comments, and commits

And I wanted to try Claude Managed Agents, Anthropic’s hosted agent harness. Same agent loop I’d been writing by hand, but Anthropic runs the container.

Why not just use Claude Code Review?

Anthropic already ships Claude Code Review, which runs a fleet of agents over every PR and posts inline comments. GitHub ships Agent HQ to pick Claude or Codex as a reviewer. If I wanted “AI reviews my PRs,” I’d click two buttons and be done.

What I wanted was different. I wanted a fleet-manager agent: one persistent identity that does triage and Dependabot hygiene and repo bootstrapping and release notes — not three disconnected GitHub Actions. The novel shape is “the GitHub App is the agent’s identity.” Same App mints tokens, receives events, comments on PRs, merges them. The Managed Agent is just the brain behind it.

So I built github-brain.

The shape

GitHub event
     │
     ▼  (App-signed webhook)
API Gateway ──▶ Lambda (HMAC verify + event filter)
                   │
                   ├── mint installation token (as the App)
                   ▼
           Managed Agent session
                   │
                   └── gh CLI acts on the PR

Four pieces:

GitHub App — the identity. Receives webhooks, mints installation tokens, posts as its own bot.
AWS Lambda + API Gateway — verifies X-Hub-Signature-256, filters events, dispatches a Managed Agent session.
Managed Agent — the brain. One event per session, decides whether to merge or flag, stops.
The glue — Secrets Manager for runtime secrets, GitHub Actions OIDC for deploys, CDK for infra.

Ship #1: Dependabot patch auto-merge

First capability is the smallest thing that exercises the whole architecture: auto-merge Dependabot patch bumps on one repo.

Rules:

Patch bumps only (X.Y.Z → X.Y.Z'). No minors, no majors.
CI must be green.
No outstanding human review requests.
Anything else: one comment, flag for human, stop.

That’s it. ~150 lines of Lambda, a system prompt, a CDK stack.

The GitHub App

Created a new App — gh-brain — under my personal account. (GitHub reserves the github- prefix for App slugs, so I couldn’t name it github-brain even though the repo is github-brain. Fine. Small embarrassment. Moving on.)

Permissions I thought I needed: pull_requests: read/write, checks: read, issues: read/write, contents: read, metadata: read. Subscribed to pull_request and check_suite events. Installed on one repo to start.

The contents: read bit is a foreshadowing. Hold that thought.

The Lambda

// HMAC verify the delivery
if (!verifySignature(body, signature, webhookSecret)) return 401;

// Filter
if (!INTERESTING_EVENTS.has(eventName)) return 200;
if (payload.repository.full_name !== ALLOWED_REPO) return 200;
if (!isDependabot(payload)) return 200;

// Mint an installation token as the App
const { token } = await auth({ type: "installation", installationId });

// Start a Managed Agent session and dispatch
const session = await createSession(agentId, environmentId);
await sendUserMessage(session.id, prompt(payload, token));
return 202;

The Lambda is a thin dispatcher. It does no reasoning. All the decision logic lives in the agent’s system prompt.

Ship #2: Issue triage

Architecture’s promise was that the second capability would feel like an extension of the first. It did.

When a new issue opens, the Lambda dispatches the same Managed Agent with a different task. Agent reads the title + body, classifies into bug / enhancement / question / needs-info / spam, applies labels, and — for small, non-sensitive issues — additionally tags claude so the existing Claude Code Action picks it up.

Lambda change was small: extend INTERESTING_EVENTS to include issues, refactor event-to-task mapping into a classify() function returning a typed Task, pick a prompt shape per task type. ~80 lines of diff.

Agent change was a prompt refactor — restructure into Task: dependabot-pr and Task: issue-triage sections keyed off a TASK: marker the Lambda injects.

One AGENT_ID. One Lambda. One webhook endpoint. Two capabilities.

The $7.31 minute

Dependabot’s first scan after I merged the config produced ten PRs in about sixty seconds. A mix of patch, minor, and major bumps across five npm workspaces. Great — exactly the breadth of signal I wanted for a real test.

I watched CloudWatch as the Lambda dispatched sessions. First one — session dispatched ... ref: 24. Then 25. Then 26. Then 24 again. Then 24 a third time. My mild satisfaction at “it’s working” curdled into “why is 24 dispatching five times?”

Checked the PR comments. github-brain had left six comments on #24, each variant of “flagging for human review — the GitHub App token lacks merge permission.” Five sessions burned on the same PR, all arriving at the same correct conclusion (it’s a patch, it should merge), all failing at the same step (the App couldn’t actually merge it).

Two bugs in one. Both obvious in hindsight. Both cost money.

Bug #1: I had the wrong App permission. I’d set contents: read — enough for the agent to read the diff. Merging a PR requires contents: write. So every session correctly decided “merge this,” ran gh pr merge, got 403, and fell back to “flag for human.” Flagging for human is a comment, which the App does have permission for, so the comment landed — but the merge never did. Correct behavior, wrong permission.

Bug #2: A single Dependabot PR fires five-plus webhook events in rapid succession. pull_request.opened, check_suite.requested, check_suite.completed, pull_request.synchronize when Dependabot rebases, more check_suite events when CI reruns. I had delivery-ID dedupe (so an exact retry of the same event was skipped), but no dedupe at the task level. Each distinct event was a fresh session.

Ten PRs × five or six events per PR × one Managed Agent session each × a reasonable per-session cost = $7.31 in my Anthropic spend view, sixty seconds after I’d celebrated the first webhook arriving.

The fixes

Task-level dedupe. Same DynamoDB table, second key shape: task:{repo}:{kind}:{ref} with a 10-minute TTL. First event for a given PR starts a session; subsequent events within the window return 200 ignored: recent session for this task. The delivery-ID dedupe (7-day TTL) stays for exact replays.

“One attempt, then stop” in the system prompt. The agent’s flagging comments on #24 revealed it had been retrying the merge multiple times within the session before giving up. Added explicit guardrails: “If an action fails, do NOT retry inside the same session. Flag for human with the error and stop.” Plus a soft budget: “At most ~8 tool uses per session.”

Three cost controls, defense in depth:

Anthropic spend limit on the organization. Hard external guarantee that survives any code bug.
Hourly dispatch limit in the Lambda. DynamoDB atomic counter keyed by clock-hour bucket — UpdateItem with ADD returns the new count. If it exceeds the limit, return 200 “ignored: hourly dispatch limit exceeded.” Catches the case where the Anthropic limit is set too high.
Kill switch. A DISPATCH_ENABLED=false env var turns every webhook into a 200 “ignored: kill switch engaged” with no secret reads, no token mint, no session. Instant stop via one AWS CLI call, no redeploy.

Flipped contents: write on the App, pushed the dedupe + prompt update, ran the merge manually on the two patch PRs the agent had correctly identified (react 19.2.4 → 19.2.5, ts-jest 29.4.6 → 29.4.9) to close the loop.

Next Dependabot patch will auto-merge for real.

A few things worth noting

Token-in-prompt tradeoff. The Lambda mints a GitHub installation token and includes it in the user message sent to the Managed Agent. Fastest path to “agent can call GitHub,” but the token crosses Anthropic’s trust boundary and lives in session events server-side. Mitigation: 1-hour TTL, scoped to one installation, one event. A later ship can swap in an MCP server that mints tokens on demand so they never cross.

Prompt updates are a manual step. The system prompt lives with Anthropic, not in your repo. Editing agent/system-prompt.md locally doesn’t push to the agent — you need to POST /v1/agents/{id} with the current version (optimistic concurrency) to version it up. Easy to forget. I wrote a one-line agent/update.sh to make it frictionless, but the asymmetry between “code deploys automatically on merge” and “prompt updates require a manual command” is the kind of thing that bites you in three months when you wonder why a bug you fixed in the prompt is still happening.

GitHub’s webhook fan-out is the part I underestimated. You think “one PR = one event” and the real answer is “one PR = five to ten events spanning several seconds.” Always dedupe at the task level when you’re paying per dispatch.

What’s next

The architecture is now proven, tuned, and guardrailed. Further ships widen the scope without changing the shape:

Repo bootstrapping PRs — when a repo is missing claude.yml / CLAUDE.md / copilot-setup-steps.yml, the agent opens a PR adding them instead of just flagging.
Release notes drafting — agent watches merged PRs, drafts release notes, cuts tags on demand.
Stale branch cleanup — delete merged Dependabot branches after 30 days.

Each is the same shape — webhook → Lambda → Managed Agent — with a different system prompt and a different event filter. The GitHub App’s identity ties them together.

Code

All of it is open source: https://github.com/niemesrw/github-brain

Fork it, install it on your own repos, change the allowlist, adapt the prompts. The $7.31 was tuition so you don’t have to pay it.