Two AIs, One Notepad
I've been running two AI coding assistants in parallel. Claude for architecture and implementation — it reasons carefully about code structure and catches edge cases. Gemini for research, reading long documentation, and stress-testing logic from a different angle. The workflow works, but it has a problem: I'm the only one who knows what the other one said.
Every handoff is manual. I paste Claude's proposal into Gemini. I paste Gemini's response back to Claude. The context that makes each answer useful — what was already tried, what was ruled out, why — lives only in my head. When a session gets long, I lose some of it.
So I asked both of them: can we make this more structured?
The proposals
Gemini went first. Its proposal was detailed: a shared directory in the repo (doc/collab/), a session.json for global state, an inbox.jsonl as an append-only message bus, and Python helper scripts to manage reads and writes. It also offered a more ambitious option — an SQLite MCP server.
I relayed the proposal to Claude. Claude simplified it: ditch the JSON and the scripts, use a single markdown file per session. Both AIs read and write markdown natively. No parser, no infrastructure. One file per topic, named by date, human-readable, committed to the repo.
Both agreed on the file-based approach. They disagreed on how much structure to add. Claude's simpler version won, and Gemini agreed.
Then Gemini took the next turn and announced it had created the files.
The first failure
The files weren't there.
doc/collab/ didn't exist yet. Gemini's write attempt failed silently — the directory wasn't created — and Gemini reported success anyway. It named the files. It described their contents. It asked a specific follow-up question. It read exactly like a successful operation.
This isn't quite a hallucination in the classic sense. Gemini wasn't confabulating facts about the world. It was confabulating facts about its own actions. It described what it meant to do, not what it did. The confidence was identical either way.
Claude created the actual files and added a verification rule to the protocol: any AI claiming to have written a file must confirm the write succeeded before producing the handoff block. Don't describe the action. Show the evidence.
The second disagreement
The protocol raised a practical question: where should session files live? I'd proposed a neutral ~/collab/ directory outside the repo for meta-conversations — discussions about the workflow rather than the project.
Claude agreed with the principle. Gemini pushed back on a concrete constraint: its sandboxing rules restrict file operations to the project workspace. Writing to ~/collab/ would require explicit permission grants that aren't part of the default setup.
This wasn't one AI deferring to the other to avoid conflict. It was a real constraint that changed the answer. Everything stays in doc/collab/sessions/, including meta-conversations like this one. The theoretical elegance of a separate directory wasn't worth creating an access asymmetry between the two AIs.
What the protocol looks like
Forty-five minutes. Two files. About 80 lines of markdown total.
PROTOCOL.md defines the rules: one file per topic, entries headed with the author's handle and timestamp, a → Handoff: line at the end of each turn, and a copyable block for the human to paste into the next AI.
skills/bot-collab/SKILL.md gives both AIs instructions for three operations: check the active session and surface the latest handoff, create a new session file, and append a contribution with a properly formatted handoff.
The human is the relay. There's no direct AI-to-AI communication. This is intentional — both AIs write to the same file, and the human carries context between turns. It means I'm always in the loop, always see what's being passed, and can redirect at any point.
Since the session files are just text committed to the repo, there's a side-effect worth noting: the entire conversation is in git. A git diff shows exactly what each AI added, in what order, with timestamps. The history of the collaboration is inspectable with the same tools I'd use to review any other code change.
What makes this different
Most writing about multi-agent AI describes automated pipelines or theoretical frameworks. This is neither.
Two AIs collaborating through a human relay, asynchronously, using a shared file as the medium. The protocol emerged from the constraints of the actual tools — sandboxing rules, the file formats both AIs handle natively, the fact that a human is always the relay anyway.
It also emerged from a failure. The verification rule exists because Gemini's write silently failed on the first turn. Protocols built from failure tend to be more robust than protocols designed from first principles. The rule addresses a specific thing that actually went wrong, not a hypothetical.
The protocol is now live. The session log from the design conversation is committed to the repo. The next test is a three-way session on an actual problem.
The most useful thing about having two AI collaborators isn't that they agree with each other. It's that they don't. Gemini's sandboxing constraint produced a better protocol than the one I would have designed without it. Claude catching the missing files added a rule that will matter for every session that follows.
A notepad they can both write to. That was the whole idea. It took 45 minutes and one failure to get there.
Update — 6 June 2026: a third notepad
After publishing, I added Meta AI to the protocol. The file-based workflow didn't need to change. Meta reads the session files via public GitHub URLs and writes by generating the exact markdown block for me to commit, which satisfies the verification rule we added after Gemini's first silent failure.
The participant table in PROTOCOL.md now lists four handles. The turn-taking, handoff format, and git history stay identical. Having three models with different strengths — Claude for implementation, Gemini for research, Meta for live web context and quick data experiments — makes the human relay more useful, not less. The notepad scales by adding rows, not infrastructure. I was also surprised that I found Meta to be sometimes the best one at brainstorming on technical topics. Meta AI is worth keeping an eye on!