Blog

Best ways to share what you're working on with your AI agent

Your agent cannot see your screen. So when something looks wrong, you describe it. "The dropdown opens but the options render behind the modal, and the button below it shifts about ten pixels right when you hover." Three sentences in, you are doing layout forensics in prose, and the agent is still guessing which dropdown you mean.

Describing UI state in words is slow and lossy. The state lives on your screen; the agent needs it as something it can read. There are five ways to bridge that gap, and they are not equal. Here they are, roughly in order of effort, ending with the escape hatch for when nothing else fits.

Paste a screenshot

The zero-setup option. Take a screenshot, paste it into the chat. Claude Code, Cursor, and most agent clients accept images, and modern models read them well. For "this button is misaligned," a screenshot beats four sentences every time.

The limits are structural. A screenshot is one moment with no history: the agent sees the broken state but not the three clicks that produced it. There is no transcript, so anything you would have said out loud has to be typed anyway. And the image lives only in that one conversation. Tomorrow's session starts blind.

Use it for: a single visual state, right now, in the conversation you already have open.

Copy console or network output

When the problem is an error rather than a layout, raw text is hard to beat. Copy the stack trace from the console, or the failing request and response from the network tab, and paste it. The agent gets exact strings it can search the codebase for, which is precisely what it is good at.

The trade is the inverse of the screenshot: you get precision and lose the picture. Console output does not show what the page looked like when the error fired, or what you did to trigger it. You end up pasting the error, then describing the visual context around it anyway. It is half the evidence.

Use it for: errors, failed requests, anything where the exact text matters more than the pixels.

Record your screen with a transcript

A screen recording captures what the first two methods each miss: the sequence and the narration. You walk through the bug, talk over it, and everything is in there, including the steps to reproduce, the visual state at each step, and the context you would have spent paragraphs typing.

One catch decides whether this method works at all: the agent has to be able to read the recording. A video link from a tool that only serves a player gives the agent nothing. You will end up watching your own recording and transcribing the relevant parts by hand, which defeats the point. A recording only counts as shared context if it comes out the other side as text, meaning a transcript, descriptions of what was on screen, and any errors that appeared.

Use it for: anything with more than one step, anything where you would naturally talk while showing.

Connect recordings over MCP (recommended)

This is the method we recommend, and the reason ClipCabinet exists. MCP (Model Context Protocol) is an open protocol that lets AI tools connect to outside data sources. ClipCabinet runs an MCP server: you mint an API token, connect your client, and from then on your agent can list your recordings, search across them, and read any clip directly.

The difference from every method above is who does the work. You do not paste anything. You say "look at my latest clip" or "find the recording where checkout threw a 500," and the agent pulls the transcript, the frame captions, and the extracted errors itself. Because clips are processed and indexed when you record them, that context is durable: a recording from last week is as queryable as one from five minutes ago, in any session.

Setup is one click per client. ClipCabinet has connect cards for Claude Code, Cursor, VS Code, and Codex that mint the right token and wire up the config. The free tier includes MCP access, so the whole workflow is testable on 20 free clips before paying anything. The full tool reference is at /docs/mcp.

Use it for: your default. Record once, and every agent session can reach it.

Markdown export

Some clients have no MCP support, and some conversations are in a tool you cannot configure. For those, every ClipCabinet clip has a Markdown view: append .md to the clip URL, or hit "Copy as Markdown" in the app, and you get the transcript, comments, and metadata as clean text. Paste it anywhere an agent reads text, which is everywhere.

It is the same content the MCP server serves, delivered by hand instead of on demand. You lose the search-and-fetch loop, but the recording still arrives as something the agent can actually use.

Use it for: clients without MCP, or one-off pastes into anything with a text box.

The bottom line

Screenshots and console pastes are fine for small, immediate questions, and you will keep using them. But the moment a problem has steps, state, or a story, record it, and record it with a tool that turns the recording into text. Connect that tool over MCP so the agent can fetch context itself, and keep Markdown export in your pocket for everything else.

ClipCabinet does all of the above natively. Install the extension, record one real bug, and ask your agent about it. The setup details live at /docs/mcp.

FAQ

What is MCP?

MCP (Model Context Protocol) is an open protocol for connecting AI tools to outside data sources. ClipCabinet runs an MCP server; your agent connects to it with an API token and can then list, search, and read your recordings directly.

Which agents support MCP?

Claude Code, Claude Desktop, Cursor, VS Code, and Codex all support MCP. ClipCabinet has one-click connect cards for Claude Code, Cursor, VS Code, and Codex that mint a token and set up the config for you.

What if my agent has no MCP support?

Use the Markdown export. Append .md to any clip URL, or use Copy as Markdown in the app, and paste the result into the conversation. It contains the transcript, comments, and metadata as plain text.