Giving an AI Assistant Hands: Action + MCP Data Layers

Pre-treatment form the assistant can pre-fill before I confirm — The assistant can pre-fill this form — I still confirm before anything saves. (Synthetic data.)

Once the assistant could read my data (part 5), the obvious next step was letting it do things. I built that in two layers, on purpose.

1. The on-device action layer

This one drives the app. The assistant can open a screen, pre-fill a form, start a flow. But it doesn't poke at the UI directly — every action goes through a small state machine, so "fill the pre-treatment form" is a defined transition, not the model clicking around hopefully.

And it stops short of committing. The assistant pre-fills; I confirm. The amber highlight on a pre-filled field is a deliberate tell — it shows me what the AI touched before I sign off on it.

2. The cloud data-context layer

The second layer is a read-only /api/mcp endpoint. It exposes my data to a model through the Model Context Protocol — but read-only, by design. A model can look; it can't write to my records over the wire. Anything that changes data goes through the on-device path, where I'm in the loop.

The rule: separate "look" from "touch"

Reading is low-risk and can be broad. Acting is higher-risk and should be narrow, explicit, and confirmable. Splitting them into two layers — a read-only data context and a constrained, human-confirmed action layer — is how you get an agent that's genuinely useful without being scary. That split holds for almost any agent in production.

This is the agent question every team hits eventually: not "can the model call tools," but "what is it allowed to do unsupervised, and how do I prove it." Worth getting right early.

Giving the assistant hands: an action layer and a data layer

1. The on-device action layer

2. The cloud data-context layer

The rule: separate "look" from "touch"