Show your work
What AI-traceable reasoning looks like inside a .deepcell.
"Can I trust an agent's number?"
It's the question every analyst eventually asks, usually right before a
deliverable goes out. The cell says 78.4%. The agent that produced it is
gone — context window flushed, chat archived, prompt forgotten. The model is
a snapshot. The reasoning evaporated when the tab closed.
A confident number with no audit trail is worse than no number at all. It leaks trust the moment someone in the meeting asks why.
The diagnostic#
ChatGPT-in-a-spreadsheet is a fluent stranger sitting next to you. It will fill any cell you point at. What it won't do — what no chat-shaped tool can do — is leave behind a structured record of how it got there. A cell comment is a paragraph. A paragraph is not a graph. You can't query it. You can't diff it. You can't ask which assumptions, if they broke, would force the number to move.
The same problem exists without agents. An analyst leaves the team and
their model becomes archaeology. The formulas survive; the why doesn't.
We've all inherited a workbook whose tabs are named final_v3_USE_THIS and
spent a Tuesday reverse-engineering somebody's worldview from cell
references.
The fix isn't a better comment system. It's giving reasoning the same first-class treatment we give formulas.
The design move#
A .deepcell carries a typed reasoning graph alongside the values.
Anatomy of a .deepcell covers the full
section list; the one we care about here is Reasoning. Four node
types:
- Claim — a load-bearing statement about the model. Seven kinds:
thesis,risk,catalyst,counter,question,market_consensus,knowledge. - Assumption — an input you're choosing to believe. Lifecycle:
holding,uncertain,broken,superseded. - Evidence — an anchor to something outside the model. A
sourceUri(filing:,url:,doc:,deepcell:), an optionaleffectiveDate, an excerpt. - Argument — a typed edge between any of the above. Eight relations:
supports,refutes,depends_on,derives_from,variant_of,supersedes,contradicts,references.
Claims link to the cells they justify — by itemRefs, contextRefs, and
statusRef. The graph isn't decoration. It's the part of the file that
explains the rest of the file.
Here's what a thesis looks like in the wild, slightly fictionalized from a real working model:
<Claim id="t_main_v2" kind="thesis" status="active" strength="high"
itemRefs="Gross_Margin_Pct" contextRefs="Q1_2027,Q2_2027" statusRef="projected">
<Label>2027 GM expansion thesis (v2)</Label>
<Body>Margin reaches 78% by mid-2027 — Q3 call confirmed steeper
inference cost curve than expected.</Body>
</Claim>
<Assumption id="a_inf_cost" status="holding">
<Body>Per-token inference cost falls 30%/yr.</Body>
</Assumption>
<Argument from="t_main_v2" rel="depends_on" to="a_inf_cost"/>Three nodes, one edge. The thesis points at two projected quarters of
Gross_Margin_Pct. It depends on an assumption about inference cost
decay. If that assumption breaks, the thesis is the first thing that
should be re-examined.
That's the contract: every load-bearing number in the model has a path back to the assumptions that produced it.
What you can do with it#
The graph is queryable from the CLI:
deepcell assumption impact a_inf_cost
# → lists every Claim that depends on this Assumption, transitivelyUseful when a quarterly print lands and you want to know, before lunch, which parts of your model the new data point touches.
The more interesting command is the one that runs at commit time:
deepcell reasoning-diff model.deepcell
# → after a working-tree edit, shows Claims now flagged as drift candidatesFlip a_inf_cost from holding to broken — say a vendor announces a
price floor — and reasoning-diff walks the graph and surfaces every
Claim downstream of that assumption as a drift candidate. Not
automatically falsified. Flagged. The analyst still decides whether the
thesis survives the new reality or needs a supersedes edge to a v3.
Wire that into a git pre-commit hook and your model can't quietly drift out from under its own thesis without somebody noticing.
History is also queryable:
deepcell claim history t_main_v2
# → every version of this Claim, with the Arguments that superseded itThe graph survives an analyst leaving. The next person to open the file gets the worldview, not just the worksheet.
Coexistence, not replacement#
An xlsx can carry a cell comment. It can't carry a graph. That's not a flaw in Excel — comments were designed for the casual reader, the colleague flipping through the tab. They do that job well.
The reasoning graph is for the audit trail. The two can live side-by-side:
comments stay in the workbook for the casual reader, the graph stays in
the .deepcell for the auditor, the successor analyst, and the agent
running pre-commit checks. Bring your Excel model in, layer reasoning on
top, export back out when you need to.
One aside#
The Reasoning section is the most recent addition to the format — the
spec went in last sprint and is the part of .deepcell most actively
evolving. The node kinds and relations above are stable; expect more
queries to land on top of them.
Who signs the work#
A reasoning graph is not absolution. The agent can populate Claims and
Evidence at scale, but a Claim with strength="high" is still a claim a
human is making. The graph makes the claim legible — to a reviewer, to
a regulator, to the analyst who inherits the model in eighteen months.
It doesn't make the claim correct. See
the analyst and Claude for how we think about
that division of labor: the agent drafts, the analyst signs.
The number on the page is still yours. Now there's a paper trail behind it.
See it for yourself — open a sample .deepcell in the playground. Edit a value, watch the dependents recalculate, inspect the reasoning behind any number.