Anatomy of a .deepcell
Eight sections of plain XML. Open one in any editor and read it like prose.
The first question a careful person asks about a new file format is the correct one: what's inside, and can I get out?
Black-box formats are a moat for the vendor and a trap for the user. You can't audit what you can't read. You can't extract what you can't parse. And "trust us, it's structured" is not a promise worth accepting from anyone whose business model depends on retention.
So .deepcell is the opposite of that. Open one in any text editor. It's
XML. It's UTF-8. It diffs in git. cat works. There are eight sections,
each doing one job, and you can read them top to bottom like prose.
The matrix laying out the post you're reading right now is itself a
.deepcell. Sixteen blog posts as Items, six finance roles plus three
customer segments as Contexts, audience fit as the cell value. No
formulas, no reasoning graph — minimal schema, because that's all this
particular use case needed. We'll walk it section by section.
The eight sections#
Header → StatusDefs → ContextDefs → ItemDefs →
CalcDefs → PresentationDefs → Values → Reasoning
CalcDefs and Reasoning are both optional. The matrix file omits them.
A 3-statement model uses every section.
1. Header#
Who, what, when. Free text, no schema games.
<Header>
<Title>DeepCell Blog Content Plan — Role x Segment Coverage Matrix</Title>
<Description>16 planned blog posts mapped against finance role and customer segment...</Description>
<Author>DeepCell content planning</Author>
<CreatedDate>2026-05-13</CreatedDate>
</Header>If you remember nothing else, remember this: the title and description are plain text in a plain element. No proprietary encoding. No vendor magic.
2. StatusDefs#
Every value carries a status. In a finance model that's the difference
between actual (reported), projected (forecast), forecast (an
alternate forward path), and any custom label your team needs. The calc
engine uses status to decide which values are visible to which scenario —
actuals override projections, and so on.
The matrix file only needs one:
<StatusDefinitions>
<Status statusId="fit">
<Label>Audience fit</Label>
</Status>
</StatusDefinitions>One axis of meaning. That's all this document does.
3. ContextDefs#
The time-and-scenario axis. In a DCF this is where Q1_2027, Q2_2027,
base_case, bear_case live. In the matrix, the "context" axis isn't time
— it's audiences:
<ContextDefinitions>
<Context contextId="SS" level="0" statusRef="fit">
<Label>Sell-side analyst</Label>
</Context>
<Context contextId="BS" level="0" statusRef="fit">
<Label>Buy-side analyst (public markets)</Label>
</Context>
<Context contextId="PV" level="0" statusRef="fit">
<Label>PE / VC analyst</Label>
</Context>
<!-- FPA, CF, IB, Individual, Team, Enterprise ... -->
</ContextDefinitions>A finance model lays its periods on this same axis. The shape is the same; the labels change.
4. ItemDefs#
The row axis. The line items. In a model these are Revenue, COGS,
Gross_Margin — typed, hierarchical (level 0 through 3), with parent
references that the engine uses to validate roll-ups. In the matrix they
are blog posts:
<Item itemId="post_03" order="30" level="0">
<Label>[T1] Anatomy of a .deepcell</Label>
<Description>What's inside the file? Am I locked in?</Description>
<DataType>category</DataType>
</Item>level="0" means top-level. A consolidated P&L would put Revenue at
level 0, Product_Revenue and Service_Revenue at level 1, and the
SKU-level lines under those. The hierarchy is enforced — you can't post a
value to a parent if it has children doing the work.
5. CalcDefs#
The matrix has none. It's a literal table; every cell was authored
directly. So <CalculationDefinitions/> sits empty:
<CalculationDefinitions/>In a 3-statement model this section is where the document earns its name. A calculation looks roughly like this (illustrative — your model will have hundreds):
<Calculation itemRef="Gross_Margin"
formula="Revenue - COGS"
scenarioRef="base_case"/>The engine ships NPV, IRR, SUMIF, IF, and about thirty other functions, resolves the dependency DAG, detects cycles, and recomputes on edit. The point isn't the function library — Excel has more. The point is that the formulas live in their own section, separately addressable from the values they produce. That's what makes them auditable.
6. PresentationDefs#
How the data is laid out for rendering. One sheet, one block, in this case:
<PresentationDefinitions>
<Sheet sheetId="matrix" name="Coverage Matrix">
<Block blockId="matrix_block"
name="Audience fit by post"
blockType="table"
xAxis="timecontext"
yAxis="item"
itemOrders="10-160"
contextRefs="SS,BS,PV,FPA,CF,IB,Individual,Team,Enterprise" />
</Sheet>
</PresentationDefinitions>This is what tells the renderer (web grid, Excel add-in, exporter — same intermediate representation feeds all three) that items go down, contexts go across, and the block runs from order 10 through 160. Presentation is separate from data. Re-pivoting is a presentation edit, not a model edit.
7. Values#
The cells themselves. Grouped by item and status:
<ItemGroup itemRef="post_01" statusRef="fit">
<Value contextRef="SS">primary</Value>
<Value contextRef="BS">primary</Value>
<Value contextRef="PV">primary</Value>
<Value contextRef="FPA">secondary</Value>
<Value contextRef="CF">primary</Value>
<Value contextRef="IB">secondary</Value>
<Value contextRef="Individual">primary</Value>
<Value contextRef="Team">primary</Value>
<Value contextRef="Enterprise">primary</Value>
</ItemGroup>Read that block out loud and you have the spec for one row of the matrix. "Post 1 is a primary target for sell-side, buy-side, PE/VC, controllers, individuals, teams, and enterprises; a secondary target for FP&A and deal analysts." The file tells you exactly that, in exactly that order.
Every fact in a .deepcell is one of these <Value> elements, addressed
by Item × Time × Scenario × Status. Four dimensions. No hidden state.
8. Reasoning#
The matrix doesn't carry a reasoning graph. A real model does — typed nodes for Claims, Assumptions, Evidence, and Arguments that connect a number to the conviction behind it. The why for the discount rate. The source for the growth curve. The argument graph that says this catalyst supports that thesis, and here's the filing it cites.
That's a long enough subject to get its own post. Show your work covers it.
Am I locked in?#
No. deepcell to-excel writes an xlsx anytime — formulas preserved if you
want them, values-only if you don't. The reasoning graph and the version
history don't survive the round-trip (Excel has nowhere to put them), but
the model itself comes out clean. You can leave whenever you want.
That's the difference between a file format and a platform. We picked the file format on purpose. The full argument is in why a new file format.
See it for yourself — open a sample .deepcell in the playground. Edit a value, watch the dependents recalculate, inspect the reasoning behind any number.