Cognitive Graphs: A General Architecture for Replayable Reasoning
A Cognitive Graph is an enhanced chat log one that doesn’t just record what was said, but structurally preserves how thoughts, decisions, alternatives, and artifacts evolve over time, making the whole history replayable and auditable.
TL;DR: A Cognitive Graph turns AI-assisted work from lossy chat logs into event-sourced, replayable reasoning state. Instead of preserving only the final output, it preserves the decisions, alternatives, rationales, mutations, hashes, and snapshots that explain how the output came to exist.
Most AI tools remember outputs.
They do not remember how those outputs came to exist.
A user asks a question, receives an answer, edits it, compares it with alternatives, rejects parts, accepts others, and moves on. The final artifact survives, but the reasoning path usually disappears.
That loss matters.
In serious work, the path is often as valuable as the artifact. A code patch matters, but so do the tests, rejected approaches, tradeoffs, review comments, and final decision. A written paragraph matters, but so do the variants, constraints, editorial judgments, and reasons it was chosen. A research claim matters, but so do the sources, objections, confidence levels, and competing interpretations behind it.
A Cognitive Graph is a way to preserve that process.
It is a graph-based, event-sourced structure for storing how thoughts evolve into artifacts.
The key idea is simple:
The final artifact matters.
The reasoning path that produced it often matters more.
This post describes the architecture as a general pattern. I will use Writer, my AI-assisted writing system, as the implementation example later, but the idea is not limited to writing.
A Cognitive Graph can be used for:
- code review
- research synthesis
- legal reasoning
- policy analysis
- agent workflows
- human-AI collaboration
- long-running design decisions
The common structure is always similar:
graph LR
intent[💡 Intent]
alternatives[🔀 Alternatives]
critique[🔍 Critique]
revision[✏️ Revision]
decision[✅ Decision]
rationale[📖 Rationale]
artifact[🏆 Artifact]
intent --> alternatives
alternatives --> critique
critique --> revision
revision --> decision
decision --> rationale
rationale --> artifact
rationale -.->|🔄 revisit| intent
classDef start fill:#e1f5fe,stroke:#01579b,stroke-width:2px,color:#000
classDef process fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#000
classDef decided fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#000
classDef final fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#000
class intent start
class alternatives,critique,revision,rationale process
class decision decided
class artifact final
The goal is to make that structure durable, inspectable, and replayable.
The Problem: AI Workflows Lose Their Most Valuable Data
Most AI-assisted workflows are lossy.
A user asks for help. The model produces something. The user keeps part of it, rejects part of it, asks for a revision, compares it with another answer, accepts a final version, and moves on.
The final artifact survives.
The reasoning path usually disappears.
That means the system loses:
- why one variant was preferred over another
- which alternatives were explicitly rejected
- which instruction led to the useful output
- which branch failed
- which decision was human-made
- how the artifact evolved across multiple attempts
- whether the same reasoning could be replayed later
This problem is not limited to writing.
In code, the final patch survives, but the failed approaches, benchmark results, review comments, and tradeoffs often disappear.
In research, the final summary survives, but the rejected interpretations, source conflicts, and confidence judgments often disappear.
In design, the final decision survives, but the alternatives and constraints that shaped it are scattered across chat logs, tickets, documents, and memory.
The issue is not that AI systems cannot generate useful outputs. They can.
The issue is that most AI systems do not preserve the evolution of the work.
A typical AI workflow looks like this:
answer = model.generate(prompt)
# User edits the answer.
# User asks another model.
# User rejects part of it.
# User accepts a final version.
save(final_answer)
That gives you the artifact.
It does not give you the reasoning history.
A Cognitive Graph changes the unit of storage. Instead of only saving the final output, it records the transitions that produced it:
graph.append_node(type="intent", content="Improve this paragraph")
draft = graph.append_node(type="draft", content=first_answer)
revision = graph.append_node(type="revision", content=second_answer)
graph.link(draft, revision, type="replaces")
decision = graph.append_node(
type="decision",
content="Accepted the revision because it was clearer and shorter."
)
graph.link(decision, revision, type="accepts")
This is still simplified, but the shift is important.
The system no longer only knows:
Here is the final answer.
It also knows:
Here was the intent.
Here was the first attempt.
Here was the revision.
Here is what replaced what.
Here is the decision.
Here is why the decision was made.
That is the difference between storing output and storing cognition.
The core question becomes:
Can the system preserve not just the artifact, but the path that produced it?
A Cognitive Graph is one answer to that question.
It turns the hidden evolution of AI-assisted work into structured, replayable state.
The Core Idea: Reasoning as a Graph
A Cognitive Graph treats reasoning as a graph, not a transcript.
A transcript is chronological.
A graph is causal.
A transcript can tell you what was said. A graph can tell you what changed, what caused the change, which alternatives existed, and which path became canonical.
Instead of storing this:
prompt
→ answer
→ discarded context
we store this:
intent
→ draft
→ variant
→ critique
→ revision
→ decision
→ rationale
→ artifact
Every step can become graph state.
At the simplest level, a Cognitive Graph needs only a few primitives:
from dataclasses import dataclass
@dataclass
class Node:
id: str
type: str
content: str
metadata: dict
@dataclass
class Edge:
id: str
source_id: str
target_id: str
type: str
metadata: dict
A node is a unit of thought.
An edge is the relationship between thoughts.
For example:
intent = graph.add_node(
type="intent",
content="Make this explanation clearer for a technical reader.",
)
draft = graph.add_node(
type="draft",
content="The system stores conversation history.",
)
revision = graph.add_node(
type="revision",
content="The system preserves the evolution of decisions, alternatives, and artifacts.",
)
graph.add_edge(
source_id=draft.id,
target_id=revision.id,
type="replaces",
)
graph.add_edge(
source_id=intent.id,
target_id=revision.id,
type="guides",
)
Now the system does not merely know the final sentence.
It knows:
what the user wanted
what the first attempt said
what replaced it
which intent guided the replacement
That is already more useful than a transcript.
But a Cognitive Graph becomes much more powerful when we add branches, events, snapshots, and decisions.
What This Is Not
| Chat Log / Transcript | Cognitive Graph |
|---|---|
| Chronological | Causal |
| Stores outputs | Stores transitions |
| Hard to replay | Event-sourced and deterministic |
| Passive memory | Executable memory |
| Context evaporates | Lineage persists |
A Cognitive Graph is not just a vector database. A vector database can retrieve similar content, but it does not usually preserve the causal evolution of decisions.
A Cognitive Graph is not just a knowledge graph. A knowledge graph stores entities and relationships. A Cognitive Graph adds mutation events, lifecycle state, replay, snapshots, hashes, and decision provenance.
A Cognitive Graph is not a chain-of-thought transcript. It is an external, structured, replayable system for preserving reasoning state.
The Basic Model
A practical Cognitive Graph has seven core parts:
CognitiveGraph
├── Nodes
├── Edges
├── Branches
├── Events
├── Snapshots
├── Decisions
└── Artifacts
Each one has a different job.
| Part | Purpose |
|---|---|
Node |
Stores a unit of thought, text, code, evidence, decision, or rationale |
Edge |
Stores a typed relationship between nodes |
Branch |
Stores an alternate reasoning path |
Event |
Stores an append-only mutation to the graph |
Snapshot |
Freezes a validated graph state |
Decision |
Records a selection among alternatives |
Artifact |
Stores an output that survived the process |
Example node types:
intent
claim
draft
variant
critique
revision
test_result
review_note
decision
constitutional_reasoning
artifact
Example edge types:
refines
replaces
supports
contradicts
accepts
rejects
justifies
depends_on
canonicalizes
This gives us a compact language for modeling reasoning.
A rewritten sentence is not just new text. It is a node that may replace another node.
A review comment is not just a note. It is a node that may critique a variant.
A decision is not just a label. It is a node that may accept one path and reject another.
A final artifact is not just a file. It is the canonical descendant of a reasoning process.
The Build Arc: From Storage to Runtime
A Cognitive Graph does not need to emerge fully formed.
It can be built in layers. Each layer turns passive records into something more executable:
| Layer | Focus | What Changes |
|---|---|---|
| Graph substrate | Nodes, edges, branches | Reasoning becomes structured instead of linear |
| Validation | Types, references, lifecycle | The graph can reject invalid cognitive state |
| Hashing | Structure/content/state identity | Reasoning states become comparable |
| Snapshots | Frozen graph checkpoints | Cognitive state can be preserved and restored |
| Mutations | Explicit units of change | The system can describe what changed |
| Event sourcing | Append-only event stream | State can be reconstructed from history |
| Replay verification | Replay vs snapshot | The history becomes executable and auditable |
| Event-first execution | Events become authoritative | State is derived from recorded mutations |
| Decisions | Human/editor choices | Judgment becomes graph state |
| Why Graphs | Structured rationales | Reasons become replayable state |
Why a Graph Instead of a Chat Log?
Because serious work does not move in a straight line.
A conversation can fork.
One branch may explore the idea as prose. Another as code. Another may test assumptions. Another may produce a simpler explanation for readers. All of those branches may start from the same intent.
graph LR
intent[💡 Intent]
prose_path[📝 Prose Path]
code_path[💻 Code Path]
test_path[🧪 Test Path]
revision[✏️ Revision]
patch[🔧 Patch]
benchmark[📊 Benchmark]
decision[✅ Decision]
artifact[🏆 Artifact]
intent --> prose_path
intent --> code_path
intent --> test_path
prose_path --> revision
code_path --> patch
test_path --> benchmark
revision --> decision
patch --> decision
benchmark --> decision
decision --> artifact
classDef start fill:#e1f5fe,stroke:#01579b,stroke-width:2px,color:#000
classDef branch fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#000
classDef action fill:#fff8e1,stroke:#f57f17,stroke-width:2px,color:#000
classDef converge fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#000
classDef result fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#000
class intent start
class prose_path,code_path,test_path branch
class revision,patch,benchmark action
class decision converge
class artifact result
A chat log flattens this structure.
A Cognitive Graph preserves it.
This matters because rejected branches are not always useless. A failed implementation may explain why a safer design was chosen. A rejected paragraph may contain the sentence that later becomes the opening. A failed hypothesis may become evidence for a better one. A graph lets the system keep those paths without pretending they all belong to one linear conversation.
But the difference runs deeper than forking. It changes the very nature of the record.
| Chat Log | Cognitive Graph | |
|---|---|---|
| Organisation | Chronological | Causal |
| Content | Outputs (what was said) | Transitions (what changed, why) |
| Reproducibility | Hard to replay | Event‑sourced and replayable |
| Evolution captured | A flat sequence | Branches, decisions, alternatives, lineage |
| Memory type | Passive memory | Executable memory |
That table is not just a comparison it’s the definitional boundary. A chat log can tell you what happened. A Cognitive Graph tells you what changed, who decided, which path survived, and why. It turns collaboration history into something you can audit, replay, and build on.
Validation: Turning a Graph Into a Runtime
A graph by itself is only structure.
It can store nodes. It can store edges. It can store branches.
But storage is not enough.
If a Cognitive Graph is going to preserve reasoning, it needs rules. Otherwise it becomes another pile of loosely connected records.
For example:
Can a rejected branch still receive new nodes?
Can an edge point to a missing node?
Can a graph with invalid references be snapshotted?
Can two nodes have the same identity?
Can a replay continue if the graph is structurally broken?
If the answer to those questions is vague, the graph cannot be trusted.
So the next step is validation.
A Cognitive Graph needs to know when it is valid, when it is unsafe to mutate, and when it is safe to freeze, replay, or compare.
At minimum, validation should check:
valid node types
valid edge types
valid branch states
missing node references
duplicate IDs
invalid artifact references
cycle rules
lifecycle constraints
A very small version might look like this:
ALLOWED_NODE_TYPES = {
"intent",
"draft",
"variant",
"revision",
"decision",
"reasoning",
"artifact",
}
ALLOWED_EDGE_TYPES = {
"refines",
"replaces",
"supports",
"rejects",
"justifies",
"accepted_into",
}
def validate_graph(nodes, edges):
node_ids = {node.id for node in nodes}
errors = []
for node in nodes:
if node.type not in ALLOWED_NODE_TYPES:
errors.append(f"Invalid node type: {node.type}")
for edge in edges:
if edge.type not in ALLOWED_EDGE_TYPES:
errors.append(f"Invalid edge type: {edge.type}")
if edge.source_id not in node_ids:
errors.append(f"Missing source node: {edge.source_id}")
if edge.target_id not in node_ids:
errors.append(f"Missing target node: {edge.target_id}")
return errors
That is not complicated.
But it changes the nature of the system.
The graph is no longer just a place where reasoning is stored. It becomes a state object with integrity rules.
Now we can say:
This graph is valid.
This branch can be mutated.
This snapshot is safe to create.
This replay is structurally trustworthy.
That is the first step from memory toward runtime.
Branch Lifecycle
Validation also needs to apply to branches.
A branch is not just a folder for nodes. It is a reasoning path with lifecycle state.
A simple branch lifecycle might be:
active
merged
abandoned
canonical
Each state should mean something.
For example:
def can_append_to_branch(branch):
return branch.status == "active"
That sounds almost too simple, but it prevents a subtle class of errors.
If a branch was abandoned, and the system later appends new reasoning to it accidentally, the history becomes ambiguous. Did the branch really fail? Was it reopened? Did a later process corrupt old state?
Lifecycle rules remove that ambiguity.
active → can receive new nodes
merged → preserved but no longer independently mutating
abandoned → preserved as history, not extended
canonical → accepted as part of the final path
This matters because rejected or abandoned work is still valuable.
A failed branch may explain why a decision was made. A rejected variant may later become evidence. An abandoned implementation may document a constraint.
The system should preserve those branches, but it should not accidentally keep mutating them as if they were still live.
Traversal: Asking Questions of the Graph
Once a graph is valid, we can traverse it.
Traversal is what lets the system ask useful questions:
What led to this decision?
Which drafts did this artifact descend from?
What did this variant replace?
Which reasoning node justified this choice?
Which branches contributed to the final artifact?
A minimal traversal function might look like this:
def ancestors(node_id, edges):
parents = {}
for edge in edges:
parents.setdefault(edge.target_id, []).append(edge.source_id)
result = []
stack = [node_id]
while stack:
current = stack.pop()
for parent in parents.get(current, []):
if parent not in result:
result.append(parent)
stack.append(parent)
return result
With that, a final artifact is no longer just a blob of text or code.
It becomes the endpoint of a path.
intent
→ draft
→ critique
→ revision
→ decision
→ canonical artifact
That path is what makes the artifact explainable.
Without validation, snapshots can preserve broken state.
Without lifecycle rules, branches lose meaning.
Without traversal, lineage is just data you cannot query.
This is the point where the Cognitive Graph starts to become a runtime.
Not because it can generate anything.
Because it can enforce and inspect the structure of thought.
Deterministic Identity: Hashing Cognitive State
Once the graph can be validated, it needs identity.
Not just a database ID.
A database ID tells you where something is stored. It does not tell you what the thing is.
For a Cognitive Graph, that distinction matters.
Two graphs might live in different databases but represent the same reasoning. Two branches might have the same structure but different content. Two snapshots might contain the same artifact but different lifecycle state.
So the graph needs a way to say:
This reasoning state is identical to that reasoning state.
or:
This reasoning state has changed.
That is where deterministic hashing becomes useful.
A Cognitive Graph benefits from more than one identity:
| Hash | Meaning | Example Question |
|---|---|---|
structure |
The topology of the graph | Did the reasoning path change shape? |
content |
The semantic payload | Did the text, code, claim, or artifact change? |
state |
The exact runtime state | Is this graph exactly the same checkpoint? |
This distinction is important.
A graph can have the same structure but different content:
intent → draft → revision → decision
That path may be identical across two runs, even if the actual draft text differs.
A graph can have the same content but different runtime state. For example, the same artifact may exist in two graphs with different graph IDs, branch statuses, or evaluation scores.
So instead of treating identity as one thing, we separate it:
structure hash = how it is connected
content hash = what it says
state hash = exact runtime identity
This leads to one of the most important distinctions in the system:
graph_id = storage identity
hash = cognitive identity
A graph_id is useful for lookup.
A hash is useful for comparison.
If two systems produce the same content hash, they may have arrived at the same reasoning payload even if they created it in different places.
If a snapshot’s state hash changes, the graph has drifted.
If a replay produces the same content hash but a different state hash, the reasoning content may match while runtime identity differs.
That is exactly the kind of distinction a serious reasoning system needs.
A Minimal Hashing Example
A tiny version might look like this:
import hashlib
import json
def stable_json(value: dict) -> str:
return json.dumps(
value,
sort_keys=True,
separators=(",", ":"),
ensure_ascii=False,
)
def sha256(value: dict) -> str:
return hashlib.sha256(stable_json(value).encode("utf-8")).hexdigest()
def structure_projection(graph):
return {
"nodes": sorted(
[{"id": n.id, "type": n.type} for n in graph.nodes],
key=lambda n: n["id"],
),
"edges": sorted(
[
{
"source": e.source_id,
"target": e.target_id,
"type": e.type,
}
for e in graph.edges
],
key=lambda e: (e["source"], e["target"], e["type"]),
),
}
structure_hash = sha256(structure_projection(graph))
The important part is not the specific code.
The important part is the rule:
Same graph state → same hash.
Different graph state → different hash.
To make that true, the hash input must be canonical:
sort keys
sort lists
remove timestamps
remove volatile metadata
preserve meaningful content
Without canonicalization, hashes become noisy. The system starts detecting accidental formatting differences instead of real cognitive differences.
Hashing turns the Cognitive Graph from a structure into something that can be verified.
Now the system can ask:
Did this branch actually change?
Does this replay match the snapshot?
Did this artifact drift?
Are these two reasoning paths equivalent?
Can I deduplicate this result?
This is the foundation for snapshots, replay verification, deduplication, branch comparison, cache keys, and audit trails.
Once a graph has deterministic identity, it becomes possible to freeze it and prove whether it changed later.
Snapshots: Freezing Cognitive State
Once a graph has deterministic identity, you can freeze it.
That is what a snapshot does.
A snapshot is not just a backup. A backup says:
Here is a copy of some data.
A Cognitive Graph snapshot says:
Here is a validated reasoning state,
with structure, content, and state hashes,
captured at this point in time.
That distinction matters.
A useful snapshot stores more than the graph payload. It should also store the hashes that prove what was captured:
@dataclass
class GraphSnapshot:
snapshot_id: str
graph_id: str
structure_hash: str
content_hash: str
state_hash: str
node_count: int
edge_count: int
branch_count: int
artifact_count: int
graph_state_json: dict
The graph_state_json field contains the frozen state needed for restoration or inspection. The hashes tell us whether that state still matches what was originally captured.
In other words:
payload = what we froze
hashes = proof of what we froze
A snapshot should also have provenance.
The system should know why it was created:
before canonicalization
before merging branches
after human approval
before applying a patch
after completing a review session
Without provenance, snapshots become anonymous checkpoints. They may still be technically valid, but they are harder to interpret.
A simple provenance event might look like this:
@dataclass
class ProvenanceEvent:
event_id: str
graph_id: str
event_type: str
actor: str
payload: dict
ProvenanceEvent(
event_id="event_001",
graph_id="graph_123",
event_type="snapshot.created",
actor="system",
payload={
"snapshot_id": "snap_001",
"reason": "before_branch_merge",
"state_hash": "sha256:..."
},
)
Now the system can answer not only:
What was the graph state?
but also:
Why was this state preserved?
That is important for auditability.
Restoration: Proving the Snapshot Is Executable
A snapshot becomes much more powerful when it can be restored.
The basic loop is:
snapshot payload
→ restore graph
→ recompute hashes
→ compare with stored hashes
If the hashes match, the snapshot is not merely descriptive. It is executable.
A minimal restoration check might look like this:
def verify_restored_snapshot(snapshot):
restored_graph = restore_graph(snapshot.graph_state_json)
return {
"structure_match": hash_structure(restored_graph) == snapshot.structure_hash,
"content_match": hash_content(restored_graph) == snapshot.content_hash,
"state_match": hash_state(restored_graph) == snapshot.state_hash,
}
This lets the system detect corruption, drift, or incomplete restoration.
The important invariant is:
snapshot payload
→ restored graph
→ same hashes
If that invariant holds, the snapshot can be trusted.
There is one subtle point.
If the state hash includes the graph’s storage identity, then restoring into the original graph and restoring into a new graph are not the same operation.
Restoring into the same graph can produce full equality:
structure match = true
content match = true
state match = true
Restoring into a new graph may still preserve structure and content, but the exact runtime identity changes:
structure match = true
content match = true
state match = false
That is not a failure.
It is an honest distinction.
The new graph may contain the same reasoning, but it is not the same runtime state. That is why separating structure, content, and state hashes matters.
Snapshots turn reasoning into something you can checkpoint.
A human or agent can experiment freely, knowing that the graph can return to a known valid state.
The next step is to record not just the frozen states, but the mutations that move the graph from one state to another.
Mutation: The Unit of Cognitive Change
Before we can talk about event sourcing, we need to define what actually changes.
In a Cognitive Graph, the basic unit of change is a mutation.
A mutation is any operation that changes the graph’s cognitive state.
For example:
add a node
link two nodes
fork a branch
record a decision
canonicalize an artifact
attach a rationale
restore a snapshot
These are not just database writes.
They are cognitive transitions.
A node being added might mean a new idea entered the system. An edge being linked might mean two thoughts were related. A branch being forked might mean a new line of reasoning began. A decision being recorded might mean one option survived and another was rejected.
So the mutation is the point where reasoning changes shape.
A simple mutation function might look like this:
def append_node(graph, node_type, content):
node = Node(
id=new_id(),
type=node_type,
content=content,
metadata={},
)
graph.nodes.append(node)
return node
That works for a local graph.
But it does not answer the important questions:
Who made this change?
What state existed before it?
What state existed after it?
Can this change be replayed?
Can this change be audited?
Did this change happen directly, or through a recorded event?
Those questions matter because a Cognitive Graph is not just a container.
It is trying to preserve the evolution of reasoning.
So a mutation needs more structure.
From Mutation to Transition
A better model treats a mutation as a transition between two cognitive states:
previous graph state
→ mutation
→ next graph state
That transition should be inspectable.
At minimum, it should carry:
operation type
actor
payload
previous state identity
resulting state identity
In code, that starts to look like this:
@dataclass(frozen=True)
class GraphMutation:
mutation_id: str
graph_id: str
operation: str
actor: str
payload: dict
parent_state_hash: str | None
resulting_state_hash: str | None
Now a mutation does not just say:
a node was added
It says:
this actor added this node
to this graph
from this previous state
producing this resulting state
That is the bridge from graph editing to cognitive auditability.
If graph state is changed directly, the system can see the final result but not necessarily the path.
For example:
graph.nodes.append(node)
After this, the graph has a new node.
But the system may not know:
why the node was added
which workflow added it
what state existed before
whether this addition can be replayed
whether it bypassed validation
Direct mutation is easy.
But direct mutation is also where reasoning history disappears.
That is why the mutation itself has to become explicit.
A Cognitive Graph should not only store the result of a change. It should store the change as a first-class object.
This is where mutation events enter.
Mutation Events: Recording Cognitive Transitions
A mutation event is the durable record of a cognitive transition.
mutation intent
→ mutation event
→ event applier
→ new graph state
Once mutation events exist, the graph can stop asking only:
What is true now?
and start asking:
What changed?
Who changed it?
Why did it change?
Can we replay the change?
Did the resulting state match the expected hash?
Different domains will have different mutations, but the core set is small.
For a general Cognitive Graph, the first useful mutation events are:
graph.created
node.appended
edge.linked
branch.forked
artifact.canonicalized
decision.recorded
reasoning.recorded
snapshot.restored
Each mutation has a payload.
For example, a node mutation might contain:
{
"node_id": "node_123",
"node_type": "revision",
"content": "The revised paragraph goes here.",
"branch_id": "branch_004",
}
An edge mutation might contain:
{
"edge_id": "edge_456",
"source_node_id": "node_123",
"target_node_id": "node_789",
"edge_type": "replaces",
}
A decision mutation might contain:
{
"decision_id": "decision_001",
"decision_type": "accepted",
"target_node_id": "node_789",
"rejected_node_ids": ["node_123", "node_456"],
"rationale": "Clearer, shorter, and more aligned with the project voice.",
}
Each payload describes not just data, but intent.
It says what kind of cognitive transition occurred.
The rule is simple:
If a change matters to the reasoning process, it should be a mutation.
A variant being rejected matters. A branch being abandoned matters. A test result being attached matters. A rationale being recorded matters. A constraint being applied matters.
If these changes are not captured, the final artifact becomes harder to explain.
Snapshots preserve states.
Mutations explain how one state became another.
Events make those mutations durable and replayable.
Event-Sourced Cognition
Snapshots gave us checkpoints.
But checkpoints only tell you where the graph was.
They do not fully explain how the graph got there.
A snapshot can say:
Here is the reasoning state at this moment.
But it cannot fully answer:
What changed before this snapshot?
Which branch introduced the new idea?
Which decision selected this variant?
Which event caused the replay mismatch?
Which mutation moved this graph from one cognitive state to another?
To answer those questions, we need the thing that lives between snapshots:
events
A Cognitive Graph becomes much more powerful when every change to the graph is represented as an append-only event.
At first, the graph is state-first:
mutate graph state
→ optionally record what happened
That is useful, but it is not enough for replayable cognition.
The stronger model is event-first:
append event
→ apply event
→ derive graph state
In this model, the event is the authoritative fact. The graph state is the result of applying the event.
A simple event might look like this:
@dataclass(frozen=True)
class MutationEvent:
event_id: str
graph_id: str
sequence_number: int
event_type: str
actor: str
payload: dict
parent_state_hash: str | None
resulting_state_hash: str | None
For example:
MutationEvent(
event_id="evt_042",
graph_id="graph_123",
sequence_number=42,
event_type="node.appended",
actor="editor",
payload={
"node_id": "node_revision_17",
"node_type": "revision_text",
"content": "The revised sentence goes here.",
"branch_id": "branch_chapter_04",
},
parent_state_hash="sha256:before...",
resulting_state_hash="sha256:after...",
)
That one event contains a cognitive transition:
before state
→ node.appended
→ after state
Now the graph can answer not just:
What is the current state?
but:
What sequence of changes produced this state?
That is the turn from memory into runtime.
Event Types as Cognitive Operations
A small event vocabulary can express a surprising amount of reasoning:
| Event | Meaning |
|---|---|
node.appended |
A new unit of thought entered the graph |
edge.linked |
Two thoughts were related |
branch.forked |
A new reasoning path began |
artifact.canonicalized |
Some output survived the process |
decision.recorded |
An actor selected, rejected, deferred, or promoted something |
reasoning.recorded |
The system preserved why a decision was made |
That gives the system a language of cognitive mutation.
Not just:
save this text
but:
append this thought
connect these ideas
fork this line of reasoning
promote this artifact
record this decision
explain this decision
Applying Events
An event by itself is only a record.
To make it executable, we need an applier.
The applier takes an event and mutates graph state deterministically.
class EventApplier:
def apply(self, graph, event: MutationEvent):
if event.event_type == "node.appended":
graph.add_node(**event.payload)
elif event.event_type == "edge.linked":
graph.add_edge(**event.payload)
elif event.event_type == "branch.forked":
graph.add_branch(**event.payload)
elif event.event_type == "artifact.canonicalized":
graph.add_artifact(**event.payload)
else:
raise ValueError(f"Unsupported event: {event.event_type}")
The implementation can be more sophisticated, but the invariant is simple:
same initial state
+ same ordered event sequence
= same graph state
That invariant is what makes replay possible.
Replay means rebuilding the graph from the event log:
def replay(events):
graph = CognitiveGraph()
for event in sorted(events, key=lambda e: e.sequence_number):
EventApplier().apply(graph, event)
return graph
If replay works, then the event log is not just history. It is executable memory.
event stream
→ replay
→ reconstructed graph
A reasoning process can now be inspected, tested, compared, and restored from first principles.
Replay vs Snapshot: The Trust Boundary
This is where snapshots and events meet.
A snapshot says:
Here is the graph state we froze.
The event stream says:
Here is how the graph reached that state.
If both are correct, they should agree.
flowchart LR
%% Event replay path
A[📜 Event Stream] --> B[🔄 Replay Events]
B --> C[🧠 Reconstructed Graph]
%% Snapshot restore path
D[📸 Snapshot Payload] --> E[📂 Restore Snapshot]
E --> F[🧠 Restored Graph]
%% Hash computation
C --> G[🔢 Compute Hashes]
F --> H[🔢 Compute Hashes]
%% Comparison
G --> I{⚖️ Hashes Match?}
H --> I
%% Outcomes
I -->|✅ Yes| J[🎉 Replay Verified]
I -->|❌ No| K[⚠️ Drift Detected]
%% Styles
classDef eventPath fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#000
classDef snapshotPath fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px,color:#000
classDef compute fill:#fff3e0,stroke:#ef6c00,stroke-width:2px,color:#000
classDef decision fill:#f3e5f5,stroke:#6a1b9a,stroke-width:2px,color:#000
classDef positive fill:#c8e6c9,stroke:#1b5e20,stroke-width:2px,color:#000
classDef negative fill:#ffcdd2,stroke:#b71c1c,stroke-width:2px,color:#000
class A,B,C eventPath
class D,E,F snapshotPath
class G,H compute
class I decision
class J positive
class K negative
The important proof is:
replay(events) == restore(snapshot)
Not by manually inspecting the graph.
By comparing deterministic hashes:
structure hash
content hash
state hash
If the hashes match, the reasoning history is executable and consistent.
If they do not, something drifted:
an event was lost
an event was applied differently
the snapshot payload changed
the hash projection changed
state was mutated outside the event stream
That last case matters most.
Replay-vs-snapshot equivalence can detect hidden mutation.
Event-First Execution
Once replay worked, the next step was to make events authoritative.
So we inverted the mutation model.
Before:
mutate state
→ emit event
After:
emit event
→ apply event
→ derive state
A simplified flow looks like this:
def append_node(graph_id, node_type, content):
parent_hash = compute_state_hash(graph_id)
event = append_event(
graph_id=graph_id,
event_type="node.appended",
payload={
"node_id": new_id(),
"node_type": node_type,
"content": content,
},
parent_state_hash=parent_hash,
resulting_state_hash=None,
)
apply_event(event)
resulting_hash = compute_state_hash(graph_id)
update_event_resulting_hash(event.event_id, resulting_hash)
return get_node(event.payload["node_id"])
This does two things.
First, every mutation becomes observable.
Second, every mutation becomes replayable.
The graph is no longer changed by invisible service calls. It is changed by events that can be inspected, ordered, hashed, and replayed.
The central invariant becomes:
events become authoritative
state becomes derived
That is the point where a Cognitive Graph stops being a memory system and starts acting like infrastructure.
A Concrete Runtime Trace
Here is the architecture in one small trace.
Suppose a user asks the system to improve a paragraph. The system generates a revision, the user accepts it, and the graph records the reasoning path.
1. append_event(type="node.appended", payload={...})
2. EventApplier applies the event
3. node is created in graph state
4. compute_state_hash() returns sha256:8f3a...
5. update_event_resulting_hash(event_id, sha256:8f3a...)
6. create_snapshot() stores the graph payload and hashes
7. replay_events() rebuilds the graph
8. replay-vs-snapshot compares hashes
9. verified=True
A simplified event stream might look like this:
events = [
Event(
type="node.appended",
payload={
"node_id": "intent_1",
"node_type": "intent",
"content": "Make this paragraph clearer.",
},
),
Event(
type="node.appended",
payload={
"node_id": "draft_a",
"node_type": "draft",
"content": "The system keeps previous messages.",
},
),
Event(
type="node.appended",
payload={
"node_id": "draft_b",
"node_type": "revision",
"content": "The system preserves how decisions evolve over time.",
},
),
Event(
type="edge.linked",
payload={
"source_node_id": "draft_a",
"target_node_id": "draft_b",
"edge_type": "replaces",
},
),
Event(
type="decision.recorded",
payload={
"decision_type": "accepted",
"target_node_id": "draft_b",
"rejected_node_ids": ["draft_a"],
"rationale": "Clearer, shorter, and closer to the intended meaning.",
},
),
]
That event stream tells us:
what the user wanted
what the first attempt was
what replaced it
which option was accepted
which option was rejected
why the choice was made
The final paragraph is no longer isolated. It is the endpoint of a replayable reasoning path.
A Small Example: Choosing a Better Paragraph
Here is the whole idea in a small example.
Suppose the task is:
Make this paragraph clearer.
The system might produce two variants. The user accepts the second one because it is shorter and easier to read.
A normal tool stores only the final paragraph.
A Cognitive Graph stores the process:
events = [
Event(
type="node.appended",
payload={
"node_id": "intent_1",
"node_type": "intent",
"content": "Make this paragraph clearer.",
},
),
Event(
type="node.appended",
payload={
"node_id": "draft_a",
"node_type": "draft",
"content": "The system keeps a record of previous messages.",
},
),
Event(
type="node.appended",
payload={
"node_id": "draft_b",
"node_type": "revision",
"content": "The system preserves how decisions evolve over time.",
},
),
Event(
type="edge.linked",
payload={
"source_node_id": "draft_a",
"target_node_id": "draft_b",
"edge_type": "replaces",
},
),
Event(
type="decision.recorded",
payload={
"decision_type": "accepted",
"target_node_id": "draft_b",
"rejected_node_ids": ["draft_a"],
"rationale": "Clearer, shorter, and closer to the intended meaning.",
},
),
]
That event stream tells us:
what the user wanted
what the first attempt was
what replaced it
which option was accepted
which option was rejected
why the choice was made
Later, the system can replay the event stream and reconstruct the same graph.
That is the difference between storing a paragraph and preserving the reasoning that produced it.
Decisions Are Cognitive Events
Capturing suggestions is not enough.
The human decision process also needs to become graph state.
A decision can record:
accepted
rejected
deferred
preferred
merged
promoted
archived
A variant comparison can record:
variant A
variant B
comparison dimension
preferred option
rationale
This is where the graph becomes collaborative.
It no longer captures only what the AI generated.
It captures what the human chose.
That matters because the user’s decision is not metadata. It is part of the cognitive process.
A final artifact is not just the last output in a sequence.
It is the survivor of decisions.
Why Graphs: Capturing the Reason Behind the Decision
A Cognitive Graph should not only store what happened.
It should store why a decision was made.
For that, we add a structured reasoning node.
A “why” object might contain:
target decision
preferred option
rejected options
principle basis
tradeoffs
uncertainty assessment
policy constraints
identity alignment notes
rationale
confidence
provenance hash
In code, the shape might be:
@dataclass(frozen=True)
class ConstitutionalReasoning:
reasoning_id: str
target_node_id: str | None
target_decision_id: str | None
preferred_option: str
rejected_options: list[str]
principles: list[str]
tradeoffs: list[str]
constraints: list[str]
rationale: str
confidence: float
provenance_hash: str
This turns the graph from operational memory into epistemic memory.
Now it can answer:
Why did we accept this rewrite?
What principle won: clarity or voice?
Which alternatives were rejected?
What constraints mattered?
Do two decisions conflict?
Can we replay the justification?
That is a major shift.
The graph does not only preserve actions.
It preserves reasons.
Making the Graph Visible
A cognitive runtime is not useful if it remains invisible.
The system needs a way to inspect:
graphs
nodes
branches
review sessions
editorial decisions
variant comparisons
canonical artifacts
snapshots
replay status
lineage
reasoning nodes
The first interface should be read-only.
That is important.
Before a system edits cognitive state, it should make cognitive state understandable.
A useful dashboard lets the user answer:
What changed?
Which branch did this come from?
Which variants were rejected?
Which decision accepted this?
Why was it accepted?
Does replay still match the snapshot?
This is how hidden AI collaboration becomes inspectable.
What We Actually Built
At this point, the Cognitive Graph has become something larger than a writing feature.
It is an event-sourced runtime for structured reasoning.
The architecture now has direct analogues to distributed systems:
| Distributed Systems Concept | Cognitive Graph Equivalent |
|---|---|
| append-only log | mutation events |
| state reconstruction | replay |
| snapshots | graph snapshots |
| deterministic rebuild | replay-vs-snapshot verification |
| provenance | event, decision, and reasoning lineage |
| branches | cognitive branches |
| consistency checks | structure/content/state hashes |
| event appliers | cognitive state transitions |
This is why “Cognitive Graph” is not just a name.
It is a real substrate.
From Generation to Continuity: The Real Shift
Most AI tools are optimized for generation.
Cognitive Graph is optimized for continuity.
That’s a deeper difference than it first appears.
Generation asks a single question:
Can I produce a useful answer?
Continuity asks a set of them:
Can I preserve how this answer came to exist?
Can I compare it to alternatives?
Can I replay the reasoning later?
Can I audit the decision that favored it?
Can I use it as a foundation tomorrow?
The diagram below makes the architectural split concrete:
graph LR
subgraph Generation Paradigm
prompt[💬 Prompt] --> model[🤖 Model] --> answer[💡 Answer]
style prompt fill:#f8d7da,stroke:#721c24
style model fill:#f8d7da,stroke:#721c24
style answer fill:#f8d7da,stroke:#721c24
end
subgraph Continuity Paradigm
intent[🎯 Intent] --> cg[🧠 Cognitive Graph]
cg --> branches[🌿 Branches & Variants]
cg --> events[⚡ Mutation Events]
cg --> snapshots[📸 Snapshots]
cg --> decisions[✅ Decisions & Rationale]
cg --> artifacts[🏆 Artifacts]
snapshots --> replay[🔄 Replay & Verify]
artifacts -->|feeds| intent
end
style intent fill:#d4edda,stroke:#155724
style cg fill:#d4edda,stroke:#155724
style branches fill:#d4edda,stroke:#155724
style events fill:#d4edda,stroke:#155724
style snapshots fill:#d4edda,stroke:#155724
style decisions fill:#d4edda,stroke:#155724
style artifacts fill:#d4edda,stroke:#155724
style replay fill:#d4edda,stroke:#155724
On the left: a transient pipeline once the answer is emitted, the process evaporates.
On the right: a persistent, replayable reasoning substrate where every decision, alternative, and rationale is preserved.
This is what separates a chatbot from a cognitive runtime.
Beyond Writing: A General Architecture for Structured Decisions
Because the Cognitive Graph captures the evolution of thinking, not just text, its shape generalizes far beyond prose.
Any domain where decisions unfold over time benefits from the same memory, lineage, and replay:
- Code review – patches, critiques, rejections, final merge.
- Research synthesis – hypotheses, evidence, competing interpretations, consensus notes.
- Legal reasoning – interpretations, precedents, accepted arguments, dissents.
- Policy analysis – proposals, tradeoffs, principles, final positions.
- Design review – mockups, feedback rounds, selection rationale.
- Agent workflows – sub‑task branches, outcome comparisons, final plans. All right Kat Fox In each case, the core pattern is identical:
Structured evolving decisions, with memory.
A Cognitive Graph gives those decisions not just a record, but a provenance chain that can be replayed, verified, and learned from.
That’s the foundation for a new kind of AI‑assisted work one where the reasoning doesn’t vanish when the answer arrives.
Conclusion: Auditable AI, Branchable Thought
The real promise of a Cognitive Graph is not that it creates a new kind of AI.
It is that it gives you a better way to work with the AI you already have.
Most AI conversations move forward and then disappear behind you. You can scroll back, but you cannot easily branch from a decision, replay a path, compare two alternatives, or ask why one answer survived and another did not.
A Cognitive Graph changes that.
It lets you audit the path:
What did we ask?
What did the system try?
What did we reject?
What did we accept?
What changed after that?
It also lets you branch the path:
What if we went back to this earlier decision?
What if we explored the code version instead of the prose version?
What if another model reviewed the rejected branch?
What if we resumed from the moment before the final artifact was chosen?
That is the practical value.
The graph turns an AI session from a single forward-moving transcript into a structure you can revisit, inspect, fork, and extend.
You can go deeper at any stage of the conversation. You can preserve failed branches instead of losing them. You can ask the AI to research an alternate path without destroying the current one. You can compare the branch you took with the branch you abandoned. You can return later and understand why the work ended where it did.
That is what makes the Cognitive Graph useful.
It gives AI collaboration memory with handles.
Not just a record of what was said, but a replayable map of where thought could have gone, where it did go, and why one path became real.
What comes next: Graphs add structured reasons to those branch points. They make it possible to audit not only what changed, but why the system or the human chose one direction over another. That is where branching, replay, and justification start to work together.
Core Glossary
Cognitive Graph
A graph-based structure for preserving how thoughts, decisions, alternatives, and artifacts evolve over time.
Node
A unit of thought or work inside the graph. Examples include an intent, draft, variant, critique, decision, artifact, or reasoning object.
Edge
A typed relationship between two nodes. Examples include refines, replaces, supports, rejects, justifies, and accepted_into.
Branch
An alternate reasoning path within the graph.
Mutation
A meaningful change to the graph’s cognitive state.
Mutation Event
An append-only record of a mutation. It stores what changed, who changed it, and the state before and after.
Snapshot
A frozen, validated graph state captured at a point in time.
Replay
The process of rebuilding graph state from the event stream.
State Hash
A deterministic hash of exact runtime state.
Why Graph
The part of the Cognitive Graph that stores structured reasons behind decisions.
References and Further Reading
This post sits at the intersection of several existing ideas: event sourcing, provenance, graph-based memory, constitutional reasoning, and human-AI collaboration. The Cognitive Graph is not identical to any one of these, but it borrows useful patterns from each.
Event Sourcing and Replayable State
Martin Fowler’s writing on Event Sourcing is the clearest starting point for the software architecture behind the Cognitive Graph. The key idea is that state changes are stored as a sequence of events, and system state can be reconstructed by replaying those events. That maps directly onto the Cognitive Graph idea of mutation events, replay, and snapshot verification. (martinfowler.com)
Fowler’s broader article on event-driven systems is also useful because it explains the important distinction between logging things that happened and treating events as the source of truth. The Cognitive Graph follows the latter idea: graph state becomes derived from events rather than merely accompanied by events. (martinfowler.com)
CQRS is also relevant, especially the separation between command-side mutation and query-side read models. The Cognitive Graph does something similar: event-first mutation produces graph state, while traversal, lineage, replay, and dashboard views operate as query-side interpretations. (martinfowler.com)
Provenance and Auditability
The W3C PROV-O ontology is a useful conceptual reference for provenance. PROV-O provides a formal way to describe entities, activities, agents, and the relationships between them. Cognitive Graph is not a PROV-O implementation, but its ideas around provenance events, actor attribution, and derivation are closely related. (W3C)
The most important shared principle is that outputs are not enough. A trustworthy system should also preserve information about how an output was produced, what activity generated it, and which actor or system was responsible.
Constitutional Reasoning and “Why” Objects
Anthropic’s Constitutional AI paper is relevant to the “Why Graphs” direction. Constitutional AI uses explicit principles to guide model behavior through critique and revision. Cognitive Graph takes a runtime-oriented version of that idea: instead of only shaping model behavior during training, it stores principles, tradeoffs, rejected options, and rationales as explicit graph state. (arXiv)
More recent work on Collective Constitutional AI extends the idea by sourcing principles from broader groups of people. That connects naturally to the Cognitive Graph idea of multiple actors, competing rationales, variant comparison, and human-reviewable decision lineage. (arXiv)
How These Ideas Connect
The Cognitive Graph borrows from event sourcing the idea that history should be executable. It borrows from provenance systems the idea that artifacts should carry derivation and attribution. It borrows from knowledge graphs the idea that relationships matter. It borrows from constitutional reasoning the idea that decisions should be linked to principles.
The new piece is the combination:
a replayable graph of reasoning state, where decisions, alternatives, artifacts, and justifications are preserved as event-sourced cognition.