ADR 0007: Epistemic Model — Preserving the Evolution of Understanding¶
Status: Accepted Date: 2026-06-07 Build target: v0.5.0 (Full scope) Related: ADR 0006 (Layered Architecture), ADR 0004 (Progressive Ontology); builds on the v0.5 provenance schema
Context¶
GraphForge is a system for preserving reasoning, not merely storing conclusions. Its analysts — in OSINT, intelligence, genealogy, fraud, due diligence, investigative journalism — do not proceed in a straight line. They form hypotheses, gather evidence, revise beliefs, reject theories, merge or split them, and sometimes revisit discarded explanations. The history of that process is often as important as the final conclusion.
The product principle is explicit:
GraphForge should preserve the evolution of understanding, not merely the current state of understanding. The disappearance of a hypothesis should not imply it never existed. The rejection of a theory should not destroy the evidence that caused the rejection.
The architecture today does the opposite. It is built for state, not evolution:
- Provenance (
provenance/events.parquet,lineage.parquet) is forward-lineage only — "where did this come from" — and is currently a 17-line stub that is never written. - There is no notion of a fact's epistemic status (is this a hypothesis? supported? rejected? superseded?).
- There is no way for competing interpretations to coexist; a graph naturally collapses to the single surviving interpretation.
- There is no reasoning trace — why a conclusion was reached, or why an alternative was rejected.
- There is no time model for belief — "what did we believe, and when?"
This ADR defines an epistemic model that closes the gap. Per the layering decision (ADR 0006), it lives entirely in the knowledge layer, attached to graph objects by UUID reference. It does not touch graph topology or change graph-native query semantics.
This is a v0.5.0 deliverable at Full scope — it is the basic-but-complete embodiment of the analyst vision the project has aimed at from the start.
Decision¶
Introduce an assertion (claim) as the unit of analytical belief, layered on top of the existing provenance schema, with bitemporal valid-time as its substrate. Assertions reference graph objects by UUID; the graph layer is unchanged.
1. Assertion / claim¶
An assertion is a statement an analyst (or an inference rule) makes about a graph object — a node, an edge, or a property value — together with its epistemic status, supporting evidence, reasoning, and the time window over which it is believed to hold.
knowledge/assertions.parquet:
| Column | Type | Notes |
|---|---|---|
assertion_uuid |
FixedSizeBinary(16) |
UUIDv7 — identity of this claim |
subject_uuid |
FixedSizeBinary(16) |
The graph object the claim is about (node_uuid / edge_uuid) |
subject_kind |
Utf8 |
"node" | "edge" | "property" |
claim |
Utf8 |
What is asserted (e.g. "is the same person as <uuid>", "employed_at <uuid>") |
status |
Utf8 |
hypothesis | supported | refuted | superseded | disputed |
confidence |
Float64 |
Confidence in this assertion (knowledge-layer policy; default conservative_min) |
hypothesis_group |
FixedSizeBinary(16) |
Nullable — groups competing assertions about the same question |
provenance_uuid |
FixedSizeBinary(16) |
The provenance event that produced this assertion |
analyst_uuid |
FixedSizeBinary(16) |
Who made the claim |
valid_from |
Timestamp(us, UTC) |
Assertion time — when the claim is believed to start holding (nullable) |
valid_to |
Timestamp(us, UTC) |
Assertion time — when it stops (nullable = open) |
recorded_at |
Timestamp(us, UTC) |
Transaction time — when GraphForge recorded the claim |
retracted_at |
Timestamp(us, UTC) |
Transaction time of retraction (nullable; never deleted) |
2. Status and supersession¶
Status transitions are append-only. A claim is never overwritten or deleted; it is superseded by a new assertion, and the link is preserved.
knowledge/supersession.parquet:
| Column | Type | Notes |
|---|---|---|
superseding_uuid |
FixedSizeBinary(16) |
The new assertion |
superseded_uuid |
FixedSizeBinary(16) |
The prior assertion it replaces |
reason |
Utf8 |
Why (free text or rule id) |
recorded_at |
Timestamp(us, UTC) |
When the supersession was recorded |
A refuted or superseded assertion remains in assertions.parquet with its evidence and
reasoning intact. Rejecting a theory marks status and records supersession; it never destroys the
prior claim, its evidence, or its history.
3. Competing hypotheses¶
Multiple assertions about the same question coexist by sharing a hypothesis_group. None is
privileged by storage; the analyst (or a confidence policy) chooses among them, and the choice is
itself an assertion (status transition + supersession), leaving the alternatives in place.
hypothesis_group G ("who owns ACME?")
├── assertion A: owner = Alice status=supported confidence=0.7
├── assertion B: owner = Bob status=refuted confidence=0.2 (evidence preserved)
└── assertion C: owner = Carol status=hypothesis confidence=0.4
4. Evidence attachment¶
Evidence links an assertion to the observations/documents that support or contradict it. Evidence stays attached regardless of later conclusions — refuting a claim does not detach the evidence that informed it.
knowledge/evidence.parquet:
| Column | Type | Notes |
|---|---|---|
evidence_uuid |
FixedSizeBinary(16) |
Identity |
assertion_uuid |
FixedSizeBinary(16) |
The claim |
source_uuid |
FixedSizeBinary(16) |
Document / observation / source ref |
role |
Utf8 |
"supports" | "contradicts" | "context" |
weight |
Float64 |
Contribution weight |
recorded_at |
Timestamp(us, UTC) |
5. Reasoning preservation¶
Reasoning captures why a conclusion was reached or an alternative rejected — the step the provenance lineage cannot express. It references the assertion(s) it explains.
knowledge/reasoning.parquet:
| Column | Type | Notes |
|---|---|---|
reasoning_uuid |
FixedSizeBinary(16) |
Identity |
assertion_uuid |
FixedSizeBinary(16) |
The claim this reasoning concerns |
kind |
Utf8 |
"justification" | "rejection" | "note" |
text |
Utf8 |
The analyst's reasoning |
analyst_uuid |
FixedSizeBinary(16) |
|
recorded_at |
Timestamp(us, UTC) |
6. Bitemporal valid-time¶
Two independent time axes, both preserved:
- Assertion time (
valid_from/valid_to): the period the world-fact is believed to hold ("Alice worked at ACME 2019–2021"). - Transaction time (
recorded_at/retracted_at): when GraphForge knew the claim ("we recorded this on 2026-06-07; we retracted it on 2026-08-01").
Together these answer "what did we believe, when, and why did it change?" — point-in-time belief
reconstruction. Valid-time lives here, on assertions — not on raw topology nodes (resolving the
stale valid_from_ts/valid_to_ts columns that ADR 0006 removed from the graph layer).
Preservation-over-deletion (the law)¶
- No epistemic record is ever destructively updated or deleted. Changes append; supersession links.
- Retraction sets
retracted_at; it does not remove the row. - Refuting/superseding a claim preserves the claim, its evidence, and its reasoning.
Boundary and lightweight guarantees (per ADR 0006)¶
- Knowledge layer only. All epistemic tables live under
knowledge/and reference graph objects by*_uuid. Graph topology, properties, and the adjacency index are untouched. - Graph-native semantics unaffected. Cypher/traversal/algorithms read only the graph layer. Boundary regression test: a graph with full epistemic history returns identical Cypher results to the same graph without it. The epistemic model never changes graph query results.
- Capability-gated.
knowledge/is a capability folder (ADR 0006 / project manifest). Absent = the epistemic model is simply not enabled; the graph works exactly as before. - Bitemporal is opt-in. Valid-time columns are nullable; transaction-time is cheap. Full bitemporal querying is capability-gated and off by default, with a documented lighter fallback (assertion-time only, dropping transaction-time) if footprint or query complexity proves heavy on a given project. Even at Full scope it is local, single-user Parquet — no server, no infra.
Consequences¶
Positive¶
- GraphForge becomes a system of evolving understanding: competing hypotheses coexist, rejected theories and their evidence survive, reasoning is traceable, and belief is reconstructable at any point in time. This is the stated true-north of the product.
- The graph stays graph-native and fast; epistemic richness imposes nothing on traversal.
- Provenance is upgraded from forward-lineage-only to a foundation the epistemic model sits on.
Negative / Risks¶
- More tables and more analyst-facing concepts. Mitigated by capability-gating and sensible defaults; a minimal project never sees any of it.
- Bitemporal modelling is genuinely complex. Mitigated by opt-in, off-by-default, and the assertion-time-only fallback.
- Confidence policy must be real (no longer hardcoded
1.0). Owned by the Knowledge Layer Foundation milestone, which this model depends on.
Dependencies¶
- Depends on the Knowledge Layer Foundation (provenance events actually written, confidence propagated, evidence links) — epistemic status sits on real provenance + evidence.
- Coordinates with the Conformance milestone's confidence-policy issues (
conservative_min, inference-rule recording).
Alternatives Considered¶
| Alternative | Rejected because |
|---|---|
| State-only (status quo) | Collapses knowledge to the surviving interpretation; destroys competing/rejected hypotheses and reasoning. Directly contradicts the product principle. |
| Epistemic status as graph node/edge properties | Violates ADR 0006's boundary rule; contaminates graph semantics; bloats the traversal hot path; mutation/overwrite would destroy history. |
| Versioned graph snapshots | Heavy, not lightweight; answers "what did the graph look like" but not "why did we believe it / what competed with it." Doesn't model competing hypotheses or reasoning. |
| Defer to v0.6 | The user reorientation places the basic-but-complete analyst embodiment — including epistemic integrity — in v0.5.0. Deferral would ship a workbench that forgets its own reasoning. |
References¶
- ADR 0006: Layered Architecture — the knowledge-layer boundary
- Architecture Refactor v0.5 §6 — provenance model this builds on
- Storage Architecture —
knowledge/capability tables - ADR 0004: Progressive Ontology — exploration-first analyst workflow