Skip to content

ADR 0007: Epistemic Model — Preserving the Evolution of Understanding

Status: Accepted Date: 2026-06-07 Build target: v0.5.0 (Full scope) Related: ADR 0006 (Layered Architecture), ADR 0004 (Progressive Ontology); builds on the v0.5 provenance schema


Context

GraphForge is a system for preserving reasoning, not merely storing conclusions. Its analysts — in OSINT, intelligence, genealogy, fraud, due diligence, investigative journalism — do not proceed in a straight line. They form hypotheses, gather evidence, revise beliefs, reject theories, merge or split them, and sometimes revisit discarded explanations. The history of that process is often as important as the final conclusion.

The product principle is explicit:

GraphForge should preserve the evolution of understanding, not merely the current state of understanding. The disappearance of a hypothesis should not imply it never existed. The rejection of a theory should not destroy the evidence that caused the rejection.

The architecture today does the opposite. It is built for state, not evolution:

  • Provenance (provenance/events.parquet, lineage.parquet) is forward-lineage only — "where did this come from" — and is currently a 17-line stub that is never written.
  • There is no notion of a fact's epistemic status (is this a hypothesis? supported? rejected? superseded?).
  • There is no way for competing interpretations to coexist; a graph naturally collapses to the single surviving interpretation.
  • There is no reasoning trace — why a conclusion was reached, or why an alternative was rejected.
  • There is no time model for belief — "what did we believe, and when?"

This ADR defines an epistemic model that closes the gap. Per the layering decision (ADR 0006), it lives entirely in the knowledge layer, attached to graph objects by UUID reference. It does not touch graph topology or change graph-native query semantics.

This is a v0.5.0 deliverable at Full scope — it is the basic-but-complete embodiment of the analyst vision the project has aimed at from the start.


Decision

Introduce an assertion (claim) as the unit of analytical belief, layered on top of the existing provenance schema, with bitemporal valid-time as its substrate. Assertions reference graph objects by UUID; the graph layer is unchanged.

1. Assertion / claim

An assertion is a statement an analyst (or an inference rule) makes about a graph object — a node, an edge, or a property value — together with its epistemic status, supporting evidence, reasoning, and the time window over which it is believed to hold.

knowledge/assertions.parquet:

Column Type Notes
assertion_uuid FixedSizeBinary(16) UUIDv7 — identity of this claim
subject_uuid FixedSizeBinary(16) The graph object the claim is about (node_uuid / edge_uuid)
subject_kind Utf8 "node" | "edge" | "property"
claim Utf8 What is asserted (e.g. "is the same person as <uuid>", "employed_at <uuid>")
status Utf8 hypothesis | supported | refuted | superseded | disputed
confidence Float64 Confidence in this assertion (knowledge-layer policy; default conservative_min)
hypothesis_group FixedSizeBinary(16) Nullable — groups competing assertions about the same question
provenance_uuid FixedSizeBinary(16) The provenance event that produced this assertion
analyst_uuid FixedSizeBinary(16) Who made the claim
valid_from Timestamp(us, UTC) Assertion time — when the claim is believed to start holding (nullable)
valid_to Timestamp(us, UTC) Assertion time — when it stops (nullable = open)
recorded_at Timestamp(us, UTC) Transaction time — when GraphForge recorded the claim
retracted_at Timestamp(us, UTC) Transaction time of retraction (nullable; never deleted)

2. Status and supersession

Status transitions are append-only. A claim is never overwritten or deleted; it is superseded by a new assertion, and the link is preserved.

knowledge/supersession.parquet:

Column Type Notes
superseding_uuid FixedSizeBinary(16) The new assertion
superseded_uuid FixedSizeBinary(16) The prior assertion it replaces
reason Utf8 Why (free text or rule id)
recorded_at Timestamp(us, UTC) When the supersession was recorded

A refuted or superseded assertion remains in assertions.parquet with its evidence and reasoning intact. Rejecting a theory marks status and records supersession; it never destroys the prior claim, its evidence, or its history.

3. Competing hypotheses

Multiple assertions about the same question coexist by sharing a hypothesis_group. None is privileged by storage; the analyst (or a confidence policy) chooses among them, and the choice is itself an assertion (status transition + supersession), leaving the alternatives in place.

hypothesis_group G ("who owns ACME?")
  ├── assertion A: owner = Alice   status=supported   confidence=0.7
  ├── assertion B: owner = Bob     status=refuted     confidence=0.2   (evidence preserved)
  └── assertion C: owner = Carol   status=hypothesis  confidence=0.4

4. Evidence attachment

Evidence links an assertion to the observations/documents that support or contradict it. Evidence stays attached regardless of later conclusions — refuting a claim does not detach the evidence that informed it.

knowledge/evidence.parquet:

Column Type Notes
evidence_uuid FixedSizeBinary(16) Identity
assertion_uuid FixedSizeBinary(16) The claim
source_uuid FixedSizeBinary(16) Document / observation / source ref
role Utf8 "supports" | "contradicts" | "context"
weight Float64 Contribution weight
recorded_at Timestamp(us, UTC)

5. Reasoning preservation

Reasoning captures why a conclusion was reached or an alternative rejected — the step the provenance lineage cannot express. It references the assertion(s) it explains.

knowledge/reasoning.parquet:

Column Type Notes
reasoning_uuid FixedSizeBinary(16) Identity
assertion_uuid FixedSizeBinary(16) The claim this reasoning concerns
kind Utf8 "justification" | "rejection" | "note"
text Utf8 The analyst's reasoning
analyst_uuid FixedSizeBinary(16)
recorded_at Timestamp(us, UTC)

6. Bitemporal valid-time

Two independent time axes, both preserved:

  • Assertion time (valid_from / valid_to): the period the world-fact is believed to hold ("Alice worked at ACME 2019–2021").
  • Transaction time (recorded_at / retracted_at): when GraphForge knew the claim ("we recorded this on 2026-06-07; we retracted it on 2026-08-01").

Together these answer "what did we believe, when, and why did it change?" — point-in-time belief reconstruction. Valid-time lives here, on assertionsnot on raw topology nodes (resolving the stale valid_from_ts/valid_to_ts columns that ADR 0006 removed from the graph layer).

Preservation-over-deletion (the law)

  • No epistemic record is ever destructively updated or deleted. Changes append; supersession links.
  • Retraction sets retracted_at; it does not remove the row.
  • Refuting/superseding a claim preserves the claim, its evidence, and its reasoning.

Boundary and lightweight guarantees (per ADR 0006)

  • Knowledge layer only. All epistemic tables live under knowledge/ and reference graph objects by *_uuid. Graph topology, properties, and the adjacency index are untouched.
  • Graph-native semantics unaffected. Cypher/traversal/algorithms read only the graph layer. Boundary regression test: a graph with full epistemic history returns identical Cypher results to the same graph without it. The epistemic model never changes graph query results.
  • Capability-gated. knowledge/ is a capability folder (ADR 0006 / project manifest). Absent = the epistemic model is simply not enabled; the graph works exactly as before.
  • Bitemporal is opt-in. Valid-time columns are nullable; transaction-time is cheap. Full bitemporal querying is capability-gated and off by default, with a documented lighter fallback (assertion-time only, dropping transaction-time) if footprint or query complexity proves heavy on a given project. Even at Full scope it is local, single-user Parquet — no server, no infra.

Consequences

Positive

  • GraphForge becomes a system of evolving understanding: competing hypotheses coexist, rejected theories and their evidence survive, reasoning is traceable, and belief is reconstructable at any point in time. This is the stated true-north of the product.
  • The graph stays graph-native and fast; epistemic richness imposes nothing on traversal.
  • Provenance is upgraded from forward-lineage-only to a foundation the epistemic model sits on.

Negative / Risks

  • More tables and more analyst-facing concepts. Mitigated by capability-gating and sensible defaults; a minimal project never sees any of it.
  • Bitemporal modelling is genuinely complex. Mitigated by opt-in, off-by-default, and the assertion-time-only fallback.
  • Confidence policy must be real (no longer hardcoded 1.0). Owned by the Knowledge Layer Foundation milestone, which this model depends on.

Dependencies

  • Depends on the Knowledge Layer Foundation (provenance events actually written, confidence propagated, evidence links) — epistemic status sits on real provenance + evidence.
  • Coordinates with the Conformance milestone's confidence-policy issues (conservative_min, inference-rule recording).

Alternatives Considered

Alternative Rejected because
State-only (status quo) Collapses knowledge to the surviving interpretation; destroys competing/rejected hypotheses and reasoning. Directly contradicts the product principle.
Epistemic status as graph node/edge properties Violates ADR 0006's boundary rule; contaminates graph semantics; bloats the traversal hot path; mutation/overwrite would destroy history.
Versioned graph snapshots Heavy, not lightweight; answers "what did the graph look like" but not "why did we believe it / what competed with it." Doesn't model competing hypotheses or reasoning.
Defer to v0.6 The user reorientation places the basic-but-complete analyst embodiment — including epistemic integrity — in v0.5.0. Deferral would ship a workbench that forgets its own reasoning.

References