ADR 0002: Rust Core¶
Date: 2026-05-25
Status: Accepted
Branch: rust-core
Context¶
GraphForge v0.4.x is a pure-Python implementation. It passes 100% of the openCypher TCK, ships three namespaced API surfaces (db.execute, db.gds, db.search), and uses SQLite for persistence. Those surface boundaries were an artifact of the Python architecture — the v0.5.0 API collapses them into seven analyst-intent verbs (forge.execute, forge.rank, forge.cluster, forge.paths, forge.analyze, forge.similar, forge.find) with a unified Arrow result contract.
The next architectural goal is multi-language support — a first-class Rust crate API, a Python binding that returns Arrow results, a Node binding that returns Arrow IPC, and Swift and Kotlin bindings via UniFFI. Three requirements drive this:
- Language-independent results — downstream tools should not need to depend on the Python runtime or a Python-specific object model
- Performance — variable-length path expansion, provenance tracking, and large-graph algorithms benefit from native code
- Extensibility — the current pure-Python executor is difficult to extend with custom operators, storage providers, and planner rules
A secondary goal is to make the architecture explicit: the compiler pipeline, ontology model, and Graph IR should be first-class Rust artifacts rather than implicit Python conventions.
Decision¶
Refactor GraphForge to a Rust core with multi-language bindings.
All Rust work stages on the rust-core branch and merges to main only after the merge gates below are met.
Architecture Summary¶
Parser: Recursive Descent + Pratt expression parser¶
Chosen over LALRPOP, pest, and nom.
The parser is a hand-written recursive-descent clause parser with a Pratt expression parser subroutine — the architecture used by Neo4j's production Cypher parser, PostgreSQL, SQLite, and virtually every other production SQL/query-language parser. It is conflict-free by construction.
See ADR 0003 for the full rationale: LALRPOP was the original plan, but 50 structural shift/reduce conflicts proved unresolvable within LALRPOP's state-merging model. The hand-written parser resolves this permanently and is directly auditable against the openCypher BNF.
- LALRPOP was the original approach; rejected because LALRPOP's state-merging produces structural conflicts in the full openCypher grammar that cannot be resolved without compromising correctness — see ADR 0003
- pest was rejected: PEG-based, different parsing model, cannot reuse the hand-written
Toklexer - nom was rejected: parser-combinator library optimised for streaming/binary parsing; not a natural fit for a large declarative grammar
Execution: DataFusion + custom graph operators¶
Chosen over Polars-as-executor and a fully custom executor.
DataFusion exposes exactly the extension points GraphForge needs:
- Custom
LogicalPlanandExecutionPlannodes for graph-native operators - Custom
TableProviderfor Parquet, SQLite, DuckDB - Custom optimizer rules for predicate pushdown and label-constraint hoisting
- A
QueryPlannerhook so GraphForge IR is lowered directly rather than parsing SQL
Polars is an excellent dataframe engine but does not expose these extension points. A fully custom executor would require implementing scan interfaces, optimizer rules, batch streams, and partitioning from scratch — all of which DataFusion already provides.
Polars is retained as a storage-layer companion for IO/sinks (Parquet, CSV, JSON/NDJSON, IPC) and as an optional convenience layer in the Python binding. It is not the semantic owner of query execution.
Wire contract: Arrow¶
Arrow provides:
- A stable, language-independent columnar memory format
- Zero-copy in-process exchange via the C Data Interface
- Python exchange via the PyCapsule Interface (no hard PyArrow dependency)
- Node consumption via Arrow IPC and
tableFromIPCin Apache Arrow JS
Arrow schema metadata carries GraphForge-specific annotations (ir_version, ontology_version, query_id, provenance_policy) that survive IPC and Parquet round-trips.
The AST is not the cross-language contract. The Graph IR (semver-versioned) and the Arrow result schema are the stable boundaries.
Storage: Parquet only (initial scope)¶
Parquet is the sole storage provider for the Rust core. Parquet file metadata carries GraphForge provenance annotations. The StorageProvider trait is designed for extension, but no additional backends (SQLite, DuckDB) are in scope for the initial implementation.
User-Facing API¶
v0.5.0 replaces the three namespaced surfaces (db.gds, db.search, db.execute) with seven analyst-intent verbs on a single forge instance:
| Method | What it does | Returns |
|---|---|---|
forge.execute(cypher) |
Run an openCypher query | Arrow Table |
forge.rank(label, by=...) |
Score every node (centrality, structural, link prediction) | Arrow Table + score |
forge.cluster(label, by=...) |
Assign community membership | Arrow Table + community_id |
forge.paths(source, target, by=...) |
Find paths, compute flow, or traverse | Arrow Table + cost, path |
forge.analyze(label, by=...) |
Spanning trees, DAG analysis, coloring, matching, embeddings | Arrow Table (varies) |
forge.similar(label, by=...) |
Pairwise node similarity | Arrow Table + similarity |
forge.find(query, ...) |
Text/vector/hybrid search with lazy indexing | Arrow Table + score + matched_on |
rank(), cluster(), and paths() accept write_property for opt-in graph mutation. find() indexes lazily on first call; forge.index() is available for explicit control.
Full algorithm catalog: docs/architecture/algorithms.md
Bindings¶
| Language | Mechanism | Result type |
|---|---|---|
| Python | PyO3 + maturin | pyarrow.Table or RecordBatchReader |
| Node | napi-rs | Arrow IPC Buffer → tableFromIPC(buf) |
| Swift | UniFFI | Arrow IPC Data → GraphForgeResult |
| Kotlin | UniFFI | Arrow IPC ByteArray → GraphForgeResult |
Ontology: runtime-loadable¶
The ontology is a runtime-loadable metadata model, not baked into Rust types at compile time. Authoring format is YAML or JSON (Serde); execution format is Arrow tables compiled at load time; persistence format is Parquet.
Migration Strategy¶
Fork-first¶
All Rust work is done on the rust-core branch. The Python 0.4.x codebase on main is the reference implementation and remains the production release until the merge gates below are met.
Merge gates¶
| Gate | Requirement |
|---|---|
| Parser parity | Existing corpus + syntax goldens pass against recursive-descent + Pratt parser |
| openCypher conformance | Agreed TCK subset passes end-to-end |
| Ontology runtime | Load/validate/migrate round-trips pass |
| Data contract | Arrow/Parquet/IPC round-trips pass |
| Provider baseline | Parquet provider passes core semantics |
| Binding baseline | Python, Node, Swift, and Kotlin can execute queries and consume Arrow results |
| Observability | explain, query IDs, provenance IDs, structured errors available |
Spike sequence (uncertainty reduction)¶
- Parser spike — implement a thin openCypher subset with the recursive-descent + Pratt parser
- Ontology spike — load YAML/JSON → Arrow tables → run validator
- Execution spike — lower one
MATCHquery to DataFusion → return Arrow batches - Binding spike — deliver that Arrow payload to Python and Node
- SQLite spike — confirm WAL +
BEGIN IMMEDIATE+ partial indexes + generated columns
Consequences¶
Positive¶
- First-class Rust crate API, Python, Node, Swift, and Kotlin bindings sharing a single semantic core
- Arrow as a stable, language-independent result contract eliminates per-language serialization
- DataFusion's optimizer and execution framework accelerate complex query handling without building a full executor from scratch
- Pluggable storage providers enable Parquet, SQLite, and DuckDB without changing query semantics
- The ontology as a runtime-loadable model enables schema evolution without recompilation
Negative / Risks¶
- The Rust refactor is a significant engineering investment; the
rust-corebranch will diverge frommainfor an extended period - The parity corpus and differential testing infrastructure need to be built alongside the new parser
- DataFusion's extension API surface is stable but evolving; custom node implementations will need to track upstream changes
- The merge gates are strict by design; shipping pressure should not lower them
Mitigations¶
- Strict fork-first strategy with defined merge gates avoids "perpetual refactor" drift
- Spike tasks are sequenced to collapse uncertainty fastest before committing to the full implementation
- The Python
0.4.xrelease is not blocked; it continues to ship frommainduring the refactor
Alternatives Considered¶
| Alternative | Rejected because |
|---|---|
| Polars as primary executor | Does not expose compiler extension points (custom plan nodes, optimizer rules, table providers) needed for graph-native semantics |
| Fully custom Rust executor | Front-loads the highest-risk work (scan interfaces, optimizer, batch streams) that DataFusion already provides |
| Stay Python | Cannot deliver first-class Rust or Node bindings; native path expansion and provenance tracking remain slow |
| DuckDB as semantic owner | DuckDB is an excellent accelerator but its extension model is not designed for owning a graph IR and custom operator semantics |
References¶
- Architecture Overview
- AST & Planning
- Execution Model
- Storage Architecture
- ADR 0006: Layered Architecture — the graph/knowledge/workbench layer model and crate-boundary rule that organises the crates listed above