GraphForge Roadmap¶

Last Updated: 2026-06-07 Current Version: 0.4.0

Released¶

v0.3.8 — Full TCK Compliance¶

3,885/3,885 openCypher TCK scenarios passing (zero failures, zero expected failures)
First embedded Python graph database with complete openCypher TCK compliance
Full Cypher language support: MATCH, WHERE, RETURN, CREATE, SET, REMOVE, DELETE, MERGE, UNWIND, WITH, ORDER BY, SKIP, LIMIT, OPTIONAL MATCH, variable-length paths, pattern comprehension, temporal types, all standard functions

v0.3.9 — Performance¶

Theme: Maximize the performance of the v0.3.x feature set — O(n) parsing, node/edge indexes, bulk ingestion, SQLite durability tuning, and LIMIT short-circuit.

Area	Work	PR
LALR(1) parser migration	Linear-time parsing, unknown function detection	#430
Property equality index	O(1) node lookup by property value	#427
LIMIT short-circuit	Traversal stops at demand; UNWIND + WITH LIMIT optimised	#423, #443
Bulk ingestion API	`create_node_bulk`, `bulk_ingest()` context manager	#444
SQLite PRAGMA tuning	`synchronous=NORMAL`, 64 MB cache, `temp_store=MEMORY`	#446
Recursion limit fix	`sys.setrecursionlimit` thread-safe, set once at init	#445
`elementId()` function	GQL-spec string form of element identity	#447
Test suite renovation	Zero-base fixture audit, marker consistency, parametrization	#438
Perf baseline suite	Real-dataset benchmarks (S/M/L/XL tiers), v0.3.8→v0.3.9 delta	#441

v0.3.10 — Analytics Integration¶

Theme: Bridge GraphForge to the Python analytics ecosystem. Export graphs directly to NetworkX and igraph; add a parse/plan cache to eliminate repeated parse overhead in notebook loops.

Issue	Scope
#391	`gf.to_networkx()` and `gf.to_igraph()` with optional subgraph filtering
#504	Parse/plan LRU cache, `GraphForge(cache_size=128)`
#502	`add_graph_documents()` LangChain-compatible ingestion

Release tracker: #448

v0.4.0 — Three-Surface API (Algorithms + Search)¶

Released: 2026-05-07

Theme: Extend GraphForge beyond Cypher with two new API surfaces for graph algorithms and hybrid retrieval — without adding Cypher extensions.

db.execute(...)                         # Cypher — openCypher query engine
db.gds.pagerank(write_property="rank")  # Algorithms — igraph / NetworkX backends
db.search("query", vector=embedding)    # Retrieval — FTS5 + vector cosine, RRF fusion

db.gds — 8 compiled algorithms: pagerank, betweenness_centrality, closeness_centrality, degree_centrality, louvain, connected_components, clustering_coefficient, triangle_count.

db.search — FTS5 text search, vector cosine similarity, and RRF-fused hybrid. Returns list[SearchHit] with score provenance.

graphforge.recipes — neighbourhood() n-hop context builder for LLM prompts.

Release tracker: #394

Planned¶

v0.5.0 — Rust Core¶

Theme: Replace the Python executor with a Rust core and redesign the API around analyst intent. The three namespaced Python surfaces (db.gds, db.search, db.execute) are replaced by seven analyst-intent verbs — all returning Apache Arrow Tables:

forge.execute(cypher)              # openCypher → Arrow Table
forge.rank(label, by=...)          # centrality, structural scoring → Arrow Table + score
forge.cluster(label, by=...)       # community detection, components → Arrow Table + community_id
forge.paths(source, target, by=…)  # shortest paths, flow, reachability → Arrow Table
forge.analyze(label, by=…)         # spanning trees, DAG, coloring, matching, embeddings → Arrow Table
forge.similar(label, by=…)         # pairwise node similarity → Arrow Table
forge.find(query, …)               # text/vector/hybrid search → Arrow Table + score + matched_on

Full algorithm catalog: Algorithm Verbs

Architecture decisions: ADR 0002, ADR 0003

Milestones¶

The milestone sequence below was renumbered when the adjacency index (ADR 0005) was adopted as a first-class layer and the Architecture Baseline moved to its chronological slot. GitHub milestone IDs are immutable; the M## prefix in each milestone title is the canonical sequence shown here. See refactor-v0.5 §8.

#	Milestone	State	Deliverable	Exit criteria
9	Compiler skeleton	closed	AST, spans, token model, LALRPOP parser harness	Differential parse tests passing on corpus
10	Ontology runtime	closed	YAML/JSON loader, normalized Arrow tables, validator	Load/validate/migrate round-trips passing
11	Architecture baseline	closed	UUID identity, typed edge tables, topology/properties split, project structure (docs)	Architecture reference adopted
12	Graph IR	closed	Typed graph IR + serde envelope + `explain` output	AST→IR golden tests stable
13	Relational lowering	closed	Lower core MATCH/WHERE/RETURN subset to DataFusion	Logical-plan tests stable
14	Execution baseline	open	DataFusion-backed execution on Parquet provider	End-to-end query tests passing; Arrow RecordBatch streams returned
15	Adjacency index baseline	open	Consolidated derived CSR adjacency index under `indexes/adjacency/`	No IR change; never alters results; gates M18
16	Bindings baseline	open	`forge.execute()` → PyArrow Table; Node IPC results	Smoke tests and packaging passing — Python + Node
17	Conformance hardening	open	openCypher TCK subset, fuzzing, provenance/confidence rules	TCK threshold met; all merge gates pass
18	Rank and cluster	open	`forge.rank/cluster` → Arrow Tables	All `by=` values tested; write-back working
19	Find	open	`forge.find()` + `forge.index()` → Arrow Tables	Text, vector, hybrid; lazy indexing working
20	Layering & boundary reconciliation	open	Layer docs/ADRs (0006/0007); boundary gate regression test	Graph-native query results independent of knowledge layer
21	Knowledge layer foundation	open	Write provenance events/lineage; propagate confidence	Knowledge attaches by UUID reference only; boundary gate green
22	Epistemic model	open	Assertions/status, supersession, evidence, bitemporal valid-time	Preservation-over-deletion; boundary gate green
23	v0.5.0 release	open	Agent-skills work + comprehensive close-out (#742)	M11–M22 complete; boundary gate green
24	Swift + Kotlin bindings (v0.5.1)	open	UniFFI-generated Swift Package + Kotlin JAR	Round-trip tests + CI packaging green for both languages

Merge gates (all required before merging to `main`)¶

Parser parity — RD+Pratt corpus + syntax goldens pass
openCypher TCK subset — agreed compliance threshold met
Ontology round-trips — load/validate/migrate stable
Arrow/IPC round-trips — data contract stable across all five language bindings
Parquet provider — core semantics verified
Python + Node bindings — packaging and smoke tests pass
Swift + Kotlin bindings — UniFFI packaging and smoke tests pass
Observability — explain, query IDs, provenance IDs, structured errors
All seven analyst verbs — Arrow Tables, write-back, via/directed filters
forge.find() / forge.index() — lazy indexing, text + vector + hybrid

Language Binding Matrix¶

Language	Mechanism	Result	Milestone
Python	PyO3 + maturin	`pyarrow.Table`	16
Node / TypeScript	napi-rs	Arrow IPC `Buffer`	16
Swift	UniFFI	Arrow IPC `Data` → `GraphForgeResult`	24
Kotlin / JVM	UniFFI	Arrow IPC `ByteArray` → `GraphForgeResult`	24
Rust	Native crate	`ExecutionResult`	14

Version Numbering¶

GraphForge follows Semantic Versioning and is pre-v1.0. The 0.x version series signals that the API is still maturing.

Patch (0.x.y): Bug fixes, small improvements, no API changes
Minor (0.x.0): New features; backwards-compatible where practical

v0.5.0 is the Rust core release. A v1.0 release will happen when the API is stable enough to commit to long-term.