Skip to content

GraphForge Roadmap

Last Updated: 2026-06-07 Current Version: 0.4.0


Released

v0.3.8 — Full TCK Compliance

  • 3,885/3,885 openCypher TCK scenarios passing (zero failures, zero expected failures)
  • First embedded Python graph database with complete openCypher TCK compliance
  • Full Cypher language support: MATCH, WHERE, RETURN, CREATE, SET, REMOVE, DELETE, MERGE, UNWIND, WITH, ORDER BY, SKIP, LIMIT, OPTIONAL MATCH, variable-length paths, pattern comprehension, temporal types, all standard functions

v0.3.9 — Performance

Theme: Maximize the performance of the v0.3.x feature set — O(n) parsing, node/edge indexes, bulk ingestion, SQLite durability tuning, and LIMIT short-circuit.

Area Work PR
LALR(1) parser migration Linear-time parsing, unknown function detection #430
Property equality index O(1) node lookup by property value #427
LIMIT short-circuit Traversal stops at demand; UNWIND + WITH LIMIT optimised #423, #443
Bulk ingestion API create_node_bulk, bulk_ingest() context manager #444
SQLite PRAGMA tuning synchronous=NORMAL, 64 MB cache, temp_store=MEMORY #446
Recursion limit fix sys.setrecursionlimit thread-safe, set once at init #445
elementId() function GQL-spec string form of element identity #447
Test suite renovation Zero-base fixture audit, marker consistency, parametrization #438
Perf baseline suite Real-dataset benchmarks (S/M/L/XL tiers), v0.3.8→v0.3.9 delta #441

v0.3.10 — Analytics Integration

Theme: Bridge GraphForge to the Python analytics ecosystem. Export graphs directly to NetworkX and igraph; add a parse/plan cache to eliminate repeated parse overhead in notebook loops.

Issue Scope
#391 gf.to_networkx() and gf.to_igraph() with optional subgraph filtering
#504 Parse/plan LRU cache, GraphForge(cache_size=128)
#502 add_graph_documents() LangChain-compatible ingestion

Release tracker: #448


Released: 2026-05-07

Theme: Extend GraphForge beyond Cypher with two new API surfaces for graph algorithms and hybrid retrieval — without adding Cypher extensions.

db.execute(...)                         # Cypher — openCypher query engine
db.gds.pagerank(write_property="rank")  # Algorithms — igraph / NetworkX backends
db.search("query", vector=embedding)    # Retrieval — FTS5 + vector cosine, RRF fusion

db.gds — 8 compiled algorithms: pagerank, betweenness_centrality, closeness_centrality, degree_centrality, louvain, connected_components, clustering_coefficient, triangle_count.

db.search — FTS5 text search, vector cosine similarity, and RRF-fused hybrid. Returns list[SearchHit] with score provenance.

graphforge.recipesneighbourhood() n-hop context builder for LLM prompts.

Release tracker: #394


Planned


v0.5.0 — Rust Core

Theme: Replace the Python executor with a Rust core and redesign the API around analyst intent. The three namespaced Python surfaces (db.gds, db.search, db.execute) are replaced by seven analyst-intent verbs — all returning Apache Arrow Tables:

forge.execute(cypher)              # openCypher → Arrow Table
forge.rank(label, by=...)          # centrality, structural scoring → Arrow Table + score
forge.cluster(label, by=...)       # community detection, components → Arrow Table + community_id
forge.paths(source, target, by=)  # shortest paths, flow, reachability → Arrow Table
forge.analyze(label, by=)         # spanning trees, DAG, coloring, matching, embeddings → Arrow Table
forge.similar(label, by=)         # pairwise node similarity → Arrow Table
forge.find(query, )               # text/vector/hybrid search → Arrow Table + score + matched_on

Full algorithm catalog: Algorithm Verbs

Architecture decisions: ADR 0002, ADR 0003

Milestones

The milestone sequence below was renumbered when the adjacency index (ADR 0005) was adopted as a first-class layer and the Architecture Baseline moved to its chronological slot. GitHub milestone IDs are immutable; the M## prefix in each milestone title is the canonical sequence shown here. See refactor-v0.5 §8.

# Milestone State Deliverable Exit criteria
9 Compiler skeleton closed AST, spans, token model, LALRPOP parser harness Differential parse tests passing on corpus
10 Ontology runtime closed YAML/JSON loader, normalized Arrow tables, validator Load/validate/migrate round-trips passing
11 Architecture baseline closed UUID identity, typed edge tables, topology/properties split, project structure (docs) Architecture reference adopted
12 Graph IR closed Typed graph IR + serde envelope + explain output AST→IR golden tests stable
13 Relational lowering closed Lower core MATCH/WHERE/RETURN subset to DataFusion Logical-plan tests stable
14 Execution baseline open DataFusion-backed execution on Parquet provider End-to-end query tests passing; Arrow RecordBatch streams returned
15 Adjacency index baseline open Consolidated derived CSR adjacency index under indexes/adjacency/ No IR change; never alters results; gates M18
16 Bindings baseline open forge.execute() → PyArrow Table; Node IPC results Smoke tests and packaging passing — Python + Node
17 Conformance hardening open openCypher TCK subset, fuzzing, provenance/confidence rules TCK threshold met; all merge gates pass
18 Rank and cluster open forge.rank/cluster → Arrow Tables All by= values tested; write-back working
19 Find open forge.find() + forge.index() → Arrow Tables Text, vector, hybrid; lazy indexing working
20 Layering & boundary reconciliation open Layer docs/ADRs (0006/0007); boundary gate regression test Graph-native query results independent of knowledge layer
21 Knowledge layer foundation open Write provenance events/lineage; propagate confidence Knowledge attaches by UUID reference only; boundary gate green
22 Epistemic model open Assertions/status, supersession, evidence, bitemporal valid-time Preservation-over-deletion; boundary gate green
23 v0.5.0 release open Agent-skills work + comprehensive close-out (#742) M11–M22 complete; boundary gate green
24 Swift + Kotlin bindings (v0.5.1) open UniFFI-generated Swift Package + Kotlin JAR Round-trip tests + CI packaging green for both languages

Merge gates (all required before merging to main)

  • Parser parity — RD+Pratt corpus + syntax goldens pass
  • openCypher TCK subset — agreed compliance threshold met
  • Ontology round-trips — load/validate/migrate stable
  • Arrow/IPC round-trips — data contract stable across all five language bindings
  • Parquet provider — core semantics verified
  • Python + Node bindings — packaging and smoke tests pass
  • Swift + Kotlin bindings — UniFFI packaging and smoke tests pass
  • Observability — explain, query IDs, provenance IDs, structured errors
  • All seven analyst verbs — Arrow Tables, write-back, via/directed filters
  • forge.find() / forge.index() — lazy indexing, text + vector + hybrid

Language Binding Matrix

Language Mechanism Result Milestone
Python PyO3 + maturin pyarrow.Table 16
Node / TypeScript napi-rs Arrow IPC Buffer 16
Swift UniFFI Arrow IPC DataGraphForgeResult 24
Kotlin / JVM UniFFI Arrow IPC ByteArrayGraphForgeResult 24
Rust Native crate ExecutionResult 14

Version Numbering

GraphForge follows Semantic Versioning and is pre-v1.0. The 0.x version series signals that the API is still maturing.

  • Patch (0.x.y): Bug fixes, small improvements, no API changes
  • Minor (0.x.0): New features; backwards-compatible where practical

v0.5.0 is the Rust core release. A v1.0 release will happen when the API is stable enough to commit to long-term.