GraphForge¶
An embedded, openCypher-compatible graph database for Python, Node, Swift, and Kotlin
GraphForge lets you write openCypher queries against an in-memory or Parquet-backed graph with no external services. The v0.5.0 release passes all 3,885 openCypher TCK scenarios and provides seven analyst-intent methods — all returning Apache Arrow Tables.
from graphforge import GraphForge
forge = GraphForge() # in-memory
# forge = GraphForge("path/to/graph/") # or Parquet-backed
alice = forge.add_node("Person", name="Alice", age=30)
bob = forge.add_node("Person", name="Bob", age=25)
forge.add_edge(alice, "KNOWS", bob, since=2020)
# openCypher — returns an Arrow Table
table = forge.execute("""
MATCH (p:Person)-[:KNOWS]->(friend)
RETURN p.name AS person, friend.name AS friend
""")
# Consume with pandas, Polars, or iterate directly
df = table.to_pandas()
print(df)
# person friend
# 0 Alice Bob
Getting Started¶
| Installation | Install via pip or uv |
| Quick Start | Your first graph in five minutes |
| Tutorial | Step-by-step guided walkthrough |
User Guide¶
| Cypher Reference | Full openCypher language guide |
| Graph Construction | Build graphs with Python API and Cypher |
| Analytics Integration | Arrow, pandas, Polars, rank, cluster, find |
| Datasets | Load 100+ real-world networks |
Use Cases¶
| Knowledge Graph Construction | Extract entities from text, build and refine ontologies |
| Network Analysis | Degree, paths, communities — in notebooks |
| LLM-Powered Workflows | Store LLM extractions, build retrieval context |
| AI Agent Tool Recall | Graph-structured tool libraries for LLM agents |
| Agent Grounding | Ground agents in domain ontologies |
Reference¶
| API Reference | Full method reference — all seven verbs |
| Algorithm Catalog | All algorithms across rank/cluster/paths/analyze/similar |
| OpenCypher Compatibility | Feature matrix — v0.5.0 (100%) |
| TCK Compliance | 3,885 / 3,885 passing |
| Changelog | Release history |
Design Principles¶
- Correctness over performance — openCypher semantics verified against the full TCK
- Zero configuration —
pip install graphforge, no servers, no connection strings - Inspectable —
explainat every compiler stage; structured errors with source spans - Arrow-first results — every method returns an Apache Arrow Table
Architecture¶
GraphForge exposes seven analyst-intent methods that share a single Parquet-backed storage layer.
forge.execute("MATCH …") → Cypher path (parser → binder → Graph IR → DataFusion)
forge.rank("Person", by=…) → Algorithm path (centrality / structural scoring)
forge.cluster("Person", by=…) → Algorithm path (community detection, components)
forge.paths(alice, bob, by=…) → Algorithm path (shortest paths, flow, reachability)
forge.analyze(by=…) → Algorithm path (DAG, coloring, spanning trees, embeddings)
forge.similar("Person", by=…) → Algorithm path (pairwise node similarity)
forge.find("query", …) → Search path (text + vector hybrid search)
The Rust core uses a hand-written recursive-descent + Pratt expression parser, DataFusion-backed query execution, and Parquet storage. First-class bindings for Python (PyO3/maturin), Node (napi-rs), Swift (UniFFI), and Kotlin (UniFFI) all return Arrow results.