Skip to content

GraphForge

An embedded, openCypher-compatible graph database for Python, Node, Swift, and Kotlin

PyPI Python Tests Coverage License

GraphForge lets you write openCypher queries against an in-memory or Parquet-backed graph with no external services. The v0.5.0 release passes all 3,885 openCypher TCK scenarios and provides seven analyst-intent methods — all returning Apache Arrow Tables.

from graphforge import GraphForge

forge = GraphForge()                    # in-memory
# forge = GraphForge("path/to/graph/") # or Parquet-backed

alice = forge.add_node("Person", name="Alice", age=30)
bob   = forge.add_node("Person", name="Bob",   age=25)
forge.add_edge(alice, "KNOWS", bob, since=2020)

# openCypher — returns an Arrow Table
table = forge.execute("""
    MATCH (p:Person)-[:KNOWS]->(friend)
    RETURN p.name AS person, friend.name AS friend
""")

# Consume with pandas, Polars, or iterate directly
df = table.to_pandas()
print(df)
#   person friend
# 0  Alice    Bob

Getting Started

Installation Install via pip or uv
Quick Start Your first graph in five minutes
Tutorial Step-by-step guided walkthrough

User Guide

Cypher Reference Full openCypher language guide
Graph Construction Build graphs with Python API and Cypher
Analytics Integration Arrow, pandas, Polars, rank, cluster, find
Datasets Load 100+ real-world networks

Use Cases

Knowledge Graph Construction Extract entities from text, build and refine ontologies
Network Analysis Degree, paths, communities — in notebooks
LLM-Powered Workflows Store LLM extractions, build retrieval context
AI Agent Tool Recall Graph-structured tool libraries for LLM agents
Agent Grounding Ground agents in domain ontologies

Reference

API Reference Full method reference — all seven verbs
Algorithm Catalog All algorithms across rank/cluster/paths/analyze/similar
OpenCypher Compatibility Feature matrix — v0.5.0 (100%)
TCK Compliance 3,885 / 3,885 passing
Changelog Release history

Design Principles

  1. Correctness over performance — openCypher semantics verified against the full TCK
  2. Zero configurationpip install graphforge, no servers, no connection strings
  3. Inspectableexplain at every compiler stage; structured errors with source spans
  4. Arrow-first results — every method returns an Apache Arrow Table

Architecture

GraphForge exposes seven analyst-intent methods that share a single Parquet-backed storage layer.

forge.execute("MATCH …")         →  Cypher path     (parser → binder → Graph IR → DataFusion)
forge.rank("Person", by=…)       →  Algorithm path  (centrality / structural scoring)
forge.cluster("Person", by=…)    →  Algorithm path  (community detection, components)
forge.paths(alice, bob, by=…)    →  Algorithm path  (shortest paths, flow, reachability)
forge.analyze(by=…)              →  Algorithm path  (DAG, coloring, spanning trees, embeddings)
forge.similar("Person", by=…)    →  Algorithm path  (pairwise node similarity)
forge.find("query", …)           →  Search path     (text + vector hybrid search)

The Rust core uses a hand-written recursive-descent + Pratt expression parser, DataFusion-backed query execution, and Parquet storage. First-class bindings for Python (PyO3/maturin), Node (napi-rs), Swift (UniFFI), and Kotlin (UniFFI) all return Arrow results.

See Architecture Overview.