Skip to content

GraphForge

Composable graph tooling for analysis, construction, and refinement

Python Version PyPI Version License Test Status

What is GraphForge?

GraphForge is a Python library that provides composable graph tooling for analysis, construction, and refinement. It implements the openCypher query language, allowing you to work with graph data using a familiar, SQL-like syntax.

Key Features

  • openCypher Query Language - Industry-standard graph query language
  • Real-World Datasets - Built-in support for SNAP, Neo4j, and benchmark datasets
  • Type-Safe - Built with Pydantic for data validation
  • Pure Python - No external database dependencies
  • TCK Compliant - Implements openCypher specification
  • Composable - Designed for Python workflows

Quick Example

from graphforge import GraphForge

# Create a graph
graph = GraphForge()

# Query with openCypher
result = graph.execute("""
    MATCH (p:Person {name: 'Alice'})-[:KNOWS]->(friend)
    RETURN friend.name
""")

for row in result:
    print(row['friend.name'])

Or Load a Real Dataset

# Load a SNAP dataset (auto-downloads and caches)
graph = GraphForge.from_dataset("snap-ego-facebook")

# Analyze immediately
result = graph.execute("""
    MATCH (n)-[r]->()
    RETURN n.id, count(r) AS connections
    ORDER BY connections DESC
    LIMIT 5
""")

Getting Started

Documentation

User Guide

Datasets

Architecture

Reference

Design Principles

  1. Spec-driven correctness - openCypher semantics over performance
  2. Deterministic behavior - Stable results across runs
  3. Inspectable - Observable query plans and execution
  4. Minimal dependencies - Keep the dependency tree small
  5. Python-first - Optimize for Python workflows

Project Status

GraphForge is under active development. Current openCypher TCK compliance: 16.6% (638/3,837 scenarios).

See the Changelog for recent updates.

Contributing

We welcome contributions! See our Contributing Guide to get started.

License

MIT License - see LICENSE for details.