Skip to content

Contributing to GraphForge

Thank you for your interest in contributing to GraphForge!

GraphForge has two active codebases:

Branch Language Status
main Python Current production release (0.4.x)
rust-core Rust + Python Rust core refactor (staging)

Most new feature work happens on rust-core. Bug fixes and documentation improvements may target either branch.


Development Setup

Python-only (main branch)

Prerequisites: Python 3.10+, uv or pip

git clone https://github.com/DecisionNerd/graphforge.git
cd graphforge

# Install dev dependencies
uv sync --all-extras

# Verify
pytest -m unit

Rust core (rust-core branch)

Prerequisites: Python 3.10+, Rust stable, uv, maturin

# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
rustup update stable

git clone https://github.com/DecisionNerd/graphforge.git
cd graphforge
git checkout rust-core

# Install Python dev dependencies
uv sync --dev

# Build and install the Rust extension in development mode
maturin develop --release

# Verify
cargo test --workspace   # Rust tests
pytest -m unit           # Python binding tests

Development Workflow

Before Pushing Code

Always run the full validation suite before pushing:

# Python (main)
make pre-push

# Rust core
cargo clippy --workspace -- -D warnings
cargo test --workspace
make pre-push

make pre-push runs: format-check, lint, type-check, tests with coverage, and coverage threshold validation (≥85%).

Running Tests

# Python
make test                # all tests
make test-unit           # fast unit tests
make test-integration    # integration tests
make test-tck            # TCK compliance tests

# Rust
cargo test --workspace                  # all crates
cargo test -p gf-cypher                 # one crate
cargo test --workspace -- --nocapture   # with output

# Coverage
make coverage            # run + validate thresholds
make coverage-report     # open HTML report

Code Quality

# Python
make format              # ruff format
make lint                # ruff check
make type-check          # mypy

# Rust
cargo fmt --all          # rustfmt
cargo clippy --workspace -- -D warnings

Project Structure

graphforge/
├── crates/                      # Rust workspace
│   ├── gf-core/                 # public engine facade
│   ├── gf-ast/                  # AST + spans
│   ├── gf-cypher/               # recursive-descent + Pratt parser
│   ├── gf-ontology/             # runtime ontology
│   ├── gf-ir/                   # graph IR
│   ├── gf-rel/                  # relational lowering
│   ├── gf-plan/                 # DataFusion integration
│   ├── gf-exec/                 # execution session, algorithms, search
│   ├── gf-storage/              # StorageProvider + Parquet
│   ├── gf-io/                   # IO sinks
│   ├── gf-provenance/           # lineage + confidence
│   ├── gf-bindings-py/          # PyO3 Python binding
│   ├── gf-bindings-node/        # napi-rs Node binding
│   ├── gf-bindings-uniffi/      # UniFFI shared binding (Swift + Kotlin)
│   └── gf-cli/                  # CLI
├── bindings/
│   ├── swift/                   # Swift Package Manager package
│   └── kotlin/                  # Gradle/Kotlin package
├── src/graphforge/          # Python package (main branch)
│   ├── api.py
│   ├── parser/
│   ├── planner/
│   ├── executor/
│   ├── storage/
│   ├── algorithms/
│   ├── search/
│   └── recipes/
├── tests/
│   ├── unit/
│   ├── integration/
│   ├── tck/
│   └── property/
├── docs/
└── pyproject.toml

Testing Guidelines

Writing Tests

Unit tests — test one component in isolation:

@pytest.mark.unit
def test_node_creation():
    node = Node(labels={"Person"})
    assert "Person" in node.labels

Integration tests — test end-to-end behavior:

@pytest.mark.integration
def test_query_execution(db):
    result = db.execute("MATCH (n) RETURN n")
    assert result is not None

Rust unit tests — in the same file as the module under test:

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_node_scan_op() {
        let op = GraphOp::NodeScan { var: VarId(0), ty: TypeId(1) };
        assert!(matches!(op, GraphOp::NodeScan { .. }));
    }
}

Test Quality Standards

  • Fast: unit tests < 1 ms
  • Isolated: no shared state between tests
  • Deterministic: same input = same output
  • Named descriptively

See testing.md for the full testing reference.


Code Style

Python

  • PEP 8, enforced by ruff
  • Type hints on all function signatures
  • Docstrings on public APIs (Google style)
  • No # type: ignore without explanation

Rust

  • cargo fmt enforced in CI
  • cargo clippy -- -D warnings enforced in CI
  • No #[allow(dead_code)] without explanation
  • Public items need doc comments

Pull Request Process

PR Size Guidelines

Keep PRs small and focused. CI tools work best with small, reviewable diffs.

Good: - Single feature or bug fix - 50–300 lines of code changed - 1–5 files modified - Clear, focused purpose

Too large: - Multiple unrelated changes - 1,000+ lines changed - Refactoring + new feature + bug fixes combined

Example — breaking up a Rust feature:

PR 1: "gf-cypher: add WITH clause grammar and AST node"
PR 2: "gf-ir: add WithOp graph IR operator"
PR 3: "gf-rel: lower WITH to DataFusion logical plan"
PR 4: "gf-exec: integrate WITH in execution pipeline"
PR 5: "tests: WITH clause unit + integration + TCK"

No Bandaid Fixes

Fix problems properly, not with temporary workarounds. Investigate root causes, add regression tests, and keep CI checks enabled.

PR Requirements

All PRs must:

  • Pass all CI checks
  • Include tests for new functionality
  • Maintain or improve code coverage (≥85%)
  • Update relevant documentation
  • Have a clear description
  • Reference the issue number in the commit and PR body (Closes #XX)

Design Principles

  1. Spec-driven correctness — openCypher semantics over performance
  2. Arrow as the wire contract — results cross language boundaries as Arrow RecordBatch streams
  3. GraphForge owns the semantics — no binding or storage provider becomes the semantic owner
  4. Surfaces are independentdb.gds and db.search never modify the grammar
  5. Inspectableexplain at every compiler stage; structured errors with spans

openCypher TCK Compliance

When implementing openCypher features:

  1. Check the TCK coverage matrix: tests/tck/coverage_matrix.json
  2. Mark features as "supported", "planned", or "unsupported"
  3. Add corresponding TCK tests
  4. Ensure semantic correctness per the openCypher specification

All supported features must pass their TCK scenarios — this is a hard merge gate.


Documentation

Code documentation

  • Rust: doc comments (///) on all public items; cargo doc must build cleanly
  • Python: Google-style docstrings on public APIs

Project documentation

When adding features, update:

  • docs/architecture/ — if the change affects the compiler pipeline, storage, or execution model
  • docs/reference/ — if the public API changes
  • CHANGELOG.md ([Unreleased] section)

Releases and Versioning

GraphForge follows Semantic Versioning.

When submitting PRs, update the [Unreleased] section of CHANGELOG.md:

## [Unreleased]

### Added
- New feature you implemented

### Fixed
- Bug you fixed

See release-process.md for the full release procedure.


Getting Help

License

By contributing, you agree that your contributions will be licensed under the MIT License.