Contributing to GraphForge¶
Thank you for your interest in contributing to GraphForge!
GraphForge has two active codebases:
| Branch | Language | Status |
|---|---|---|
main |
Python | Current production release (0.4.x) |
rust-core |
Rust + Python | Rust core refactor (staging) |
Most new feature work happens on rust-core. Bug fixes and documentation
improvements may target either branch.
Development Setup¶
Python-only (main branch)¶
Prerequisites: Python 3.10+, uv or pip
git clone https://github.com/DecisionNerd/graphforge.git
cd graphforge
# Install dev dependencies
uv sync --all-extras
# Verify
pytest -m unit
Rust core (rust-core branch)¶
Prerequisites: Python 3.10+, Rust stable, uv, maturin
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
rustup update stable
git clone https://github.com/DecisionNerd/graphforge.git
cd graphforge
git checkout rust-core
# Install Python dev dependencies
uv sync --dev
# Build and install the Rust extension in development mode
maturin develop --release
# Verify
cargo test --workspace # Rust tests
pytest -m unit # Python binding tests
Development Workflow¶
Before Pushing Code¶
Always run the full validation suite before pushing:
# Python (main)
make pre-push
# Rust core
cargo clippy --workspace -- -D warnings
cargo test --workspace
make pre-push
make pre-push runs: format-check, lint, type-check, tests with coverage,
and coverage threshold validation (≥85%).
Running Tests¶
# Python
make test # all tests
make test-unit # fast unit tests
make test-integration # integration tests
make test-tck # TCK compliance tests
# Rust
cargo test --workspace # all crates
cargo test -p gf-cypher # one crate
cargo test --workspace -- --nocapture # with output
# Coverage
make coverage # run + validate thresholds
make coverage-report # open HTML report
Code Quality¶
# Python
make format # ruff format
make lint # ruff check
make type-check # mypy
# Rust
cargo fmt --all # rustfmt
cargo clippy --workspace -- -D warnings
Project Structure¶
graphforge/
├── crates/ # Rust workspace
│ ├── gf-core/ # public engine facade
│ ├── gf-ast/ # AST + spans
│ ├── gf-cypher/ # recursive-descent + Pratt parser
│ ├── gf-ontology/ # runtime ontology
│ ├── gf-ir/ # graph IR
│ ├── gf-rel/ # relational lowering
│ ├── gf-plan/ # DataFusion integration
│ ├── gf-exec/ # execution session, algorithms, search
│ ├── gf-storage/ # StorageProvider + Parquet
│ ├── gf-io/ # IO sinks
│ ├── gf-provenance/ # lineage + confidence
│ ├── gf-bindings-py/ # PyO3 Python binding
│ ├── gf-bindings-node/ # napi-rs Node binding
│ ├── gf-bindings-uniffi/ # UniFFI shared binding (Swift + Kotlin)
│ └── gf-cli/ # CLI
├── bindings/
│ ├── swift/ # Swift Package Manager package
│ └── kotlin/ # Gradle/Kotlin package
├── src/graphforge/ # Python package (main branch)
│ ├── api.py
│ ├── parser/
│ ├── planner/
│ ├── executor/
│ ├── storage/
│ ├── algorithms/
│ ├── search/
│ └── recipes/
├── tests/
│ ├── unit/
│ ├── integration/
│ ├── tck/
│ └── property/
├── docs/
└── pyproject.toml
Testing Guidelines¶
Writing Tests¶
Unit tests — test one component in isolation:
@pytest.mark.unit
def test_node_creation():
node = Node(labels={"Person"})
assert "Person" in node.labels
Integration tests — test end-to-end behavior:
@pytest.mark.integration
def test_query_execution(db):
result = db.execute("MATCH (n) RETURN n")
assert result is not None
Rust unit tests — in the same file as the module under test:
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_node_scan_op() {
let op = GraphOp::NodeScan { var: VarId(0), ty: TypeId(1) };
assert!(matches!(op, GraphOp::NodeScan { .. }));
}
}
Test Quality Standards¶
- Fast: unit tests < 1 ms
- Isolated: no shared state between tests
- Deterministic: same input = same output
- Named descriptively
See testing.md for the full testing reference.
Code Style¶
Python¶
- PEP 8, enforced by ruff
- Type hints on all function signatures
- Docstrings on public APIs (Google style)
- No
# type: ignorewithout explanation
Rust¶
cargo fmtenforced in CIcargo clippy -- -D warningsenforced in CI- No
#[allow(dead_code)]without explanation - Public items need doc comments
Pull Request Process¶
PR Size Guidelines¶
Keep PRs small and focused. CI tools work best with small, reviewable diffs.
Good: - Single feature or bug fix - 50–300 lines of code changed - 1–5 files modified - Clear, focused purpose
Too large: - Multiple unrelated changes - 1,000+ lines changed - Refactoring + new feature + bug fixes combined
Example — breaking up a Rust feature:
PR 1: "gf-cypher: add WITH clause grammar and AST node"
PR 2: "gf-ir: add WithOp graph IR operator"
PR 3: "gf-rel: lower WITH to DataFusion logical plan"
PR 4: "gf-exec: integrate WITH in execution pipeline"
PR 5: "tests: WITH clause unit + integration + TCK"
No Bandaid Fixes¶
Fix problems properly, not with temporary workarounds. Investigate root causes, add regression tests, and keep CI checks enabled.
PR Requirements¶
All PRs must:
- Pass all CI checks
- Include tests for new functionality
- Maintain or improve code coverage (≥85%)
- Update relevant documentation
- Have a clear description
- Reference the issue number in the commit and PR body (
Closes #XX)
Design Principles¶
- Spec-driven correctness — openCypher semantics over performance
- Arrow as the wire contract — results cross language boundaries as Arrow RecordBatch streams
- GraphForge owns the semantics — no binding or storage provider becomes the semantic owner
- Surfaces are independent —
db.gdsanddb.searchnever modify the grammar - Inspectable —
explainat every compiler stage; structured errors with spans
openCypher TCK Compliance¶
When implementing openCypher features:
- Check the TCK coverage matrix:
tests/tck/coverage_matrix.json - Mark features as
"supported","planned", or"unsupported" - Add corresponding TCK tests
- Ensure semantic correctness per the openCypher specification
All supported features must pass their TCK scenarios — this is a hard merge gate.
Documentation¶
Code documentation¶
- Rust: doc comments (
///) on all public items;cargo docmust build cleanly - Python: Google-style docstrings on public APIs
Project documentation¶
When adding features, update:
docs/architecture/— if the change affects the compiler pipeline, storage, or execution modeldocs/reference/— if the public API changesCHANGELOG.md([Unreleased]section)
Releases and Versioning¶
GraphForge follows Semantic Versioning.
When submitting PRs, update the [Unreleased] section of CHANGELOG.md:
## [Unreleased]
### Added
- New feature you implemented
### Fixed
- Bug you fixed
See release-process.md for the full release procedure.
Getting Help¶
- Questions: GitHub Discussions
- Bugs: GitHub Issues
License¶
By contributing, you agree that your contributions will be licensed under the MIT License.