Skip to content

Testing Strategy & Infrastructure

Overview

GraphForge has two test suites that must both pass:

Suite Location What it tests
Rust tests (cargo test) crates/*/src/ Each Rust crate in isolation and integration
Python tests (pytest) tests/ Python binding, end-to-end queries, TCK compliance

The testing principles are the same for both:

  1. Spec-driven correctness — openCypher semantics verified via TCK
  2. Fast feedback loops — unit tests run in milliseconds
  3. Hermetic tests — no shared state between tests
  4. Deterministic behavior — tests pass or fail consistently

Rust Tests

Structure

Each crate contains unit tests inline with the source and integration tests in tests/:

crates/gf-cypher/
├── src/
│   ├── lexer.rs        # #[cfg(test)] inline unit tests
│   ├── parser.rs       # #[cfg(test)] inline unit tests
│   └── lib.rs
└── tests/
    └── parse_corpus.rs # end-to-end parse tests against golden corpus

Running

# All crates
cargo test --workspace

# One crate
cargo test -p gf-cypher

# With output
cargo test --workspace -- --nocapture

# Only doctests
cargo test --doc --workspace

Rust test example

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn node_scan_roundtrip() {
        let op = GraphOp::NodeScan { var: VarId(0), ty: TypeId(1) };
        let json = serde_json::to_string(&op).unwrap();
        let back: GraphOp = serde_json::from_str(&json).unwrap();
        assert_eq!(op, back);
    }
}

Parser differential corpus

The parser migration strategy requires differential testing between the Python LALR(1) parser and the LALRPOP Rust parser. The corpus lives in tests/parser_corpus/ and includes:

  • Valid queries (from the TCK and real-world examples)
  • Invalid queries (error recovery cases)
  • Precedence edge cases
  • Unicode identifiers
  • Parameter syntax
  • Comments

Differential tests run both parsers on the same input and assert AST parity:

cargo test -p gf-cypher -- differential

Python Tests

Test Categories

1. Unit Tests (tests/unit/)

Test individual components in isolation.

tests/unit/
├── parser/
├── planner/
├── executor/
├── storage/
├── algorithms/
├── search/
└── recipes/

Characteristics: no I/O, < 1 ms per test, ≥90% coverage target.

2. Integration Tests (tests/integration/)

Test full query pipeline (parse → plan → execute), persistence, transactions, and the Python API surface.

Characteristics: may use temporary databases, < 100 ms per test.

3. openCypher TCK Tests (tests/tck/)

Official openCypher Technology Compatibility Kit. 3,885 scenarios; 100% passing on main. TCK is a hard merge gate for the rust-core branch too.

tests/tck/
├── conftest.py
├── coverage_matrix.json
└── features/

4. Property-Based Tests (tests/property/)

Hypothesis-driven generative tests for value semantics, expression evaluation, and storage consistency invariants.

5. Performance Benchmarks (tests/benchmarks/)

Real-dataset benchmarks tracked over time. Not part of the standard CI run.

Pytest Configuration

[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = ["-ra", "--strict-markers", "--tb=short", "-v"]

markers = [
    "unit: unit tests (fast, isolated)",
    "integration: integration tests (may use I/O)",
    "tck: openCypher TCK compliance tests",
    "property: property-based tests",
    "benchmark: performance benchmarks",
    "slow: tests that take >1s",
]

Running Python Tests

# All tests
make test

# By category
make test-unit
make test-integration
make test-tck

# With coverage
make coverage            # run + validate thresholds (≥85% total, ≥90% patch)
make coverage-report     # open HTML report
make coverage-diff       # changed files only

# Parallel (4× faster for TCK)
pytest tests/ -n auto

Core Fixtures (tests/conftest.py)

@pytest.fixture
def db():
    """Fresh in-memory GraphForge instance."""
    return GraphForge()

@pytest.fixture
def tmp_db(tmp_path):
    """GraphForge instance backed by a temporary Parquet directory."""
    return GraphForge(str(tmp_path / "graph"))

Quality Gates

Coverage Requirements

Scope Threshold
Total codebase ≥85%
Patch (new/changed lines) ≥90%
Core modules (executor, planner, parser) ≥90%

Required Checks (all PRs)

  1. cargo clippy --workspace -- -D warnings — zero warnings
  2. cargo test --workspace — all Rust tests pass
  3. pytest -m unit — all Python unit tests pass
  4. pytest -m integration — all Python integration tests pass
  5. pytest -m tck — all non-skipped TCK scenarios pass
  6. make coverage — coverage thresholds met
  7. make lint and make type-check — zero issues

TCK Coverage Matrix

Maintain tests/tck/coverage_matrix.json:

{
  "tck_version": "2024.2",
  "features": {
    "Match1_Nodes": {
      "status": "supported",
      "scenarios": {
        "Match single node": "pass",
        "Match node with label": "pass"
      }
    },
    "Match3_VariableLength": {
      "status": "supported"
    }
  }
}

When the Rust core implements a feature, verify the corresponding TCK scenarios pass end-to-end before marking "status": "supported".


CI/CD

GitHub Actions runs the full suite on every PR:

jobs:
  rust:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
      - run: cargo clippy --workspace -- -D warnings
      - run: cargo test --workspace

  python:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ubuntu-latest, macos-latest, windows-latest]
        python-version: ["3.10", "3.11", "3.12", "3.13"]
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
      - uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}
      - run: pip install uv && uv sync --all-extras
      - run: maturin develop --release
      - run: make pre-push

Known Issues

pytest-xdist + pytest-cov deadlock on macOS / Python 3.13

Symptom: make pre-push hangs at the end of the test run — progress reaches ~100% then freezes. CPU drops to 0%. Only kill escapes it.

Root cause: pytest-cov collects coverage data from xdist workers via IPC sockets. When workers close their sockets, a coverage collection thread in the main process blocks on read(), deadlocking with the main thread. Reproduced on macOS (Darwin 25.x) + Python 3.13 + pytest-cov 7.0.0 + pytest-xdist 3.x.

Solution (current Makefile): Run coverage serially, skipping SNAP tests:

coverage:
    uv run pytest tests/unit tests/integration -m "not snap" \
        --cov=src --cov-branch \
        --cov-report=term-missing --cov-report=xml

The serial run is ~60 s slower than the parallel baseline but avoids the deadlock. If the upstream pytest-cov / pytest-xdist fix lands, re-evaluate.


References