GraphForge Scale Limits¶
Python 0.4.x measurements
All benchmarks on this page were measured against the Python 0.4.x executor
on real datasets. The Rust core (rust-core branch) replaces the Python executor
with a DataFusion-backed Arrow pipeline; these numbers will not apply once
the Rust core ships.
Last updated: 2026-05-04 (v0.3.9)
GraphForge is designed for research and notebook workflows on small-to-medium graphs. This document describes the practical limits measured on real datasets, distinguishes between query types, and explains why edge count matters more than node count for most operations.
The Short Answer¶
| Query type | Practical limit | Notes |
|---|---|---|
| Traversal with LIMIT | ~20M edges | one_hop LIMIT 1000 takes <1s at 16M edges |
| Full-scan count / aggregation | ~1M edges | count_edges warm = 218s at 34M edges |
| ORDER BY over all nodes | ~500k nodes | topn without index = 155s at 4M nodes |
| Memory (RSS) | ~8 GB available RAM | ~360 bytes/edge; 16M edges ≈ 6 GB |
Why Edge Count, Not Node Count¶
The README historically said "< 10M nodes". That framing is misleading because the real ceiling depends on what you're doing and how many edges you have.
count_nodesiterates an in-memory dict: O(N nodes)count_edgesiterates every adjacency list entry: O(N edges)
A graph with 1M nodes and 50M edges will hit full-scan limits far earlier than a sparse graph with 9M nodes and 10M edges. For dense graphs the difference is dramatic:
| Dataset | Nodes | Edges | count_nodes warm |
count_edges warm |
|---|---|---|---|---|
| M-tier amazon | 335k | 926k | 622 ms | 2 509 ms |
| L-tier livejournal | 4M | 34M | 2 400 ms | 218 000 ms |
| XL-tier cit-patents | 3.8M | 16.5M | 2 539 ms | 55 818 ms |
At L-tier the edge scan is 91× slower than the node scan because livejournal has 8.7 edges per node on average versus the adjacency dict's O(1) node lookup.
Measured Performance by Tier¶
All times are wall-clock seconds on Apple Silicon (macOS 25.4, Python 3.13).
LIMIT-respecting queries (stop early — scale well)¶
These queries stop producing rows as soon as the LIMIT is satisfied. The LIMIT short-circuit (#423) makes them practical at every tier.
| Query | XS (78e) | S (88k e) | M (926k e) | XL (16.5M e) | L (34M e) |
|---|---|---|---|---|---|
one_hop LIMIT 1000 |
0.002s | 0.003s | 0.043s | 0.50s | 0.74s |
two_hop LIMIT 1000 |
0.001s | 0.003s | 0.044s | 0.66s | 0.76s |
L-tier one_hop is slightly slower than XL despite fewer edges because livejournal has higher average degree (8.7 vs 4.4) — each expansion produces more candidates.
Full-scan queries (must visit every node or edge)¶
These queries cannot stop early and scale poorly beyond 1M edges.
| Query | XS | S | M | XL | L |
|---|---|---|---|---|---|
count_nodes warm |
<1 ms | 2 ms | 622 ms | 2 539 ms | 2 400 ms |
count_edges warm |
<1 ms | 83 ms | 2 509 ms | 55 818 ms | 218 000 ms |
aggregation warm |
<1 ms | 15 ms | 1 820 ms | 20 833 ms | 39 000 ms |
topn ORDER BY cold |
<1 ms | 21 ms | 2 800 ms | 51 962 ms | 155 000 ms |
count_edges is catastrophically slow at L-tier because it iterates all 34M
adjacency list entries with no index and no caching.
Structural Bottlenecks¶
These are not tuning problems in the Python executor — they are addressed architecturally in the Rust core:
| Bottleneck | Python 0.4.x | Rust core approach |
|---|---|---|
| No O(1) edge counter | Full adjacency scan | Columnar COUNT(*) on edge_facts table |
| ORDER BY without top-N heap | Materialises all nodes before sorting | DataFusion top-N physical node |
| Single-row ingest | One INSERT per row |
Parquet bulk write via Arrow RecordBatch |
| ~360 bytes/edge in Python dicts | Hard memory ceiling at ~40M edges | Compact columnar layout in Parquet |
Recommended Size Ceilings (Python 0.4.x)¶
| Use case | Ceiling | Why |
|---|---|---|
| Interactive traversal (LIMIT-based) | ~20M edges | <1s response at XL-tier |
| Full-scan aggregation | ~1M edges | Beyond this, queries take seconds to minutes |
| ORDER BY all nodes | ~500k nodes | Full sort materialises everything |
| Memory budget (16 GB machine) | ~40M edges | ~360 bytes/edge |
| Memory budget (8 GB machine) | ~20M edges | Leave headroom for Python runtime |
For LIMIT-based queries — the dominant pattern in research notebooks — the ceiling is comfortably above 10M nodes for sparse graphs. For full-scan operations the practical limit is much lower and governed by edge count.
LIMIT Short-Circuit Effectiveness¶
MATCH (a:Node)-[]->(b)-[]->(c) RETURN id(c) AS cid LIMIT 1000
| Tier | Without LIMIT | With LIMIT 1000 | Speedup |
|---|---|---|---|
| S (88k edges) | ~3.8s | 0.003s | 1 270× |
| M (926k edges) | ~9s | 0.044s | 205× |
| L (34M edges) | hours | 0.76s | >>1 000× |
Further Reading¶
- CHANGELOG.md — version history including v0.3.9 performance improvements
- Architecture Overview — Rust core design and DataFusion execution model