GraphForge Scale Limits¶

Python 0.4.x measurements

All benchmarks on this page were measured against the Python 0.4.x executor on real datasets. The Rust core (rust-core branch) replaces the Python executor with a DataFusion-backed Arrow pipeline; these numbers will not apply once the Rust core ships.

Last updated: 2026-05-04 (v0.3.9)

GraphForge is designed for research and notebook workflows on small-to-medium graphs. This document describes the practical limits measured on real datasets, distinguishes between query types, and explains why edge count matters more than node count for most operations.

The Short Answer¶

Query type	Practical limit	Notes
Traversal with LIMIT	~20M edges	`one_hop LIMIT 1000` takes <1s at 16M edges
Full-scan count / aggregation	~1M edges	`count_edges` warm = 218s at 34M edges
ORDER BY over all nodes	~500k nodes	`topn` without index = 155s at 4M nodes
Memory (RSS)	~8 GB available RAM	~360 bytes/edge; 16M edges ≈ 6 GB

Why Edge Count, Not Node Count¶

The README historically said "< 10M nodes". That framing is misleading because the real ceiling depends on what you're doing and how many edges you have.

count_nodes iterates an in-memory dict: O(N nodes)
count_edges iterates every adjacency list entry: O(N edges)

A graph with 1M nodes and 50M edges will hit full-scan limits far earlier than a sparse graph with 9M nodes and 10M edges. For dense graphs the difference is dramatic:

Dataset	Nodes	Edges	`count_nodes` warm	`count_edges` warm
M-tier amazon	335k	926k	622 ms	2 509 ms
L-tier livejournal	4M	34M	2 400 ms	218 000 ms
XL-tier cit-patents	3.8M	16.5M	2 539 ms	55 818 ms

At L-tier the edge scan is 91× slower than the node scan because livejournal has 8.7 edges per node on average versus the adjacency dict's O(1) node lookup.

Measured Performance by Tier¶

All times are wall-clock seconds on Apple Silicon (macOS 25.4, Python 3.13).

LIMIT-respecting queries (stop early — scale well)¶

These queries stop producing rows as soon as the LIMIT is satisfied. The LIMIT short-circuit (#423) makes them practical at every tier.

Query	XS (78e)	S (88k e)	M (926k e)	XL (16.5M e)	L (34M e)
`one_hop LIMIT 1000`	0.002s	0.003s	0.043s	0.50s	0.74s
`two_hop LIMIT 1000`	0.001s	0.003s	0.044s	0.66s	0.76s

L-tier one_hop is slightly slower than XL despite fewer edges because livejournal has higher average degree (8.7 vs 4.4) — each expansion produces more candidates.

Full-scan queries (must visit every node or edge)¶

These queries cannot stop early and scale poorly beyond 1M edges.

Query	XS	S	M	XL	L
`count_nodes` warm	<1 ms	2 ms	622 ms	2 539 ms	2 400 ms
`count_edges` warm	<1 ms	83 ms	2 509 ms	55 818 ms	218 000 ms
`aggregation` warm	<1 ms	15 ms	1 820 ms	20 833 ms	39 000 ms
`topn ORDER BY` cold	<1 ms	21 ms	2 800 ms	51 962 ms	155 000 ms

count_edges is catastrophically slow at L-tier because it iterates all 34M adjacency list entries with no index and no caching.

Structural Bottlenecks¶

These are not tuning problems in the Python executor — they are addressed architecturally in the Rust core:

Bottleneck	Python 0.4.x	Rust core approach
No O(1) edge counter	Full adjacency scan	Columnar `COUNT(*)` on `edge_facts` table
ORDER BY without top-N heap	Materialises all nodes before sorting	DataFusion top-N physical node
Single-row ingest	One `INSERT` per row	Parquet bulk write via Arrow RecordBatch
~360 bytes/edge in Python dicts	Hard memory ceiling at ~40M edges	Compact columnar layout in Parquet

Recommended Size Ceilings (Python 0.4.x)¶

Use case	Ceiling	Why
Interactive traversal (LIMIT-based)	~20M edges	<1s response at XL-tier
Full-scan aggregation	~1M edges	Beyond this, queries take seconds to minutes
ORDER BY all nodes	~500k nodes	Full sort materialises everything
Memory budget (16 GB machine)	~40M edges	~360 bytes/edge
Memory budget (8 GB machine)	~20M edges	Leave headroom for Python runtime

For LIMIT-based queries — the dominant pattern in research notebooks — the ceiling is comfortably above 10M nodes for sparse graphs. For full-scan operations the practical limit is much lower and governed by edge count.

LIMIT Short-Circuit Effectiveness¶

MATCH (a:Node)-[]->(b)-[]->(c) RETURN id(c) AS cid LIMIT 1000

Tier	Without LIMIT	With LIMIT 1000	Speedup
S (88k edges)	~3.8s	0.003s	1 270×
M (926k edges)	~9s	0.044s	205×
L (34M edges)	hours	0.76s	>>1 000×