Algorithm Verbs
Status: Rust Core — active design
Last Updated: 2026-05-29
Overview
GraphForge exposes graph algorithms through five analyst-intent verbs that bypass the
Cypher compiler entirely. All return Arrow Tables; none require writing Cypher.
forge.rank(label, by=...) → node scores (centrality, structural)
forge.cluster(label, by=...) → community membership (partitioning, components)
forge.paths(source, target, by=...) → path results (shortest paths, flows, reachability)
forge.analyze(label, by=...) → graph-level metrics (planarity, coloring, matching, DAG)
forge.similar(label, by=...) → pairwise similarity (topology, vector)
The algorithm surface is explicitly designed for parity with:
OCG — pure-Rust openCypher engine (31+ algorithms via AlgorithmEngine trait)
Neo4j GDS — production graph data science library (60 algorithms across 7 categories)
igraph — the de-facto standard Python graph analysis library
Every algorithm that exists in any of these three libraries has a home in one of the four
verbs. Algorithms with no upstream analog (genealogy-specific traversals, provenance-aware
semijoins) are added as needed without breaking the verb design.
The AlgorithmBackend Trait
All algorithms dispatch through a single AlgorithmBackend trait in gf-exec. The trait
mirrors OCG's AlgorithmEngine interface but returns AlgorithmResult (an Arrow batch +
metadata) instead of bespoke result types.
Priority resolution: native Rust (rustworkx-core / petgraph) → igraph (if installed
via PyO3) → NetworkX (fallback). Backend is selected at runtime per algorithm call; no
user configuration required.
forge.rank() — Node Scoring
Returns an Arrow Table: all node properties + score: Float64.
write_property is opt-in; default is read-only.
Centrality (by=)
Value
Algorithm
Source reference
pagerank
PageRank
OCG / GDS / igraph
betweenness
Betweenness centrality (Brandes)
OCG / GDS / igraph
closeness
Closeness centrality
OCG / GDS / igraph
harmonic_closeness
Harmonic closeness (handles disconnected graphs)
GDS
degree
Degree centrality (normalized)
OCG / GDS / igraph
eigenvector
Eigenvector centrality
GDS
article_rank
ArticleRank (degree-penalized PageRank)
GDS
hits_hub
HITS hub score
GDS
hits_authority
HITS authority score
GDS
celf
Cost-Effective Lazy Forward influence maximization
GDS
Structural (by=)
Value
Algorithm
Source reference
clustering_coefficient
Local clustering coefficient
OCG / igraph
triangles
Triangle count per node
OCG / GDS
local_clustering_coefficient
Alias for clustering_coefficient (GDS name)
GDS
k_core
K-core decomposition number
igraph / GDS
Link Prediction Scores (by=)
These score pairs of nodes for likely edge formation. rank() emits a score per node
based on its aggregate link-prediction likelihood; for pairwise results use analyze().
Value
Algorithm
Source reference
preferential_attachment
Preferential attachment score
GDS
adamic_adar
Adamic-Adar score
GDS
common_neighbors
Common neighbor count
GDS
resource_allocation
Resource allocation index
GDS
total_neighbors
Total neighbors (Jaccard denominator)
GDS
Returns an Arrow Table: all node properties + community_id: Int64.
write_property is opt-in; default is read-only.
Value
Algorithm
Source reference
louvain
Louvain modularity maximization
OCG / GDS / igraph
leiden
Leiden (improved Louvain)
GDS / igraph
label_propagation
Label propagation
OCG / GDS / igraph
speaker_listener
Speaker-listener label propagation (SLPA)
GDS
girvan_newman
Girvan-Newman (edge betweenness)
OCG
modularity_optimization
Direct modularity optimization
GDS
fastgreedy
Clauset-Newman-Moore fastgreedy
igraph
infomap
Infomap flow-based clustering
igraph
leading_eigenvector
Newman leading eigenvector
igraph
walktrap
Walktrap random walk clustering
igraph
spinglass
Spinglass statistical mechanics
igraph
hdbscan
HDBSCAN density-based clustering
GDS
k_means
K-means on node embeddings
GDS
approximate_max_k_cut
Approximate max k-cut partitioning
GDS
Components (by=)
Value
Algorithm
Source reference
components
Weakly connected components
OCG / igraph
strongly_connected
Strongly connected components (Tarjan)
OCG / GDS / igraph
biconnected
Biconnected components
igraph
Structural Decomposition (by=)
Value
Algorithm
Source reference
k_core_decomposition
K-core decomposition (assigns shell number)
igraph / GDS
forge.paths() — Path Finding and Flow
Returns an Arrow Table whose schema depends on the algorithm:
- Shortest-path algorithms: source_id, target_id, cost, path (list of node IDs)
- Flow algorithms: source_id, sink_id, flow, edge flow columns
- Random walk: node_id, walk (list of node IDs)
source and target accept node IDs, NodeHandles, or property-match dicts.
Shortest Paths (by=)
Value
Algorithm
Source reference
bfs
Breadth-first search (hop-count distance)
OCG / GDS
dijkstra
Dijkstra single-source or source-target
OCG / GDS
dijkstra_all_pairs
All-pairs Dijkstra
OCG / GDS
astar
A* with caller-supplied heuristic
OCG / GDS
bellman_ford
Bellman-Ford (handles negative weights)
OCG / GDS
floyd_warshall
Floyd-Warshall all-pairs
OCG / GDS
delta_stepping
Delta-stepping parallel SSSP
GDS
yens
Yen's k-shortest paths
GDS
Flow (by=)
Value
Algorithm
Source reference
max_flow
Edmonds-Karp maximum flow
OCG / GDS
min_cut
Minimum cut capacity
OCG / GDS
min_cost_max_flow
Minimum cost maximum flow
GDS
gomory_hu_tree
Gomory-Hu tree (all-pairs min-cut)
igraph
Steiner Trees (by=)
Value
Algorithm
Source reference
min_steiner_tree
Minimum directed Steiner tree
GDS
prize_collecting_steiner_tree
Prize-collecting Steiner tree
GDS
Traversal / Sampling (by=)
Value
Algorithm
Source reference
dfs
Depth-first search traversal
GDS / igraph
random_walk
Random walk sequence
GDS
Reachability (by=)
Value
Algorithm
Source reference
transitive_closure
All (src, dst) reachability pairs
OCG
forge.analyze() — Graph-Level Metrics
Returns an Arrow Table whose schema reflects the analysis type.
No label parameter is required for graph-global metrics; it is optional for node-level
output (coloring, matching, cycle listing).
Spanning Trees (by=)
Value
Algorithm
Source reference
minimum_spanning_tree
Kruskal MST → edge list with weights
OCG
maximum_spanning_tree
Kruskal maximum spanning tree
OCG
minimum_k_spanning_tree
Minimum weight k-spanning tree
GDS
DAG Analysis (by=)
Value
Algorithm
Source reference
topological_sort
Kahn's topological ordering
OCG / GDS
is_dag
Boolean: is the graph a DAG?
OCG
find_cycles
Johnson's algorithm — list all simple cycles
OCG
dag_longest_path
Longest path by hop count
OCG
dag_longest_path_weighted
Longest path by edge weight
OCG / GDS
Coloring (by=)
Value
Algorithm
Source reference
node_coloring
Greedy vertex coloring
OCG
edge_coloring
Greedy edge coloring
OCG
chromatic_number
Approximate chromatic number
OCG
k1_coloring
K-1 coloring
GDS
Matching (by=)
Value
Algorithm
Source reference
max_weight_matching
Maximum weight matching (edges)
OCG
max_cardinality_matching
Maximum cardinality matching (edges)
OCG
max_bipartite_matching
Maximum bipartite matching
igraph
Eulerian Graph (by=)
Value
Algorithm
Source reference
euler_circuit
Eulerian circuit node sequence
OCG
euler_path
Eulerian path node sequence
OCG
has_euler_circuit
Boolean: has Eulerian circuit?
OCG
has_euler_path
Boolean: has Eulerian path?
OCG
Planarity and Structure (by=)
Value
Algorithm
Source reference
is_planar
Boyer-Myrvold / LR planarity test
OCG
articulation_points
Articulation point detection
GDS
bridges
Bridge (cut-edge) detection
GDS
triangle_count
Global triangle count
GDS
conductance
Conductance metric for a partition
GDS
modularity
Modularity of a given partition
igraph
transitivity
Global / average local transitivity
igraph
triad_census
MAN triad census
igraph
dyad_census
Dyad census (mutual/asymmetric/null)
igraph
count_automorphisms
Automorphism count (VF2)
igraph
Node Embeddings (by=)
These produce a vector column (embedding: List<Float32>) rather than a scalar score.
Value
Algorithm
Source reference
node2vec
Node2Vec random-walk embedding
GDS
graphsage
GraphSAGE inductive embedding
GDS
fast_random_projection
Fast random projection embedding
GDS
hashgnn
HashGNN feature-based embedding
GDS
Similarity
Node and subgraph similarity algorithms return a pairwise Arrow Table
(node1_id, node2_id, similarity: Float64). Exposed via forge.similar().
forge . similar ( label , by = "jaccard" , k = 10 ) # k-nearest neighbors
forge . similar ( label , by = "cosine" , vector_property = "embedding" , k = 10 )
Value
Algorithm
Source reference
node_similarity
Jaccard / overlap similarity
GDS
knn
K-nearest neighbors (property-based)
GDS
filtered_knn
Filtered KNN
GDS
filtered_node_similarity
Filtered node similarity
GDS
Verb Summary
Verb
Primary output column
Use when
forge.rank(label, by=...)
score: Float64
You want every node scored
forge.cluster(label, by=...)
community_id: Int64
You want nodes partitioned
forge.paths(source, target, by=...)
cost, path
You want routes between nodes
forge.analyze(label, by=...)
varies
Graph-level or structural metrics
forge.similar(label, by=...)
similarity: Float64
You want pairwise node similarity
All verbs accept via (relationship type filter), directed (bool), and write_property
(opt-in write-back). All return Arrow Tables.
Implementation Notes
Algorithm dispatch lives in crates/gf-exec/src/algorithms/
Backend trait: AlgorithmBackend in crates/gf-exec/src/algorithms/backend.rs
Adjacency export (Parquet → Arrow edge table): crates/gf-exec/src/algorithms/adjacency.rs
Backend resolution order: native Rust (rustworkx-core, petgraph) → igraph → NetworkX
All backends return AlgorithmResult { batches: Vec<RecordBatch>, schema: SchemaRef }
OCG's AlgorithmEngine trait is the primary cross-reference for correctness of algorithm
signatures; their Rust implementations use the same backends (rustworkx-core, petgraph)
References
API Reference — public verb signatures
Architecture Overview — how verbs fit into the pipeline
Execution Model — DataFusion integration
OCG crate: https://docs.rs/ocg/latest/ocg/ — algorithm interface reference (Apache-2.0)
Neo4j GDS: https://neo4j.com/docs/graph-data-science/current/algorithms/
igraph Python API: https://igraph.org/python/api/latest/