Skip to content

Algorithm Verbs

Status: Rust Core — active design Last Updated: 2026-05-29


Overview

GraphForge exposes graph algorithms through five analyst-intent verbs that bypass the Cypher compiler entirely. All return Arrow Tables; none require writing Cypher.

forge.rank(label, by=...)             →  node scores          (centrality, structural)
forge.cluster(label, by=...)          →  community membership  (partitioning, components)
forge.paths(source, target, by=...)   →  path results          (shortest paths, flows, reachability)
forge.analyze(label, by=...)          →  graph-level metrics   (planarity, coloring, matching, DAG)
forge.similar(label, by=...)          →  pairwise similarity   (topology, vector)

The algorithm surface is explicitly designed for parity with:

  • OCG — pure-Rust openCypher engine (31+ algorithms via AlgorithmEngine trait)
  • Neo4j GDS — production graph data science library (60 algorithms across 7 categories)
  • igraph — the de-facto standard Python graph analysis library

Every algorithm that exists in any of these three libraries has a home in one of the four verbs. Algorithms with no upstream analog (genealogy-specific traversals, provenance-aware semijoins) are added as needed without breaking the verb design.


The AlgorithmBackend Trait

All algorithms dispatch through a single AlgorithmBackend trait in gf-exec. The trait mirrors OCG's AlgorithmEngine interface but returns AlgorithmResult (an Arrow batch + metadata) instead of bespoke result types.

Priority resolution: native Rust (rustworkx-core / petgraph) → igraph (if installed via PyO3) → NetworkX (fallback). Backend is selected at runtime per algorithm call; no user configuration required.


forge.rank() — Node Scoring

Returns an Arrow Table: all node properties + score: Float64.

write_property is opt-in; default is read-only.

Centrality (by=)

Value Algorithm Source reference
pagerank PageRank OCG / GDS / igraph
betweenness Betweenness centrality (Brandes) OCG / GDS / igraph
closeness Closeness centrality OCG / GDS / igraph
harmonic_closeness Harmonic closeness (handles disconnected graphs) GDS
degree Degree centrality (normalized) OCG / GDS / igraph
eigenvector Eigenvector centrality GDS
article_rank ArticleRank (degree-penalized PageRank) GDS
hits_hub HITS hub score GDS
hits_authority HITS authority score GDS
celf Cost-Effective Lazy Forward influence maximization GDS

Structural (by=)

Value Algorithm Source reference
clustering_coefficient Local clustering coefficient OCG / igraph
triangles Triangle count per node OCG / GDS
local_clustering_coefficient Alias for clustering_coefficient (GDS name) GDS
k_core K-core decomposition number igraph / GDS

These score pairs of nodes for likely edge formation. rank() emits a score per node based on its aggregate link-prediction likelihood; for pairwise results use analyze().

Value Algorithm Source reference
preferential_attachment Preferential attachment score GDS
adamic_adar Adamic-Adar score GDS
common_neighbors Common neighbor count GDS
resource_allocation Resource allocation index GDS
total_neighbors Total neighbors (Jaccard denominator) GDS

forge.cluster() — Community Membership

Returns an Arrow Table: all node properties + community_id: Int64.

write_property is opt-in; default is read-only.

Community Detection (by=)

Value Algorithm Source reference
louvain Louvain modularity maximization OCG / GDS / igraph
leiden Leiden (improved Louvain) GDS / igraph
label_propagation Label propagation OCG / GDS / igraph
speaker_listener Speaker-listener label propagation (SLPA) GDS
girvan_newman Girvan-Newman (edge betweenness) OCG
modularity_optimization Direct modularity optimization GDS
fastgreedy Clauset-Newman-Moore fastgreedy igraph
infomap Infomap flow-based clustering igraph
leading_eigenvector Newman leading eigenvector igraph
walktrap Walktrap random walk clustering igraph
spinglass Spinglass statistical mechanics igraph
hdbscan HDBSCAN density-based clustering GDS
k_means K-means on node embeddings GDS
approximate_max_k_cut Approximate max k-cut partitioning GDS

Components (by=)

Value Algorithm Source reference
components Weakly connected components OCG / igraph
strongly_connected Strongly connected components (Tarjan) OCG / GDS / igraph
biconnected Biconnected components igraph

Structural Decomposition (by=)

Value Algorithm Source reference
k_core_decomposition K-core decomposition (assigns shell number) igraph / GDS

forge.paths() — Path Finding and Flow

Returns an Arrow Table whose schema depends on the algorithm: - Shortest-path algorithms: source_id, target_id, cost, path (list of node IDs) - Flow algorithms: source_id, sink_id, flow, edge flow columns - Random walk: node_id, walk (list of node IDs)

source and target accept node IDs, NodeHandles, or property-match dicts.

Shortest Paths (by=)

Value Algorithm Source reference
bfs Breadth-first search (hop-count distance) OCG / GDS
dijkstra Dijkstra single-source or source-target OCG / GDS
dijkstra_all_pairs All-pairs Dijkstra OCG / GDS
astar A* with caller-supplied heuristic OCG / GDS
bellman_ford Bellman-Ford (handles negative weights) OCG / GDS
floyd_warshall Floyd-Warshall all-pairs OCG / GDS
delta_stepping Delta-stepping parallel SSSP GDS
yens Yen's k-shortest paths GDS

Flow (by=)

Value Algorithm Source reference
max_flow Edmonds-Karp maximum flow OCG / GDS
min_cut Minimum cut capacity OCG / GDS
min_cost_max_flow Minimum cost maximum flow GDS
gomory_hu_tree Gomory-Hu tree (all-pairs min-cut) igraph

Steiner Trees (by=)

Value Algorithm Source reference
min_steiner_tree Minimum directed Steiner tree GDS
prize_collecting_steiner_tree Prize-collecting Steiner tree GDS

Traversal / Sampling (by=)

Value Algorithm Source reference
dfs Depth-first search traversal GDS / igraph
random_walk Random walk sequence GDS

Reachability (by=)

Value Algorithm Source reference
transitive_closure All (src, dst) reachability pairs OCG

forge.analyze() — Graph-Level Metrics

Returns an Arrow Table whose schema reflects the analysis type. No label parameter is required for graph-global metrics; it is optional for node-level output (coloring, matching, cycle listing).

Spanning Trees (by=)

Value Algorithm Source reference
minimum_spanning_tree Kruskal MST → edge list with weights OCG
maximum_spanning_tree Kruskal maximum spanning tree OCG
minimum_k_spanning_tree Minimum weight k-spanning tree GDS

DAG Analysis (by=)

Value Algorithm Source reference
topological_sort Kahn's topological ordering OCG / GDS
is_dag Boolean: is the graph a DAG? OCG
find_cycles Johnson's algorithm — list all simple cycles OCG
dag_longest_path Longest path by hop count OCG
dag_longest_path_weighted Longest path by edge weight OCG / GDS

Coloring (by=)

Value Algorithm Source reference
node_coloring Greedy vertex coloring OCG
edge_coloring Greedy edge coloring OCG
chromatic_number Approximate chromatic number OCG
k1_coloring K-1 coloring GDS

Matching (by=)

Value Algorithm Source reference
max_weight_matching Maximum weight matching (edges) OCG
max_cardinality_matching Maximum cardinality matching (edges) OCG
max_bipartite_matching Maximum bipartite matching igraph

Eulerian Graph (by=)

Value Algorithm Source reference
euler_circuit Eulerian circuit node sequence OCG
euler_path Eulerian path node sequence OCG
has_euler_circuit Boolean: has Eulerian circuit? OCG
has_euler_path Boolean: has Eulerian path? OCG

Planarity and Structure (by=)

Value Algorithm Source reference
is_planar Boyer-Myrvold / LR planarity test OCG
articulation_points Articulation point detection GDS
bridges Bridge (cut-edge) detection GDS
triangle_count Global triangle count GDS
conductance Conductance metric for a partition GDS
modularity Modularity of a given partition igraph
transitivity Global / average local transitivity igraph
triad_census MAN triad census igraph
dyad_census Dyad census (mutual/asymmetric/null) igraph
count_automorphisms Automorphism count (VF2) igraph

Node Embeddings (by=)

These produce a vector column (embedding: List<Float32>) rather than a scalar score.

Value Algorithm Source reference
node2vec Node2Vec random-walk embedding GDS
graphsage GraphSAGE inductive embedding GDS
fast_random_projection Fast random projection embedding GDS
hashgnn HashGNN feature-based embedding GDS

Similarity

Node and subgraph similarity algorithms return a pairwise Arrow Table (node1_id, node2_id, similarity: Float64). Exposed via forge.similar().

forge.similar(label, by="jaccard", k=10)   # k-nearest neighbors
forge.similar(label, by="cosine", vector_property="embedding", k=10)
Value Algorithm Source reference
node_similarity Jaccard / overlap similarity GDS
knn K-nearest neighbors (property-based) GDS
filtered_knn Filtered KNN GDS
filtered_node_similarity Filtered node similarity GDS

Verb Summary

Verb Primary output column Use when
forge.rank(label, by=...) score: Float64 You want every node scored
forge.cluster(label, by=...) community_id: Int64 You want nodes partitioned
forge.paths(source, target, by=...) cost, path You want routes between nodes
forge.analyze(label, by=...) varies Graph-level or structural metrics
forge.similar(label, by=...) similarity: Float64 You want pairwise node similarity

All verbs accept via (relationship type filter), directed (bool), and write_property (opt-in write-back). All return Arrow Tables.


Implementation Notes

  • Algorithm dispatch lives in crates/gf-exec/src/algorithms/
  • Backend trait: AlgorithmBackend in crates/gf-exec/src/algorithms/backend.rs
  • Adjacency export (Parquet → Arrow edge table): crates/gf-exec/src/algorithms/adjacency.rs
  • Backend resolution order: native Rust (rustworkx-core, petgraph) → igraph → NetworkX
  • All backends return AlgorithmResult { batches: Vec<RecordBatch>, schema: SchemaRef }
  • OCG's AlgorithmEngine trait is the primary cross-reference for correctness of algorithm signatures; their Rust implementations use the same backends (rustworkx-core, petgraph)

References

  • API Reference — public verb signatures
  • Architecture Overview — how verbs fit into the pipeline
  • Execution Model — DataFusion integration
  • OCG crate: https://docs.rs/ocg/latest/ocg/ — algorithm interface reference (Apache-2.0)
  • Neo4j GDS: https://neo4j.com/docs/graph-data-science/current/algorithms/
  • igraph Python API: https://igraph.org/python/api/latest/