AI Agent Grounding with Ontologies¶
Overview¶
Grounded agents use knowledge graphs and ontologies to structure domain knowledge, map capabilities to tools, and enable semantic reasoning. This approach helps LLM-based agents make better decisions by grounding their understanding in structured knowledge rather than relying solely on prompt context.
GraphForge is an ideal solution for agent grounding due to its embedded architecture, Python-native design, and zero-configuration deployment model.
Why Knowledge Graphs for Agent Grounding?¶
The Problem¶
Modern LLM agents face several challenges: - Context limitations: Cannot fit all domain knowledge in prompts - Tool selection: Choosing the right tool from dozens or hundreds of options - Semantic understanding: Knowing which tools work with which domain concepts - Reasoning: Following relationships between concepts, tools, and actions
The Solution: Ontological Grounding¶
A knowledge graph provides: - Structured domain model: Classes, properties, and relationships - Tool annotations: Capabilities mapped to domain concepts - Semantic queries: Find tools by intent, not just keywords - Reasoning paths: Navigate concept hierarchies and relationships
GraphForge vs. Neo4j for Agent Development¶
| Feature | GraphForge | Neo4j |
|---|---|---|
| Deployment | Embedded in Python process | Requires server setup |
| Setup | pip install graphforge |
Install server, configure ports, manage service |
| Integration | Native Python API | HTTP API or driver |
| Agent workflow | Direct in-process access | Network calls add latency |
| Development | Zero config, instant start | Configuration, connection strings |
| Portability | Runs anywhere Python runs | Requires server infrastructure |
| Query language | Full openCypher (same as Neo4j) | openCypher |
| Best for | AI/ML development, research, prototyping | Production deployments, massive scale |
Bottom line: GraphForge eliminates deployment complexity for agent development while providing full openCypher compatibility.
Architecture¶
┌─────────────────────────────────────┐
│ LLM Agent (GPT-4, Claude, etc.) │
│ │
│ "Find tools to check inventory" │
└───────────────┬─────────────────────┘
│
├─ Semantic Query
│
┌───────────────▼─────────────────────┐
│ GraphForge │
│ │
│ ┌──────────────────────────────┐ │
│ │ Domain Ontology │ │
│ │ │ │
│ │ Product ─IS_A─> Item │ │
│ │ Inventory ─TRACKS─> Product │ │
│ │ │ │
│ └──────────────────────────────┘ │
│ │
│ ┌──────────────────────────────┐ │
│ │ Tool Definitions │ │
│ │ │ │
│ │ check_inventory() │ │
│ │ ─OPERATES_ON─> Inventory │ │
│ │ ─REQUIRES─> product_id │ │
│ │ │ │
│ └──────────────────────────────┘ │
└─────────────────────────────────────┘
│
├─ Matched Tools
│
┌───────────────▼─────────────────────┐
│ Agent Execution Layer │
│ │
│ Execute: check_inventory("123") │
└─────────────────────────────────────┘
Implementation Guide¶
Step 1: Define Domain Ontology¶
Create a structured model of your domain:
from graphforge import GraphForge
gf = GraphForge()
# Define class hierarchy
gf.execute("""
CREATE (:Class {name: 'Entity', description: 'Base class for all domain entities'})
CREATE (:Class {name: 'Product', description: 'Physical or digital product'})
CREATE (:Class {name: 'Inventory', description: 'Stock tracking system'})
CREATE (:Class {name: 'Order', description: 'Customer purchase order'})
""")
# Define IS_A relationships
gf.execute("""
MATCH (product:Class {name: 'Product'}), (entity:Class {name: 'Entity'})
CREATE (product)-[:IS_A]->(entity)
""")
# Define domain properties
gf.execute("""
CREATE (:Property {name: 'price', type: 'float', class: 'Product'})
CREATE (:Property {name: 'quantity', type: 'int', class: 'Inventory'})
CREATE (:Property {name: 'status', type: 'string', class: 'Order'})
""")
Step 2: Annotate Tools with Capabilities¶
Map tools to domain concepts:
# Define a tool
gf.execute("""
CREATE (t:Tool {
name: 'check_inventory',
description: 'Check current stock levels for a product',
returns: 'int',
endpoint: 'api.inventory.check'
})
""")
# Define tool parameters
gf.execute("""
MATCH (t:Tool {name: 'check_inventory'})
CREATE (t)-[:HAS_PARAMETER]->(:Parameter {
name: 'product_id',
type: 'string',
required: true,
description: 'Unique identifier for the product'
})
""")
# Link tool to domain concepts
gf.execute("""
MATCH (t:Tool {name: 'check_inventory'}),
(inv:Class {name: 'Inventory'}),
(prod:Class {name: 'Product'})
CREATE (t)-[:OPERATES_ON]->(inv)
CREATE (t)-[:REQUIRES]->(prod)
""")
# Annotate tool capabilities
gf.execute("""
MATCH (t:Tool {name: 'check_inventory'})
CREATE (t)-[:CAN_DO]->(:Capability {name: 'query_stock'})
CREATE (t)-[:CAN_DO]->(:Capability {name: 'verify_availability'})
""")
Step 3: Query for Tool Selection¶
The agent queries the ontology to find relevant tools:
def find_tools_for_intent(gf, intent_description):
"""Find tools matching agent intent using semantic queries."""
# Example: "I need to check product availability"
query = """
MATCH (t:Tool)-[:CAN_DO]->(c:Capability)
WHERE c.name CONTAINS 'availability' OR c.name CONTAINS 'stock'
RETURN t.name AS tool,
t.description AS description,
collect(c.name) AS capabilities
"""
results = gf.to_dicts(query)
return results
tools = find_tools_for_intent(gf, "check product availability")
# Returns: [{'tool': 'check_inventory', 'description': '...', 'capabilities': [...]}]
Step 4: Navigate Concept Hierarchies¶
Find tools that work with a class or its superclasses:
def find_tools_for_entity(gf, entity_class):
"""Find all tools that work with an entity or its parent classes."""
query = """
MATCH (c:Class {name: $entity_class})-[:IS_A*0..]->(parent:Class)
WITH parent
MATCH (t:Tool)-[:OPERATES_ON]->(parent)
RETURN DISTINCT t.name AS tool,
t.description AS description,
parent.name AS operates_on
"""
return gf.to_dicts(query, {'entity_class': entity_class})
# Find all tools that work with Products
tools = find_tools_for_entity(gf, 'Product')
Step 5: Get Tool Metadata¶
Retrieve complete tool signature for execution:
def get_tool_metadata(gf, tool_name):
"""Get complete tool definition including parameters."""
query = """
MATCH (t:Tool {name: $tool_name})
OPTIONAL MATCH (t)-[:HAS_PARAMETER]->(p:Parameter)
OPTIONAL MATCH (t)-[:OPERATES_ON]->(c:Class)
OPTIONAL MATCH (t)-[:CAN_DO]->(cap:Capability)
RETURN t.name AS name,
t.description AS description,
t.endpoint AS endpoint,
collect(DISTINCT {
name: p.name,
type: p.type,
required: p.required,
description: p.description
}) AS parameters,
collect(DISTINCT c.name) AS operates_on,
collect(DISTINCT cap.name) AS capabilities
"""
result = gf.to_dicts(query, {'tool_name': tool_name})
return result[0] if result else None
Integration with Agent Frameworks¶
LangChain / LangGraph Integration¶
from langchain_core.tools import StructuredTool
from langgraph.prebuilt import ToolNode
# Load tools from GraphForge ontology
def load_tools_from_ontology(gf):
tools_data = gf.to_dicts("""
MATCH (t:Tool)
RETURN t.name AS name, t.description AS description, t.endpoint AS endpoint
""")
tools = []
for t in tools_data:
# Create LangChain tool from ontology definition
tool = StructuredTool.from_function(
name=t['name'],
description=t['description'],
func=lambda **kwargs, ep=t['endpoint']: call_api(ep, kwargs), # Your API caller
)
tools.append(tool)
return tools
# Use in a LangGraph agent node
tools = load_tools_from_ontology(gf)
tool_node = ToolNode(tools)
LlamaIndex Integration¶
from llama_index.core.tools import FunctionTool
from llama_index.core.objects import ObjectRetriever
# Create tools from ontology
def create_llama_tools(gf):
tools_data = gf.to_dicts("""
MATCH (t:Tool)
OPTIONAL MATCH (t)-[:HAS_PARAMETER]->(p:Parameter)
RETURN t.name AS name,
t.description AS description,
collect({name: p.name, type: p.type, required: p.required}) AS params
""")
tools = []
for t in tools_data:
# Create function signature from ontology
tool = FunctionTool.from_defaults(
fn=lambda **kwargs, name=t['name']: execute_tool(name, kwargs),
name=t['name'],
description=t['description']
)
tools.append(tool)
return tools
# For dynamic retrieval — subclass ObjectRetriever to use GraphForge as backend
class GraphForgeToolRetriever(ObjectRetriever):
def __init__(self, gf):
self._gf = gf
def retrieve(self, query_str: str) -> list[FunctionTool]:
tools_data = self._gf.to_dicts("""
MATCH (t:Tool)-[:CAN_DO]->(c:Capability)
WHERE c.name CONTAINS $q OR t.description CONTAINS $q
WITH DISTINCT t WHERE t.deprecated = false
RETURN t.name AS name, t.description AS description, t.endpoint AS endpoint
ORDER BY t.latency_ms ASC LIMIT 5
""", {'q': query_str})
return [
FunctionTool.from_defaults(
fn=lambda **kwargs, ep=t['endpoint']: call_api(ep, kwargs),
name=t['name'], description=t['description']
)
for t in tools_data
]
Custom Agent Implementation¶
class OntologyGroundedAgent:
"""Agent that uses GraphForge ontology for tool grounding."""
def __init__(self, gf: GraphForge, llm):
self.gf = gf
self.llm = llm
def execute(self, user_query: str):
"""Execute user query with ontology-grounded tool selection."""
# Step 1: Extract intent from user query
intent = self.llm.extract_intent(user_query)
# Step 2: Query ontology for relevant tools
tools = self.find_tools_for_intent(intent)
# Step 3: LLM selects best tool and parameters
selected_tool, params = self.llm.select_tool(user_query, tools)
# Step 4: Execute tool
result = self.execute_tool(selected_tool, params)
return result
def find_tools_for_intent(self, intent):
"""Query ontology for tools matching intent."""
query = """
MATCH (t:Tool)-[:CAN_DO]->(c:Capability)
WHERE c.name CONTAINS $intent OR t.description CONTAINS $intent
RETURN t.name AS name,
t.description AS description,
collect(c.name) AS capabilities
"""
return self.gf.to_dicts(query, {'intent': intent})
Advanced Patterns¶
Multi-Step Reasoning¶
Chain tools using relationship traversal:
# Find tools that can be composed
query = """
MATCH (t1:Tool)-[:PRODUCES]->(concept:Class)<-[:REQUIRES]-(t2:Tool)
WHERE t1.name = 'search_products'
RETURN t1.name AS first_tool,
concept.name AS intermediate,
t2.name AS next_tool
"""
# Result: search_products -> Product -> check_inventory
# Agent can chain: search_products() |> check_inventory()
Contextual Tool Selection¶
Select tools based on current conversation context:
def select_contextual_tools(gf, conversation_entities):
"""Find tools relevant to entities mentioned in conversation."""
query = """
MATCH (c:Class)<-[:IS_A*0..]-(entity)
WHERE entity.name IN $entities
WITH c, entity
MATCH (t:Tool)-[:OPERATES_ON]->(c)
RETURN DISTINCT t.name AS tool,
t.description AS description,
entity.name AS relevant_to
ORDER BY tool
"""
return gf.to_dicts(query, {'entities': conversation_entities})
Permission and Access Control¶
Model tool permissions in the ontology:
# Define user roles and permissions
gf.execute("""
CREATE (:Role {name: 'customer', level: 1})
CREATE (:Role {name: 'admin', level: 10})
""")
gf.execute("""
MATCH (t:Tool {name: 'check_inventory'}), (r:Role {name: 'customer'})
CREATE (r)-[:CAN_USE]->(t)
""")
gf.execute("""
MATCH (t:Tool {name: 'update_inventory'}), (r:Role {name: 'admin'})
CREATE (r)-[:CAN_USE]->(t)
""")
# Filter tools by user role
def get_authorized_tools(gf, user_role):
return gf.to_dicts("""
MATCH (r:Role {name: $role})-[:CAN_USE]->(t:Tool)
RETURN t.name AS tool, t.description AS description
""", {'role': user_role})
Benefits of GraphForge for Agent Grounding¶
1. Embedded Architecture¶
- No server overhead: GraphForge runs in your Python process
- Zero latency: Direct in-memory queries, no network round-trips
- Simple deployment: No ports, no services, no configuration
2. Python-Native Integration¶
- Seamless: Import GraphForge like any Python library
- Type safety: Python objects in, Python objects out
- Debugging: Use standard Python debuggers and tools
3. Development Velocity¶
- Instant setup:
pip install graphforgeand you're running - Rapid iteration: No server restarts or connection management
- Portable: Works in notebooks, scripts, containers, serverless
4. openCypher Compatibility¶
- Standard queries: Same Cypher syntax as Neo4j
- Transferable skills: Knowledge applies across graph databases
- Rich expressiveness: Full pattern matching, aggregations, paths
5. Research-Friendly¶
- Inspectable: Print graph state, examine queries interactively
- Lightweight: Perfect for experiments and prototypes
- Reproducible: Single-file persistence, easy to version control
Example Use Cases¶
E-commerce Agent¶
- Ontology: Products, Orders, Inventory, Customers
- Tools: search(), check_stock(), place_order(), track_shipment()
- Queries: Find tools to complete purchase flow
Customer Support Agent¶
- Ontology: Issues, Products, Solutions, Procedures
- Tools: search_kb(), create_ticket(), escalate(), get_status()
- Queries: Find resolution tools for issue type
Data Analysis Agent¶
- Ontology: Datasets, Metrics, Visualizations, Transformations
- Tools: load_data(), compute_metric(), plot(), export()
- Queries: Find analysis pipeline for metric
DevOps Agent¶
- Ontology: Services, Deployments, Monitors, Alerts
- Tools: deploy(), rollback(), check_health(), scale()
- Queries: Find remediation tools for alert type
Getting Started¶
Installation¶
pip install graphforge
Quick Example¶
from graphforge import GraphForge
# Create ontology
gf = GraphForge()
# Define domain
gf.execute("""
CREATE (:Class {name: 'Product'}),
(:Class {name: 'Inventory'})
""")
# Add tool (CREATE and MATCH must be separate execute() calls)
gf.execute("CREATE (:Tool {name: 'check_stock', description: 'Check product stock'})")
gf.execute("""
MATCH (t:Tool {name: 'check_stock'}), (i:Class {name: 'Inventory'})
CREATE (t)-[:OPERATES_ON]->(i)
""")
# Query for tool
results = gf.to_dicts("""
MATCH (t:Tool)-[:OPERATES_ON]->(c:Class {name: 'Inventory'})
RETURN t.name AS tool, t.description AS description
""")
print(results)
# [{'tool': 'check_stock', 'description': 'Check product stock'}]
Hybrid Search for Tool Selection¶
Cypher WHERE clauses do exact or range matching. For natural-language tool selection — where an
agent might describe an intent in any phrasing — db.search adds fuzzy text and semantic vector
retrieval on top of the ontology.
Index Tool Descriptions for Text Search¶
# After loading the tool ontology, index descriptions for FTS retrieval
rows = db.execute("MATCH (t:Tool) RETURN id(t) AS nid, t.name AS name, t.description AS desc")
for row in rows:
text = f"{row['name'].value} {row['desc'].value}"
db.search.index_node(row["nid"].value, text)
Store Vector Embeddings (bring-your-own)¶
import openai
client = openai.OpenAI()
rows = db.execute("MATCH (t:Tool) RETURN id(t) AS nid, t.description AS desc")
for row in rows:
vec = client.embeddings.create(
input=row["desc"].value, model="text-embedding-3-small"
).data[0].embedding
db.search.set_node_vector(row["nid"].value, vec, space="text-embedding-3-small")
Agent Tool Selection¶
def select_tools(db, user_intent: str, intent_vec: list[float], top_k: int = 5):
"""Return ranked tools for a given user intent."""
results = db.search(user_intent, vector=intent_vec, top_k=top_k)
tools = []
for hit in results:
tool_id = hit.ref.id
# Fetch full tool metadata including parameters via Cypher
rows = db.execute("""
MATCH (t:Tool)-[:HAS_PARAMETER]->(p:Parameter)
WHERE id(t) = $nid
RETURN t.name AS tool, t.description AS desc,
collect(p.name) AS params
""", {"nid": tool_id})
if rows:
tools.append({
"tool": rows[0]["tool"].value,
"description": rows[0]["desc"].value,
"parameters": [v.value for v in rows[0]["params"].value],
"relevance": hit.score,
"matched_by": hit.sources, # ("text",), ("vector",), or ("text", "vector")
})
return tools
Why Hybrid Over Pure-Cypher¶
| Approach | Handles phrasing variation | Requires exact keywords | Semantic proximity |
|---|---|---|---|
WHERE t.name = $name |
No | Yes | No |
WHERE t.name CONTAINS $term |
Partial | Yes | No |
db.search.text(intent) |
Yes (BM25) | No | No |
db.search(intent, vector=vec) |
Yes (BM25 + cosine) | No | Yes |
For ontology-grounded agents, hybrid retrieval narrows the candidate tool list before Cypher fetches the full structured metadata — best of both surfaces.
Next Steps¶
- Explore the example: See
examples/agent_grounding/for complete working example - Design your ontology: Model your domain classes and relationships
- Annotate your tools: Map tools to domain concepts
- Build query patterns: Create semantic queries for tool selection
- Integrate with LLM: Connect to your agent framework
Resources¶
Conclusion¶
GraphForge provides the ideal foundation for AI agent grounding: - Simple: Embedded architecture eliminates deployment complexity - Powerful: Full openCypher query expressiveness - Fast: Direct in-process access, no network latency - Flexible: Python-native integration with any agent framework
Start building grounded agents today with zero infrastructure overhead.