OpenCypher Feature Graph - Example Queries¶

Practical Cypher queries for analyzing the OpenCypher feature knowledge graph.

Prerequisites: Build the graph with python scripts/build_feature_graph.py

Graph Location: docs/feature-graph.db

Getting Started¶

from graphforge import GraphForge

# Open the feature graph
db = GraphForge('docs/feature-graph.db')

# Run queries
results = db.execute("""
    MATCH (f:Feature)
    RETURN f.name, f.category, f.subcategory
    LIMIT 10
""")

for row in results:
    print(f"{row['f.name'].value} ({row['f.category'].value})")

Feature Discovery Queries¶

1. List All Features by Category¶

Find all features organized by their category.

MATCH (c:Category)<-[:BELONGS_TO_CATEGORY]-(f:Feature)
RETURN c.name AS category,
       collect(f.name) AS features,
       count(f) AS feature_count
ORDER BY feature_count DESC

Use Case: Get an overview of feature distribution across categories

2. Find All Complete Features¶

List all features that are fully implemented.

MATCH (f:Feature)-[:IMPLEMENTED_IN]->(i:Implementation {status: 'complete'})
RETURN f.name AS feature,
       f.category AS category,
       i.file_path AS implementation
ORDER BY category, feature

Use Case: Identify what's fully working for documentation or user guides

3. Find All Incomplete Features¶

List features that are partially implemented or not implemented.

MATCH (f:Feature)-[:IMPLEMENTED_IN]->(i:Implementation)
WHERE i.status IN ['partial', 'not_implemented']
RETURN f.name AS feature,
       f.category AS category,
       i.status AS status
ORDER BY i.status, category, feature

Use Case: Identify work remaining for roadmap planning

Implementation Status Queries¶

4. Implementation Status by Category¶

Calculate completion percentage for each category.

MATCH (c:Category)<-[:BELONGS_TO_CATEGORY]-(f:Feature)
OPTIONAL MATCH (f)-[:IMPLEMENTED_IN]->(i:Implementation)
WITH c,
     count(f) AS total_features,
     sum(CASE WHEN i.status = 'complete' THEN 1 ELSE 0 END) AS complete,
     sum(CASE WHEN i.status = 'partial' THEN 1 ELSE 0 END) AS partial,
     sum(CASE WHEN i.status = 'not_implemented' OR i IS NULL THEN 1 ELSE 0 END) AS not_impl
RETURN c.name AS category,
       total_features,
       complete,
       partial,
       not_impl,
       round(100.0 * complete / total_features, 1) AS completion_pct
ORDER BY completion_pct DESC

Use Case: Track progress toward full OpenCypher compliance by category

Expected Output:

| category              | total | complete | partial | not_impl | completion_pct |
|-----------------------|-------|----------|---------|----------|----------------|
| Temporal Functions    | 11    | 11       | 0       | 0        | 100.0          |
| Spatial Functions     | 2     | 2        | 0       | 0        | 100.0          |
| Comparison Operators  | 8     | 8        | 0       | 0        | 100.0          |
| Reading Clauses       | 2     | 2        | 0       | 0        | 100.0          |
| ...                   | ...   | ...      | ...     | ...      | ...            |

5. Find High-Priority Implementation Gaps¶

Features not implemented but with good TCK coverage (proxied by category).

MATCH (f:Feature)-[:IMPLEMENTED_IN]->(i:Implementation {status: 'not_implemented'})
MATCH (f)-[:BELONGS_TO_CATEGORY]->(c:Category)
RETURN f.name AS feature,
       c.name AS category,
       f.category AS type
ORDER BY category, feature
LIMIT 20

Use Case: Prioritize next features to implement

6. Find Partial Implementations¶

Features that are partially complete and need finishing.

MATCH (f:Feature)-[:IMPLEMENTED_IN]->(i:Implementation {status: 'partial'})
RETURN f.name AS feature,
       f.category AS category,
       i.file_path AS file,
       i.notes AS notes
ORDER BY category, feature

Use Case: Find incomplete work to finish

Category Analysis Queries¶

7. Most Complete Categories¶

Categories sorted by completion percentage.

MATCH (c:Category)<-[:BELONGS_TO_CATEGORY]-(f:Feature)
OPTIONAL MATCH (f)-[:IMPLEMENTED_IN]->(i:Implementation {status: 'complete'})
WITH c, count(DISTINCT f) AS total, count(DISTINCT i) AS complete
WHERE total > 0
RETURN c.name AS category,
       total AS features,
       complete,
       round(100.0 * complete / total, 1) AS pct
ORDER BY pct DESC, total DESC

Use Case: Identify strongest areas of OpenCypher support

8. Least Complete Categories¶

Categories needing the most work.

MATCH (c:Category)<-[:BELONGS_TO_CATEGORY]-(f:Feature)
OPTIONAL MATCH (f)-[:IMPLEMENTED_IN]->(i:Implementation {status: 'complete'})
WITH c, count(DISTINCT f) AS total, count(DISTINCT i) AS complete
WHERE total > 0
RETURN c.name AS category,
       total AS features,
       complete,
       total - complete AS remaining,
       round(100.0 * complete / total, 1) AS pct
ORDER BY pct ASC, remaining DESC
LIMIT 10

Use Case: Identify categories needing focus

Feature Comparison Queries¶

9. Compare Clause vs Function Implementation¶

Compare implementation rates across major categories.

MATCH (f:Feature)-[:IMPLEMENTED_IN]->(i:Implementation)
WITH f.category AS type,
     count(f) AS total,
     sum(CASE WHEN i.status = 'complete' THEN 1 ELSE 0 END) AS complete
RETURN type,
       total,
       complete,
       round(100.0 * complete / total, 1) AS pct
ORDER BY pct DESC

Use Case: Compare progress across feature types (clauses vs functions vs operators)

10. Find Features Without Implementations¶

Features that don't have any implementation records (data quality check).

MATCH (f:Feature)
WHERE NOT EXISTS((f)-[:IMPLEMENTED_IN]->())
RETURN f.name AS feature,
       f.category AS category,
       f.subcategory AS subcategory
ORDER BY category, feature

Use Case: Data quality validation - identify missing implementation status

File and Code Reference Queries¶

11. Features by Implementation File¶

Group features by their implementation file location.

MATCH (f:Feature)-[:IMPLEMENTED_IN]->(i:Implementation)
WHERE i.file_path IS NOT NULL
WITH i.file_path AS file,
     collect(f.name) AS features,
     count(f) AS feature_count
RETURN file, feature_count, features
ORDER BY feature_count DESC

Use Case: Understand code organization and feature clustering

12. Find All Features in a Specific File¶

Get all features implemented in a specific file (e.g., evaluator.py).

MATCH (f:Feature)-[:IMPLEMENTED_IN]->(i:Implementation)
WHERE i.file_path CONTAINS 'evaluator.py'
RETURN f.name AS feature,
       f.category AS category,
       i.status AS status,
       i.file_path AS file
ORDER BY category, feature

Use Case: Understand what a specific file implements

Roadmap Planning Queries¶

13. Generate v0.5.0 Roadmap¶

Prioritized list of not-implemented features.

MATCH (f:Feature)-[:IMPLEMENTED_IN]->(i:Implementation {status: 'not_implemented'})
MATCH (f)-[:BELONGS_TO_CATEGORY]->(c:Category)
RETURN c.name AS category,
       collect(f.name) AS features,
       count(f) AS count
ORDER BY count DESC

Use Case: Plan next release features grouped by category

14. Features to Complete for Full Temporal Support¶

Even though temporal is 100%, check for any related features.

MATCH (c:Category {name: 'Temporal Functions'})<-[:BELONGS_TO_CATEGORY]-(f:Feature)
OPTIONAL MATCH (f)-[:IMPLEMENTED_IN]->(i:Implementation)
RETURN f.name AS feature,
       COALESCE(i.status, 'unknown') AS status
ORDER BY status, feature

Use Case: Verify complete category coverage

15. Find "Low-Hanging Fruit" Features¶

Simple features that could be implemented quickly (predicates).

MATCH (f:Feature)-[:IMPLEMENTED_IN]->(i:Implementation {status: 'not_implemented'})
WHERE f.category = 'function' AND f.subcategory = 'predicate'
RETURN f.name AS feature,
       f.subcategory AS type
ORDER BY feature

Use Case: Find quick wins for next sprint

Data Quality and Validation Queries¶

16. Validate Feature-Category Relationships¶

Ensure all features belong to a category.

MATCH (f:Feature)
OPTIONAL MATCH (f)-[:BELONGS_TO_CATEGORY]->(c:Category)
WITH f, c
WHERE c IS NULL
RETURN f.name AS orphan_feature,
       f.category AS category,
       f.subcategory AS subcategory

Use Case: Data quality check - find orphaned features

17. Count Nodes by Type¶

Basic graph statistics.

MATCH (n)
RETURN labels(n)[0] AS node_type,
       count(n) AS count
ORDER BY count DESC

Use Case: Graph health check

18. Count Relationships by Type¶

Relationship distribution.

MATCH ()-[r]->()
RETURN type(r) AS relationship_type,
       count(r) AS count
ORDER BY count DESC

Use Case: Graph structure validation

Advanced Analytical Queries¶

19. Feature Completion Trend Analysis¶

Identify which feature types are most/least complete.

MATCH (f:Feature)-[:IMPLEMENTED_IN]->(i:Implementation)
WITH f.category AS category,
     f.subcategory AS subcategory,
     count(f) AS total,
     sum(CASE WHEN i.status = 'complete' THEN 1 ELSE 0 END) AS complete,
     sum(CASE WHEN i.status = 'partial' THEN 1 ELSE 0 END) AS partial
WHERE total >= 3
RETURN category,
       subcategory,
       total,
       complete,
       partial,
       round(100.0 * complete / total, 1) AS pct
ORDER BY pct DESC, total DESC

Use Case: Detailed completion analysis by subcategory

20. Generate Priority Matrix¶

Features scored by implementation complexity (assumed from category) and importance.

MATCH (f:Feature)-[:IMPLEMENTED_IN]->(i:Implementation {status: 'not_implemented'})
MATCH (f)-[:BELONGS_TO_CATEGORY]->(c:Category)
WITH f, c,
     CASE
       WHEN c.name CONTAINS 'Predicate' THEN 'HIGH'
       WHEN c.name CONTAINS 'List' THEN 'MEDIUM'
       WHEN c.name CONTAINS 'Mathematical' THEN 'LOW'
       ELSE 'MEDIUM'
     END AS priority
RETURN priority,
       c.name AS category,
       collect(f.name) AS features,
       count(f) AS count
ORDER BY
  CASE priority
    WHEN 'HIGH' THEN 1
    WHEN 'MEDIUM' THEN 2
    ELSE 3
  END,
  count DESC

Use Case: Prioritized implementation backlog

Usage Tips¶

Performance¶

For large graphs, add indexes on frequently queried properties
Use LIMIT when exploring to avoid large result sets
Profile queries with PROFILE or EXPLAIN (if supported)

Querying Patterns¶

Start with simple MATCH (n:Label) RETURN n LIMIT 10 to explore
Use OPTIONAL MATCH for features that may not have all relationships
Aggregate with collect() to group related features

Combining with Documentation¶

# Get incomplete functions and look up documentation
results = db.execute("""
    MATCH (f:Feature {category: 'function'})-[:IMPLEMENTED_IN]->(i:Implementation)
    WHERE i.status = 'not_implemented'
    RETURN f.name, f.subcategory
    ORDER BY f.subcategory, f.name
""")

for row in results:
    name = row['f.name'].value
    subcategory = row['f.subcategory'].value
    print(f"{name} ({subcategory})")
    # Then look up details in docs/reference/opencypher-features/02-functions.md

Extending the Graph¶

Adding TCK Scenario Nodes¶

To enhance the graph with actual TCK scenario data:

# Parse TCK inventory and create scenario nodes
from pathlib import Path

tck_inventory = Path('docs/reference/tck-inventory.md').read_text()

# Extract scenarios and create nodes
# Then create TESTED_BY relationships to features

db.execute("""
    CREATE (t:TCKScenario {
        name: 'Match simple node pattern',
        feature_file: 'tests/tck/features/official/clauses/match/Match1.feature',
        status: 'passing'
    })
""")

# Link to feature
db.execute("""
    MATCH (f:Feature {name: 'MATCH'}), (t:TCKScenario {name: 'Match simple node pattern'})
    CREATE (f)-[:TESTED_BY {coverage_type: 'basic'}]->(t)
""")

Adding Feature Dependencies¶

Model dependencies between features:

MATCH (with:Feature {name: 'WITH'}), (return:Feature {name: 'RETURN'})
CREATE (with)-[:DEPENDS_ON {
  dependency_type: 'required',
  reason: 'WITH requires RETURN-like projection syntax'
}]->(return)

References¶

Graph Schema: docs/reference/feature-graph-schema.md
Compatibility Matrix: docs/reference/opencypher-compatibility-matrix.md
Build Script: scripts/build_feature_graph.py
GraphForge Documentation: docs/