Model
Rye — Overview
See how Rye unifies entities, relationships, events, and assertions for better decisions.
Source file: design/model/overview.md
Rye — Overview
Why a Graph, and Why in PostgreSQL
1. The Core Challenge
Operational systems store data in silos. A CRM tracks contacts. A project manager tracks tasks. A support tool tracks tickets. A billing system tracks subscriptions. Each system is optimized for its own workflow, but none of them answer the question that drives decisions:
“Given this entity, who is connected to it, what has happened, what do we currently believe about it, what has changed, and how does it relate to everything else?”
That question spans every silo. Answering it today requires a human to manually cross-reference systems, hold relationships in their head, and hope nothing was missed. When that person leaves, the institutional knowledge leaves with them.
Rye solves this by providing a single queryable structure that captures entities, their relationships, what happened, and what we believe — without replacing the systems that already work.
2. Why a Graph
The data is fundamentally a graph. Entities (people, companies, parcels, tickets, projects) connect through relationships (owns, employs, targets, assigned_to, references). Those relationships have properties — an ownership has a fractional interest, an employment has a start date. The relationships are as important as the entities.
A traditional normalized schema can model known, stable relationships. But operational domains have characteristics that make rigid schemas brittle:
- Entity types are not fully known upfront. Today it’s customers and tickets. Tomorrow it’s partners, integrations, and regulatory filings.
- Relationship types are not fully known upfront. “Referred by,” “escalated from,” “blocks,” “successor to” — these emerge as the domain is explored.
- The same real-world entity exists in multiple systems. A person is a CRM contact, a support ticket requester, a billing account owner, and a Slack user.
- Relationships change over time. People change roles. Deals advance. Subscriptions churn and reactivate.
- Later facts contradict earlier ones. A data correction reveals that a customer’s plan was miscategorized. The system must preserve both the old belief and the new one.
3. Why PostgreSQL
A dedicated graph database (Neo4j, Neptune) is a valid choice, but PostgreSQL with JSONB and lightweight graph tables offers significant advantages:
- Single operational database. No synchronization between a relational store and a graph store. Domain tables and the graph live in the same database, queryable in the same transaction.
- JSONB with GIN indexes. Flexible, schema-on-read storage with indexed access to any field. New properties require no migration.
- Row-Level Security. Native RLS enforces access control at the database engine level, not the application level.
- Mature ecosystem. Triggers, materialized views, CTEs, window functions, and the full SQL toolkit.
- Recursive CTEs handle graph traversal adequately for millions of edges. If traversal becomes a bottleneck, Apache AGE adds openCypher without leaving PostgreSQL.
4. Conceptual Architecture
The data model has six tables organized into three layers:
Layer 1: The Graph (Structure)
nodes— Entities (vertices): people, companies, projects, tickets, parcels, documentsedges— Directed relationships between entities with optional temporal bounds
Layer 2: The Event Log (Activity)
events— Immutable record of things that happened: emails, calls, status changes, importsevent_participants— Junction linking events to the nodes involved
Layer 3: The Knowledge Layer (Intelligence)
assertions— Time-versioned facts that can be superseded: valuations, statuses, opinionsartifacts— Extracted content, document references, structured data products
Supporting tables handle security, integration, and deduplication:
access_grants— Runtime-configurable permissionsfield_classifications— Field-level sensitivity metadatanode_source_map— Maps graph nodes to records in domain tablesnode_merges— Tracks entity deduplication decisionscrm_code_counters— Human-readable code generation
5. Design Principles
- Append-only safety. Assertions are never mutated — only superseded. Events are immutable. You can’t corrupt history, only build on it.
- Overlay architecture. The graph points to your domain tables. Your domain tables don’t know the graph exists. Drop the graph schema and all operational systems continue.
- Temporal by default. Every fact has a timestamp and provenance. You always know what you believed and when.
- Agent-native. The schema is structured for LLM agents to read, write, and traverse through natural language. Agents insert facts; they never overwrite or delete.
- Convention over schema. New entity types, relationship types, and properties require no migration. Write a new
node_typevalue and it exists. - Single auth model. Access control uses session variables (
SET LOCAL "app.current_role" = ...) consistently. No mixing of session-based and database-role-based enforcement.
6. Related Documents
- Data Dictionary — Every table, view, and function: what it does and why
- Core Contract and Conformance Kit — Normative contract, implementation checklist, and test matrix
- Schema Reference — Table definitions and DDL
- Functions Reference — Utility functions and query patterns
- Security — RLS policies and field-level redaction
- Integration — Domain table overlay and change tracking
- CRM Conventions — Contact, opportunity, and pipeline conventions
- PM Conventions — Task, project, and sprint conventions