AI Layer
The AI layer is the platform's sovereign intelligence — a personal AI agent with privileged access to all user data, constrained entirely by user-configured consent rules. It never operates outside the user's explicit permissions, every LLM call is audited, and write actions require human confirmation.
The data model, event system, embeddings pipeline, and pgvector are built from the start — so when the AI agent ships, it has a full indexed history to work with.
Four Components
┌─────────────────────────────────────────────────────────────┐
│ AI LAYER │
│ │
│ ┌─────────────────┐ ┌──────────────────────────────┐ │
│ │ Agent Service │ │ LLM Router │ │
│ │ │ │ OpenRouter → Cloud APIs │ │
│ │ Conversation │───►│ Ollama → Local models │ │
│ │ RAG Engine │ │ User routing rules │ │
│ │ Context Mgmt │ │ Budget enforcement │ │
│ └─────────────────┘ └──────────────────────────────┘ │
│ │
│ ┌─────────────────┐ ┌──────────────────────────────┐ │
│ │ Workflow Engine │ │ Embedding Service │ │
│ │ │ │ │ │
│ │ YAML workflows │ │ Write-time: embed entities │ │
│ │ Step execution │ │ pgvector storage │ │
│ │ Human-in-loop │ │ Semantic search index │ │
│ └─────────────────┘ └──────────────────────────────┘ │
│ ▲ │
│ │ decomposes │
│ ┌─────────────────┐ │
│ │ Agentic Planner │ │
│ │ NL → workflow │ │
│ └─────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Agent Service
The Agent Service manages conversations and orchestrates the RAG pipeline.
Conversation Model
interface AgentConversation {
conversationId: string;
userId: string;
messages: AgentMessage[];
context: {
currentLocation?: string; // from Present plane
activeCalendarEvents?: Event[];
recentActivity?: Activity[];
};
}
interface AgentMessage {
role: 'user' | 'assistant' | 'system';
content: string;
citations?: Citation[]; // RAG source references
requiresConfirmation?: boolean; // for write actions
}
RAG Engine Pipeline
User query
│
▼
1. Query Expansion
└─ Generate structured sub-queries from natural language
e.g., "what did Sarah email me?" →
query: messages WHERE sender LIKE Sarah AND direction=inbound
query: persons WHERE name LIKE Sarah
▼
2. Parallel Retrieval
├── pgvector: cosine similarity search (semantic)
├── Meilisearch: full-text search
└── PostgreSQL: structured queries (date ranges, person IDs, channel types)
▼
3. Re-ranking
└─ Score all results by relevance to original query
Boost recent results, boost exact name matches
▼
4. Context Assembly
└─ Select top-k results within token budget
Format as structured context block for LLM
▼
5. LLM Call (via LLM Router)
└─ System prompt + user context + user query → response
▼
6. Response with Citations
└─ Stream response to user
Attach source references (message IDs, document IDs)
Example RAG Queries
| User Query | Data Sources Queried |
|---|---|
| "Where was I when I emailed Sarah last month?" | messages + location_history (joined by timestamp) |
| "Who called me from 727 area code last week?" | messages (voice) + persons |
| "Summarize my meetings this week" | events + attendees (persons) |
| "What documents did I share with Acme Corp?" | documents + persons (org) |
LLM Router
The LLM Router directs each query to the right model based on user-defined rules, privacy constraints, and cost controls.
Routing Architecture
AI query
│
▼
Router: evaluate routing rules
│
├─► Local model (Ollama) — sensitive data, budget exhausted, user preference
│ └── Llama 3, Mistral, Phi
│
└─► Cloud API (OpenRouter) — complex reasoning, creative generation
├── Claude (Anthropic) — default for reasoning
├── GPT-4 (OpenAI) — complex tasks
└── Gemini (Google) — multimodal, long context
OpenRouter Integration
OpenRouter provides a unified API for all cloud LLM providers. A single OpenRouter API key routes to any supported provider — no per-provider key management required.
// LLM Router call via OpenRouter
const response = await openrouter.chat({
model: 'anthropic/claude-3-5-sonnet', // or 'openai/gpt-4o', etc.
messages: [...],
stream: true,
});
User Routing Rules
Users define routing rules in the Consent UI. Rules are evaluated in order:
# Example user routing configuration
routing_rules:
- condition: "data_domain contains 'health'"
model: local/llama3
reason: "Health data never leaves device"
- condition: "query_type == 'reasoning'"
model: openrouter/claude-3-5-sonnet
- condition: "monthly_spend >= budget_limit"
model: local/llama3
reason: "Budget exhausted, fall back to local"
- default: openrouter/claude-3-haiku # fast, cheap default
Budget Control
interface LLMBudgetConfig {
monthlyLimitUSD: number; // e.g., 20.00
warningThresholdPct: number; // e.g., 0.80 (warn at 80%)
hardCutoff: boolean; // if true, switch to local at limit; if false, warn only
}
The router tracks token usage per provider and switches to local models when the monthly budget is reached.
Embedding Service
The Embedding Service runs continuously in the background, keeping the vector index current.
Write-Time Embedding
When any text entity is created or updated in the Data Service, the Embedding Service generates a vector embedding and stores it in pgvector:
data.EntityCreated (NATS)
│
▼
IndexDocumentJob (background queue)
│
▼
Embedding Service
│ calls local embedding model (Sentence Transformers or Ollama)
▼
pgvector embeddings table
INSERT (entity_type, entity_id, model, embedding)
Vector Schema
embeddings (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
entity_type text NOT NULL, -- 'message' | 'document' | 'person' | 'event'
entity_id uuid NOT NULL,
model text NOT NULL, -- embedding model identifier
embedding vector(1536), -- OpenAI ada-002 dim; or 768 for local models
created_at timestamptz DEFAULT now()
)
CREATE INDEX ON embeddings USING ivfflat (embedding vector_cosine_ops);
Semantic Search Query
-- Find top-10 messages semantically similar to a query vector
SELECT e.entity_id, 1 - (em.embedding <=> $query_vector) AS similarity
FROM embeddings em
JOIN messages e ON e.id = em.entity_id
WHERE em.entity_type = 'message'
ORDER BY em.embedding <=> $query_vector
LIMIT 10;
Workflow Engine
The Workflow Engine executes multi-step automated tasks defined in YAML. It handles state persistence, error recovery, and human-in-the-loop confirmation.
Workflow Definition
id: summarize-and-file
name: "Summarize and File Important Articles"
version: "1.0"
trigger:
event: data.EntityCreated
filter:
entity_type: document
tags: [article]
steps:
- id: summarize
type: ai.generate
model: openrouter/claude-3-haiku
prompt: |
Summarize this article in 3 bullet points:
{{ steps.input.content }}
output: summary
- id: classify
type: ai.classify
model: local/llama3
categories: [tech, science, business, culture, health]
input: "{{ steps.input.content }}"
output: topic
- id: tag
type: data.update
entity_type: document
entity_id: "{{ steps.input.entity_id }}"
fields:
tags: ["{{ steps.classify.output }}", "auto-summarized"]
notes: "{{ steps.summarize.output }}"
Human-in-the-Loop Step
- id: confirm-send
type: human.confirm
message: |
About to send this email to {{ recipient_name }}:
{{ steps.compose.output }}
timeout: 300 # seconds to wait for confirmation
on_timeout: cancel
on_confirm:
- type: communication.send_email
...
AI-Powered Rules Integration
The Rules Service can use AI as a condition or action:
# AI condition: classify an email before routing
trigger: communication.EmailReceived
condition:
operator: ai.classify_text
params:
text: "{{ event.body }}"
category: urgent
# Returns true if classified as urgent
action:
operator: ai.generate_text
params:
prompt: "Summarize in one sentence: {{ event.body }}"
output: summary
action:
operator: communication.send_notification
params:
body: "Urgent email: {{ summary }}"
Agentic Planner
The Agentic Planner accepts natural language goals and decomposes them into executable workflow definitions.
Planning Process
User: "Plan a weekend trip to San Francisco next month for me and my partner.
Find flights, a pet-friendly hotel, and Italian restaurant for Saturday night."
▼
1. Agentic Planner sends goal to powerful LLM (Claude Sonnet or GPT-4)
System prompt: available plugins, user preferences, current context
▼
2. LLM decomposes into steps:
- search_flights(from=user_home, to=SFO, dates=next_month_weekend, passengers=2)
- search_hotels(location=SFO, pet_friendly=true, dates=same)
- search_restaurants(location=SFO, cuisine=italian, night=saturday)
▼
3. Agentic Planner generates a dynamic workflow YAML
▼
4. Workflow Engine executes
- Each step calls the appropriate plugin/service
- Results collected
▼
5. Results presented to user for selection
▼
6. Human-in-the-loop confirmation before any booking action
Privacy Guarantees
The AI layer enforces the platform's zero-trust principles:
| Guarantee | Mechanism |
|---|---|
| No data access without consent | Policy Service checked before every RAG retrieval |
| Every LLM call logged | Audit log: provider, model, token count, timestamp |
| PII redaction for cloud prompts | Policy Service strips names/phones/emails from prompt before routing |
| Write actions require confirmation | Human-in-the-loop step mandatory for email/call/purchase |
| Budget hard limit | LLM Router blocks cloud calls when limit reached |
| Local model fallback | Health and sensitive data always routed to local Ollama |
Capability Rollout
| Stage | Capability |
|---|---|
| Embeddings foundation | Embedding Service active (all entities embedded at write time); pgvector schema in place |
| Semantic search | Basic semantic search in the web app (not full AI agent) |
| Full AI agent | Full Agent Service; RAG Engine; LLM Router with OpenRouter + local Ollama; Workflow Engine; Agentic Planner; AI chat UI in web + Android |
| Advanced AI | Multi-agent workflows; federated AI (queries span multiple user hubs); on-device LLM on Android |