AI — AI Service Library
packages/ai is the AI service library shared by the hub and any future AI-facing services. It provides the RAG engine, LLM routing, embedding pipeline, workflow engine, and agentic planning — all with built-in privacy and consent controls.
The embedding pipeline activates as soon as the hub starts. By the time the full agent is enabled, it already has an indexed corpus of months of user data.
What It Is
A TypeScript package consumed by apps/hub. Not a standalone HTTP server — it's a library. The hub exposes the AI functionality through the Sovereign API at /api/ai/*.
packages/ai/
├── src/
│ ├── index.ts # Public API surface (named exports)
│ ├── sanitize.ts # Prompt/PII sanitization helpers
│ ├── rag/
│ │ ├── RagEngine.ts # Main RAG pipeline orchestrator
│ │ ├── Retriever.ts # pgvector + Meilisearch parallel retrieval
│ │ └── Reranker.ts # Result scoring and selection
│ ├── llm/
│ │ ├── LlmRouter.ts # Route to OpenRouter or local Ollama
│ │ ├── OpenRouterClient.ts # OpenRouter API client
│ │ └── OllamaClient.ts # Local Ollama API client
│ ├── embedding/
│ │ └── EmbeddingService.ts # Generate + store vector embeddings
│ ├── workflow/
│ │ ├── WorkflowEngine.ts # YAML workflow execution
│ │ ├── StepExecutor.ts # Individual step handler dispatch (exports HumanConfirmationRequired)
│ │ └── types.ts # Workflow/step/trigger types
│ └── planner/
│ └── AgenticPlanner.ts # NL goal → workflow decomposition
├── package.json
└── tsconfig.json
Note: an
AgentService/agent/module is planned but not yet present —index.tscontains a commented-out placeholder export. Conversation management currently lives in the hub (apps/hub/src/services/agent/), not in this package.
Public API
The package exposes individual classes as named exports — there is no createAiService factory. Consumers (the hub) instantiate and wire components directly:
import {
EmbeddingService,
LlmRouter,
OpenRouterClient,
OllamaClient,
Retriever,
Reranker,
RagEngine,
WorkflowEngine,
StepExecutor,
AgenticPlanner,
} from '@made-open/ai';
// Embedding (called on EntityCreated / EntityUpdated)
const embeddings = new EmbeddingService(/* deps */);
await embeddings.embedEntity('message', 'msg-uuid', 'Hey, can we move the meeting to Thursday?');
// RAG query
const rag = new RagEngine(/* retriever, reranker, llmRouter */);
const response = await rag.query({ userId: 'user-id', text: 'Who called me last week?' });
// Workflow execution
const engine = new WorkflowEngine(/* deps */);
await engine.execute(definition, trigger);
// Agentic planning
const planner = new AgenticPlanner(/* deps */);
const workflow = await planner.plan('Book a restaurant for Saturday dinner', 'user-id');
RAG Engine
The RAG Engine retrieves relevant context from the user's data before generating an LLM response. It runs three retrieval strategies in parallel:
class RagEngine {
async retrieve(query: string, userId: string): Promise<RagContext> {
const [vectorResults, textResults, structuredResults] = await Promise.all([
// Semantic similarity via pgvector
this.retriever.vectorSearch(query, userId, { topK: 20 }),
// Full-text search via Meilisearch
this.retriever.textSearch(query, userId, { limit: 20 }),
// Structured queries (dates, names, channels)
this.retriever.structuredQuery(this.parseIntent(query), userId),
]);
// Re-rank and select top results within token budget
return this.reranker.select([...vectorResults, ...textResults, ...structuredResults], {
maxTokens: 4096,
query,
});
}
}
pgvector Query
SELECT e.entity_id, e.entity_type,
1 - (em.embedding <=> $queryVector) AS similarity
FROM embeddings em
JOIN messages m ON m.id = em.entity_id
WHERE em.entity_type = 'message'
AND m.owner_id = $userId
ORDER BY em.embedding <=> $queryVector
LIMIT 20;
LLM Router
The router applies user-defined routing rules to every query. Rules are evaluated in order; first match wins.
interface RoutingRule {
condition: (context: QueryContext) => boolean;
provider: 'openrouter' | 'ollama';
model: string;
reason?: string;
}
// Default routing rules (user can override in settings)
const defaultRules: RoutingRule[] = [
{
condition: (ctx) => ctx.dataDomains.includes('health'),
provider: 'ollama',
model: 'llama3.2',
reason: 'Health data stays local',
},
{
condition: (ctx) => ctx.monthlySpend >= ctx.budgetLimit,
provider: 'ollama',
model: 'llama3.2',
reason: 'Budget limit reached',
},
{
condition: () => true, // default
provider: 'openrouter',
model: 'anthropic/claude-3-5-haiku',
},
];
OpenRouter
OpenRouter provides a single endpoint for all cloud LLMs with an OpenAI-compatible API:
class OpenRouterClient {
async complete(request: ChatRequest): Promise<ReadableStream> {
const response = await fetch('https://openrouter.ai/api/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': `Bearer ${this.apiKey}`,
'HTTP-Referer': 'https://made-open.io',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: request.model,
messages: request.messages,
stream: true,
}),
});
return response.body!;
}
}
Embedding Service
Runs as a background job handler. Every data.EntityCreated / data.EntityUpdated event triggers an IndexDocumentJob:
class EmbeddingService {
async embedEntity(entityType: string, entityId: string, text: string): Promise<void> {
// Generate embedding using configured model
const embedding = await this.model.embed(text);
// Upsert into pgvector
await this.supabase.from('embeddings').upsert({
entity_type: entityType,
entity_id: entityId,
model: this.model.id,
embedding: embedding, // vector(1536)
}, { onConflict: 'entity_type,entity_id' });
}
}
Embedding models (configurable):
text-embedding-3-smallvia OpenRouter (cloud, low cost)nomic-embed-textvia Ollama (local, free, 768-dim)
Workflow Engine
YAML-defined workflows with step execution, state persistence, and human confirmation:
class WorkflowEngine {
async execute(definition: WorkflowDefinition, trigger: TriggerContext): Promise<void> {
const run = await this.createRun(definition, trigger);
for (const step of definition.steps) {
if (step.type === 'human.confirm') {
// StepExecutor throws `HumanConfirmationRequired`; engine persists state and returns.
throw new HumanConfirmationRequired(run, step);
}
const result = await this.executeStep(step, run.context);
run.context[step.id] = result;
await this.persistRunState(run);
}
}
}
Agentic Planner
The Agentic Planner accepts a natural language goal and produces a workflow definition using a powerful LLM:
class AgenticPlanner {
async plan(goal: string, userId: string): Promise<WorkflowDefinition> {
const availableTools = await this.getAvailableTools(userId);
const response = await this.llmRouter.complete({
model: 'anthropic/claude-3-5-sonnet', // needs reasoning capability
messages: [
{ role: 'system', content: this.buildPlannerSystemPrompt(availableTools) },
{ role: 'user', content: goal },
],
});
return this.parseWorkflowFromResponse(response);
}
}
Privacy & Audit
Every AI operation is logged:
// In LlmRouter, after every call:
await policyService.log({
actorType: 'service',
actorId: 'ai-service',
action: 'execute',
resourceType: `llm:${provider}:${model}`,
outcome: 'allowed',
metadata: {
promptTokens: usage.promptTokens,
completionTokens: usage.completionTokens,
costUSD: usage.cost,
routingReason: matchedRule.reason,
},
});
First Testable Workflow
1. Start hub with AI package
2. Ensure Supabase has data (contacts, messages from MS Graph sync)
3. POST /api/ai/query
body: { "text": "Who called me last week?" }
└─ Verify: RAG retrieves call records from messages table
└─ Verify: LLM response names the callers
└─ Verify: audit_log has an entry for the LLM call
4. POST /api/ai/query
body: { "text": "What did Sarah and I discuss in email?" }
└─ Verify: email thread retrieved via pgvector semantic search
└─ Verify: response synthesizes across multiple messages
5. Check LLM routing:
└─ Query about health → routed to Ollama (verify in audit log)
└─ Complex reasoning query → routed to OpenRouter Claude
Capability Rollout
| Capability | Description |
|---|---|
| Embedding Service | Active at startup; all entities embedded at write time; pgvector index maintained |
| Semantic Search | Basic semantic search exposed in web/Android app via hub API |
| Full Agent | Full Agent Service; RAG Engine; LLM Router; Workflow Engine; Agentic Planner; AI chat UI |
| Advanced AI | Multi-agent workflows; on-device embedding on Android; federated AI |