AI — AI Service Library

packages/ai is the AI service library shared by the hub and any future AI-facing services. It provides the RAG engine, LLM routing, embedding pipeline, workflow engine, and agentic planning — all with built-in privacy and consent controls.

The embedding pipeline activates as soon as the hub starts. By the time the full agent is enabled, it already has an indexed corpus of months of user data.

What It Is

A TypeScript package consumed by apps/hub. Not a standalone HTTP server — it's a library. The hub exposes the AI functionality through the Sovereign API at /api/ai/*.

packages/ai/
├── src/
│   ├── index.ts                 # Public API surface (named exports)
│   ├── sanitize.ts               # Prompt/PII sanitization helpers
│   ├── rag/
│   │   ├── RagEngine.ts         # Main RAG pipeline orchestrator
│   │   ├── Retriever.ts         # pgvector + Meilisearch parallel retrieval
│   │   └── Reranker.ts          # Result scoring and selection
│   ├── llm/
│   │   ├── LlmRouter.ts         # Route to OpenRouter or local Ollama
│   │   ├── OpenRouterClient.ts  # OpenRouter API client
│   │   └── OllamaClient.ts      # Local Ollama API client
│   ├── embedding/
│   │   └── EmbeddingService.ts  # Generate + store vector embeddings
│   ├── workflow/
│   │   ├── WorkflowEngine.ts    # YAML workflow execution
│   │   ├── StepExecutor.ts      # Individual step handler dispatch (exports HumanConfirmationRequired)
│   │   └── types.ts             # Workflow/step/trigger types
│   └── planner/
│       └── AgenticPlanner.ts    # NL goal → workflow decomposition
├── package.json
└── tsconfig.json

Note: an AgentService / agent/ module is planned but not yet present — index.ts contains a commented-out placeholder export. Conversation management currently lives in the hub (apps/hub/src/services/agent/), not in this package.

Public API

The package exposes individual classes as named exports — there is no createAiService factory. Consumers (the hub) instantiate and wire components directly:

import {
  EmbeddingService,
  LlmRouter,
  OpenRouterClient,
  OllamaClient,
  Retriever,
  Reranker,
  RagEngine,
  WorkflowEngine,
  StepExecutor,
  AgenticPlanner,
} from '@made-open/ai';

// Embedding (called on EntityCreated / EntityUpdated)
const embeddings = new EmbeddingService(/* deps */);
await embeddings.embedEntity('message', 'msg-uuid', 'Hey, can we move the meeting to Thursday?');

// RAG query
const rag = new RagEngine(/* retriever, reranker, llmRouter */);
const response = await rag.query({ userId: 'user-id', text: 'Who called me last week?' });

// Workflow execution
const engine = new WorkflowEngine(/* deps */);
await engine.execute(definition, trigger);

// Agentic planning
const planner = new AgenticPlanner(/* deps */);
const workflow = await planner.plan('Book a restaurant for Saturday dinner', 'user-id');

RAG Engine

The RAG Engine retrieves relevant context from the user's data before generating an LLM response. It runs three retrieval strategies in parallel:

class RagEngine {
  async retrieve(query: string, userId: string): Promise<RagContext> {
    const [vectorResults, textResults, structuredResults] = await Promise.all([
      // Semantic similarity via pgvector
      this.retriever.vectorSearch(query, userId, { topK: 20 }),

      // Full-text search via Meilisearch
      this.retriever.textSearch(query, userId, { limit: 20 }),

      // Structured queries (dates, names, channels)
      this.retriever.structuredQuery(this.parseIntent(query), userId),
    ]);

    // Re-rank and select top results within token budget
    return this.reranker.select([...vectorResults, ...textResults, ...structuredResults], {
      maxTokens: 4096,
      query,
    });
  }
}

pgvector Query

SELECT e.entity_id, e.entity_type,
       1 - (em.embedding <=> $queryVector) AS similarity
FROM embeddings em
JOIN messages m ON m.id = em.entity_id
WHERE em.entity_type = 'message'
  AND m.owner_id = $userId
ORDER BY em.embedding <=> $queryVector
LIMIT 20;

LLM Router

The router applies user-defined routing rules to every query. Rules are evaluated in order; first match wins.

interface RoutingRule {
  condition: (context: QueryContext) => boolean;
  provider: 'openrouter' | 'ollama';
  model: string;
  reason?: string;
}

// Default routing rules (user can override in settings)
const defaultRules: RoutingRule[] = [
  {
    condition: (ctx) => ctx.dataDomains.includes('health'),
    provider: 'ollama',
    model: 'llama3.2',
    reason: 'Health data stays local',
  },
  {
    condition: (ctx) => ctx.monthlySpend >= ctx.budgetLimit,
    provider: 'ollama',
    model: 'llama3.2',
    reason: 'Budget limit reached',
  },
  {
    condition: () => true,  // default
    provider: 'openrouter',
    model: 'anthropic/claude-3-5-haiku',
  },
];

OpenRouter

OpenRouter provides a single endpoint for all cloud LLMs with an OpenAI-compatible API:

class OpenRouterClient {
  async complete(request: ChatRequest): Promise<ReadableStream> {
    const response = await fetch('https://openrouter.ai/api/v1/chat/completions', {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${this.apiKey}`,
        'HTTP-Referer': 'https://made-open.io',
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        model: request.model,
        messages: request.messages,
        stream: true,
      }),
    });
    return response.body!;
  }
}

Embedding Service

Runs as a background job handler. Every data.EntityCreated / data.EntityUpdated event triggers an IndexDocumentJob:

class EmbeddingService {
  async embedEntity(entityType: string, entityId: string, text: string): Promise<void> {
    // Generate embedding using configured model
    const embedding = await this.model.embed(text);

    // Upsert into pgvector
    await this.supabase.from('embeddings').upsert({
      entity_type: entityType,
      entity_id: entityId,
      model: this.model.id,
      embedding: embedding,   // vector(1536)
    }, { onConflict: 'entity_type,entity_id' });
  }
}

Embedding models (configurable):

text-embedding-3-small via OpenRouter (cloud, low cost)
nomic-embed-text via Ollama (local, free, 768-dim)

Workflow Engine

YAML-defined workflows with step execution, state persistence, and human confirmation:

class WorkflowEngine {
  async execute(definition: WorkflowDefinition, trigger: TriggerContext): Promise<void> {
    const run = await this.createRun(definition, trigger);

    for (const step of definition.steps) {
      if (step.type === 'human.confirm') {
        // StepExecutor throws `HumanConfirmationRequired`; engine persists state and returns.
        throw new HumanConfirmationRequired(run, step);
      }

      const result = await this.executeStep(step, run.context);
      run.context[step.id] = result;
      await this.persistRunState(run);
    }
  }
}

Agentic Planner

The Agentic Planner accepts a natural language goal and produces a workflow definition using a powerful LLM:

class AgenticPlanner {
  async plan(goal: string, userId: string): Promise<WorkflowDefinition> {
    const availableTools = await this.getAvailableTools(userId);

    const response = await this.llmRouter.complete({
      model: 'anthropic/claude-3-5-sonnet',  // needs reasoning capability
      messages: [
        { role: 'system', content: this.buildPlannerSystemPrompt(availableTools) },
        { role: 'user', content: goal },
      ],
    });

    return this.parseWorkflowFromResponse(response);
  }
}

Privacy & Audit

Every AI operation is logged:

// In LlmRouter, after every call:
await policyService.log({
  actorType: 'service',
  actorId: 'ai-service',
  action: 'execute',
  resourceType: `llm:${provider}:${model}`,
  outcome: 'allowed',
  metadata: {
    promptTokens: usage.promptTokens,
    completionTokens: usage.completionTokens,
    costUSD: usage.cost,
    routingReason: matchedRule.reason,
  },
});

First Testable Workflow

1. Start hub with AI package
2. Ensure Supabase has data (contacts, messages from MS Graph sync)

3. POST /api/ai/query
   body: { "text": "Who called me last week?" }
   └─ Verify: RAG retrieves call records from messages table
   └─ Verify: LLM response names the callers
   └─ Verify: audit_log has an entry for the LLM call

4. POST /api/ai/query
   body: { "text": "What did Sarah and I discuss in email?" }
   └─ Verify: email thread retrieved via pgvector semantic search
   └─ Verify: response synthesizes across multiple messages

5. Check LLM routing:
   └─ Query about health → routed to Ollama (verify in audit log)
   └─ Complex reasoning query → routed to OpenRouter Claude

Capability Rollout

Capability	Description
Embedding Service	Active at startup; all entities embedded at write time; pgvector index maintained
Semantic Search	Basic semantic search exposed in web/Android app via hub API
Full Agent	Full Agent Service; RAG Engine; LLM Router; Workflow Engine; Agentic Planner; AI chat UI
Advanced AI	Multi-agent workflows; on-device embedding on Android; federated AI