Skip to content

Data Federation

This document outlines AICO's federated device network architecture, which enables secure synchronization of user data across multiple trusted devices while maintaining privacy and local-first principles.

Federated Device Network

AICO implements a privacy-preserving federated device network that allows the AI companion to seamlessly roam between a user's trusted devices:

graph TD
    A[User's Phone] <-->|P2P Encrypted Sync| B[User's Laptop]
    A <-->|P2P Encrypted Sync| C[User's Tablet]
    B <-->|P2P Encrypted Sync| C

    A -.->|Fallback Only| D[Encrypted Cloud Relay]
    B -.->|Fallback Only| D
    C -.->|Fallback Only| D

    classDef device fill:#663399,stroke:#9370DB,color:#fff
    class A,B,C device
    classDef cloud fill:#9370DB,stroke:#663399,color:#fff
    class D cloud

Core Principles

  1. P2P Encrypted Mesh: Direct device-to-device synchronization without intermediaries
  2. Local Network Priority: Devices on the same network communicate directly
  3. Zero-Knowledge Design: No external parties can access user data
  4. Fallback Cloud Relay: Optional encrypted relay only when direct connection impossible
  5. User Control: Complete user control over trusted devices and sync policies

Key Components

Device Registry

The device registry manages trusted device information and is stored in the primary libSQL database:

# Example device registry schema
CREATE TABLE device_registry (
    device_id TEXT PRIMARY KEY,
    device_name TEXT NOT NULL,
    device_type TEXT NOT NULL,
    public_key BLOB NOT NULL,
    last_seen TIMESTAMP,
    trust_level INTEGER NOT NULL,
    sync_policy TEXT NOT NULL
);

Key Features: - Stored in libSQL for consistency with primary data layer - Manages trusted device information and relationships - Handles encryption keys and trust relationships - Supports device-specific sync policies

Selective Sync

Different data types have different synchronization policies based on their importance and data characteristics:

Data Type Sync Priority Policy Database
User Profile High Immediate sync libSQL
Conversation History Medium Configurable (Full/Summary) libSQL
Personality Model High Full sync libSQL
Vector Embeddings Low On-demand sync ChromaDB
Analytics Data Low Periodic batch sync DuckDB
Media Files Low Thumbnail only by default File system
Cache Data None Local only RocksDB

Sync Characteristics: - Different sync policies per database and data type - Prioritization of critical vs. non-critical data - Bandwidth-efficient delta synchronization - Configurable sync intervals and triggers

P2P Encrypted Sync

The federated sync mechanism implements secure peer-to-peer communication:

  1. Device Discovery:
  2. Local network: mDNS/Bonjour for same-network devices
  3. Remote: DHT (Distributed Hash Table) for internet-based discovery
  4. Fallback to encrypted relay when direct connection impossible

  5. Authentication:

  6. Mutual device authentication using public key cryptography
  7. Trust establishment through user verification
  8. Device-specific encryption keys for secure channels

  9. Data Transfer:

  10. End-to-end encryption for all synced data
  11. Direct device-to-device communication when possible
  12. Delta synchronization for bandwidth efficiency
  13. Resumable transfers for reliability
  14. Fallback to encrypted relay when necessary

  15. Sync Protocol:

  16. Merkle tree-based change detection
  17. Conflict-free replicated data types (CRDTs) where applicable
  18. Atomic transaction boundaries for consistency

Conflict Resolution

AICO implements sophisticated conflict resolution strategies tailored to different data types:

  1. Last-Writer-Wins: For simple preference and configuration data
  2. Timestamp-based resolution for simple conflicts
  3. Used for user settings and device preferences

  4. Semantic Merging: For complex structured data

  5. Conversation history merging with chronological ordering
  6. Personality model updates with weighted averaging
  7. Context-aware merging for relationship data

  8. Vector Merging: For embedding and analytical data

  9. Vector data merging with deduplication
  10. Similarity-based conflict detection
  11. Automatic re-embedding for conflicted content

  12. Manual Resolution: For critical conflicts

  13. User notification for unresolvable conflicts
  14. Conflict queue with resolution interface
  15. Audit trail for all conflict resolutions

Implementation Timeline

The federated device network will be implemented in phases:

  1. MVP: Single device with local data only
  2. Post-MVP: Add federated device network as an enhancement
  3. Future: Advanced conflict resolution and offline operation

Security Considerations

  1. Device Authorization: New devices require explicit user approval
  2. Revocation: Compromised devices can be removed from trust network
  3. Encryption: All synchronized data is encrypted with device-specific keys
  4. Audit Trail: All sync operations are logged for transparency

Conclusion

AICO's federated device network enables a seamless multi-device experience while maintaining the project's core privacy principles. By prioritizing direct P2P communication and implementing zero-knowledge encryption, users maintain complete control over their data while enjoying the convenience of a companion AI that follows them across devices.