Skip to content

Agency Metrics & State Visibility

This document lists key metrics, states, and KPIs of the Agency system.

  • User-Facing Metrics – values that can be surfaced directly (or with light explanation) to end users, including non-technical users.
  • Engineering & Debug Metrics – values primarily for developers, operators, and evaluators.

Each table uses these columns:

  • Name – Metric/state identifier.
  • Type – Counter, gauge, enum/state, list, time-series, etc.
  • Scope – Per-user, per-agent (global), per-session, per-goal, etc.
  • Purpose – Why we track it.

1. User-Facing Metrics

1.1 Goals, Intentions & Work in Progress

Name Type Scope Purpose
primary_focus_intention single goal summary per-user Represent the intention AICO currently treats as primary focus (top-scored by Goal Arbiter, with temporal smoothing).
active_intentions list of goal summaries per-user Show what AICO is currently working on (top goals/intentions).
open_goals_total gauge (count) per-user Indicate how many open projects/threads exist.
hobby_goals_active list of hobby summaries per-user Make AICO’s own hobbies and self-projects visible.

1.2 Curiosity, Intrinsic Motivation & Hobbies

Name Type Scope Purpose
curiosity_level scalar (e.g. low/medium/high) per-user Communicate how strong curiosity-driven behaviour is right now.
curiosity_opportunities short list of themes per-user Show what AICO is currently curious about (1–3 human-readable items).
hobby_activity_time coarse duration per hobby per-user, per-hobby Indicate recent engagement with each hobby (e.g., "spent time this week").
hobby_state enum + summary per-hobby Present the status of each hobby (active/paused/completed) with a short description.

1.3 Memory, World Model & Open Loops

Name Type Scope Purpose
open_loops_count gauge per-user Let users know how many unresolved threads AICO is tracking.
last_consolidation_time timestamp per-user Indicate when the last sleep-like consolidation ran ("last night", "2h ago").
life_area_coverage coarse breakdown (e.g. low/medium/high per LifeArea) per-user Give the user a simple view of how well AICO understands different areas of their life (Health, Work, Relationships, etc.) based on World Model facts.
known_conflicts_present boolean + short summary per-user Indicate whether the World Model currently has unresolved contradictions in important domains (e.g. job, relationship, routine) and, optionally, offer to clarify them.

1.4 Emotion, Relationship & Style

Name Type Scope Purpose
emotional_state_current enum + intensity per-user Provide a simple description of AICO’s current emotional stance, when appropriate.
relationship_strength scalar (coarsened) per-user Optionally summarise how close/stable AICO perceives the relationship (only if UX-appropriate).

1.5 Agency Style, Initiative & Safety

Name Type Scope Purpose
initiative_ratio scalar (e.g. "rarely / sometimes / often") per-user Communicate how often AICO initiates interactions vs just responding.
agency_initiated_messages count (aggregated over a period) per-user Provide history of how many conversations AICO started recently.
safety_profile config snapshot (coarse mode) per-user, per-deployment Show current safety/value mode (e.g., cautious / balanced / experimental).
consent_required_actions list of pending items per-user Surface actions waiting on explicit user approval/consent.

1.6 Self-Reflection, Change & Lifecycle

Name Type Scope Purpose
reflection_runs yes/no + last timestamp per-user Indicate that AICO has recently reflected on behaviour ("I recently reflected on our week").
behaviour_adjustments short list of changes per-user Summarise what AICO is trying to do differently (high-level strategy changes).
lifecycle_phase enum (ACTIVE / FOCUSED_WORK / IDLE_LIGHT / SLEEP_LIKE / MAINTENANCE) per-user, per-agent Expose the current LifecycleState (from the Lifecycle component) and drive room/posture in the 3D flat.
embodiment_state enum (room, posture, activity label) per-user Map internal state to visual representation in the avatar and flat.

2. Engineering & Debug Metrics

The following metrics are primarily intended for developers, operators, and evaluation dashboards. They provide deeper visibility into internal dynamics and performance.

2.1 Goals, Planning & Execution

Name Type Scope Purpose
goals_by_origin breakdown (user / curiosity / hobby / maintenance) per-user Analyse balance of goal sources.
goal_lifecycle_events event log per-goal Debug why goals were created/paused/completed/dropped.
plans_active list of plan IDs + goal IDs per-user Inspect which plans are currently executing.
plan_depth gauge (avg steps per plan) per-agent, time-series Monitor complexity of planning over time.
plan_execution_success_rate ratio (0-1) per-user, per-goal type Evaluate how often planned steps execute successfully.
replanning_events count / log per-user Detect instability or frequent context shifts.

2.2 Curiosity, World Model & AMS

Name Type Scope Purpose
curiosity_signals_total count/time-series per-user, per-agent Track overall rate of CuriositySignals emitted by detectors.
curiosity_signals_by_type breakdown by curiosity_type per-user Analyse which kinds of curiosity (knowledge_gap, novelty, self_performance, hobby_play) are most active.
curiosity_signals_promoted count/time-series per-user Count signals that passed all gates and became CuriositySignalEvents.
curiosity_signals_deferred_or_rejected breakdown (values/ethics / emotion / resources / low_score) per-user Understand why curiosity opportunities did not become events or goals.
curiosity_goals_created count/time-series per-user Track frequency of intrinsically motivated goals created from CuriositySignalEvents.
curiosity_goals_active gauge (count) per-user Monitor how many curiosity- or agent-self-origin goals are currently active.
curiosity_goal_outcomes breakdown (completed / user_rejected / auto_dropped) per-user Evaluate how curiosity-driven goals fare and whether users accept or dismiss them.
world_model_nodes gauge (count) per-agent Monitor size/growth of the world model.
world_model_edges gauge (count) per-agent Track relational complexity of the world model.
world_model_facts_by_life_area breakdown (fact counts per LifeArea) per-user Inspect how many WorldStateFacts exist per LifeArea to spot over- and under-represented domains.
world_model_conflicts_active gauge (count) per-user Monitor how many conflicting WorldStateFacts are currently unresolved.
hypotheses_open gauge (count) per-user Track how many open World Model hypotheses exist about the user.
hypothesis_lifecycle_events event log per-user, per-hypothesis Debug how hypotheses move between open/confirmed/rejected and what evidence drove the change.
episodic_writes_rate rate (events/time) per-agent Monitor memory write load.
semantic_summaries_created count/time-series per-user Track consolidation of long-term summaries.

2.3 Emotion, Personality & Social

Name Type Scope Purpose
emotion_transitions event log per-user Debug emotional dynamics around key events.
relationship_stability scalar per-user Capture volatility of relationship state.

2.4 Arbiter, Values/Ethics & Safety

Name Type Scope Purpose
arbiter_decisions_log event log (candidates + scores + chosen) per-user Debug why certain goals were selected or rejected.
goal_source_mix_over_time time-series breakdown per-user Analyse balance of user vs agent-self vs maintenance goals over time.
actions_blocked_by_policy count/time-series per-user, per-agent Monitor impact of value/ethics policies.

2.5 Self-Reflection, Scheduler, Resources, Lifecycle & Embodiment

Name Type Scope Purpose
lessons_generated count/time-series + list per-user Inspect how many lessons AICO is extracting and their content.
scheduled_agency_tasks list + states per-agent, per-user See what background work is queued and its status.
agency_resource_usage gauge (CPU/mem/battery share) per-agent Monitor resource cost of agency-related work.
lifecycle_state_time_distribution breakdown (time spent per LifecycleState) per-agent, time-series Analyse how much time AICO spends in ACTIVE / FOCUSED_WORK / IDLE_LIGHT / SLEEP_LIKE / MAINTENANCE to tune daily rhythm and background work policies.
lifecycle_transitions event log per-agent Debug when and why LifecycleState changed (including reason such as quiet_hours, manual_override, low_battery).
tasks_run_vs_deferred_by_lifecycle breakdown (counts per state and task class) per-agent Inspect how often tasks (user_facing, background_light, background_heavy, maintenance) run or are deferred due to LifecycleState flags, to validate Scheduler/Lifecycle integration.
embodiment_state_changes event log per-user Debug mapping between agency state transitions and visual updates.

2.6 Conversation & Embodiment

Name Type Scope Purpose
agency_initiated_messages count/time-series per-user Track how often AICO starts interactions.
embodiment_state enum (room, posture, activity) per-user Map internal state to visual representation.
embodiment_state_changes event log per-user Debug mapping between agency state and visuals.