Collects BEAM process metrics and serves multiple visualization features from a single data source.
Three tiers of data collection, each with different cost profiles:
Always-on (trivially cheap): Monitors named supervisors via
Process.monitor/1. Tracks restart events and recovery times. Cost is onehandle_info({:DOWN, ...})per supervisor crash, which is rare. This powers Resilience-as-UX (#1109).On-demand polling (activated when subscribers exist): Walks the supervision tree via
Supervisor.which_children/1, callsProcess.info/2for each process, and stores snapshots in a circular buffer (last 300 samples = 5 minutes at 1Hz). Activated when a UI panel subscribes, deactivated when all subscribers disconnect. This powers BEAM Observatory (#1081) and Living Architecture (#1098).Domain state queries (no collection): Downstream features read existing APIs (Agent.Session, etc.) directly. SystemObserver doesn't collect this data; it's listed here for completeness.
Supervision placement
Lives as the last child under Minga.Supervisor (top-level). This
means it starts after Foundation, Services, and Runtime are all up,
giving it full visibility into the process tree. With rest_for_one,
a SystemObserver crash restarts nothing (nothing comes after it), and
a Foundation/Services/Runtime crash restarts SystemObserver too (correct:
re-establishes monitors).
Summary
Types
A snapshot of process metrics for the entire supervision tree.
Internal state for the SystemObserver GenServer.
Functions
Returns a specification to start this module under a supervisor.
Classifies a process by registered name and child type.
Classifies a process for Observatory rendering.
Classifies a process for Observatory rendering using supervisor child modules when available.
Returns the restart history as a list, most recent first.
Returns all collected process tree samples as a list, oldest first.
Returns the latest process tree snapshot, or nil if no samples have
been collected yet.
Starts the SystemObserver GenServer.
Subscribes the calling process to process tree snapshots.
Unsubscribes the calling process from process tree snapshots.
Types
@type child_modules() :: [module()] | :dynamic
@type child_type() :: Minga.SystemObserver.ProcessSnapshot.child_type()
@type process_class() :: Minga.SystemObserver.ProcessSnapshot.process_class()
@type process_tree_snapshot() :: %{ timestamp: integer(), processes: %{required(pid()) => Minga.SystemObserver.ProcessSnapshot.t()} }
A snapshot of process metrics for the entire supervision tree.
@type t() :: %{ monitors: %{required(reference()) => atom()}, restart_history: [Minga.SystemObserver.RestartRecord.t()], subscribers: MapSet.t(pid()), subscriber_monitors: %{required(reference()) => pid()}, samples: :queue.queue(process_tree_snapshot()), sample_count: non_neg_integer(), poll_timer: reference() | nil }
Internal state for the SystemObserver GenServer.
Functions
Returns a specification to start this module under a supervisor.
See Supervisor.
@spec classify_process(atom() | nil, child_type()) :: process_class()
Classifies a process by registered name and child type.
@spec classify_process(pid(), atom() | nil, child_type()) :: process_class()
Classifies a process for Observatory rendering.
@spec classify_process(pid(), atom() | nil, child_type(), child_modules()) :: process_class()
Classifies a process for Observatory rendering using supervisor child modules when available.
@spec restart_history(GenServer.server()) :: [Minga.SystemObserver.RestartRecord.t()]
Returns the restart history as a list, most recent first.
The last 50 restart events are retained. This is always available (always-on tier), regardless of subscriber count.
@spec samples(GenServer.server()) :: [process_tree_snapshot()]
Returns all collected process tree samples as a list, oldest first.
The maximum number of samples retained is 300 (5 minutes at 1Hz). Returns an empty list if polling has not been activated.
@spec snapshot(GenServer.server()) :: process_tree_snapshot() | nil
Returns the latest process tree snapshot, or nil if no samples have
been collected yet.
This is a one-shot query. For continuous monitoring, subscribe and read
samples/0 periodically.
@spec start_link(keyword()) :: GenServer.on_start()
Starts the SystemObserver GenServer.
@spec subscribe(GenServer.server()) :: :ok
Subscribes the calling process to process tree snapshots.
While at least one subscriber exists, SystemObserver polls the process
tree at 1Hz and stores snapshots. The subscriber receives no messages
from SystemObserver directly; use snapshot/0 or samples/0 to read
the collected data.
The subscriber is automatically unsubscribed when it exits.
@spec unsubscribe(GenServer.server()) :: :ok
Unsubscribes the calling process from process tree snapshots.
If this was the last subscriber, polling stops.