Telemetry Module¶
The telemetry module provides append-only recording and read-only aggregation
of inference metrics. Every engine call records timing, token counts, energy
usage, and cost to SQLite via the event bus. The TelemetryAggregator
provides per-model and per-engine statistics with time-range filtering.
TelemetryStore¶
TelemetryStore
¶
Append-only SQLite store for inference telemetry records.
Source code in src/openjarvis/telemetry/store.py
Functions¶
record
¶
record(rec: TelemetryRecord) -> None
Persist a single telemetry record.
Source code in src/openjarvis/telemetry/store.py
TelemetryAggregator¶
TelemetryAggregator
¶
ModelStats¶
ModelStats
dataclass
¶
ModelStats(model_id: str = '', call_count: int = 0, total_tokens: int = 0, prompt_tokens: int = 0, completion_tokens: int = 0, total_latency: float = 0.0, avg_latency: float = 0.0, total_cost: float = 0.0, avg_ttft: float = 0.0, total_energy_joules: float = 0.0, avg_gpu_utilization_pct: float = 0.0, avg_throughput_tok_per_sec: float = 0.0)
Aggregated statistics for a single model.
EngineStats¶
EngineStats
dataclass
¶
EngineStats(engine: str = '', call_count: int = 0, total_tokens: int = 0, total_latency: float = 0.0, avg_latency: float = 0.0, total_cost: float = 0.0, avg_ttft: float = 0.0, total_energy_joules: float = 0.0, avg_gpu_utilization_pct: float = 0.0, avg_throughput_tok_per_sec: float = 0.0)
Aggregated statistics for a single engine backend.
AggregatedStats¶
AggregatedStats
dataclass
¶
AggregatedStats(total_calls: int = 0, total_tokens: int = 0, total_cost: float = 0.0, total_latency: float = 0.0, per_model: List[ModelStats] = list(), per_engine: List[EngineStats] = list())
Top-level summary combining per-model and per-engine stats.
Instrumented Wrapper¶
instrumented_generate¶
instrumented_generate
¶
instrumented_generate(engine: InferenceEngine, messages: Sequence[Message], *, model: str, bus: EventBus, temperature: float = 0.7, max_tokens: int = 1024, **kwargs: Any) -> Dict[str, Any]
Call engine.generate() and publish telemetry events on bus.
Returns the raw result dict from the engine.