stormlog

Stormlog - A comprehensive memory profiling tool.

class stormlog.GPUMemoryProfiler(device=None, track_tensors=False, track_cpu_memory=True, collect_stack_traces=False, max_snapshots=10000)[source]

Bases: object

Comprehensive GPU memory profiler for PyTorch operations.

Parameters:

device (str | int | torch.device | None)
track_tensors (bool)
track_cpu_memory (bool)
collect_stack_traces (bool)
max_snapshots (int)

profile_function(func, *args, **kwargs)[source]

Profile a single function call.

Parameters:

func (Callable[[...], Any]) – Function to profile
*args (Any) – Arguments to pass to function
**kwargs (Any) – Keyword arguments to pass to function

Returns:

ProfileResult with profiling information

Return type:

ProfileResult

profile_context(name='context')[source]

Context manager for profiling a block of code.

Parameters:: name (str) – Name for the profiled context
Yields:: None. The completed ProfileResult is appended to results after exit.
Return type:: Any

start_monitoring(interval=0.1)[source]

Start continuous memory monitoring.

Parameters:: interval (float) – Monitoring interval in seconds
Return type:: None

stop_monitoring()[source]

Stop continuous memory monitoring.

Return type:: None

get_summary()[source]

Get a summary of all profiling results.

Return type:: Dict[str, Any]

clear_results()[source]

Clear all profiling results and reset state.

Return type:: None

class stormlog.MemorySnapshot(timestamp, allocated_memory, reserved_memory, max_memory_allocated, max_memory_reserved, active_memory, inactive_memory, cpu_memory, device_id=0, operation=None, stack_trace=None)[source]

Bases: object

Represents a memory snapshot at a specific point in time.

Parameters:

timestamp (float)
allocated_memory (int)
reserved_memory (int)
max_memory_allocated (int)
max_memory_reserved (int)
active_memory (int)
inactive_memory (int)
cpu_memory (int)
device_id (int)
operation (str | None)
stack_trace (str | None)

timestamp: float

allocated_memory: int

reserved_memory: int

max_memory_allocated: int

max_memory_reserved: int

active_memory: int

inactive_memory: int

cpu_memory: int

device_id: int = 0

operation: str | None = None

stack_trace: str | None = None

to_dict()[source]

Convert snapshot to dictionary.

Return type:: Dict[str, Any]

class stormlog.ProfileResult(function_name, execution_time, memory_before, memory_after, memory_peak, memory_allocated, memory_freed, tensors_created, tensors_deleted, call_count=1)[source]

Bases: object

Results from profiling a function or operation.

Parameters:

function_name (str)
execution_time (float)
memory_before (MemorySnapshot)
memory_after (MemorySnapshot)
memory_peak (MemorySnapshot)
memory_allocated (int)
memory_freed (int)
tensors_created (int)
tensors_deleted (int)
call_count (int)

function_name: str

execution_time: float

memory_before: MemorySnapshot

memory_after: MemorySnapshot

memory_peak: MemorySnapshot

memory_allocated: int

memory_freed: int

tensors_created: int

tensors_deleted: int

call_count: int = 1

memory_diff()[source]

Calculate memory difference between before and after.

Return type:: int

peak_memory_usage()[source]

Get peak memory usage during execution.

Return type:: int

to_dict()[source]

Convert result to dictionary.

Return type:: Dict[str, Any]

stormlog.profile_context(name='context', device=None, profiler=None)[source]

Context manager for profiling a block of code.

Parameters:

name (str) – Name for the profiled context
device (str | int | torch.device | None) – GPU device to use for profiling
profiler (GPUMemoryProfiler | None) – Custom profiler instance to use

Yields:

GPUMemoryProfiler instance used to profile the block

Return type:

Iterator[GPUMemoryProfiler]

Example

with profile_context(“model_forward”) as prof:: output = model(input)

stormlog.profile_function(func=None, *, name=None, device=None, profiler=None)[source]

Decorator to profile a function’s GPU memory usage.

Can be used as @profile_function or @profile_function(name=”custom_name”)

Parameters:

func (F | None) – Function to profile (when used as @profile_function)
name (str | None) – Custom name for the profiled function
device (str | int | torch.device | None) – GPU device to use for profiling
profiler (GPUMemoryProfiler | None) – Custom profiler instance to use

Returns:

Decorated function or ProfileResult if called directly

Return type:

Callable[[F], F] | F

class stormlog.MemoryVisualizer(profiler=None)[source]

Bases: object

Comprehensive visualization tool for memory profiling data.

Parameters:: profiler (GPUMemoryProfiler | None)

plot_memory_timeline(results=None, snapshots=None, save_path=None, interactive=True)[source]

Plot memory usage over time.

Parameters:

results (List[ProfileResult] | None) – List of ProfileResults to plot
snapshots (List[MemorySnapshot] | None) – List of MemorySnapshots to plot
save_path (str | None) – Path to save the plot
interactive (bool) – Whether to create interactive plot

Returns:

Matplotlib or Plotly figure

Return type:

matplotlib.pyplot.Figure | plotly.graph_objects.Figure

plot_cross_rank_timeline(events, save_path=None)[source]

Plot a merged, aligned cross-rank device-memory timeline.

Parameters:

events (List[TelemetryEventV2])
save_path (str | None)

Return type:

matplotlib.pyplot.Figure

plot_function_comparison(results=None, metric='memory_allocated', save_path=None, interactive=True)[source]

Compare memory usage across different functions.

Parameters:

results (List[ProfileResult] | None) – List of ProfileResults to compare
metric (str) – Metric to compare (‘memory_allocated’, ‘execution_time’, ‘peak_memory’)
save_path (str | None) – Path to save the plot
interactive (bool) – Whether to create interactive plot

Returns:

Matplotlib or Plotly figure

Return type:

matplotlib.pyplot.Figure | plotly.graph_objects.Figure

plot_memory_heatmap(results=None, save_path=None)[source]

Create a heatmap showing memory usage patterns.

Parameters:

results (List[ProfileResult] | None) – List of ProfileResults to analyze
save_path (str | None) – Path to save the plot

Returns:

Matplotlib figure

Return type:

matplotlib.pyplot.Figure

create_dashboard(results=None, snapshots=None, save_path=None)[source]

Create a comprehensive dashboard with multiple visualizations.

Parameters:

results (List[ProfileResult] | None) – List of ProfileResults
snapshots (List[MemorySnapshot] | None) – List of MemorySnapshots
save_path (str | None) – Path to save the dashboard

Returns:

Plotly figure with subplots

Return type:

plotly.graph_objects.Figure

export_data(results=None, snapshots=None, format='csv', save_path='memory_profile_data')[source]

Export profiling data to various formats.

Parameters:

results (List[ProfileResult] | None) – List of ProfileResults to export
snapshots (List[MemorySnapshot] | None) – List of MemorySnapshots to export
format (str) – Export format (‘csv’, ‘json’)
save_path (str) – Base path for saved files

Returns:

Path to saved file

Return type:

str

show(fig)[source]

Display a figure.

Parameters:: fig (matplotlib.pyplot.Figure | plotly.graph_objects.Figure)
Return type:: None

class stormlog.MemoryAnalyzer(profiler=None, collective_sensitivity='medium', collective_threshold_overrides=None)[source]

Bases: object

Advanced analyzer for memory profiling data.

Parameters:

profiler (GPUMemoryProfiler | None)
collective_sensitivity (str)
collective_threshold_overrides (Mapping[str, Any] | None)

analyze_memory_patterns(results=None)[source]

Detect memory usage patterns in profiling data.

Parameters:: results (List[ProfileResult] | None) – List of ProfileResults to analyze
Returns:: List of detected patterns
Return type:: List[MemoryPattern]

generate_performance_insights(results=None)[source]

Generate performance insights from profiling data.

Parameters:: results (List[ProfileResult] | None) – List of ProfileResults to analyze
Returns:: List of performance insights
Return type:: List[PerformanceInsight]

analyze_memory_gaps(events, *, phase_resolver=None)[source]

Classify allocator-vs-device hidden memory gaps over time.

Parameters:

events (List[TelemetryEventV2]) – Chronologically ordered telemetry samples.
phase_resolver (PhaseReplayIndex | None)

Returns:

Prioritized list of gap findings (severity desc, confidence desc).

Return type:

List[GapFinding]

analyze_cross_rank_timeline(events, *, phase_resolver=None)[source]

Merge rank timelines and detect the earliest cluster-wide spike cause.

Parameters:

events (List[TelemetryEventV2])
phase_resolver (PhaseReplayIndex | None)

Return type:

Dict[str, Any]

analyze_collective_attribution(events, *, phase_resolver=None)[source]

Attribute hidden-memory spikes to collective communication phases.

Parameters:

events (List[TelemetryEventV2])
phase_resolver (PhaseReplayIndex | None)

Return type:

List[CollectiveAttributionResult]

generate_optimization_report(results=None, events=None)[source]

Generate a comprehensive optimization report.

Parameters:

results (List[ProfileResult] | None) – List of ProfileResults to analyze
events (List[TelemetryEventV2] | None) – Optional telemetry event series for gap analysis. When provided, the report includes a gap_analysis section.

Returns:

Comprehensive optimization report

Return type:

Dict[str, Any]

class stormlog.GapFinding(classification, severity, confidence, evidence, description, remediation, evidence_timestamp_ns=None, phase_attribution=None)[source]

Bases: object

A classified finding from hidden-memory gap analysis.

Parameters:

classification (str)
severity (str)
confidence (float)
evidence (dict[str, Any])
description (str)
remediation (List[str])
evidence_timestamp_ns (int | None)
phase_attribution (PhaseAttribution | None)

classification: str

severity: str

confidence: float

evidence: dict[str, Any]

description: str

remediation: List[str]

evidence_timestamp_ns: int | None = None

phase_attribution: PhaseAttribution | None = None

class stormlog.MemoryTracker(device=None, sampling_interval=0.1, max_events=10000, enable_alerts=True, enable_oom_flight_recorder=False, oom_dump_dir='oom_dumps', oom_buffer_size=None, oom_max_dumps=5, oom_max_total_mb=256, job_id=None, rank=None, local_rank=None, world_size=None, enable_native_cuda_history=False, native_history_max_entries=100000, telemetry_sink_config=None)[source]

Bases: object

Real-time memory tracker with alerts and monitoring.

Parameters:

device (str | int | torch.device | None)
sampling_interval (float)
max_events (int)
enable_alerts (bool)
enable_oom_flight_recorder (bool)
oom_dump_dir (str)
oom_buffer_size (int | None)
oom_max_dumps (int)
oom_max_total_mb (int)
job_id (str | None)
rank (int | None)
local_rank (int | None)
world_size (int | None)
enable_native_cuda_history (bool)
native_history_max_entries (int)
telemetry_sink_config (TelemetrySinkConfig | None)

get_session_summary()[source]

Return the current or most recent tracking session summary.

Return type:: SessionSummary | None

property oom_buffer_size: int: Resolved OOM ring-buffer size.

start_tracking()[source]

Start real-time memory tracking.

Return type:: None

stop_tracking()[source]

Stop real-time memory tracking.

Return type:: None

enter_phase(name, *, metadata=None)[source]

Enter one structured workload phase while tracking is active.

Parameters:

name (str)
metadata (Dict[str, Any] | None)

Return type:

PhaseHandle

phase(name, *, metadata=None)[source]

Context manager that emits structured phase enter and exit records.

Parameters:

name (str)
metadata (Dict[str, Any] | None)

Return type:

Any

handle_exception(exc, context=None, metadata=None)[source]

Capture OOM diagnostics for recognized OOM exceptions.

Parameters:

exc (BaseException)
context (str | None)
metadata (Dict[str, Any] | None)

Return type:

str | None

capture_oom(context='runtime', metadata=None)[source]

Capture OOM diagnostic bundle if a tracked block raises OOM.

Parameters:

context (str)
metadata (Dict[str, Any] | None)

Return type:

Any

add_alert_callback(callback)[source]

Add a callback function to be called on alerts.

Parameters:: callback (Callable[[TrackingEvent], None])
Return type:: None

remove_alert_callback(callback)[source]

Remove an alert callback.

Parameters:: callback (Callable[[TrackingEvent], None])
Return type:: None

get_events(event_type=None, last_n=None, since=None)[source]

Get tracking events with optional filtering.

Parameters:

event_type (str | None) – Filter by event type
last_n (int | None) – Get last N events
since (float | None) – Get events since timestamp

Returns:

List of filtered events

Return type:

List[TrackingEvent]

get_memory_timeline(interval=1.0)[source]

Get memory usage timeline with specified interval.

Parameters:: interval (float) – Time interval in seconds for aggregation
Returns:: Dictionary with timeline data
Return type:: Dict[str, List]

get_statistics()[source]

Get comprehensive tracking statistics.

Return type:: Dict[str, Any]

export_events(filename, format='csv')[source]

Export tracking events to file.

Parameters:

filename (str) – Output filename
format (str) – Export format (‘csv’ or ‘json’)

Return type:

None

clear_events()[source]

Clear all tracking events.

Return type:: None

set_threshold(threshold_name, value)[source]

Set alert threshold.

Parameters:

threshold_name (str) – Name of the threshold
value (int | float) – Threshold value

Return type:

None

get_alerts(last_n=None)[source]

Get all alert events (warnings, critical, errors).

Parameters:: last_n (int | None)
Return type:: List[TrackingEvent]

class stormlog.OOMFlightRecorder(config)[source]

Bases: object

Bounded recorder that writes dump bundles on OOM.

Parameters:: config (OOMFlightRecorderConfig)

record_event(event)[source]

Append one event payload to the in-memory ring buffer.

Parameters:: event (dict[str, Any])
Return type:: None

snapshot_events()[source]

Return buffered events in chronological order.

Return type:: list[dict[str, Any]]

clear()[source]

Discard buffered events for the next session/run.

Return type:: None

dump(*, reason, exception, context, backend, metadata=None, session_summary=None)[source]

Write an OOM diagnostic bundle and enforce retention constraints.

Parameters:

reason (str)
exception (BaseException)
context (str | None)
backend (str)
metadata (dict[str, Any] | None)
session_summary (SessionSummary | None)

Return type:

str | None

class stormlog.OOMFlightRecorderConfig(enabled=False, dump_dir='oom_dumps', buffer_size=10000, max_dumps=5, max_total_mb=256)[source]

Bases: object

Runtime configuration for OOM flight recorder dumps.

Parameters:

enabled (bool)
dump_dir (str)
buffer_size (int)
max_dumps (int)
max_total_mb (int)

enabled: bool = False

dump_dir: str = 'oom_dumps'

buffer_size: int = 10000

max_dumps: int = 5

max_total_mb: int = 256

class stormlog.OOMExceptionClassification(is_oom, reason)[source]

Bases: object

Normalized classification result for an exception.

Parameters:

is_oom (bool)
reason (str | None)

is_oom: bool

reason: str | None

stormlog.classify_oom_exception(exc)[source]

Classify whether an exception corresponds to an OOM condition.

Parameters:: exc (BaseException)
Return type:: OOMExceptionClassification

class stormlog.TelemetryEventV2(schema_version, timestamp_ns, event_type, collector, sampling_interval_ms, pid, host, device_id, allocator_allocated_bytes, allocator_reserved_bytes, allocator_active_bytes, allocator_inactive_bytes, allocator_change_bytes, device_used_bytes, device_free_bytes, device_total_bytes, context, job_id=None, rank=0, local_rank=0, world_size=1, metadata=<factory>)[source]

Bases: object

Legacy v2 telemetry event payload retained for backward-compatible writes/tests.

Parameters:

schema_version (Literal[2])
timestamp_ns (int)
event_type (str)
collector (str)
sampling_interval_ms (int)
pid (int)
host (str)
device_id (int)
allocator_allocated_bytes (int)
allocator_reserved_bytes (int)
allocator_active_bytes (int | None)
allocator_inactive_bytes (int | None)
allocator_change_bytes (int)
device_used_bytes (int)
device_free_bytes (int | None)
device_total_bytes (int | None)
context (str | None)
job_id (str | None)
rank (int)
local_rank (int)
world_size (int)
metadata (dict[str, Any])

schema_version: Literal[2]

timestamp_ns: int

event_type: str

collector: str

sampling_interval_ms: int

pid: int

host: str

device_id: int

allocator_allocated_bytes: int

allocator_reserved_bytes: int

allocator_active_bytes: int | None

allocator_inactive_bytes: int | None

allocator_change_bytes: int

device_used_bytes: int

device_free_bytes: int | None

device_total_bytes: int | None

context: str | None

job_id: str | None = None

rank: int = 0

local_rank: int = 0

world_size: int = 1

metadata: dict[str, Any]

class stormlog.DeviceMemoryCollector[source]

Bases: ABC

Backend-specific collector contract for device memory signals.

abstract name()[source]

Return runtime backend name (cuda, rocm, mps).

Return type:: str

abstract is_available()[source]

Return whether this collector can sample in the current runtime.

Return type:: bool

abstract sample()[source]

Collect a single normalized memory sample.

Return type:: DeviceMemorySample

sample_with_diagnostics()[source]

Collect a sample while preserving core-failure diagnostics.

Return type:: DeviceMemorySampleResult

abstract capabilities()[source]

Describe backend capability signals for telemetry metadata.

Return type:: Dict[str, Any]

class stormlog.DeviceMemorySample(allocated_bytes, reserved_bytes, used_bytes, free_bytes, total_bytes, active_bytes, inactive_bytes, device_id)[source]

Bases: object

Normalized device-memory sample produced by a backend collector.

Parameters:

allocated_bytes (int)
reserved_bytes (int)
used_bytes (int)
free_bytes (int | None)
total_bytes (int | None)
active_bytes (int | None)
inactive_bytes (int | None)
device_id (int)

allocated_bytes: int

reserved_bytes: int

used_bytes: int

free_bytes: int | None

total_bytes: int | None

active_bytes: int | None

inactive_bytes: int | None

device_id: int

stormlog.build_device_memory_collector(device=None)[source]

Build a backend collector for CUDA/ROCm/MPS runtime environments.

Parameters:: device (str | int | torch.device | None)
Return type:: DeviceMemoryCollector

stormlog.detect_torch_runtime_backend()[source]

Return the active torch runtime backend in this environment.

Return type:: str

class stormlog.CPUMemoryProfiler[source]

Bases: object

Lightweight CPU memory profiler mirroring the GPU API.

start_monitoring(interval=0.1)[source]

Parameters:: interval (float)
Return type:: None

stop_monitoring()[source]

Return type:: None

profile_function(func, *args, **kwargs)[source]

Parameters:

func (Callable[[...], Any])
args (Any)
kwargs (Any)

Return type:

CPUProfileResult

profile_context(name='context')[source]

Parameters:: name (str)
Return type:: Any

clear_results()[source]

Return type:: None

get_summary()[source]

Return type:: Dict[str, Any]

class stormlog.CPUMemoryTracker(sampling_interval=0.5, max_events=10000, enable_alerts=True, job_id=None, rank=None, local_rank=None, world_size=None, telemetry_sink_config=None)[source]

Bases: object

CPU tracker offering a superset of the GPU tracker interface.

Parameters:

sampling_interval (float)
max_events (int)
enable_alerts (bool)
job_id (Optional[str])
rank (Optional[int])
local_rank (Optional[int])
world_size (Optional[int])
telemetry_sink_config (Optional[TelemetrySinkConfig])

get_session_summary()[source]

Return type:: SessionSummary | None

start_tracking()[source]

Return type:: None

stop_tracking()[source]

Return type:: None

enter_phase(name, *, metadata=None)[source]

Enter one structured CPU tracking phase.

Parameters:

name (str)
metadata (Dict[str, Any] | None)

Return type:

PhaseHandle

phase(name, *, metadata=None)[source]

Context manager that emits structured CPU phase telemetry.

Parameters:

name (str)
metadata (Dict[str, Any] | None)

Return type:

Any

get_events(event_type=None, last_n=None, since=None)[source]

Get tracking events with optional filtering.

Parameters:

event_type (str | None) – Filter by event type
last_n (int | None) – Get last N events
since (float | None) – Get events since timestamp

Returns:

List of filtered events

Return type:

List[TrackingEvent]

get_statistics()[source]

Return type:: Dict[str, Any]

get_memory_timeline(interval=1.0)[source]

Parameters:: interval (float)
Return type:: Dict[str, List[float]]

clear_events()[source]

Return type:: None

export_events(filename, format='csv')[source]

Parameters:

filename (str)
format (str)

Return type:

None

export_events_with_timestamp(directory, format)[source]

Parameters:

directory (str)
format (str)

Return type:

str

stormlog.telemetry_event_from_record(record, permissive_legacy=True, default_collector='legacy.unknown', default_sampling_interval_ms=0, default_session_id=None)[source]

Create a canonical telemetry event from v3, v2, or legacy records.

Parameters:

record (Mapping[str, Any])
permissive_legacy (bool)
default_collector (str)
default_sampling_interval_ms (int)
default_session_id (str | None)

Return type:

TelemetryEventV3

stormlog.telemetry_event_to_dict(event)[source]

Serialize a telemetry event to a plain dictionary.

Parameters:: event (TelemetryEventV3 | TelemetryEventV2)
Return type:: dict[str, Any]

stormlog.validate_telemetry_record(record)[source]

Validate a v2 or v3 telemetry record.

Raises:: ValueError – if the record is invalid or partial.
Parameters:: record (Mapping[str, Any])
Return type:: None

stormlog.load_telemetry_events(path, permissive_legacy=True, events_key=None, session_id=None)[source]

Load telemetry events from JSON and return the selected session.

Parameters:

path (str | Path)
permissive_legacy (bool)
events_key (str | None)
session_id (str | None)

Return type:

list[TelemetryEventV3]

stormlog.resolve_distributed_identity(*, job_id=None, rank=None, local_rank=None, world_size=None, metadata=None, env=None)[source]

Normalize distributed identity fields from explicit, metadata, or env inputs.

Parameters:

job_id (Any)
rank (Any)
local_rank (Any)
world_size (Any)
metadata (Mapping[str, Any] | None)
env (Mapping[str, str] | None)

Return type:

dict[str, Any]

class stormlog.TimelineMarker(session_id, start_ns, end_ns, kind, source, severity, label, rank=None, local_rank=None, world_size=None, event_type=None, metadata=<factory>)[source]

Bases: object

Normalized timeline landmark derived from telemetry or annotation sources.

Parameters:

session_id (str)
start_ns (int)
end_ns (int | None)
kind (str)
source (str)
severity (str)
label (str)
rank (int | None)
local_rank (int | None)
world_size (int | None)
event_type (str | None)
metadata (dict[str, Any])

session_id: str

start_ns: int

end_ns: int | None

kind: str

source: str

severity: str

label: str

rank: int | None = None

local_rank: int | None = None

world_size: int | None = None

event_type: str | None = None

metadata: dict[str, Any]

property is_interval: bool: Return whether the marker spans a non-point interval.

stormlog.derive_timeline_markers(events, *, include_phase_markers=True)[source]

Derive normalized timeline markers from telemetry events.

Parameters:

events (Sequence[Any])
include_phase_markers (bool)

Return type:

list[TimelineMarker]

stormlog.derive_session_timeline_markers(session, *, include_phase_markers=True)[source]

Derive normalized markers from one loaded telemetry session.

Parameters:

session (LoadedTelemetrySession)
include_phase_markers (bool)

Return type:

list[TimelineMarker]

stormlog.timeline_marker_to_dict(marker)[source]

Serialize a marker into a JSON-safe mapping.

Parameters:: marker (TimelineMarker)
Return type:: dict[str, Any]

stormlog.get_gpu_info(device=None)[source]

Get comprehensive GPU information.

Parameters:: device (str | int | torch.device | None) – GPU device to query (None for current device)
Returns:: Dictionary with GPU information
Return type:: Dict[str, Any]

stormlog.format_bytes(bytes_value, precision=2)[source]

Format bytes into human-readable format.

Parameters:

bytes_value (int) – Number of bytes
precision (int) – Decimal precision

Returns:

Formatted string (e.g., “1.25 GB”)

Return type:

str

stormlog.convert_bytes(value, from_unit, to_unit)[source]

Convert between different byte units.

Parameters:

value (int | float) – Value to convert
from_unit (str) – Source unit (B, KB, MB, GB, TB)
to_unit (str) – Target unit (B, KB, MB, GB, TB)

Returns:

Converted value

Return type:

float

Modules

`analyzer`	Advanced analysis tools for memory profiling data.
`attributed_viz`	Stormlog-native memory visualisation with tensor attribution.
`cli`	Command-line interface for Stormlog.
`collective_attribution`	Heuristics for attributing hidden-memory spikes to collective communication.
`collector_health`	Shared collector-health state and retry helpers.
`context_profiler`	Context profiler for easy function and code block profiling.
`correlation`	Derived correlation rows for local Stormlog artifact investigations.
`cpu_profiler`	CPU-only memory profiler and tracker.
`cuda_native_debug`	CUDA-native allocator history capture and attribution helpers.
`derived_fields`	Registry-driven derived-field layer for Stormlog telemetry.
`device_collectors`	Backend-aware device memory collector abstractions.
`diagnose`	Diagnostic bundle builder for the Stormlog diagnose command.
`distributed_analysis`	Distributed telemetry analysis helpers.
`entrypoint`	Top-level Stormlog console entrypoint.
`gap_analysis`	Shared hidden-memory gap analysis utilities.
`infer`	Inference profiling helpers for OpenAI-compatible serving endpoints.
`issues`	Durable issue fingerprints and grouped issue row models.
`jax`	JAX support for Stormlog.
`oom_flight_recorder`	OOM flight recorder helpers for bounded event capture and dump artifacts.
`phases`	Structured phase telemetry helpers for trackers and analysis.
`profiler`	Core Stormlog for PyTorch.
`query`	Local query API for Stormlog artifact directories and telemetry files.
`query_cli`	Command-line interface for local Stormlog artifact queries.
`release_version`	Helpers for deriving the next release version from Git tags.
`run_catalog`	Stable public facade for run envelope and attachment catalog helpers.
`session`	Shared session identity and lifecycle helpers.
`telemetry`	Canonical telemetry event schema and legacy conversion helpers.
`telemetry_classification`	Shared classification helpers for canonical telemetry events.
`telemetry_model`	Backend-neutral projection over the persisted telemetry event schema.
`telemetry_rollups`	Derived compact rollups for append-only telemetry sinks.
`telemetry_sink`	Append-only telemetry sink with rollover and retention bounds.
`tensorflow`	TensorFlow support for Stormlog.
`timeline_markers`	Derived timeline marker helpers for telemetry sessions.
`tracker`	Real-time memory tracking and monitoring.
`tui`	Textual-based terminal UI and top-level Stormlog dispatcher.
`utils`	Utility functions for GPU memory profiling.
`visualizer`	Visualization tools for GPU memory profiling data.
`wandb_integration`	Optional Weights & Biases export helpers for Stormlog outputs.