stormlog.cuda_native_debug

CUDA-native allocator history capture and attribution helpers.

Functions

build_cuda_tensor_attribution_index([...])

Build a best-effort index from CUDA storage pointers to live tensors.

build_snapshot_allocation_attribution(...)

Cross-reference allocator addresses against live tensor storage pointers.

capture_cuda_snapshot_artifacts(output_dir, *)

Capture the current CUDA allocator snapshot and write debug artifacts.

cuda_memory_history([device, ...])

Context manager that records CUDA allocator history for a block.

cuda_memory_history_supported()

Return whether the current PyTorch runtime exposes CUDA history APIs.

start_cuda_memory_history([device, ...])

Enable CUDA allocator history recording for the selected device.

stop_cuda_memory_history([device])

Disable CUDA allocator history recording for the selected device.

write_cuda_snapshot_artifacts(output_dir, ...)

Write snapshot, attribution, and best-effort visualization artifacts.

stormlog.cuda_native_debug.build_cuda_tensor_attribution_index(device=None, *, skip_gc=False)[source]

Build a best-effort index from CUDA storage pointers to live tensors.

Parameters:
  • device (int | torch.device | None)

  • skip_gc (bool)

Return type:

dict[str, Any]

stormlog.cuda_native_debug.build_snapshot_allocation_attribution(snapshot, tensor_index)[source]

Cross-reference allocator addresses against live tensor storage pointers.

Parameters:
  • snapshot (Any)

  • tensor_index (dict[str, Any])

Return type:

dict[str, Any]

stormlog.cuda_native_debug.capture_cuda_snapshot_artifacts(output_dir, *, device=None, history_recorded)[source]

Capture the current CUDA allocator snapshot and write debug artifacts.

Parameters:
  • output_dir (Path)

  • device (int | torch.device | None)

  • history_recorded (bool)

Return type:

list[str]

stormlog.cuda_native_debug.cuda_memory_history(device=None, trace_alloc_max_entries=100000)[source]

Context manager that records CUDA allocator history for a block.

Parameters:
  • device (int | torch.device | None)

  • trace_alloc_max_entries (int)

Return type:

Iterator[None]

stormlog.cuda_native_debug.cuda_memory_history_supported()[source]

Return whether the current PyTorch runtime exposes CUDA history APIs.

Return type:

bool

stormlog.cuda_native_debug.start_cuda_memory_history(device=None, trace_alloc_max_entries=100000)[source]

Enable CUDA allocator history recording for the selected device.

Parameters:
  • device (int | torch.device | None)

  • trace_alloc_max_entries (int)

Return type:

None

stormlog.cuda_native_debug.stop_cuda_memory_history(device=None)[source]

Disable CUDA allocator history recording for the selected device.

Parameters:

device (int | torch.device | None)

Return type:

None

stormlog.cuda_native_debug.write_cuda_snapshot_artifacts(output_dir, snapshot, tensor_index, *, history_recorded, device=None)[source]

Write snapshot, attribution, and best-effort visualization artifacts.

Parameters:
  • output_dir (Path)

  • snapshot (Any)

  • tensor_index (dict[str, Any])

  • history_recorded (bool)

  • device (int | torch.device | None)

Return type:

list[str]