TensorFlow Guide
This guide covers the current TensorFlow workflow in Stormlog: profiling TensorFlow code directly, tracking TensorFlow memory usage from the CLI, and exporting artifacts for later review.
Before you start
Validate the environment:
tfmemprof info
If you are bringing up a GPU runtime, start with the workload-backed matmul recipe in TensorFlow Production Recipes before using training-backed examples.
python -m examples.basic.tensorflow_demo is source-checkout only and exercises
TensorFlow’s training path. Use it after the runtime is aligned for training-
backed ops, not as the first GPU smoke test.
These checks work on CPU-backed TensorFlow installs as well as GPU-backed ones.
Daily workflow: ML engineer
Use TFMemoryProfiler when you want snapshots and aggregate results around a real TensorFlow workload.
from stormlog.tensorflow import TFMemoryProfiler
profiler = TFMemoryProfiler(enable_tensor_tracking=True)
with profiler.profile_context("training"):
model.fit(x_train, y_train, epochs=1, batch_size=32)
results = profiler.get_results()
print(f"Peak memory: {results.peak_memory_mb:.2f} MB")
print(f"Snapshots captured: {len(results.snapshots)}")
Decorator-style profiling also exists through the TensorFlow profiler API, but the context-manager flow above is the clearest daily-workflow path.
Daily workflow: investigate sustained growth
The current TensorFlow CLI is the simplest way to capture longer-running telemetry:
tfmemprof monitor --interval 0.5 --duration 30 --output tf_monitor.json
tfmemprof track --interval 0.5 --threshold 4096 --output tf_track.json
tfmemprof analyze --input tf_monitor.json --detect-leaks --optimize --report tf_report.txt
tfmemprof diagnose --duration 0 --output ./tf_diag
For CPU-backed TensorFlow or when the GPU backend is unavailable, add
--device /CPU:0 to monitor and track.
Key difference from gpumemprof:
TensorFlow analysis uses
--inputPyTorch analysis uses a positional file argument
Daily workflow: release or CI triage
Use the maintained example and scenario modules instead of inventing a one-off shell script:
Source checkout only. These commands require the repository
examples/package.
python -m examples.cli.quickstart
python -m examples.scenarios.tf_end_to_end_scenario
python -m examples.cli.capability_matrix --mode smoke --target both --oom-mode simulated
This is the fastest route to a reproducible TensorFlow artifact set.
TUI-assisted TensorFlow workflow
The current TUI can help after or during a TensorFlow run:
TensorFlowtab for sample workloads and collected summariesMonitoringfor live trackingVisualizationsfor plot exportDiagnosticsfor artifact reviewCLI & Actionsfor command-driven flows
Launch:
pip install "stormlog[tui,torch]"
stormlog
The current TUI startup path imports PyTorch immediately, so TensorFlow-only environments still need the torch extra for stormlog to launch.
Recommended validation sequence
Use this when you need a compact TensorFlow confidence pass:
tfmemprof info
tfmemprof monitor --interval 0.5 --duration 15 --output tf_monitor.json
tfmemprof analyze --input tf_monitor.json --detect-leaks --optimize --report tf_report.txt
tfmemprof diagnose --duration 0 --output ./tf_diag
If you are validating a GPU-backed runtime, run the /GPU:0 matmul recipe from
TensorFlow Production Recipes before this sequence.
If you want the source-checkout training-backed example after that, run
python -m examples.basic.tensorflow_demo.
Common issues
tfmemprof is installed but no GPU devices appear
Run:
tfmemprof info
If it still shows no GPU devices, treat it as an environment issue first. The TensorFlow flow still supports CPU-backed runs.
tfmemprof analyze rejects positional input
That is expected. Use:
tfmemprof analyze --input tf_monitor.json --detect-leaks --optimize
Plot export fails
Install the visualization extra:
pip install "stormlog[viz]"