CI and Release Qualification
Use this page when the goal is not just to diagnose one run, but to qualify Stormlog’s operational behavior before release or as part of CI.
Source checkout only below. The commands on this page use the repository
examples/package and benchmark assets underdocs/benchmarks/.
Audience: release owners, CI maintainers. Difficulty: intermediate.
Prerequisites
install the checkout with the needed extras first:
pip install -e ".[dev,test,all]"start from Installation, Examples Guide, and Benchmark Harness if the workflow is unfamiliar
a source checkout with the
examples/package availablebenchmark asset files under
docs/benchmarks/enough time and artifact space for the chosen harness profile
Success signal:
the chosen smoke or benchmark command completes and writes its expected report artifact
Choose the qualification path
If the main goal is… |
Start with… |
|---|---|
fast CLI sanity signal |
|
broad smoke coverage across surfaces |
|
always-on operability qualification |
|
enforce regression or budget gates |
benchmark harness |
Fast smoke validation
python -m examples.cli.quickstart
Use this when you want one fast signal that the installed CLI entrypoints still behave as expected.
Cross-surface smoke validation
python -m examples.cli.capability_matrix --mode smoke --target both --oom-mode simulated
Use this when you want one command that touches the major PyTorch and TensorFlow validation paths.
Always-on operability qualification
python -m examples.cli.benchmark_harness \
--profile pr \
--mode all \
--output artifacts/benchmarks/latest_v0.4.json
This is the fastest source-checkout qualification path for always-on tracking, bounded history, and retention behavior.
Regression-gated benchmark run
python -m examples.cli.benchmark_harness \
--check \
--profile pr \
--mode all \
--gate-mode regression \
--iterations 5000 \
--baseline docs/benchmarks/v0.4_baseline.json \
--tolerances docs/benchmarks/v0.4_tolerances.json \
--output artifacts/benchmarks/latest_v0.4_regression.json
Budget-gated benchmark run
python -m examples.cli.benchmark_harness \
--check \
--profile pr \
--mode all \
--gate-mode budget \
--iterations 5000 \
--budgets docs/benchmarks/v0.4_operating_budget.json \
--output artifacts/benchmarks/latest_v0.4_budget.json
What to archive from CI
benchmark harness JSON output
sink directories or diagnose bundles for failed runs
any saved analysis reports used during triage
What to do next
If the harness fails on collector health or retention metrics, move to Always-on Tracking.
If the failure centers on a specific runtime incident, move to PyTorch Production Recipes or TensorFlow Production Recipes.
If the failure is distributed and rank-specific, move to Distributed Diagnostics Recipes.
Troubleshooting
Symptom: a benchmark command is too heavy for local iteration
Likely cause: the current profile is intended for PR or nightly gating. Fix: start with smoke validation or a single harness mode before running the full gate. Verify: the shorter command completes and writes a usable report artifact.