Skip to content

CLI Playbook

KeepGPU now supports two operational styles:

  • Blocking mode (keep-gpu ...) for traditional shell workflows.
  • Service mode (keep-gpu start/status/stop) for agent workflows that must continue after arming keep-alive.

1) Blocking mode (compatibility)

keep-gpu --interval 120 --gpu-ids 0,1 --vram 2GiB --busy-threshold 25

This command blocks until you press Ctrl+C.

Start a keep session

keep-gpu start --gpu-ids 0 --vram 1GiB --interval 60 --busy-threshold 25

start auto-starts the local service if needed and returns immediately with a job_id. The command also prints:

  • dashboard URL (http://<host>:<port>/),
  • follow-up status/stop command hints,
  • daemon shutdown hint (keep-gpu service-stop).

Check status

keep-gpu status
keep-gpu status --job-id <job_id>

Stop sessions

keep-gpu stop --job-id <job_id>
keep-gpu stop --all

Stop local daemon

keep-gpu service-stop

If sessions are still active, stop them first or use --force.

List telemetry

keep-gpu list-gpus

Run service explicitly

keep-gpu serve --host 127.0.0.1 --port 8765

3) Dashboard UI

When service mode is running, open:

http://127.0.0.1:8765/

The dashboard provides:

  • live GPU memory/utilization telemetry,
  • active keep sessions,
  • session creation form,
  • single-session and stop-all controls.

Command knobs

Option Meaning Default
--gpu-ids Comma-separated GPU IDs. Omit to use all visible devices. all
--vram Per-GPU memory target (512MB, 1GiB, bytes). 1GiB
--interval Seconds between keep-alive cycles. 300
--busy-threshold / --util-threshold Back off when utilization exceeds this value. -1

Remote sessions

Preferred workflow for remote shells:

tmux new -s keepgpu
keep-gpu start --gpu-ids 0 --vram 1GiB --interval 120 --busy-threshold 25

Then run follow-up commands in the same shell (non-blocking), or monitor by way of keep-gpu status.

Troubleshooting

  • --gpu-ids parse error: use only comma-separated integers (0,1).
  • Start cannot reach service: run keep-gpu serve --host 127.0.0.1 --port 8765.
  • Need to close background service: run keep-gpu stop --all first, then keep-gpu service-stop (or use keep-gpu service-stop --force).
  • OOM during keep: reduce --vram or free GPU memory before starting.
  • No utilization data: ensure nvidia-ml-py works and nvidia-smi is available.