Published on March 29, 2025 1:10 PM GMT
I run a lot of one-off jobs on EC2 machines. Thisusually looks like:
- Stand up a machineMess around for a while trying things and writing codeRun my command under screen
If the machine costs a non-trivial amount and the job finishesin the middle of the night I'm not awake to shut it down.
I could, and sometimes do, forget to turn the machine off.
Ideally I could tell the machine to shut itself off if no one waslogging in and there weren't any active jobs.
I didn't see anything like this (though I didn't look very hard) so Iwrote something (github):
$ prevent-shutdown long-running-command
As long as that command is still running, or someone is logged in overssh, the machine will stay on. Every five minutes a systemd timerwill check if this is the case, and if not shut the machine down.Note that you still need screen
or something to preventthe long running command from exiting when you log out.
(This is an example of the kind of thing that I find goes a lot fasterwith anLLM. I used Claude 3.7, prompted it with essentially thebeginning of this blog post, took the scripts it generated as astarting point, and then fixed some things. It did make some mistakes(the big ones: a typo of $
for $$
, a regexlooking for PID:
that should have looked for^PID:
, didn't initially plan for handling stale jobs) butthat's also about what I'd expect if I'd asked a junior engineer towrite this for me. And with much faster turnaround on my codereviews!)
Discuss