Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

shell - Changing the bash script sent to sbatch in slurm during run a bad idea?

I wanted to run a python script main.py multiple times with different arguments through a sbatch_run.sh script as in:

#!/bin/bash
#SBATCH --job-name=sbatch_run
#SBATCH --array=1-1000
#SBATCH --exclude=node047

arg1=10 #arg to be change during runs
arg2=12 #arg to be change during runs
python main.py $arg1 $arg2

The arguments are encoded in the bash file ran by sbatch. I was worried that if I ran sbatch_run.sh multiple times one after the other but changing the value of arg1 and arg2 during each run, that it might cause errors in my runs. For example if I do:

sbatch sbatch_run.sh # with arg1=10 and arg2=12

and then immediately after I change sbatch_run.sh but run the file again as in:

sbatch sbatch_run.sh # with arg1=69 and arg2=666

would case my runs to all run with the last one (i.e. arg1=69 and arg2=666) instead of each run with its own arguments.

I know for sure that if I hard code the arguments in main.py and then run the same sbatch script but change the main.py it will run the last one. I was wondering if that is the case too if I change the sbatch_run.sh script.


Just so you know, I did try this experiment, by running 1000 scripts, then some get queued and put a sleep command and then change the sbatch_run.sh. It seems to not change what my run is, however, if I am wrong this is way too important to be wrong by accident and wanted to make sure I asked too.

For the record I ran:

#!/bin/bash
#SBATCH --job-name=ECHO
#SBATCH --array=1-1000
#SBATCH --exclude=node047

sleep 15
echo helloworld
echo 5

and then change the echo to echo 10 or echo byebyeworld.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

When sbatch is run, Slurm copies the submission script to its internal database ; you can convince yourself with the following experiment:

$ cat submit.sh
#!/bin/bash
#SBATCH  --hold
echo helloworld

The --hold is there to make sure the job does not start. Submit it :

$ sbatch submit.sh

Then modify the submission script:

$ sed -i 's/hello/bye/' submit.sh
$ cat submit.sh
#!/bin/bash
#SBATCH  --hold
echo byeworld

and now use control show job to see the script Slurm is planning to run:

$ scontrol show -ddd job YOURJOBID
JobId=******* JobName=submit.sh
[...]
BatchScript=
   #!/bin/bash
   #SBATCH  --hold
   echo helloworld
[...]

It hasn't changed although the original script has.

[EDIT] Recent versions of Slurm use scontrol write batch_script - rather than scontrol show -dd job to show the submission script.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...