PBS Job Scheduler Important Command And Architecture


PBS Pro – Portable Batch System – Professional. It is responsible for
1)Job scheduling.
2)Resource Management.
3)Super Computer Optimization.

pbs_sched & pbs_server it will run on master node.
[root@master ~]#ps -ef | grep pbs
root 16714 1 0 Sep17 ? 00:00:40 /usr/pbs/default/sbin/pbs_sched
root 16716 1 0 Sep17 ? 00:00:27 /usr/pbs/default/sbin/pbs_server
Scheduler – It will interact with various MOM to query system resource and learn about the availability of job to execute.

pbs_mom run on compute node.
[root@node1 ~]# ps -ef | grep --color pbs
root 8204 1 0 Jun29 ? 00:03:59 /usr/pbs/default/sbin/pbs_mom
1)Mother of all execution Job.
2)It will run on the all the client nodes.
3)MOM is also responsible to return the Job output to the user.

PBS _ Script File Example
1) #!/bin/bash (or) #!/bin/tcsh
2) #PBS
-l define the resource ( Node, PPN, Memory )
-e error_flie -o output_file
-N Job_Name
-m abcn (MAIL Events), -M mail-id
-q queue

3) MPI Programming Command.
export variable-path
cd Directory-Path
mpirun -np X -machinefile Y program-name
Example : 
#PBS -l nodes=2:ppn=2,pmem=1gb,walltime=1:00:00 #PBS -m abe
Email me when the job aborts, begins, and ends. N FOR DON'T MAIL.
NOTE 1) Job scheduler not only mpi. it is ready to run any program ordinary shell script itself.
PBS pro is “pro” version of the Open PBS.

PBS Command
#qstat -q
Job Queue Status
#qstat -Q
Available queue.
#qstat -a
-a List all the Job | -au specified USER-ID List job
-r Running Job
-s Status
-B Summary Information about PBS server
-f Detail Information about job.
Put the job on hold
Release a job
Send Signal to job
Delete job.
#qsub -q queue-name script-file
Submit the job
#pbs -j eo
Output and error redirection
o- output e – error
Job status Signal
Q Queue
R – Running
E- Ending
H – Hold Job
- a list all the nodes
-l list of nodes currently offline.

How To Get the Master and Client Server Communication LOG File.
#pbsnodes <HOST-Name> To Get the Detailed Information about the particular Node.
1)Get pbs_version,
2)state whether is it [ free | job-busy | state-unknown,down,offline ]
3)no of CPU
4)what are the jobs are running.
[root@master ~]# pbsnodes node1
Mom = node1.niper.in
Port = 15002
pbs_version = PBSPro_10.0.0.82981
ntype = PBS
state = job-busy
pcpus = 8
jobs = 32606.master/0, 32606.master/1, 32606.master/2, 32606.master/3, 32687.master/4, 32687.master/5, 32687.master/6, 32687.master/7
resources_available.amber = 0
resources_available.arch = linux
resources_available.cpmd = 8
resources_available.gaussian = 8
resources_available.gromacs = 0
resources_available.host = node1
resources_available.matlab = 0
resources_available.mem = 32830976kb
resources_available.namd = 0
resources_available.ncpus = 8
resources_available.software = MATLAB_Distrib_Comp_Engine:5
resources_available.vnode = node1
resources_assigned.amber = 0
resources_assigned.cpmd = 0
resources_assigned.fred2 = 0
resources_assigned.gaussian = 8
resources_assigned.glide = 0
resources_assigned.gromacs = 0
resources_assigned.impact = 0
resources_assigned.jaguar = 0
resources_assigned.ligprep = 0
resources_assigned.matlab = 0
resources_assigned.mem = 30720000kb
resources_assigned.namd = 0
resources_assigned.ncpus = 8
resources_assigned.qikprop = 0
resources_assigned.vasp = 0
resources_assigned.vmem = 0kb
resv_enable = True
sharing = default_shared


No comments:
Write comments