README FILE FOR JAEGER LAB CLUSTER SCRIPTS
-------------------------------------------
CONTENTS:
---------
1. INTRO/REQUIREMENTS
2. EXAMPLES
3. LIST OF ALL SCRIPTS
----------------------------------------
1. INTRO/REQUIREMENTS
----------------------------------------
Add the following line in your ~/.bashrc file to adjust your
path for Jaegerlab settings:
--- CUT FROM HERE ---
. ~jaegerlab/brute_scripts/rc-ellipse.sh
--- CUT TO HERE ---
(the above is for ellipse.hpc.emory.edu; use rc.sh for
the old cluster clust.cc.emory.edu)
To learn about the Sun Grid Engine (SGE) on how to submit
and monitor jobs, start from the "man sge_intro" manual page.
Use the "qsub" command to submit jobs to the SGE.
Genesis users must copy the /home/jaegerlab/.simrc file into their home directory.
----------------------------------------
2. EXAMPLES
----------------------------------------
GENESIS Example with Parameter File 1:
---
$ sge_submit setup_newscan.g exc-subset-simple-scan-2.par
GENESIS script: setup_newscan.g; param. file: exc-subset-simple-scan-2.par
with 440 trials.
440 rows and 11 parameters in file exc-subset-simple-scan-2.par.
Your job 22538.1-440:1 ("sge_run") has been submitted.
---
GENESIS Example with Parameter File 1 (using the fast-run queue):
---
$ sge_submit setup_newscan.g exc-subset-simple-scan-2.par -l immediate=TRUE
---
GENESIS Example with Parameter File 2:
---
$ create_perlhash_param_db my_conductances.par
$ qsub -t 1:100 ~jaegerlab/brute_scripts/sge_perlhash.sh my_gen_script.g my_conductances.par
---
GENESIS Example w/o Parameter File:
---
$ qsub ~jaegerlab/brute_scripts/run_genesis.sh my_gen_script.g
---
MATLAB Example 1:
---
$ qsub -t 1:60 ~jaegerlab/brute_scripts/sge_matlab.sh calculate(%d)
---
This will call the matlab functions calculate(1), calculate(2), ... etc. in each job.
MATLAB Example 2:
---
$ qsub -t 1:60 ~jaegerlab/brute_scripts/sge_matlab.sh load_part%d
---
This will call load_part1, load_part2, ..., load_part60 in each matlab process.
Example of old method that sshs to master node for locking:
$ qsub -t 1:100 ~jaegerlab/brute_scripts/sge_perllock.sh my_gen_script.g my_conductances.par
will launch 100 jobs processing the given parameter file.
----------------------------------------
3. COMMONLY USED SGE COMMANDS
----------------------------------------
$ qcountcpus
will give you a list of queues and the number of CPUs currently available in each.
$ qstat
will give you a list of all scheduled jobs on the cluster.
$ qstat | grep yourusername
will only show the lines with your jobs.
$ qstat -f
will show the status of all nodes on the cluster.
$ qstat -j jobnumber
will give you detailed info about your job, including error messages.
$ qmod -cj jobnumber
will clear the error state of a job and let it re-run.
----------------------------------------
4. PRIORITIES
----------------------------------------
$ qsub -p <priority> ...
will specify that the priority of the current job.
Priority convention for the fast_run queue:
Job time Priority
------------------------
<1hr 0
1-5 hrs -100
5-24 hrs -200
> 24 hrs not appropriate for fast_run
----------------------------------------
5. LIST OF ALL SCRIPTS
----------------------------------------
Scripts in use:
-----------------------------------------------------------------------------------
checkMissing.pl - Cross-checks files and parameter lines to see
if any simulations have been missed.
create_perlhash_param_db- Creates a Perl database with a .db extension from a parameter file.
dosim - Mark the first available line in a parameter file and return it.
dosimnum - Return the desired row from a parameter file.
lockLinuxFile - Request mutual exclusive lock on a file and run a command.
paramRanges.txt - Example parameter definition file for brute force search.
paramScan.pl - Parameter file generator for brute force search.
Reads the parameter definition file.
qcountcpus - Asks SGE how many CPUs are running.
rc.sh - Jaeger Lab startup script to set up paths, etc.
run_genesis.sh - Simple SGE submission script that runs a Genesis script
without refering to a parameter file.
sge_matlab.sh - SGE submission script for matlab jobs. See example.
sge_perllock.sh - SGE submission script, same as *_local.sh above, but
uses ssh to connect to master node and request file locking.
sge_perllock_test.sh - Empty SGE submission script for testing purposes. Waits
for a few seconds instead of running Genesis.
splitparfile - Split a parameter file into two parts.
takebacksim - Unmarks the last executed parameter row.
Deprecated scripts (but maybe still useful):
-----------------------------------------------------------------------------------
checksims - counts data files on each node of a cluster.
dist_params - Distributes separate parameter files to each node.
forall - Deprecated, use dsh: run command on all nodes.
forallrsh - Deprecated, use dsh: run command on all nodes using rsh.
runbatch - Deprecated: Runs genesis repeatedly until all
parameter rows are processed. Use qsub as shown
above with one of the sge_* scripts.
runbatchenv - Deprecated: Same as runbatch, but passes parameters
to Genesis through an environment variable.
runbatchremain - Deprecated: Same as runbatch, but runs as many processes
as to finish the remaining rows in a parameter file.
sge_perllock_local.sh - Deprecated: SGE submission script that uses network
file locking to access rows of a central parameter file.
---
Cengiz Gunay <cgunay@emory.edu>
Created: 2005/06/05
$Id: README,v 1.9 2007/08/13 18:49:09 cengiz Exp $