README FILE FOR JAEGER LAB CLUSTER SCRIPTS ------------------------------------------- CONTENTS: --------- 1. INTRO/REQUIREMENTS 2. EXAMPLES 3. LIST OF ALL SCRIPTS ---------------------------------------- 1. INTRO/REQUIREMENTS ---------------------------------------- Add the following line in your ~/.bashrc file to adjust your path for Jaegerlab settings: --- CUT FROM HERE --- . ~jaegerlab/brute_scripts/rc-ellipse.sh --- CUT TO HERE --- (the above is for ellipse.hpc.emory.edu; use rc.sh for the old cluster clust.cc.emory.edu) To learn about the Sun Grid Engine (SGE) on how to submit and monitor jobs, start from the "man sge_intro" manual page. Use the "qsub" command to submit jobs to the SGE. Genesis users must copy the /home/jaegerlab/.simrc file into their home directory. ---------------------------------------- 2. EXAMPLES ---------------------------------------- GENESIS Example with Parameter File 1: --- $ sge_submit setup_newscan.g exc-subset-simple-scan-2.par GENESIS script: setup_newscan.g; param. file: exc-subset-simple-scan-2.par with 440 trials. 440 rows and 11 parameters in file exc-subset-simple-scan-2.par. Your job 22538.1-440:1 ("sge_run") has been submitted. --- GENESIS Example with Parameter File 1 (using the fast-run queue): --- $ sge_submit setup_newscan.g exc-subset-simple-scan-2.par -l immediate=TRUE --- GENESIS Example with Parameter File 2: --- $ create_perlhash_param_db my_conductances.par $ qsub -t 1:100 ~jaegerlab/brute_scripts/sge_perlhash.sh my_gen_script.g my_conductances.par --- GENESIS Example w/o Parameter File: --- $ qsub ~jaegerlab/brute_scripts/run_genesis.sh my_gen_script.g --- MATLAB Example 1: --- $ qsub -t 1:60 ~jaegerlab/brute_scripts/sge_matlab.sh calculate(%d) --- This will call the matlab functions calculate(1), calculate(2), ... etc. in each job. MATLAB Example 2: --- $ qsub -t 1:60 ~jaegerlab/brute_scripts/sge_matlab.sh load_part%d --- This will call load_part1, load_part2, ..., load_part60 in each matlab process. Example of old method that sshs to master node for locking: $ qsub -t 1:100 ~jaegerlab/brute_scripts/sge_perllock.sh my_gen_script.g my_conductances.par will launch 100 jobs processing the given parameter file. ---------------------------------------- 3. COMMONLY USED SGE COMMANDS ---------------------------------------- $ qcountcpus will give you a list of queues and the number of CPUs currently available in each. $ qstat will give you a list of all scheduled jobs on the cluster. $ qstat | grep yourusername will only show the lines with your jobs. $ qstat -f will show the status of all nodes on the cluster. $ qstat -j jobnumber will give you detailed info about your job, including error messages. $ qmod -cj jobnumber will clear the error state of a job and let it re-run. ---------------------------------------- 4. PRIORITIES ---------------------------------------- $ qsub -p <priority> ... will specify that the priority of the current job. Priority convention for the fast_run queue: Job time Priority ------------------------ <1hr 0 1-5 hrs -100 5-24 hrs -200 > 24 hrs not appropriate for fast_run ---------------------------------------- 5. LIST OF ALL SCRIPTS ---------------------------------------- Scripts in use: ----------------------------------------------------------------------------------- checkMissing.pl - Cross-checks files and parameter lines to see if any simulations have been missed. create_perlhash_param_db- Creates a Perl database with a .db extension from a parameter file. dosim - Mark the first available line in a parameter file and return it. dosimnum - Return the desired row from a parameter file. lockLinuxFile - Request mutual exclusive lock on a file and run a command. paramRanges.txt - Example parameter definition file for brute force search. paramScan.pl - Parameter file generator for brute force search. Reads the parameter definition file. qcountcpus - Asks SGE how many CPUs are running. rc.sh - Jaeger Lab startup script to set up paths, etc. run_genesis.sh - Simple SGE submission script that runs a Genesis script without refering to a parameter file. sge_matlab.sh - SGE submission script for matlab jobs. See example. sge_perllock.sh - SGE submission script, same as *_local.sh above, but uses ssh to connect to master node and request file locking. sge_perllock_test.sh - Empty SGE submission script for testing purposes. Waits for a few seconds instead of running Genesis. splitparfile - Split a parameter file into two parts. takebacksim - Unmarks the last executed parameter row. Deprecated scripts (but maybe still useful): ----------------------------------------------------------------------------------- checksims - counts data files on each node of a cluster. dist_params - Distributes separate parameter files to each node. forall - Deprecated, use dsh: run command on all nodes. forallrsh - Deprecated, use dsh: run command on all nodes using rsh. runbatch - Deprecated: Runs genesis repeatedly until all parameter rows are processed. Use qsub as shown above with one of the sge_* scripts. runbatchenv - Deprecated: Same as runbatch, but passes parameters to Genesis through an environment variable. runbatchremain - Deprecated: Same as runbatch, but runs as many processes as to finish the remaining rows in a parameter file. sge_perllock_local.sh - Deprecated: SGE submission script that uses network file locking to access rows of a central parameter file. --- Cengiz Gunay <cgunay@emory.edu> Created: 2005/06/05 $Id: README,v 1.9 2007/08/13 18:49:09 cengiz Exp $