SGE on ce01

Note: This doesn't actually quite work yet.

The Twiki. (What is it with CERNs obsession with the letter 'T' ?)

Go to the documtentation and download all the rpms:
wget http://www.lip.pt/grid/sgeV61u3toSLC4/sge-ckpt-V61u3-1.i386.rpm
wget http://www.lip.pt/grid/sgeV61u3toSLC4/sge-daemons-V61u3-1.i386.rpm
wget http://www.lip.pt/grid/sgeV61u3toSLC4/sge-devel-V61u3-1.i386.rpm
wget http://www.lip.pt/grid/sgeV61u3toSLC4/sge-docs-V61u3-1.i386.rpm
wget http://www.lip.pt/grid/sgeV61u3toSLC4/sge-parallel-V61u3-1.i386.rpm
wget http://www.lip.pt/grid/sgeV61u3toSLC4/sge-qmon-V61u3-1.i386.rpm
wget http://www.lip.pt/grid/sgeV61u3toSLC4/sge-utils-V61u3-1.i386.rpm
wget http://www.lip.pt/grid/sgeV61u3toSLC4/sge-V61u3-1.i386.rpm
Install them:
rpm -ivh sge*

yum install glite-SGE_utils

Download and install the SGE server yaim interface from ETICS repository
rpm -ivh
http://eticssoft.web.cern.ch/eticssoft/repository/org.glite/org.glite.yaim.sge-server/4.0.1/noarch/glite-yaim-sge-server-4.0.1-4.noarch.rpm


Try and configure SGE
/opt/glite/yaim/bin/yaim -d6 -c -s site-info-ce01.def -n lcg-CE -n SGE_server -n SGE_utils
Open the ports: iptables -I RH-Firewall-1-INPUT 11 -p tcp --dport 536 -jLOG (for testing)
iptables -I RH-Firewall-1-INPUT 12 -p tcp -s 155.198.217.20 --dport 536 -jACCEPT
Tests:
source /usr/local/sge/pro/default/common/settings.sh
qstat
qhost
/usr/local/sge/pro/default/spool/qmaster/messages
/sbin/iptables -L -v

As dteam049:
[dteam049@ce01 siteinfo]$ more /home/dteam049/test.sh
hostname

[dteam049@ce01 siteinfo]$ qsub test.sh

The results will be on the home dir on the worker node.

Problems

SGE queue permissions
After adding another VO (supernemo), I get an error that dteam cannot submit to any queue. On closer inspection:
[root@ce01 functions]# more
/usr/local/sge/V61u3/default/spool/qmaster/cqueues/30min
qname 30min
hostlist wntest00.hep.ph.ic.ac.uk
[....]
user_lists snemoprd snemo
the script only seems to add the last VO as users to a queue. It worked the first time around, as I only asked for ops and dteam and only tested dteam.
This is set in yaim/functions/config_sge_server.
The code seems to expect queues of type:
queue_name_vo_name
which we don't have.
Let's do a hack:
Complicated version:
Starting at the line
for QUEUE in `echo $QUEUES | sed 's/"//g'` ; do
add the line
j=""
after
for i in `echo ${USERSETLIST[*]}`; do
add the line
j="$j $i"
and after the loop change QUEUE_USERSET=$i to QUEUE_USERSET=$j
Simple version:
change
user_lists $QUEUE_USERSET
to
user_lists NONE
as we allow all users in all queues.
(We can always do a fudge for ops later ;-)