EMI CREAM with SGE

This refers to EMI1, Update 10, fresh install.

(0) Documentation
System Administrator Guide for CREAM
Know issues
Trouble Shooting

(1) Preliminaries
yum install yum-protectbase.noarch
yum install yum-priorities

(2) Certificates
wget http://repository.egi.eu/sw/production/cas/1/current/repo-files/EGI-trustanchors.repo -O /etc/yum.repos.d/EGI-trustanchors.repo
yum install ca-policy-egi-core

openssl pkcs12 -clcerts -nokeys -out hostcert.pem -in cetest00.p12
openssl pkcs12 -nocerts -nodes -out hostkey.pem -in cetest00.p12
chmod 600 /etc/grid-security/hostcert.pem
chmod 400 /etc/grid-security/hostkey.pem

(3) EMI repos
cd /etc/pki/rpm-gpg/
wget http://emisoft.web.cern.ch/emisoft/dist/EMI/1/RPM-GPG-KEY-emi
cd /etc/yum.repos.d
wget http://emisoft.web.cern.ch/emisoft/dist/EMI/1/sl5/x86_64/updates/emi-release-1.0.1-1.sl5.noarch.rpm
rpm -i emi-release-1.0.1-1.sl5.noarch.rpm
I now have three emi repos: emi1-base.repo, emi1-third-party.repo and emi1-updates.repo, and two epel repos: epel.repo, epel-testing.repo.

(4) Install software
yum install xml-commons-apis
yum install emi-cream-ce
yum install emi-ge-utils
yum install gridengine-qmaster

(5) Make some special users
Note that the default values for users in /opt/glite/yaim/examples/edgusers.conf are already used on our system, so I just add 100 to each. Though on reflection it would have been better to add 400 to each to get around the 'root is not allowed to run cron jobs for users with uid < 500' setting (only relevant for user 'glite').
groupadd -g 200 glexec
useradd -m -g glexec glexec
groupadd -g 201 glite (groupadd -g 255 glite)
useradd -m -u 201 -g glite glite (useradd -m -u 255 -g glite glite)
groupadd -g 252 edguser
useradd -m -u 252 -g edguser edguser

(6) SGE specifics
Link port6444 to sge_master (in /var/sgeCA/) ls -l
lrwxrwxrwx 1 root root 11 Nov 30 16:21 port6444 -> sge_qmaster
drwxr-xr-x 3 root root 4096 Oct 13 14:09 sge_qmaster

everybody and their grandmother need to be able to run qstat:
chown -R ldap:sgeadmin /var/sgeCA/sge_qmaster/default/userkeys/ldap
chown -R edguser:sgeadmin /var/sgeCA/sge_qmaster/default/userkeys/edguser
chown -R tomcat:sgeadmin /var/sgeCA/sge_qmaster/default/userkeys/tomcat
When restarting the bdii make sure export PYTHONPATH=/usr/lib/python:$PYTHONPATH is set.
bdii
Our queue configuration is somewhat special as we have one machine which has a short version of our queue (for ops tests):
[root@ceprod07 ~]# qconf -sq grid.q | grep -E 's_rt|h_rt|s_cpu|h_cpu'
s_rt 49:05:00,[we000.grid.hep.ph.ic.ac.uk=1:00:00]
h_rt 49:05:00,[we000.grid.hep.ph.ic.ac.uk=1:02:00]
s_cpu 48:00:00
h_cpu 48:05:00
Regurlar users won't see this, but the bdii advertises the minimum of all walltimes (calculated in /usr/libexec/glite-info-dynamic-sge).
To get the bdii to advertise the correct values, I need to change $QUEUE_minlimits{$q}->{'rt'} = &minval( $QUEUE_minlimits{$q}->{'rt'}, $wallclocktime ); (~ line 770) to $QUEUE_minlimits{$q}->{'rt'} = &maxval( $QUEUE_minlimits{$q}->{'rt'}, $wallclocktime );

(7) Run yaim
/opt/glite/yaim/bin/yaim -c -s /opt/glite/yaim/siteinfo/siteinfo_cetest00.def -n creamCE -n SGE_utils
Note that for cream-ce CMS requires the * notation in the groups.conf file (and here's the users.conf file for completeness).

(8) The bdii (selinux)
slapd: ALL in hosts.allow
(this is wrong, apparently) semanage fcontext -a -t slapd_db_t "/var/log/bdii(/.*)?"; restorecon -vR /var/log/bdii/
semanage port -a -t ldap_port_t -p tcp 2170

(9) Hacks (local configuration)
All combined in the post_yaim_hacks.sh script

(10) On the worker nodes
Edit the cream-sge.sh script located in /usr/bin on the worker nodes to recognize the new CE as a cream CE.

(11) mysql queries
[root@ceprod07 ~]# mysql -u creamdbuser -p
Enter password:
mysql> use creamdb;
mysql> select * from job_status WHERE jobId = 'CREAM890544290';