EMI WMS


This log refers to EMI-1 WMS Update 19. The base OS is CentOS 5.8

(0) Documentation
EMI 1
The EMI-WMS System Administrator Guide
EMI Generic Installation Guide for EMI-1


(1) Repositories and Preliminaries
yum install yum-protectbase
yum install yum-priorities
CAs: wget http://repository.egi.eu/sw/production/cas/1/current/repo-files/egi-trustanchors.repo -O /etc/yum.repos.d/egi-trustanchors.repo
EPEL: wget http://download.fedoraproject.org/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm
rpm -i epel-release-5-4.noarch.rpm
EMI: wget http://emisoft.web.cern.ch/emisoft/dist/EMI/1/sl5/x86_64/updates/emi-release-1.0.1-1.sl5.noarch.rpm
rpm -i emi-release-1.0.1-1.sl5.noarch.rpm

(2) Install the software
yum install ca-policy-egi-core
yum install emi-wms condor-emi (see notes for update 19)

(3) Install the hostcert
cd /etc/grid-security
openssl pkcs12 -clcerts -nokeys -out hostcert.pem -in wms.p12
openssl pkcs12 -nocerts -nodes -out hostkey.pem -in wms.p12

(4) Draining a WMS
It's probably a good idea to set it into draining while configuring it. The .drain file goes in /var.

(5) Configuration
(a) Turn SELinux off for the first run through: setenforce 0 (yes, it should run with selinux enabled, but that comes later)
(b) Change the default uids in /opt/glite/yaim/examples/edgusers.conf to something greater than 500, to avoid having trouble with the cron jobs. Addendum: As it turns out, installing the software makes a glite user which is then not recreated with uid > 500 (on the LB that is still the case). The actual problem (bug?) is with pam/kerberos and the clean solution is to modify /etc/pam.d/crond appropriately.
(c) mkdir /opt/glite/yaim/siteinfo; chmod 700 /opt/glite/yaim/siteinfo (d) add SLAPD: ALL to /etc/hosts.allow to allow the bdii to run (e) Open the relevant ports in the iptables (and don't forget /etc/init.d/iptables restart).

/opt/glite/yaim/bin/yaim -c -s /opt/glite/yaim/siteinfo/siteinfo-wms01.def -n WMS

check bdii and fetch-crl are on in
chkconfig --list | grep fetch
chkconfig fetch-crl-cron on
chkconfig --list | grep bdii
chkconfig bdii on
Check bdii is working: ldapsearch -LLL -x -H ldap://wms01.grid.hep.ph.ic.ac.uk:2170 -b o=glue

(6) SELinux
This machine should obviously work with SELinux enabled ( here is the official statement).
No, I can't work it out either.
(7) Odds and Ends

(8) Restarting services
If "JobController" fails to stop: rm /var/jobcontrol/lock
Sometimes the glite-lb-bkserver processes aren't killed even though no error shows up when stopping the service, only when trying to restart it ("Input/output error (Too many connections)" in /var/log/messages). Killing all processes by hand after stopping it resolves the issue. These leftover processes can cause a high load on the machine if the service is restarted without killing them first.



This log refers to EMI-1 WMS Update 15. The base OS is CentOS 5.8

(0) Documentation
EMI 1
EMI docs
The EMI-WMS System Administrator Guide
EMI Generic Installation Guide for EMI-1


(1) Repositories and Preliminaries
yum install yum-protectbase.noarch
yum install yum-priorities
CAs: wget http://repository.egi.eu/sw/production/cas/1/current/repo-files/egi-trustanchors.repo -O /etc/yum.repos.d/egi-trustanchors.repo
EPEL: wget http://download.fedoraproject.org/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm
rpm -i epel-release-5-4.noarch.rpm
EMI: wget http://emisoft.web.cern.ch/emisoft/dist/EMI/1/sl5/x86_64/updates/emi-release-1.0.1-1.sl5.noarch.rpm
rpm -i emi-release-1.0.1-1.sl5.noarch.rpm

At the end of this, I have the following repos installed:
ICHEP.repo, CentOS-Base.repo, CentOS-Media.repo, CentOS-Vault.repo, CentOS-Debuginfo.repo, egi-trustanchors.repo, emi1-third-party.repo, emi1-base.repo, emi1-updates.repo, epel-testing.repo, epel.repo

(2) Install the software
yum install ca-policy-egi-core
yum install emi-wms

(3) Draining a WMS
It's probably a good idea to set it into draining while configuring it. The .drain file goes in /var.

(4) Install the hostcert
cd /etc/grid-security
openssl pkcs12 -clcerts -nokeys -out hostcert.pem -in wms.p12
openssl pkcs12 -nocerts -nodes -out hostkey.pem -in wms.p12

(5) Configuration
(a) Turn SELinux off for the first run through: setenforce 0
(b) Check httpd is running and make sure it comes back after a reboot:
/etc/init.d/httpd status
chkconfig --list | grep http
chkconfig httpd on
(c) Open the relevant ports in the iptables (and don't forget /etc/init.d/iptables restart).
(d) The default uid for edguser etc are already taken on our system (" DEBUG: Executing... groupadd -g 156 infosys, ERROR: Group infosys with gid '156' failed to be created"), therefore I need to edit /opt/glite/yaim/examples/edgusers.conf. Adding 400 to each user and group id does the trick. This also saves me from the 'user with UID < 500' can't run proper cron jobs dilemma.
(e) add SLAPD: ALL to /etc/hosts.allow to allow the bdii to run.
(f) cd /opt/glite/yaim; mkdir siteinfo; chmod 0700 siteinfo;
users.conf, groups.conf, siteinfo-wms02.def, vo.d
/opt/glite/yaim/bin/yaim -c -s /opt/glite/yaim/siteinfo/siteinfo-wms02.def -n WMS

(6) SELinux
This machine should obviously work with SELinux enabled ( here is the official statement) and I am slowly making my way there. So far I have (but something is still missing!):
(a) relabeling of files (ls -Z to see labels):
semanage fcontext -a -t httpd_config_t "/home/glite/.certs/hostcert.pem"
semanage fcontext -a -t httpd_config_t "/home/glite/.certs/hostkey.pem"
semanage fcontext -a -t httpd_config_t "/home/glite/.certs"
restorecon -vR /home/glite/.certs
(b) set the security context
restorecon -vrn /var/lib/mysql

(7) Restarting services
If "JobController" fails to stop: rm /var/jobcontrol/lock
Sometimes the glite-lb-bkserver processes aren't killed even though no error shows up when stopping the service, only when trying to restart it ("Input/output error (Too many connections)" in /var/log/messages). Killing all processes by hand after stopping it resolves the issue. These leftover processes can cause a high load on the machine if the service is restarted without killing them first.



This is the log for the install for the initial EMI-1 release of the WMS.

(0) Documentation
EMI 1
EMI docs
As there is no official install guide I am going to go with: The EMI-WMS System Administrator Guide
HowTo.

(1) Repositories
cd /etc/yum.repos.d
wget http://repository.egi.eu/sw/production/cas/1/current/repo-files/EGI-trustanchors.repo
wget http://emisoft.web.cern.ch/emisoft/dist/EMI/1/sl5/repos/emi1-base.repo
wget http://emisoft.web.cern.ch/emisoft/dist/EMI/1/sl5/repos/emi1-updates.repo
wget http://emisoft.web.cern.ch/emisoft/dist/EMI/1/sl5/repos/emi1-third-party.repo
cd /etc/pki/rpm-gpg/
wget http://emisoft.web.cern.ch/emisoft/dist/EMI/1/RPM-GPG-KEY-emi
epel: rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-4.noarch.rpm
(epel: see note on LB webpage)

(2) Install the software
yum install yum-protectbase.noarch yum install yum-priorities yum install ca-policy-egi-core
yum install emi-wms

(3) Install the hostcert
cd /etc/grid-security
openssl pkcs12 -clcerts -nokeys -out hostcert.pem -in wms.p12
openssl pkcs12 -nocerts -nodes -out hostkey.pem -in wms.p12

(4) Draining a WMS
It's probably a good idea to set it into draining while configuring it. The .drain file goes in /var.

(5) Configuration
(a) Open the relevant ports in the iptables (and don't forget /etc/init.d/iptables restart).
(b) The default uid for edguser etc are already taken on our system (" DEBUG: Executing... groupadd -g 156 infosys, ERROR: Group infosys with gid '156' failed to be created"), therefore I need to edit /opt/glite/yaim/examples/edgusers.conf. Adding 100 to each user and group id does the trick.
(c) add SLAPD: ALL to /etc/hosts.allow to allow the bdii to run.
/opt/glite/yaim/bin/yaim -c -s /opt/glite/yaim/siteinfo/siteinfo-wms01.def -n WMS

(6) Miscellaneous
(a) Check http is running: /etc/init.d/httpd status
(b) The new fetch cron job needs to be turned on by hand: "chkconfig fetch-crl-cron on"
(c) Syntax error on line 112 of /etc/glite-wms/glite_wms_wmproxy_httpd.conf:
SSLCertificateFile: file '/home/glite/.certs/hostcert.pem' does not exist or is empty
check if file exists and if yes - this is a selinux issue - test with 'setenforce 0'
(d) Users with uid < 500 (except for root) are not allowed to run cron jobs by default using /etc/cron.d. Cron jobs run by 'glite' need to be moved to a script which is then called from the cron job. The original cron job, the new one and the script it's calling.

(7) SELinux
This machine should obviously work with SELinux enabled ( here is the official statement) and I am slowly making my way there. So far I have:
(a) relabeling of files (ls -Z to see labels):
semanage fcontext -a -t httpd_config_t "/home/glite/.certs/hostcert.pem"
semanage fcontext -a -t httpd_config_t "/home/glite/.certs/hostkey.pem"
semanage fcontext -a -t httpd_config_t "/home/glite/.certs"
restorecon -vR /home/glite/.certs
(b) set the security context
restorecon -vrn /var/lib/mysql