Installing a new CE
As usual the official
instructions deserve a dishonourable mention.
We are aiming for the install
via yum here.
There is some more information on the Twiki.
Preparing the machine
-
Get a server
certificate.
-
Set up the repositories.
-
Comment out gpgcheck=1 in /etc/yum.conf. Some of the lcg packages don't provide
a key and yum will just sulk.
-
Read the section on java below to see which scenario applies here.
-
yum install lcg-CA
-
Install the hostkey.
[root@ce01 grid-security]# pwd
/etc/grid-security
get certificate from deathstar
openssl pkcs12 -clcerts -nokeys -out hostcert.pem -in ce01_cert.p12
openssl pkcs12 -nocerts -nodes -out hostkey.pem -in ce01_cert.p12
chmod 600 hostkey.pem
-
yum install lcg-CE
Configuring the CE
-
Make a directory for all config files to go into:
/opt/glite/yaim/siteinfo
-
When running yaim, the -d6 debug flag. -d7 either spews out copious amounts of
unintelligible messages or crashes outright.
-
There is some dispute as when is the correct time to create user accounts. This has
to be done via Kostas or Ray, hence in site-info.def set
CONFIG_USERS=no. However for the test setup here, we are going to make our own
users. The example users.conf file seems to have a syntax error, but this version works. It goes with a similarly
small groups.conf and wn-list.conf (i.e. only contains my dummy
worker node).
(Incidentally to get the list of existing pool accounts do 'getent passwd' on
sedsk09.)
Here is the documentation wrt to pool accounts: Pool accounts
-
Get the certificates, vo.d and vomsdir directories from
/vols/grid/glite/(config), which are probably the most recent ones.
Dump them all in /opt/glite/yaim/siteinfo, so they are there in case you need them.
-
Here's a link to the siteinfo.def I've been using (minus the passwords).
-
Now go for it:
/opt/glite/yaim/bin/yaim -d6 -c -s site-info-ce01.def -n lcg-CE
-
Now they are two things needed:
A batch system.
At least one worker node.
Problems
Queue names cannot start with a number
Yaim is a gigantic shell script and somebody thought it would be a good idea to
introduce a compulsory variable called <queue_name>_GROUP_ENABLE
(ENABLE_GROUP_ <queue_name> would have been too easy), and if your queue
happens to be called "30min" you are screwed. I tried to file a bug report, but
was basically told I was being stupid trying to use queues that start with a
number (as if it was my decision) and they wouldn't fix it.
The solution is to hack (in /opt/glite/yaim/functions) config_gip and
config_gip_ce (first line is the original script, next line is hacked):
qenablevar=${dnssafevar}_GROUP_ENABLE
qenablevar=`echo Q$qenablevar`
That way, it expects the variable in siteinfo.def to be called
Q<queue_name>_GROUP_ENABLE.
Similarly a bit later:
oqenvar=${dsv}_GROUP_ENABLE
oqenvar=`echo Q$oqenvar`
Testing the CE
glite-wms-job-submit -a -r ce01.hep.ph.ic.ac.uk:2119/jobmanager-lcgsge-q72h -o c1.log glite-submit.jdl
BDII (note: the CE per se is working at this stage)
I noticed that in the bdii.log I find the error message:
Grabbing port 2170 for 2171
==> slapadd: could not parse entry (line=20)
==> slapadd: could not add entry dn="Mds-Vo-name=resource,o=grid" (line=293): txn_aborted! DB_KEYEXIST: Key/data pair already exists (-30996)
ldapsearch -x -H ldap://ce01.hep.ph.ic.ac.uk:2170 -b
"mds-vo-name=resource,o=grid"
seems to report OK, hmmmm....
I tried the DB_CONFIG file from bdii01, but to now avail, that just seems to
break ldap completely.
/opt/bdii/var/tmp/stderr.log seems to contain promising information
The first complaint is about
<= str2entry: str2ad(/opt/glite/etc/gip/plugin/glite-info-dynamic-ce): attribute description contains inappropriate characters
slapadd: could not parse entry (line=20)
which is true, given that the file reads:
[root@ce01 functions]# more /opt/glite/etc/gip/plugin/glite-info-dynamic-ce
#!/bin/sh
/opt/lcg/libexec/lcg-info-dynamic-pbs /opt/glite/etc/gip/ldif/static-file-CE.ldif ce01.hep.ph.ic.ac.uk
Now why does it want to read this file in the first place, and secondly how did
the 'pbs' get here ?
A closer look in functions/config_gip(_ce) reveals that there is no entry for sge
and it defaults to pbs if it can't find the batch system.
In ce00:
[root@ce00]# more /opt/lcg/var/gip/plugin/lcg-info-dynamic-ce
#!/bin/sh
/opt/lcg/libexec/lcg-info-dynamic-sge /opt/lcg/var/gip/ldif/static-file-CE.ldif
A quick grep shows that PBS is used in config_gip, config_gip_ce and
config_jobmanager.
/opt/lcg/libexec/lcg-info-dynamic-sge exists on ce01.
Let's hack the first two:
add line:
sge|SGE) plugin="${INSTALL_ROOT}/lcg/libexec/lcg-info-dynamic-sge
${INSTALL_ROOT}/glite/etc/gip/ldif/static-file-CE.ldif";;
Rerun yaim.
[root@ce01 siteinfo]# more /opt/glite/etc/gip/plugin/glite-info-dynamic-ce
#!/bin/sh
/opt/glite/libexec/lcg-info-dynamic-sge /opt/glite/etc/gip/ldif/static-file-CE.ldif
This looks like ce00 now, but the ldap in bdii.log error remains unchanged.
Java
In this case java has been installed by Kostas.
To find it:
rpm -ql java-1.5.0-sun
In site-info.def JAVA_LOCATION should be set
/usr/lib/jvm/java-1.5.0-sun-1.5.0.16
If you need to install your own version try and do it before you do install else. Here are the
CERN instructions.
They are not that good. Here's is what I think they meant:
-
This assumes that you have setup the jpackage repo in the step "Set up the
repositories" above.
-
Install the gpg key:
rpm --import http://www.jpackage.org/jpackage.asc
-
Heed the sentence in the instructions that reads "For the exact version of Java
to use it should match the 1.5 version recommended by JPackage in 1.7
SRPMS-non-free. If, for example, this contains java-1.5.0-sun-1.5.0.12-5jpp then
it is jdk 1.5.0_12 that you want to install as described below. " If you follow
the
link, the most current version (at least today 26/11/08) is 1.5.0.15
(hint: you are looking for the nosrc rpms).
-
mkdir -p ~/redhat/BUILD ~/redhat/SOURCES ~/redhat/SPECS ~/redhat/RPMS/i586
~/redhat/SRPMS
-
In your home dir (/root) make a file called .rpmmacros that looks like this
(note the '%' are important....):
[root@ce01 ~]# more .rpmmacros
% _topdir /root/redhat
% packager Firstname Lastname <firstname.lastname@example.org>
-
Now get the rpm:
rpm -Uvvh
http://mirrors.dotsrc.org/jpackage/1.7/generic/non-free/SRPMS/java-1.5.0-sun-1.5.0.15-1jpp.nosrc.rpm
-
You still need a binary. Try the SUN archives.
JDK/JRE - 5.0 "5.0 Update 15"
Download JDK
"Linux"
jdk-1_5_0_15-linux-i586.bin (NOT jdk-1_5_0_15-linux-i586-rpm.bin)
This should go in ~/redhat/SOURCES/.
-
Now go and make that rpm:
rpmbuild -ba ~/redhat/SPECS/java-1.5.0-sun.spec
and install it
yum localinstall ~/redhat/RPMS/i586/java-1.5.0-sun-1.5.0.15-1jpp.i586.rpm
yum localinstall
~/redhat/RPMS/i586/java-1.5.0-sun-devel-1.5.0.15-1jpp.i586.rpm