I was reading my email... now I'm blogging.
Zach Garner's blog
I've configured Ganglia on Medusa. See it's native HTML output at http://medusa.lab.ac.uab.edu/ganglia/. It's also integrated with MDS, so grid-info-searches will display information about the entire cluster.
Ganglia seems to be about beta quality. No Major problems with it... except that the dynamically generated icons aren't showing up. This doesn't matter too much to us, since we aren't going to be using their web front end.
To integrate Ganglia with MDS, I have to use ganglia-python client, which seems to be alpha quality software. I had to edit the python code to change hardcoded paths (they refered to the developer's home directory).
I've installed GridPort, but so far do not have it configured. I've put up their example/demo site at http://bechamel.lab.ac.uab.edu/gridport/. To compare, OGCE Open Grid Computing Environment is also installed at http://bechamel.lab.ac.uab.edu:10081/uabgrid
NWS has been installed and configured on bechamel. It's not tied in to the rest of our systems just yet.
I've created a init.d startup for it, as follows.
# Startup script for NWS.
# Author: Zach Garner
# chkconfig: - 85 15
# description: NWS
$NWS_HOME/bin/nws_nameserver -e /var/log/nws/nameserver.err -l /var/log/nws/nameserver.log -f \
$NWS_HOME/bin/nws_memory -d $SCRATCH_DIR -e /var/log/nws/memory.err -l /var/log/nws/memory.log -N $HOSTNAME &
the mpicc compiler is not compiling Globus 3 correctly. It looks like the problem exists in only one file, in one package. Trying to find a work around
It looks like the build script supplied by Globus hard codes the Flavor. As far as I can tell there is no way to create anything other than a gcc32dbg[pth] build without editing the build script. This is needed for Non-Debug builds, builds based on a a vendor compiler (i.e. our portland compilers) or to use MPI.
Note that this is not GPT's fault. Globus is providing a script that calls GPT with the proper FLAVOR arguments. It's this script (install-gt3) that is the problem.
NMI based MPICH-G2 is working to some extent.
One problem is that it either needs to share filesystems for all systems in the MPICH-G2 cluster, or the user needs to manually copy executables to all systems. The problem with this is that the NMI Documentation does not mention this issue.
We now have to remaining issues to take care of:
# The current set up is only working with my GridN test systems
# MPICH-G2 applications are running on the Head Node of the Beowulf cluster, but not returning STDOUT/STDERR messages
# The applications are only running on the Head Node, not being distributed accross the cluster.
OpenCA is now Running. It will still take some time to integrate it with our Grid infrastructure.
I've created a small Condor Pool.
The master (nori) is running on a VM.My workstation (ceviche) is the only other machine part of the pool. The following "screenshot" shows three jobs that were submitted. Two are running on one of my workstations idle processors. The other is runningon the VM (I've got to disable this, no one should be computing on the VM)
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
firstname.lastname@example.org LINUX INTEL Unclaimed Idle 0.000 851
email@example.com LINUX INTEL Claimed Busy 0.000 851
firstname.lastname@example.org LINUX INTEL Claimed Busy 0.000 851
If you install the Condor RPM, it comes pre-configured as a master. This is what you need to do, if you want to use it as a execute node:
run: "./condor_configure --central-manager=condor_master.example.domain --owner condor"
Make sure user 'condor' exists. Now rerun condor_master (controller of the daemons on the execute node, Not the server)