Index
Deployment
Deployment using RPM
Deployment on user's home directory
Deployment
Deployment using RPM
Installation as root via RPMs has now been quite simplified. These instructions assume Red Hat
Enterprise Linux 5.X (and derivates) and the system Python 2.4.3. Other distros and higher
Python versions should work with some extra work.
1) Install and enable a supported batch system. Condor is the current supported default.
Software available from http://www.cs.wisc.edu/condor/. Condor/Condor-G setup and
configuration is beyond the scope of this documentation. Ensure that it is working
properly before proceeding.
2) Install a grid client and set up the grid certificate+key under the user APF will run as.
Please read the CONFIGURATION documentation regarding the proxy.conf file, so you see what
will be needed. Make sure voms-proxy-* commands work properly.
3) Add the racf-grid YUM repo to your system
rpm -ivh http://dev.racf.bnl.gov/yum/grid/production/rhel/6Workstation/x86_64/racf-grid-release-latest.noarch.rpm
The warning about NOKEY is expected. This release RPM sets up YUM to point at our
repository, and installs the GPG key with which all our RPMs are signed. By default
the racf-grid-release RPM sets our production repository to enabled (see
/etc/yum.repos.d/racf-grid-production.repo ).
NOTE: If you are testing APF and want to run
a pre-release version, enable the racf-grid-development or racf-grid-testing repository.
4) If you will be performing *local* batch system submission (as opposed to remote submission
via grid interfaces) you must confirm that whatever account you'll be submitting as exists on
the batch cluster. This is also the user you should set APF to run as.
NOTE: You do not want local batch logs being written to NFS, so you will need to define a
local directory for logs and be sure the APF user can write there.
5) Install the APF RPM:
yum install panda-autopyfactory
This performs several setup steps that otherwise would need to be done manually:
-- Creates 'apf' user that APF will run under.
-- Enables the factory init script via chkconfig.
-- Pulls in the panda userinterface Python library RPM from our repository.
-- Pulls in the python-simplejson RPM from the standard repository.
6) Configure APF queues/job submission as desired. Read the CONFIGURATION documentation in
order to do this. Be sure to configure at least one queue in order to test function.
7) Start APF:
/etc/init.d/factory start
8) Confirm that everything is OK:
-- Check to see if APF is running:
/etc/init.d/factory status
-- Look at the output of ps to see that APF is running under the expected user, e.g.:
ps aux | grep factory | grep -v grep
This should show who it is running as, and the arguments in
/etc/sysconfig/factory.sysconfig:
apf 22106 1.3 0.1 318064 12580 pts/2 Sl 17:13 0:00 /usr/bin/python
/usr/bin/factory.py --conf /etc/apf/factory.conf --debug --sleep=60 --runas=apf
--log=/var/log/apf/apf.log
-- Tail the log output and look for problems.
tail -f /var/log/apf/apf.log
-- Check to be sure jobs are being submitted by whatever account APF is using by
executing condor_q manually:
condor_q | grep apf
Deployment on user's home directory
AutoPyFactory INSTALL-USER
User installation assumes that APF will be installed in the user's home directory using the
standard Python distutils setup commands. It assumes that pre-requisites have already
been installed and properly configured, either within the user's home directory or on
the general system.
Software Prerequisites
----------------------------------------------------------------------
* Python 2.4 or above:
-- Comes default with RHEL 5+ and most current operating systems.
* python-simplejson:
-- Available in stock RHEL 5+. yum install python-simplejson
-- http://pypi.python.org/pypi/simplejson/
* python-pycurl:
-- Available in stock RHEL 5+. yum install python-pycurl
(We will work to eliminate this prerequisite by using pure Python urllib.)
-- http://pycurl.sourceforge.net/download/
* voms-proxy-* tools:
-- Available in stock RHEL 5+. yum install voms-clients
-- Or install the OSG client
* Condor 7.6 or above:
-- http://research.cs.wisc.edu/condor/
* Panda Client userinterface Python library:
mkdir ~/lib/python
cd ~/lib/python
svn co http://svnweb.cern.ch/guest/panda/panda-server/current/pandaserver/userinterface
Installation
-----------------------------------------------------------------------
1) Check out the project from SVN, or download the tarball, and install
svn co http://svnweb.cern.ch/guest/panda/panda-autopyfactory/current/ panda-autopyfactory
cd panda-autopyfactory
python setup.py install --home=~/
2) Unless you are running a Python installed in your home directory, you must
add ~/lib/python to your PYTHONPATH. The preferred way is to add the following
to your ~/.bash_profile
export PYTHONPATH=~/lib/python:$PYTHONPATH
Configuration
-----------------------------------------------------------------------
1) Copy the configuration -example files to their final names. And adjust the variables
within to reasonable values. Each configuration file has detailed instructions on proper
settings for various purposes.
cd $HOME/etc
cp factory.conf-example factory.conf
cp queues.conf-example queues.conf
cp proxy.conf-example proxy.conf
cp factory.sysconfig-example factory.sysconfig
Notable issues for initial installation or upgrades:
-- Be aware that the factoryId in factory.conf will be what your factory appears under on the
APF monitor. Once chosen, it will be awkward to change it. See
http://apfmon.lancs.ac.uk/mon/
-- The integrated proxy management (proxy.conf) and log serving (logserver.* variables in
factory.conf) are new with 2.X. Be sure you understand them before moving beyond testing.
-- Because of the way gatekeepers cache job executables, changing the file used as a job wrapper
does not necessarily trigger an update of that filename on site gatekeepers. The only way to
guarantee a re-send of a wrapper file is to rename it (or make a unique symlink) and change
the executable var in queues.conf e.g.,
executable = /usr/libexec/wrapper -> executable/libexec/wrapper-0.9.4.sh
We intend to look into this issue to hide it from the factory administrator, but for now it
is an unavoidable problem.
2) Start the factory, and trace the log files to determine if behavior is as expected:
cd $HOME/etc
./factory start
tail -f ~/var/log/apf.log