sched plugin Activated deleted
bug fixed in method fill() in info.py to prevent an Exception when the dictionary has an unknown key
CondorEC2 BatchSubmit and BatchStatus plugins usable, supporting standard Condor ec2 grid type.
Assuming VMs join local pool as startds, features correlation of VM jobs and corresponding startd.
VM retirement via 'condor_off -peaceful' and VM shutdown when stard is retired.
KeepNRunning sched plugin to convert absolute to relative numbers.
utils not distributed anymore within the RPM. They will be distributed with a dedicated one.
Removed reference to Panda cloud status in StatusOfflineSchedPlugin
created apf-simulated-scheds in misc/
Proxymanager now able to be run standalone as standard system init daemon. This is so base certs need not be owned or readable by the APF user.
Added email notification of factory owner when no valid proxy can be retrieved from the proxymanager.
Eliminated all _readconfig() methods in submit plugins. Switched to full initialization during __init__.
MyProxy support in proxymanager. Allows retrieval of base proxy from MyProxy using passphrase, or using another proxymanager profile proxy for auth.
fixed bug in Scale sched plugin returning 0 when n*factor < 1.
minimum Condor version check. Allows particular plugins to specify minimum. Failure aborts.
multiproxy functionality. Allow multiple proxy profiles. proxymanager tries all until one is found. failure generates email.
scripts and config files added for logs monitor apf-search-failed
strip() after split by comma in configloader, to allow things like 'x = foo, bar' with whitespace after comma
bug in MaxPending sched plugin logic fixed. Now no limit imposed when there are no pending pilots.
queues configuration files can be read from a directory (i.e. /etc/apf/queues.d/). configloader adapted to accept directories instead of list of URIs
method name removed from log messages.
fixed bug in some sched plugins, mixing None and 0 in the logic
add documentation to install.html (in verbatim, no html format)
removed all methods for FactoryCLI from bin/factory, and mainLoop() renamed to run()
added new custom logging level -TRACE- and using it for some long messages, like the output of condor_q in XML format
configloader converts None in conf to Python None object. Change in default_value logic--if no default_value is provided then generic_get returns None.
info major refactoring of Info class hierarchy. most objects now have overridden methods to avoid exception generation.
passing queryargs from queues.conf to query.querycondor() call at Condor batchstatus plugin.
siteid removed from several places and replaced by wmsqueue.
generic_get() simplified
InfoContainer removed.
fake info classes imports removed, and importing from the right place (info.py instead of factory.py)
log messages in sched plugins more homogeneous. Now all of them are "Return=123".
created a script in misc/ to search for pilots running (in theory) for more than 8 days.
Removed allowed variables for Test and Offline Sched plugins, and from the queues.conf. If the plugin is used, we assume it is allowed.
more robust code to deal with scenario where condor_q gives no output
all JSD.add() method replaced to use the new format (2 inputs instead of 1)
jsd.py and jsd3.py moved to attic/ and jsd4.py moved to jsd.py
logserver2.py moved to logserver.py, and old one moved to attic/
using method merge() instead of deepcopy() in configloader.py
method section2dict() created in configloader. It will be usefull to generate the mappings from config files
getConfig() method in configloader.py embedded in a try - except block and all config loader object creation in factory.py embedded in try - except blocks
Exception class ConfigFailure renamed ConfigFailureMandatoryAttr
Cleaning info.py code:
-- method dict() removed
-- method getconfig() removed
-- method __add__() removed
-- method set() removed
-- method get() removed
-- property total removed
-- class attribute valid = [] removed from all classes:
-- therefore method reset() removed
-- therefore method __setattr__() removed
-- method __getattribute__() is created
-- method valid() in class InfoContainer deleted
-- no longer importing Config from configloader
-- no longer self.default_value
-- all __init__() methods now hardcode the entire list of attributes
Removed logger from the input options in method generic_get()
As temporary solution, NotImplementedAttr defined in configloader.py
Documentation on old Sched Plugin SimpleNQueue deleted.
First draft for manpages created, and rpm-post script adapted to install them.
Printing env again after switching ID.
Bug fixed in CREAM example in queues.conf-example
created misc/apf-panda-jobs-info.py
INFO messages in configloader for missing non-mandatory variables moved to DEBUG.
all __XYZ__ module names deleted, except in factory.py
Fixed setup.cfg
Adjusted sysconfig, factory init, and logrotate to use a console.log for python interpreter-level debugging.
Added a log message with the APF version number.
All Sched Plugins returns a tuple (number of pilots, message).
Sending to the monitor the messages returned by Sched Plugins.
Changed setup.py, factory.py and plugins/monitor/APFMonitorPlugin.py to use release version info directly
from factory.py rather than requiring that it be correctly placed in a config file.
versionTag removed from factory.conf
Added logserver2.py, to create directory listing similar to Apache one.
Port number is got from URL instead of from config variable baseLogHttpPort
Method setuppandaenv() deleted from factory.py
copy_to_spool set to True
Dumping the content of queues.conf on start
Allowing raw = True or False in getContent() method in configloader.py
Utils distributed via /usr/share/apf instead of /usr/bin, and have different names.
Created RELEASE_NOTES.
Everything related euca and persistence removed from config files.
created test.py, to start creating unittest-like code
should_transfer_files = NO.
Split plugins into their own directories.
No more Config Plugins.
ReadySchedPlugin created -> Activated decommissioned
KeepNRunningSchedPlugin created.
CondorNordugrid Batch Submit plugin created.
CondorLSF Submit plugin created.
New configuration file for monitor plugins.
New type of plugins for monitor added.
ScaleSchedPlugin created.
Euca plugins improved. Still under development. NOTE: not distributed, most probably they will be removed and never used.
module condor.py created. It includes htcondor python bindings.
wrapper examples are placed in /etc/apf/ instead of /usr/libexec/
Non plugin modules (jsd.py, persistence.py) moved to main directory.
version number for panda userinterface package in setup.cfg
bugs in CondorLocalWMSStatusPlugin fixed
Removing self._valid from all plugins (work in progress)
ConfigFailure moved to apfexceptions.
Removed hardcoded setup of periodic_remove directive from CondorLocal Batch Submit plugin.
Not trying to create robot.txt is logserver is disabled.
Changing directory to new HOME directory after switching identity.
Checking condor daemon is running in CondorBase Batch Submit Plugin and Condor Batch Status Plugin.
Created a Singleton metaclass factory. It creates metaclasses for regular Singletons and multi Singletons.
Printing the path to executable condor_q and condor_submit in debug mode.
Added __add__() method to BatchQueueInfo and WMSQueueInfo classes.
Added factory config variable 'enablequeues'.
Sched plugins do not check for negative outputs. It has been left up to the Submit Plugin to decide what to do in that case.
Added some DEBUG logging messages.
Some messages in Activated Sched Plugin moved from DEBUG logging level to INFO level.
Bug fixed: getboolean() instead of get() when reading value of proxymanager.enabled and logmanager.enabled.
Top logger configured as root, instead of "main".
Log messages format includes method name for python > 2.4
Bumped minor release version to reflect scale of several new features, and cloud submit plugin.
Two new Sched Plugins to handle test queues and offline queues
Bug in monitor.py fixed.
Creates robots.txt file in base of logserver docroot.
Added create_run_var() to init script factory.
It creates the subdirectory var/run/ if it does not exist, to place factory.pid
Added $APFHEAD to init script factory.
This allows for user deployment on any directory, not necessarily $HOME
Fixed the bug in Activated plugin, not returning anything under some circumstances.
Creating multiple Sched plugins (MinPerCycle, MaxPerCycle, MinPending, MaxPending, MaxPerFactory),
and code in factory.py to allow chaining more than one sched plugin.
PandaConfig Plugin refactored to query SchedConfig in a singleton thread.
This is possible because it now uses an URL that delivers the entire SchedConfig content.
Also more variables added: jdladd, environ, and special_pars.
First draft for Euca Submit Plugin created. NOTE: not distributed, most probably they will be removed and never used.
This release is the 'The Hellion' v. 2.1.1
http://www.darylgregory.com/pandemonium/Review_NYRSF.aspx
Refactored scheduler log cleanup. Now handled in a separate thread, one for entire factory.
Defaults may be specified globally in factory.conf or per-queue in queues.conf
Added grid and vo queue attributes, and added executable.defaultarguments and executable.arguments
interpolation. These changes were to support wrapper.sh, runpilot3.sh, and arbitrary executables in
a general way. This feature uses standard Python ConfigParser interpolation. See
http://docs.python.org/library/configparser.html
Made Monitor object a singleton, to avoid repeated timeout delays (during queue initialization)
when APF monitor is unresponsive. Now there is a single attempt when single Monitor is initialized.
Move to wrapper 0.9.5, which checks that retrieved tarball is indeed a tarball (and not an HTML error message
from a misbehaving HTTP proxy.
Fixed logic problem in Activated plugin. Now correctly assuming that Running jobs are no longer
Activated.
Added specific Submit plugin for Cloud.
Configuration objects handled as python native ConfigParser objects instead of custom
APF objects.
Refined running as user rather than from RPM. 'setup.py install --home=/path/to/home' does a
user-based setup. All libs are in ~/lib/python, configs and init script in ~/etc/, and so on.
Added an external queue configuration information plugin mechanism to enable plugin-based
auto-fill/overriding of config information.
Consolidated all plugins into hierarchical inheritance tree, to eliminate duplicated code
(especially the Condor plugins).
This release is the 'Sleeper Service' v. 2.1.0
http://en.wikipedia.org/wiki/GSV_Sleeper_Service
Major refactorization to integrate BNL changes. Full object-oriented functional
plugin architecture, each running in a distinct thread. Allows for
end-user/third-party customization without touching core code.
Refined distutils usage to allow deployment as non-root in a home directory,
or as root in system paths. Added functionality to drop privs even when run as
root.
Added typical init script and sysconfig functionality to handle shell-level/UNIX
concerns.
Added standard Linux logrotate configuration.
Split config system into main source for APF instance (factory.conf), proxy
management (proxy.conf), individual queue configuration (queue.conf). The latter
can be accessed as URI, paving the way to queue configuration via remote DB+URL.
The previous change involved passing ConfigLoader objects to classes, rather than
passing through lists of config files.
Integrated batch system log export via HTTP; now uses embedded Python HTTP server.
Integrated proxy handling and renewal, integrated submit system log file rotation.
Adjusted module, file, and class names to follow Python recommended guidelines. See PEP 8:
http://www.python.org/dev/peps/pep-0008/
Moved all filehandling to factory.py script from Factory class. Factory is now purely
object-oriented suitable for embedding (e.g. in a web application).
Added copy of Jose's generic top-level wrapper to libexec/.
IMPORTANT: For various reasons, it was very difficult to maintain the ability to automatically
reload configs by detecting file mtime changes. So for now simply restart if you change a file.
This feature can be re-enabled soon.
This release is "The Clockmaker". v2.0.0
http://en.wikipedia.org/wiki/List_of_Revelation_Space_characters#The_Clockmaker
Introduced "country" and "group" parameters for queues. These map to
the pilot's -o and -v options allowing for pilots which only pick up
particular types of jobs.
Polling for queues now intelligently uses the country and group
options above to only send pilots when activated jobs of that type
are present at a site.
Introduced "cloud" option to tag a queue to a panda cloud.
Site and cloud status are polled and if 'offline' then pilot
submission is supressed; if a site/cloud is in 'test' status then
the pyfactory status flips to 'test' as well (limits pilot flow).
Introuduced 'pilotDepthBoost' parameter (default 2), which allows
the factory to submit up to queueDepth * pilotDepthBoost into a
non-started state if there are sufficient jobs activated. This helps
a lot if the site has short jobs where the lags in job status
updates mean that sites can run short of pilots. You may want to set
this higher at T1s (especially when doing reco).
This is a more controlled version of the short job patch which was
released in June 2009.
Corrected a small bug in the default queue specification so that it
works properly.
Renamed the 'suppression' option to 'idlePilotSuppression'.
Distributed wrapper script is now called "runpilot3-wrapper.sh",
which is a much more sensible name.
Default server is now pandaserver.cern.ch.
Updated INSTALLATION file for new panda client location in SVN.
Updates to README, INSTALLATION, Makefile corresponding to the above
changes.
Eilish wanted this release called 'Totoro', after one of her
favourite Hayao Miyazaki films,
http://en.wikipedia.org/wiki/My_Neighbour_Totoro.
Upgraded runpilot3.sh to UHURA 18f pilot.
Modified the order of the search for an LFC compatible python. The
ATLAS release python is now tested first (it's usually more recent
than the OS version). N.B. This may provoke a run time python
warning about an API mismatch between the python versions.
Added factoryId to configuration. This allows multiple factories to
run on the same machine. This factory ID is used as the PANDA_JSID
passed to the panda dashboard.
Added baseLogFileUrl to pass the correct URL for the pilot wrapper's
logfiles to the panda dashboard.
Added optional "user" field to allow pilots to be sent to pick up
particular types of jobs from panda, using the "-u USER" argument.
If this is absent or None then nothing is passed with the pilot
arguments (i.e., pickup normal production jobs).
Changed configuration so that each configuration section is a
"queue" which can have a gatekeeper section. This allows pilots to
be sent to the same queue on the CE, but having different "user"
parameters. If there is no "gatekeeper" specified then the name of
the configuration section name is the gatekeeper contact string, as
before.
New example factory configuration file, factory.conf-example makes
use of the above clearer.
Many internal changes to support the above feature.
Modified the main submit code to sort alphabetically by queue name.
cleanLogs.py now much improved. Will read factory log file directory
location from factory.conf. Options for verbosity, days to compress
after and days to delete after.
This release is dedicated to Gary Gygax. d20... 19 hit!
http://en.wikipedia.org/wiki/Gary_Gygax