|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
OverviewGEXEC is a scalable cluster remote execution system which provides fast, RSA authenticated remote execution of parallel and distributed jobs. It provides transparent forwarding of stdin, stdout, stderr, and signals to and from remote processes, provides local environment propagation, and is designed to be robust and to scale to systems over 1000 nodes. Internally, GEXEC operates by building an n-ary tree of TCP sockets and threads between gexec daemons and propagating control information up and down the tree. By using hierarchical control, GEXEC distributes both the work and resource usage associated with massive amounts of parallelism across multiple nodes, thereby eliminating problems associated with single node resource limits (e.g., limits on the number of file descriptors on front-end nodes). An initial release of the software (below) consists of a daemon, a client program, and a library which provides programmatic interface to the GEXEC system.
SoftwareUpdate: GEXEC source code and releases are now maintained as part of the Ganglia project. The GEXEC source can be checked out via svn at: http://sourceforge.net/svn/?group_id=43021. The source code can be browsed directly at: http://ganglia.svn.sf.net/viewvc/ganglia/trunk/gexec/gexec. See the Ganglia SourceForge page for more.
Documentation
GEXEC can be used interactively using the gexec client or programmatically using the GEXEC library, libgexec.a. With the client, node selection can be done in one of two ways. It can done by explictly naming a set of nodes using the GEXEC_SVRS environment variable:
# export LD_ASSUME_KERNEL="2.2.5" Alternatively, node selection can also be done using Ganglia by specifying one or more potential gmond servers to query. The first gmond server that is both up and returns a non-empty set of nodes will be used to provide the list of nodes ('-n 0' means all the nodes, a five-node cluster in the example below):
# export LD_ASSUME_KERNEL="2.2.5"
LicenseBSD license.FeedbackSend me email if you're having problems, find bugs, or have any random comments: Brent Chun. GEXEC is known to work unmodified on RedHat 7.2 and 7.3. With the latest 0.3.6 release, it is also known to work unmodified on RedHat 9.0. With some work, it is also known to work on Mandrake, RedHat Enterprise Linux 3.1, and under FreeBSD.You might also be interested in GEXEC's web page on freshmeat. |