nettee (1) - Linux Manuals
nettee: a network "tee" program
NAME
- nettee - a network "tee" program
-
SYNOPSIS
nettee [options]DESCRIPTION
nettee passes a data stream to one or more child nodes using a daisychain method. On each node nettee may also direct the stream to a file or pipe. nettee allows large amounts of data to be quickly distributed to multiple nodes on a network at a rate limited only by the network bandwidth. The distribution chain is typically linear for each network switch but may branch when nodes utilize multiple switches. For maximum throughput only one instance of nettee should utilize each network interface.
When nettee starts it waits for a connection from the upstream node before attempting to connect to its downstream nodes. Consequently nettee may be started on the nodes in any order (by a script, rsh, ssh, and so forth.) Typically only the node that reads the data stream for stdin or a file will be set to log messages, so that the progress of the transfer may be monitored. Transmission errors are detected by comparing the total number of bytes read by each child node with the number of bytes transmitted to that child.
Error Handling
By default severe errors cause the entire chain to abort. By utilizing the -conwf and -colwf options nettee may be instructed to do its best to continue processing in the event of certain write failures of the data stream. Note that failures which occur while the distribution chain is forming are still fatal events. To allow the program to continue with a truncated or alternate chain if chain formation errors are encountered utilize the -connf option, and optionally specify alternate targets in each hostlist. If the node above the failed node is allowed to emit messages and errors ( for instance: -v 5 ) messages similar to these will be sent to the log destination ( -log ):
Failures detected in child 0 [node34]: NWF
Failures detected in child 1 [node35]: NONE
Failures detected in chain: NWF
The first type of message describes the failures that were detected in the named child node, that is, those named in the -next option. The second message describes failures that were detected anywhere further on in the chain. The error codes currently defined are: NONE no errors, NWF network write failure, LWF local write failure, BBC child returned incorrect byte count, BSTAT child returned unknown or bad status, and NNF could not connect to (one or more) downstream chain nodes.
Exit Status
nettee will normally emit an EXIT_SUCCESS status. (0 on Unix.) This is true even if the errors were detected and handled in the node itself or in a child node. nettee will emit an EXIT_FAILURE status if it was forced to close by an unhandled event such as a timeout, write failure, or unexpected socket closure.OPTIONS
- -h
-
Print help information.
- -hexamples
-
Print examples.
- -herrors
-
Print error status codes.
- -i
-
Print version, license, and copyright information.
- -in <SRC>
-
Reads data from
<SRC>
which may have one of three values:
nettee
reads from the upstream node;
-
reads from stdin;
socket
read the output of a command from a socket;
filename
reads from a file. If no
-in
option is present the programs reads data from the upstream node.
- -out <DST>
-
Writes data locally to
<DST>
which may have one of three values:
none
writes nothing locally;
-
writes to stdout;
socket
write the datastream to a command through a socket;
filename
writes to a file. If no
-out
option is present the program writes data to stdout.
- -next <HOSTLISTS>
-
Writes data to downstream destination[s]
hostlist1(,hostlist2(,hostlist3(...)))
where the hostlist entries are separated by commas or spaces.
A hostlist consists of either a single hostname, or a comma separated
list of hostnames enclosed in square brackets. Example:
node1,[node2,node3],[node4,node5,node6],node7.
The bracketed form allows for
automatic failover if unreachable nodes are encountered and if
-connf
is specified. The first hostname in the
list is tried, then the next, and so on.
There may be 1-8 hostlists. The number of hostlists controls the
topology of the distribution chain. Use a linear distribution chain (a single hostlist)
when all nodes share a single network switch. Use a forked distribution chain
(multiple hostlist) when nodes are connected to two or more network switches.
The End of Chain condition (no downstream
write) is indicated by a
<HOSTS>
value of
.
,
, or
_EOC_.
An End of Chain condition is also indicated by the absence of an
-next
option. If End of Chain is indicated there may not be any other hostslists
specified.
- -cmd <COMMAND>
-
Specifies the command to use in conjunction with an
-in socket
or
-out socket
option. Since only a single
<COMMAND>
may be specified
socket
may not be applied to both
-in
and
-out
at the same time. When
-cmd
is used with
-in socket
a child process running
<COMMAND>
reads data from a disk or other device and writes
the resulting data stream to stdout. When
-cmd
is used with
-out socket
a child process running
<COMMAND>
reads the datastream from stdin and writes the
processed data to a disk or other device.
Typically the
<COMMAND>
string invokes
tar
or some other archiving program. In some instances using sockets
and
-cmd
will be faster than using the same command in a pipe due to the larger
buffer size used for the socket. Run
nettee -hexamples
to see a usage example.
- -stm <EOS>
-
stream text through a
nettee
chain until the string
<EOS>
is encountered, then exit. This allows short text
messages to traverse the chain without waiting for a buffer to fill.
Since the text message can very rapidly traverse the
nettee
chain it can be piped into
execinput
(or any other program that will execute its stdin as commands)
to produce essentially simultaneous execution on all target nodes. The
<EOS>
string is not passed through the data chain and its length is ignored.
When used to start further
nettee
processes on the target nodes
<PORT>
values must be chosen to avoid interference.
While this mode may be convenient for setting up Beowulf nodes it is
exceedingly dangerous for general use since any command introduced into
the command stream will execute on all chain nodes
as if submitted by the owner of the nettee process on that node.
Run
nettee -hexamples
to see a usage example.
- -name <STRING>
-
Specify the node name used in messages (<=127 characters). If not supplied
the values of the environmental variables
MYHOSTNAME
and
HOSTNAME
are first checked, and if those are not defined, the result of a
gethostname()
call is used.
- -log <LDST>
-
Errors and messages are written to
<LDST>
which may have one of two values:
-
writes to stderr or
filename
writes to a file. If no
-log
option is present the program writes messages to stderr.
- -p,-port <PORT>
-
First of two consecutive ports use for communication.
If no
-port
option is present the program uses the default value of 9997.
- -v <VERBOSE>
-
<VERBOSE>
is a bit mask which controls the types of warning and error
messages which are sent to the
-log
destination. Bit values indicate:
1
show error messages;
2
show command line settings;
4
show messages;
8
show periodic status messages during transfer;
16
prepend nodename to all messages.
Use a
<VERBOSE>
value of 0 to eliminate all messages. If no
-v
is present the program uses a default
<VERBOSE>
value of 1.
- -q
-
Suppresss "ignored signal" messages.
- -t <WAIT>
-
Wait up to
<WAIT>
seconds for a connection from upstream in the chain to form or data to be received.
If neither of these events occur exit with an error.
A value of
0
waits forever and will only exit on an end of data condition.
If no
-t
is present the program uses a default
<WAIT>
value of 0. The
-iconnf<WAIT>
and
-w
options control timeouts for downstream connections.
- -w
-
Wait for the next node to boot or attach to the network. If not
specified and the next node is not reachable
nettee
will exit with an error no matter what the
-t <WAIT>
and
-iconnf <WAIT>
timeout values are.
- -colwf
-
Continue on Local Write Failure. Normally the failure of
a write of the data stream to the local output will be fatal
and the entire distribution chain will collapse immediately.
(Typically this happens when data is written to disk and a
partition fills or there is an ownership problem. A complete
disk failure may initially present this way but often goes on
to crash the node, resulting also in a network write failure.)
When
-colwf
is set and a local write failure occurs on a node that node will
continue to relay data down the chain. The node that failed will
not have correctly processed the data stream locally but all other
nodes will be unaffected by this failure. The top node will emit
an error message when this occurs so that a subsequent analysis
with other tools may locate the node(s) which failed. This option
may only be employed on a node that reads data from an upstream node.
- -conwf
- Continue on Network Write Failure. Normally the failure of a write of the data stream to the next node will be fatal and the entire distribution chain will collapse immediately. (Typically this happens when a node crashes while nettee is running.) When -conwf is set and a network write failure occurs on a node (indicating that the next node has failed) the node will continue to process the data stream locally but will make no further attempts to transfer data to the next node in the chain. This allows the data transfer to complete on a chain down to the node above a failed node. The top node will emit an error message when this occurs so that a subsequent analysis with other tools may locate the node(s) which failed. This option may only be employed on a node that reads data from an upstream node
- -connf <WAIT>
- Continue on Next Node Failure. Give each node in a hostlist <WAIT> seconds to join the chain. After that each successive host in the hostlist is given <WAIT> seconds to join, and if none succeed, no data will be sent to any of those hosts. If -connf is not specified or the wait time is set to zero seconds, the program will wait forever for a connection to the first node in each hostlist.
- -progress <INTERVAL>
- If -v 8 is used a status message is emitted every <INTERVAL> bytes transferred. The default value of 10000000 will be too small for a very fast network.
RELATED PROGRAMS
netcat(1).
nettee is derived from Felix Rauch's dolly which is available here: http://www.cs.inf.ethz.ch/CoPs/patagonia/#dolly
The nettee home page is: http://saf.bio.caltech.edu/nettee.html
COPYRIGHTS
Copyright: 2008 David Mathog and Caltech. Copyright: Felix Rauch and ETH Zurich
LICENSE
Freely distributed under the second GNU General Public License (GPL 2).AUTHOR
David Mathog Biology Division, Caltech