dbmultistats (1) - Linux Manuals
dbmultistats: run dbcolstats over each group of inputs identified by some key
NAME
dbmultistats - run dbcolstats over each group of inputs identified by some key
SYNOPSIS
$0 [-dm] [-c ConfidencePercent] [-f FormatForm] [-q NumberOfQuartiles] -k KeyField ValueFieldDESCRIPTION
The input table is grouped by KeyField, then we compute a separate set of column statistics on ValueField for each group with a unique key.Assumptions and requirements are the same as dbmapreduce (this program is just a wrapper around that program):
By default, data can be provided in arbitrary order and the program consumes O(number of unique tags) memory, and O(size of data) disk space.
With the -S option, data must arrive group by tags (not necessarily sorted), and the program consumes O(number of tags) memory and no disk space. The program will check and abort if this precondition is not met.
With two -S's, program consumes O(1) memory, but doesn't verify that the data-arrival precondition is met.
(Note that these semantics are exactly like
This module also supports the standard fsdb options:
This program is distributed under terms of the GNU general
public license, version 2. See the file COPYING
with the distribution for details.
OPTIONS
Options are the same as dbcolstats.
SAMPLE USAGE
Input:
#fsdb experiment duration
ufs_mab_sys 37.2
ufs_mab_sys 37.3
ufs_rcp_real 264.5
ufs_rcp_real 277.9
Command:
cat DATA/stats.fsdb | dbmultistats -k experiment duration
Output:
#fsdb experiment mean stddev pct_rsd conf_range conf_low conf_high conf_pct sum sum_squared min max n
ufs_mab_sys 37.25 0.070711 0.18983 0.6353 36.615 37.885 0.95 74.5 2775.1 37.2 37.3 2
ufs_rcp_real 271.2 9.4752 3.4938 85.13 186.07 356.33 0.95 542.4 1.4719e+05 264.5 277.9 2
# | /home/johnh/BIN/DB/dbmultistats experiment duration
AUTHOR and COPYRIGHT
Copyright (C) 1991-2015 by John Heidemann <johnh [at] isi.edu>