     shorten - compression for waveform files

     shorten [ -xh ] [ -a align bytes ] [ -b block size  ]  [  -c
     channels  ]  [ -d discard bytes ] [ -m blocks ] [ -p predic-
     tion order ] [ -q quantisation level ] -r bit rate  ]  [  -t
     file type ] [ -v version ] [ input file ] [ output file ]

     shorten reduces the size of  waveform  files  using  Huffman
     coding  of  prediction residuals.  The amount of compression
     obtained depends on the nature of the waveform.  Those  com-
     posing  of  low frequencies and low amplitudes give the best
     compression.  Compression  is  generally  better  than  that
     obtained by general purpose compression utilities applied to
     waveform files.

     If both file names are specified then these are used as  the
     input and output files.  The first file name can be replaced
     by "-" to read from standard  input  and  likewise  for  the
     second and standard output.  If only one file name is speci-
     fied, then that name is used for input and the  output  file
     name  is  generated  by adding a .shn suffix for compression
     and removing a .shn suffix for decompression.  In this  case
     the  input  file is removed on completion.  If no file names
     are specified, shorten reads from standard input and  writes
     to  standard  output.   Whenever  possible,  the output file
     inherits the permissions, owner, group, access and modifica-
     tion times of the input file.

     -a align bytes
          Specify the number of bytes needed to be copied  verba-
          tim  before  compression begins according to the speci-
          fied file type.  This is useful if  header  information
          is not a multiple of the number bytes per sample.

     -b block size
          Specify the number of samples  to  be  grouped  into  a
          block  for  processing.  Within a block the signal ele-
          ments are expected to have the  same  spectral  charac-
          teristics.  The default option works reasonably well.

     -c channels
          Specify the number of independent interwoven  channels.
          For two signals, a(t) and b(t) the original data format
          is assumed to be a(0),b(0),a(1),b(1)...

     -d discard bytes
          Specify the number of  bytes  to  be  discarded  before
          compression  or  decompression.   This  may  be used to

          delete header information from a file.  Refer to the -a
          option for storing header information in the compressed

     -h   Give a short message specifying usage options.

     -m blocks
          Specify the number of past blocks to be used  to  esti-
          mate the mean of the signal.  The default value of zero
          disables this prediction and the mean is assumed to lie
          in  the  middle  of the range of the relevant data type
          (i.e. at zero for signed quantities).

     -p prediction order
          Specify the order of the linear predictive filter.  The
          default  value of zero disables the use of linear pred-
          iction and a polynomial interpolation  method  is  used
          instead.   The use of the linear predictive filter gen-
          erally results in a small  improvement  in  compression
          ratio  at  the  expense of execution time.  Compression
          time is linear in the specified order whilst decompres-
          sion time is about twice that of the default polynomial

     -q quantisation level
          Specify the number of low order  bits  in  each  sample
          which  can  be discarded (set to zero).  This is useful
          if these bits carry no information,  for  example  when
          the signal is corruped by noise.

     -r bit rate
          Specify the expected maximum number of bits per sample.
          The  upper bound on the bit rate is achieved by setting
          the low order bits of the sample to  zero,  hence  max-
          imising  the segmental signal to noise ratio.  WARNING:

     -t file type
          Gives the type of the  sound  sample  file  as  one  of
          au is the natural file type of ulaw encoded files  (for
          lossless  compression)  and  ulaw  is the expanded file
          type for lossy compression.  All the other  types  have
          initial s or u for signed or unsigned data, followed by
          8 or 16 as the number of bits per sample.   No  further
          extension  means the data is in the natural byte order,
          a trailing x specfies byte swapped  data,  hl  explitly
          states the byte order as high byte followed by low byte
          and lh the  converse.   The  default  is  s16,  meaning
          signed 16 bit integers in the natural byte order.

     -v version
          Specify the  binary  format  version  number.   At  the
          moment,  version  1 can write version 0 files, although
          continuation of this feature  is  not  guarenteed.   It
          will  always  be possible to unpack files packed with a
          lower version number.

     -x   Extract.   Reconstruct the original  file.   All  other
          command line options are ignored.

     shorten works by blocking the signal, making a model of each
     block  in  order to remove temporal redundancy, then Huffman
     coding the prediction residual.

     The signal is read in a block of about 128 or  256  samples,
     and  converted  to ints with expected mean of zero.  Sample-
     wise-interleaved data is  converted  to  separate  channels,
     which are assumed independent.

     Four functions are computed, corresponding  to  the  signal,
     difference  signal, second and third order differences.  The
     one with the lowest variance  is  coded.   The  variance  is
     measured  by  summing absolute values for speed and to avoid

     It is assumed the signal has the Lapaclian probability  den-
     sity  function  of exp(-abs(x)).  There is a computationally
     efficient way of mapping this density to huffman codes,  The
     code  is  in  two parts, a run of zeros a bounding one and a
     fixed number of bits mantissa.  The number of leading  zeros
     gives  the  offset  from zero.  Signed numbers are stored by
     calling the function for unsigned numbers with the  sign  in
     the lowest bit.  Some examples for a 2 bit mantissa:

     100  0

     101  1

     110  2

     111  3

     0100 4

     0111 7

     00100     8

     0000100   16

     The structure of a compressed file is:

     1) four bytes for a magic number

     2) one byte for a format version

     3) The command line options

     4) a repeating sequence of <command> <args> ...


     Exit status is normally 0.  A warning is issued if the  file
     is  not  properly  aligned,  i.e.  a whole number of records
     could not be read at the end of the file.

     No check is made for inceasing file size, but  valid  speech
     files  generally achieve some compression.  Even compressing
     a file of random bytes  (which  represents  the  worst  case
     waveform  file,  that of maximum amplitude white noise) only
     results in a small increase in the file length (about 6% for
     8 bit data and 3% for 16 bit data).

     Piped output does not work when run under DOS.

     Future enchancements that are likely to improve the compres-
     sion are pitch period tracking and arithmetic coding.

     Please mail me with bugs, bug fixes and  any  other  sugges-

     The latest version can be obtained  by  anonymous  FTP  from
     svr-ftp.eng.cam.ac.uk, directory misc.

     Copyright    (C)    1992,1993,1994    by    Tony    Robinson

     The aim of the copying and usage restrictions is that  is  I
     want people to be able to use the software freely, modify if
     they like, but not sell it or claim it is their own work.

     Thanks to the following for providing motivation  and  valu-
     able  feedback:  Dolf  Grunbauer,  Kris Huber, Dave Pallett,

     John Garofolo, Jon Fiscus and Steve Lowe.

