Manual.txt for Version 2.00 of ISO/MPEG Audio Layer 3 software only encoder/decoder for Unix. 1. ENCODER V2.00 ============= l3enc is an ISO/MPEG Layer-3 software only encoder. It takes PCM audio data files as input and delivers Layer-3 coded bit stream files as output. Several options can be selected via command line switches. Usage: l3enc [-switch1 [-switch2 [...]]] PLEASE NOTE: Non-registered users may use the encoder only with the following options: bitrate sampling rate mode format 256 kbps 44.1 kHz stereo (mode 0) MPEG-1 128 kbps 44.1 kHz stereo (mode 0) MPEG-1 64 kbps 22.05 kHz stereo (mode 1) MPEG-2 32 kbps 22.05 kHz mono (mode 4) MPEG-2 For non-registered users, ancillary data processing is not supported. 1.1 PCM audio input file The first command line argument specifies the name for the PCM audio data file. Version 2.00 of the encoder accepts either raw PCM audio data files, PCM audio data files in RIFF/WAVE format as used by Microsoft Windows, PCM audio data files in the sun .au or PCM audio data files in the Apple AIFF format. The samples must be 16 bit signed integer values. for raw PCM audio data: By default the input file is assumed to contain raw PCM audio data. Stereo audio data is input in interleaved format, the first channel beeing the left channel. ... Mono audio data has the format .... Whether the input file is treated as mono or stereo audio data is set by the encoding mode parameter (1.3). Default is stereo. 1.2 bitstream output file The second command line argument specifies the name for the bitstream output file. The extension of the file name should be .mp3. The format of the bit stream is as defined in the ISO/MPEG publications IS11172-3 (MPEG-1) and IS13818-3 (MPEG-2). 1.3 encoding mode Depending on the setting of the '-mod' switch, the encoder will treat the two input channels as: -mod 0 stereo (ms stereo), -mod 1 stereo (intensity stereo), -mod 2 dual channel -mod 3 stereo input -> mono output (L+R)/2 ("downmix") -mod 4 mono input For bitrates <= 96 kbps, the default is intensity stereo (-mod 1). For bitrates >= 112 kbps, the default is ms-stereo (-mod 0). For more details about encoding modes, please refer to section 1.11 'Encoding Recommendations' For stereo, the first channel is the left channel. The second channel is the right channel. If input files in SND/WAVE/AIFF format are used, the number of channels is detected. 1.4 sampling rate Version 2.00 of the encoder can use the following sampling rates: o 16000 Hz o 22050 Hz o 24000 Hz o 32000 Hz o 44100 Hz o 48000 Hz 1.5 effective sampling rate For registered users the following effective frequencies are supported: o 16000 Hz o 22050 Hz o 24000 Hz o 32000 Hz o 44100 Hz o 48000 Hz If you apply an effective sampling rate using the -esr switch, a downsampling from the sampling rate to that effective sampling rate is done. Currently only downsampling /2 and /3 work. 1.6 bitrate The bitrate of the bit stream output is selected via the '-br' switch. The bitrate is specified in bits/second. The bitrate is the total bitrate for all encoded channels, i.e. if you select 'br 128000' and 'stereo', both channels will be stuffed into one bit stream of 128000 bits/second. Valid bit rates are: o 8000 bit/s o 16000 bit/s o 24000 bit/s o 32000 bit/s o 40000 bit/s o 48000 bit/s o 56000 bit/s o 64000 bit/s o 80000 bit/s o 96000 bit/s o 112000 bit/s o 128000 bit/s o 144000 bit/s o 160000 bit/s o 192000 bit/s o 224000 bit/s o 256000 bit/s o 320000 bit/s The default bitrate is 128000 bits/sec. 1.7 crc check If '-crc' is asserted, ISO/MPEG1 crc checking is enabled. Without the 'crc' switch, crc checking is disabled. 1.8 swap low and high byte of input samples If the '-tfs' option is specified, the low and high bytes of each audio data input sample are swapped. Use '-tfs' if you move your PCM audio data from little endian to big endian machines (or vice versa). This will only work if the input signal is a pcm file. 1.9 ancillary data If the '-anc ' option is specified, in the bitstream the named file is inserted as ancillary data. The rate is in bits/frame. 1.10 examples of switch settings l3enc infile.pcm out.mp3 -br 112000 -cr l3enc /music/pcm/newage.pcm /bitstr/l3/newage.mp3 -mod 2 -br 64000 l3enc pop.wav pop.mp3 -esr 22050 -br 96000 1.11 Encoding Recommendations Depending on the desired bitrate, the encoding process should be done with different parameters. 'l3enc' supports two versions of Layer-3 bitstreams called MPEG-1 and MPEG-2. The basic difference is the use of different sampling frequencies: MPEG-1 Layer 3 sampling frequencies 32, 44.1, 48 kHz MPEG-2 Layer 3 sampling frequencies 16, 22.05, 24 kHz MPEG-1 supports higher audio bandwidth and is therefore the best choice for high quality audio coding at bitrates >= 96 kbps (stereo) or >= 48 kbps (mono). For bitrates <= 64 kbps (stereo) or <=32 kbps (mono), MPEG-2 offers better sound quality compared to MPEG-1. l3enc supports downsampling of input files with MPEG-1 sampling frequencies to MPEG-2 sampling frequencies using the -esr switch (section 1.5). We recommend this for lower bitrates as mentioned above. Input files with sampling frequencies <= 24 kHz can of course only be encoded with MPEG-2. For coding of stereo files with bitrates <=96 kbps, the use of intensity stereo is highly recommended. This is also the default configuration of the encoder. Note, however, that the use of intensity stereo will destroy information which is needed for sound processing schemes like Dolby Surround. For bitrates >= 112 kbps, intensity stereo is not used by default. Since it may improve the audio quality for 112 and 128 kbps, you may try its use by overriding the default settings with the -mod switch (see section 1.3). The following table summarizes the recommendations. - Coding of Mono Input bitrate coding recommendation --------------------------------------------------------- <=40 kbps MPEG-2 >=48 kbps MPEG-1 - Coding of Stereo Input bitrate coding Intensity Notes recommendation recommendation (also default) ------------------------------------------------------------------------- <=64 kbps MPEG-2 on 96 kbps MPEG-1 on 112 kbps MPEG-1 off intensity may improve quality 128 kbps MPEG-1 off intensity may improve quality >=192 kbps MPEG-1 off 2. DECODER V2.10 ============= l3dec is an ISO/MPEG Layer 3 software only decoder. It takes Layer 3 bit stream files as input and delivers PCM audio data files as output. A number of options can be selected via command line switches. Usage: l3dec [-switch1 [switch2 [...]]] If you specify no output file name and use the -sto option, the audio data is written to stdout. If you specify -sti, the decoder reads from stdin instead of the bitstream file. 2.1 bit stream input file The format of the bit stream input file must comply with ISO/IEC IS11172-3 or IS 13818-3. The decoder will process all valid MPEG1 Layer-3 bit stream data without restrictions to bit rate or sampling frequency. It supports also MPEG2 Layer-3 low sampling frequencies. 2.2 PCM audio data output file Audio data is output as samples of 16 bit signed integer PCM data. The default format is raw PCM data and can be either one channel or two interleaved channels. format of one (mono) channel PCM audio data: .... format of two channel (stereo) PCM audio data: ... If one or two audio channels are used depends on the encoded information in the bit stream. For stereo output data the first channel is the left channel. Information about sampling frequency and number of used channels is displayed at the beginning of the decoding process. 2.3 RIFF/WAVE format If selected by the '-wav' switch, audio data is output in RIFF/WAVE format (*.WAV) as used by Microsoft Windows. The audio data itself is still written as 16 bit PCM data as described in 2.2 but it is preceded by a WAVE-header. The WAVE-Header contains information about the number of channels (1 or 2), sampling frequency (32k/44.1k/48k) and used bits per sample (16). 2.4 SND format If selected by the '-snd' switch, audio data files are output in the SND format used on SUN and NeXT-Workstations. 2.5 AIFF format If selected by the '-aif' switch, audio data files are output in the AIFF format. 2.6 AIFC format If selected by the '-aic' switch, audio data files are output in the AIFC format. 2.7 skip frames With the '-fb' option you can skip a number of frames in the bit stream before the decoding starts. '-fb nnn' skips the first nnn frames. Each frame contains 1152 samples of audio data. Depending on the used sampling frequency, the duration of a frame is calculated as 24 msec (@ 48kHz), 26.1 msec (@ 44.1kHz) or 36 msec (@ 32kHz). 2.8 decode only nnn frames If you want to decode only a certain number of frames, specify the '-fn' option. '-fn xxx' will decode only xxx frames (see also 2.6). 2.9 search again after loss of synchronisation Normally the decoding process is stopped, if a loss of synchronisation is detected, i.e. the synch information is incorrect. To enable decoding of partially damaged bit stream files, you may assert the '-sa' option. In this mode the decoding is not stopped and the file is searched for valid synch information until end of file is encountered. 2.10 write audio data as ascii hex 24bit output file If the option '-h24 xxx' is specified an (additional) output file with name 'xxx' is opened. PCM Audio data is output as 24 bit ascii hex values followed by carriage return and line feed. Accuracy of the output values is 24 bit compared to the 16 bits raw output mode. Files output in 'h24' format take four times the storage capacity necessary for raw 16bit output format. 2.11 ignore error messages If errors in the bit stream are detected, the decoding process is normally halted. If the '-ign' option is specified, the decoder tries to continue with the decoding process. 2.11 accept free format bitstream If the '-ff' option is specified, a free format bitstream is accepted. 2.11 ancillary data If the bit stream contains ancillary data (user data integrated into the bit stream) the decoder can write this data into an ancillary data file. Use the switch '-a file' to specify the filename for the ancillary data. The default alignment of ancillary data is byte aligned ('-aba'). You can also use the switch '-afh' for the FhG mode. In FhG-mode, ancillary data is framed, beginning with a Sync, a length byte and has a trailing checksum. 2.12 write to stdout If the '-sto' option is specified, the PCM data output is written to stdout. 2.13 read from stdin If the '-sti' option is specified, the bitstream input is read from stdin. All brand names are registered trade marks of their respective owners.