Within an MPEG audio file, there is no main header, as an MPEG audio
file is just built up from a succession of smaller parts called frames.
Each frame is a datablock with its own header and audio information.
In the case of Layer I or Layer II, frames
are totally independent from each other, so you can cut any part of an MPEG
audio file and play it correctly. The player will then play the music starting
from the first full valid frame it will find. However, in the case of Layer
III, frames are not always independant. Due to the possible use of
the "byte reservoir", wich is a kind of internal buffer, frames are
often dependent of each other. In the worst case, 9 input frames may be
needed before beeing able to decode one single frame.
If you need to retrieve information about an MPEG
audio file, you might simply locate the first frame, and retrieve information
from its header. Information within other frames should be consistent with the first one,
except for the bitrate, as you might be retrieving information from a variable bitrate (VBR) file.
In a VBR file, the bitrate can be changed in each frame. It can
be used, as an exemple, to keep a constant sound quality during the
whole file, by using more bits when the music is more complex and thus requires
more bits to be encoded with a similar quality.
The frame header itself is 32 bits (4 bytes) length.
The first twelve bits (or first eleven bits in the case of the MPEG
2.5 extension) of a frame header are always set to 1 and are called
"frame sync". Frames may also feature an optional CRC checksum.
It is 16 bits long and, if it exists, immediately follows the frame header.
After the CRC comes the audio data. By re-calculating the CRC and
comparing its value to the sored one, you can check if the frame
has been altered during transmission of the bitstream.
Here are the details of what is within a frame header:
AAAAAAAA AAABBCCD EEEEFFGH IIJJKLMM
Sign |
Length
(bits) |
Position
(bits) |
Description |
A |
11 |
(31-21) |
Frame sync (all bits must be set) |
B |
2 |
(20,19) |
MPEG Audio version ID
00 - MPEG Version 2.5 (later extension of MPEG 2)
01 - reserved
10 - MPEG Version 2 (ISO/IEC 13818-3)
11 - MPEG Version 1 (ISO/IEC 11172-3)
Note: MPEG Version 2.5 was added lately to the MPEG 2 standard.
It is an extension used for very low bitrate files, allowing
the use of lower sampling frequencies. If your decoder does
not support this extension, it is recommended for you to use
12 bits for synchronization instead of 11 bits.
|
C |
2 |
(18,17) |
Layer description
00 - reserved
01 - Layer III
10 - Layer II
11 - Layer I |
D |
1 |
(16) |
Protection bit
0 - Protected by CRC (16bit CRC follows header)
1 - Not protected |
E |
4 |
(15,12) |
Bitrate index
bits |
V1,L1 |
V1,L2 |
V1,L3 |
V2,L1 |
V2, L2 & L3 |
0000 |
free |
free |
free |
free |
free |
0001 |
32 |
32 |
32 |
32 |
8 |
0010 |
64 |
48 |
40 |
48 |
16 |
0011 |
96 |
56 |
48 |
56 |
24 |
0100 |
128 |
64 |
56 |
64 |
32 |
0101 |
160 |
80 |
64 |
80 |
40 |
0110 |
192 |
96 |
80 |
96 |
48 |
0111 |
224 |
112 |
96 |
112 |
56 |
1000 |
256 |
128 |
112 |
128 |
64 |
1001 |
288 |
160 |
128 |
144 |
80 |
1010 |
320 |
192 |
160 |
160 |
96 |
1011 |
352 |
224 |
192 |
176 |
112 |
1100 |
384 |
256 |
224 |
192 |
128 |
1101 |
416 |
320 |
256 |
224 |
144 |
1110 |
448 |
384 |
320 |
256 |
160 |
1111 |
bad |
bad |
bad |
bad |
bad |
NOTES: All values are in kbps
V1 - MPEG Version 1
V2 - MPEG Version 2 and Version 2.5
L1 - Layer I
L2 - Layer II
L3 - Layer III
"free" means free format. The free bitrate must
remain constant, an must be lower than the maximum allowed
bitrate. Decoders are not required to support decoding of
free bitrate streams.
"bad" means that the value is unallowed.
MPEG files may feature variable bitrate (VBR). Each frame
may then be created with a different bitrate. It may be used
in all layers. Layer III decoders must support this method.
Layer I & II decoders may support it.
For Layer II there are some combinations of bitrate and mode
which are not allowed. Here is a list of allowed combinations.
bitrate |
single channel
|
stereo
|
intensity stereo
|
dual channel
|
free |
yes
|
yes
|
yes
|
yes
|
32 |
yes
|
no
|
no
|
no
|
48 |
yes
|
no
|
no
|
no
|
56 |
yes
|
no
|
no
|
no
|
64 |
yes
|
yes
|
yes
|
yes
|
80 |
yes
|
no
|
no
|
no
|
96 |
yes
|
yes
|
yes
|
yes
|
112 |
yes
|
yes
|
yes
|
yes
|
128 |
yes
|
yes
|
yes
|
yes
|
160 |
yes
|
yes
|
yes
|
yes
|
192 |
yes
|
yes
|
yes
|
yes
|
224 |
no
|
yes
|
yes
|
yes
|
256 |
no
|
yes
|
yes
|
yes
|
320 |
no
|
yes
|
yes
|
yes
|
384 |
no
|
yes
|
yes
|
yes
|
|
F |
2 |
(11,10) |
Sampling rate frequency index
bits |
MPEG1 |
MPEG2 |
MPEG2.5 |
00 |
44100 Hz |
22050 Hz |
11025 Hz |
01 |
48000 Hz |
24000 Hz |
12000 Hz |
10 |
32000 Hz |
16000 Hz |
8000 Hz |
11 |
reserv. |
reserv. |
reserv. |
|
G |
1 |
(9) |
Padding bit
0 - frame is not padded
1 - frame is padded with one extra slot
Padding is used to exactly fit the bitrate.As an example: 128kbps
44.1kHz layer II uses a lot of 418 bytes and some of 417 bytes
long frames to get the exact 128k bitrate. For Layer I slot
is 32 bits long, for Layer II and Layer III slot is 8 bits long. |
H |
1 |
(8) |
Private bit. This one is only informative. |
I |
2 |
(7,6) |
Channel Mode
00 - Stereo
01 - Joint stereo (Stereo)
10 - Dual channel (2 mono channels)
11 - Single channel (Mono)
Note: Dual channel files are made of two independant mono channel.
Each one uses exactly half the bitrate of the file. Most decoders
output them as stereo, but it might not always be the case.
One example of use would be some speech
in two different languages carried in the same bitstream, and
then an appropriate decoder would decode only the choosen language.
|
J |
2 |
(5,4) |
Mode extension (Only used in Joint stereo)
Mode extension is used to join informations that are of no
use for stereo effect, thus reducing needed bits. These bits
are dynamically determined by an encoder in Joint stereo mode,
and Joint Stereo can be changed from one frame to another,
or even switched on or off.
Complete frequency range of MPEG file is divided in subbands
There are 32 subbands. For Layer I & II these two bits determine
frequency range (bands) where intensity stereo is applied.
For Layer III these two bits determine which type of joint
stereo is used (intensity stereo or m/s stereo). Frequency
range is determined within decompression algorithm.
Layer I and II |
Layer III |
value |
Layer I & II |
00 |
bands 4 to 31 |
01 |
bands 8 to 31 |
10 |
bands 12 to 31 |
11 |
bands 16 to 31 |
|
Intensity stereo |
MS stereo |
off |
off |
on |
off |
off |
on |
on |
on |
|
|
K |
1 |
(3) |
Copyright
0 - Audio is not copyrighted
1 - Audio is copyrighted
The copyright has the same meaning as the copyright bit on CDs
and DAT tapes, i.e. telling that it is illegal to copy the contents
if the bit is set. |
L |
1 |
(2) |
Original
0 - Copy of original media
1 - Original media
The original bit indicates, if it is set, that the frame is
located on its original media. |
M |
2 |
(1,0) |
Emphasis
00 - none
01 - 50/15 ms
10 - reserved
11 - CCIT J.17
The emphasis indication is here to tell the decoder that the
file must be de-emphasized, ie the decoder must 're-equalize'
the sound after a Dolby-like noise supression. It is rarely
used. |
Some material used within this page originally came from
http://www.datavoyage.com/mpgscript/mpeghdr.htm by Predrag Supurovic
|