H.261 [ETC]

LINE TRANSMISSION OF NON-TELEPHONE SIGNALS; 非电话信号线传输
H.261
型号: H.261
厂家: ETC    ETC
描述:

LINE TRANSMISSION OF NON-TELEPHONE SIGNALS
非电话信号线传输

电话
文件: 总31页 (文件大小:125K)
中文:  中文翻译
下载:  下载PDF数据表文档文件
INTERNATIONAL TELECOMMUNICATION UNION  
H.261  
ITU-T  
TELECOMMUNICATION  
STANDARDIZATION SECTOR  
OF ITU  
(03/93)  
{This document has included corrections to typographical errors listed in Annex  
5 to COM 15R 16-E dated June 1994. - Sakae OKUBO}  
LINE TRANSMISSION OF NON-TELEPHONE  
SIGNALS  
VIDEO CODEC FOR AUDIOVISUAL  
SERVICES AT  
p × 64 kbit/s  
ITU-T Recommendation H.261  
(Previously “CCITT Recommendation”)  
FOREWORD  
The ITU Telecommunication Standardization Sector (ITU-T) is a permanent organ of the International Telecommunication  
Union. The ITU-T is responsible for studying technical, operating and tariff questions and issuing Recommendations on them  
with a view to standardizing telecommunications on a worldwide basis.  
The World Telecommunication Standardization Conference (WTSC), which meets every four years, established the topics for  
study by the ITU-T Study Groups which, in their turn, produce Recommendations on these topics.  
ITU-T Recommendation H.261 was revised by the ITU-T Study Group XV (1988-1993) and was approved by the WTSC  
(Helsinki, March 1-12, 1993).  
___________________  
NOTES  
As a consequence of a reform process within the International Telecommunication Union (ITU), the CCITT ceased  
1
to exist as of 28 February 1993. In its place, the ITU Telecommunication Standardization Sector (ITU-T) was created as of 1  
March 1993. Similarly, in this reform process, the CCIR and the IFRB have been replaced by the Radiocommunication  
Sector.  
In order not to delay publication of this Recommendation, no change has been made in the text to references containing the  
acronyms “CCITT, CCIR or IFRB” or their associated entities such as Plenary Assembly, Secretariat, etc. Future editions of  
this Recommendation will contain the proper terminology related to the new ITU structure.  
2
In this Recommendation, the expression “Administration” is used for conciseness to indicate both a  
telecommunication administration and a recognized operating agency.  
ITU 1994  
All rights reserved. No part of this publication may be reproduced or utilized in any form or by any means, electronic or  
mechanical, including photocopying and microfilm, without permission in writing from the ITU.  
CONTENTS  
Recommendation H.261 (03/93)  
Page  
1
2
Scope..............................................................................................................................................................  
Brief specification ..........................................................................................................................................  
1
1
2
2
2
2
2
3
3
3
2.1  
2.2  
2.3  
2.4  
2.5  
2.6  
2.7  
2.8  
Video input and output......................................................................................................................  
Digital output and input ....................................................................................................................  
Sampling frequency ..........................................................................................................................  
Source coding algorithm ...................................................................................................................  
Bit rate ..............................................................................................................................................  
Symmetry of transmission................................................................................................................  
Error handling...................................................................................................................................  
Multipoint operation .........................................................................................................................  
3
Source coder...................................................................................................................................................  
3
3
3
6
6
3.1  
3.2  
3.3  
3.4  
Source format....................................................................................................................................  
Video source coding algorithm.........................................................................................................  
Coding control ..................................................................................................................................  
Forced updating ................................................................................................................................  
4
5
Video multiplex coder....................................................................................................................................  
7
7
7
4.1  
4.2  
4.3  
Data structure....................................................................................................................................  
Video multiplex arrangement............................................................................................................  
Multipoint considerations .................................................................................................................  
18  
Transmission coder.........................................................................................................................................  
19  
19  
19  
20  
20  
5.1  
5.2  
5.3  
5.4  
Bit rate ..............................................................................................................................................  
Video data buffering .........................................................................................................................  
Video coding delay ...........................................................................................................................  
Forward error correction for coded video signal...............................................................................  
Annex A – Inverse transform accuracy specification...............................................................................................  
Annex B – Hypothetical reference decoder .............................................................................................................  
Annex C – Codec delay measurement method........................................................................................................  
Annex D – Still image transmission........................................................................................................................  
21  
22  
23  
24  
Recommendation H.261 (03/93)  
i
Recommendation H.261  
Recommendation H.261 (03/93)  
VIDEO CODEC FOR AUDIOVISUAL SERVICES AT X 64 kbit/s  
p
(Geneva, 1990; revised at Helsinki, 1993)  
The CCITT,  
considering  
(a)  
that there is significant customer demand for videophone, videoconference and other audiovisual services;  
that circuits to meet this demand can be provided by digital transmission using the B, H rates or their multiples up  
(b)  
0
to the primary rate or H /H rates;  
11 12  
(c)  
that ISDNs are likely to be available in some countries that provide a switched transmission service at the B, H or  
0
H /H rate;  
11 12  
(d)  
that the existence of different digital hierarchies and different television standards in different parts of the world  
complicates the problems of specifying coding and transmission standards for international connections;  
(e)  
that a number of audiovisual services are likely to appear using basic and primary rate ISDN accesses and that some  
means of intercommunication among these terminals should be possible;  
(f) that the video codec provides an essential element of the infrastructure for audiovisual services which allows such  
intercommunication in the framework of Recommendation H.200;  
(g) that Recommendation H.120 for videoconferencing using primary digital group transmission was the first in an  
evolving series of Recommendations,  
appreciating  
that advances have been made in research and development of video coding and bit rate reduction techniques which lead to  
the use of lower bit rates down to 64 kbit/s so that this may be considered as the second in the evolving series of  
Recommendations,  
and noting  
that it is the basic objective of the CCITT to recommend unique solutions for international connections,  
recommends  
that in addition to those codecs complying to Recommendation H.120, codecs having signal processing and transmission  
coding characteristics described below should be used for international audiovisual services.  
NOTES  
1
2
Codecs of this type are also suitable for some television services where full broadcast quality is not required.  
Equipment for transcoding from and to codecs according to Recommendation H.120 is under study.  
1
Scope  
This Recommendation describes the video coding and decoding methods for the moving picture component of audiovisual  
services at the rates of p × 64 kbit/s, where p is in the range 1 to 30.  
2
Brief specification  
An outline block diagram of the codec is given in Figure 1.  
Recommendation H.261 (03/93)  
1
Extern al con tro l  
Cod in g contro l  
S ource  
co der  
V id eo mu ltiplex  
Transm ission  
bu ffe r  
Tra nsm ission  
cod er  
co de r  
Vid eo  
sign al  
Cod ed  
bit s tre am  
a) Video coder  
So urce  
decod er  
Receivin g  
buffer  
Rece iving  
de cod er  
Vid eo mu ltiplex  
de co de r  
b) Video decoder  
T 1502 430-90/d01  
FIGURE 1/H.261  
Outline block diagramof the video codec  
FIGURE 1/H.261...[D01] = 9 CM  
2.1  
Video input and output  
To permit a single Recommendation to cover use in and between regions using 625- and 525-line television standards, the  
source coder operates on pictures based on a common intermediate format (CIF). The standards of the input and output  
television signals, which may, for example, be composite or component, analogue or digital and the methods of performing  
any necessary conversion to and from the source coding format are not subject to Recommendation.  
2.2  
Digital output and input  
The video coder provides a self-contained digital bit stream which may be combined with other multi-facility signals (for  
example as defined in Recommendation H.221). The video decoder performs the reverse process.  
2.3  
Sampling frequency  
Pictures are sampled at an integer multiple of the video line rate. This sampling clock and the digital network clock are  
asynchronous.  
2.4  
Source coding algorithm  
A hybrid of inter-picture prediction to utilize temporal redundancy and transform coding of the remaining signal to reduce  
spatial redundancy is adopted. The decoder has motion compensation capability, allowing optional incorporation of this  
technique in the coder.  
2.5  
Bit rate  
This Recommendation is primarily intended for use at video bit rates between approximately 40 kbit/s and 2 Mbit/s.  
2.6  
Symmetry of transmission  
The codec may be used for bidirectional or unidirectional visual communication.  
2
Recommendation H.261 (03/93)  
2.7  
Error handling  
The transmitted bit-stream contains a BCH code (Bose, Chaudhuri and Hocquengham) (511,493) forward error correction  
code. Use of this by the decoder is optional.  
2.8  
Multipoint operation  
Features necessary to support switched multipoint operation are included.  
3
Source coder  
Source format  
3.1  
The source coder operates on non-interlaced pictures occurring 30 000/1001 (approximately 29.97) times per second. The  
tolerance on picture frequency is ± 50 ppm.  
Pictures are coded as luminance and two colour difference components (Y, C and C ). These components and the codes  
B
R
representing their sampled values are as defined in CCIR Recommendation 601.  
Black = 16  
White = 235  
Zero colour difference = 128  
Peak colour difference = 16 and 240.  
These values are nominal ones and the coding algorithm functions with input values of 1 through to 254.  
Two picture scanning formats are specified.  
In the first format (CIF), the luminance sampling structure is 352 pels per line, 288 lines per picture in an orthogonal  
arrangement. Sampling of each of the two colour difference components is at 176 pels per line, 144 lines per picture,  
orthogonal. Colour difference samples are sited such that their block boundaries coincide with luminance block boundaries as  
shown in Figure 2. The picture area covered by these numbers of pels and lines has an aspect ratio of 4:3 and corresponds to  
the active portion of the local standard video input.  
NOTE – The number of pels per line is compatible with sampling the active portions of the luminance and colour difference  
signals from 525- or 625-line sources at 6.75 and 3.375 MHz, respectively. These frequencies have a simple relationship to those in CCIR  
Recommendation 601.  
The second format, quarter-CIF (QCIF), has half the number of pels and half the number of lines stated above. All codecs  
must be able to operate using QCIF. Some codecs can also operate with CIF.  
Means shall be provided to restrict the maximum picture rate of encoders by having at least 0, 1, 2 or 3 non-transmitted  
pictures between transmitted ones. Selection of this minimum number and CIF or QCIF shall be by external means (for  
example via Recommendation H.221).  
3.2  
Video source coding algorithm  
The source coder is shown in generalized form in Figure 3. The main elements are prediction, block transformation and  
quantization.  
The prediction error (INTER mode) or the input picture (INTRA mode) is subdivided into 8 pel by 8 line blocks which are  
segmented as transmitted or non-transmitted. Further, four luminance blocks and the two spatially corresponding colour  
difference blocks are combined to form a macroblock as shown in Figure 10.  
Recommendation H.261 (03/93)  
3
T 150 818 0-92 /d0 2  
Lum inan ce sa mple  
Ch romin ance sam ple  
Block ed ge  
FIGURE2/H.261  
Positioning of luminance and chrominance samples  
FIGURE 2/H.261...[D02] = 10 CM  
The criteria for choice of mode and transmitting a block are not subject to recommendation and may be varied dynamically as  
part of the coding control strategy. Transmitted blocks are transformed and resulting coefficients are quantized and variable  
length coded.  
3.2.1  
The prediction is inter-picture and may be augmented by motion compensation (see 3.2.2) and a spatial filter (see 3.2.3).  
3.2.2 Motion compensation  
Prediction  
Motion compensation (MC) is optional in the encoder. The decoder will accept one vector per macroblock. Both horizontal  
and vertical components of these motion vectors have integer values not exceeding ± 15. The vector is used for all four  
luminance blocks in the macroblock. The motion vector for both colour difference blocks is derived by halving the  
component values of the macroblock vector and truncating the magnitude parts towards zero to yield integer components.  
A positive value of the horizontal or vertical component of the motion vector signifies that the prediction is formed from pels  
in the previous picture which are spatially to the right or below the pels being predicted.  
Motion vectors are restricted such that all pels referenced by them are within the coded picture area.  
4
Recommendation H.261 (03/93)  
p
t
CC  
q z  
q
T
Q
V id e o  
in  
Q- 1  
To vide o  
m ultiplex  
co de r  
T - 1  
F
P
v
f
T 150 24 41-9 0/d0 3  
T
Q
P
Tra ns fo rm  
Q ua ntizer  
P ictu re m e m ory with m o tio n com p en sa ted variab le de lay  
L o op filte r  
F
CC Co d in g co ntro l  
p
t
Flag fo r INTRA /INTE R  
Flag fo r transm itte d o r not  
Q ua ntizer ind ic ation  
Q ua ntizing in de x for tra n sform coe fficien ts  
M otion v ector  
q z  
q
v
f
S witch in g o n/off of the lo op filte r  
FIGURE 3/H.261  
Source coder  
FIGURE 3/H.261...[D03] = 16 CM  
3.2.3  
Loop filter  
The prediction process may be modified by a two-dimensional spatial filter (FIL) which operates on pels within a predicted 8  
by 8 block.  
The filter is separable into one-dimensional horizontal and vertical functions. Both are non-recursive with coefficients of 1/4,  
1/2, 1/4 except at block edges where one of the taps would fall outside the block. In such cases the 1-D filter is changed to  
have coefficients of 0, 1, 0. Full arithmetic precision is retained with rounding to 8 bit integer values at the 2-D filter output.  
Values whose fractional part is one half are rounded up.  
Recommendation H.261 (03/93)  
5
The filter is switched on/off for all six blocks in a macroblock according to the macroblock type (see 4.2.3, MTYPE).  
3.2.4 Transformer  
Transmitted blocks are first processed by a separable two-dimensional discrete cosine transform of size 8 by 8. The output  
from the inverse transform ranges from –256 to +255 after clipping to be represented with 9 bits. The transfer function of the  
inverse transform is given by:  
7
7
1
f (x, y) =  
C(u) C(v) F(u,v)cos[π (2x + 1) u /16]cos[π (2y + 1) v/16]  
∑ ∑  
4
=
=
0
u
0
v
with  
u, v, x, y = 0, 1, 2, . . ., 7  
where  
x,y = spatial coordinates in the pel domain,  
u,v = coordinates in the transform domain,  
=
C(u)  
1 / 2  
1/ 2  
for u = 0; otherwise 1,  
=
C(v)  
for v = 0; otherwise 1.  
NOTE – Within the block being transformed, x = 0 and y = 0 refer to the pel nearest the left and top edges of the picture,  
respectively.  
The arithmetic procedures for computing the transforms are not defined, but the inverse one should meet the error tolerance  
specified in Annex A.  
3.2.5  
Quantization  
The number of quantizers is 1 for the INTRA dc coefficient and 31 for all other coefficients. Within a macroblock the same  
quantizer is used for all coefficients except the INTRA dc one. The decision levels are not defined. The INTRA dc  
coefficient is nominally the transform value linearly quantized with a stepsize of 8 and no dead-zone. Each of the other 31  
quantizers is also nominally linear but with a central dead-zone around zero and with a step size of an even value in the range  
2 to 62.  
The reconstruction levels are as defined in 4.2.4.  
NOTE – For the smaller quantization step sizes, the full dynamic range of the transform coefficients cannot be represented.  
3.2.6  
Clipping of reconstructed picture  
To prevent quantization distortion of transform coefficient amplitudes causing arithmetic overflow in the encoder and  
decoder loops, clipping functions are inserted. The clipping function is applied to the reconstructed picture which is formed  
by summing the prediction and the prediction error as modified by the coding process. This clipper operates on resulting pel  
values less than 0 or greater than 255, changing them to 0 and 255, respectively.  
3.3  
Coding control  
Several parameters may be varied to control the rate of generation of coded video data. These include processing prior to the  
source coder, the quantizer, block significance criterion and temporal sub-sampling. The proportions of such measures in the  
overall control strategy are not subject to recommendation.  
When invoked, temporal sub-sampling is performed by discarding complete pictures.  
3.4  
Forced updating  
This function is achieved by forcing the use of the INTRA mode of the coding algorithm. The update pattern is not defined.  
For control of accumulation of inverse transform mismatch error a macroblock should be forcibly updated at least once per  
every 132 times it is transmitted.  
6
Recommendation H.261 (03/93)  
4
Video multiplex coder  
Data structure  
4.1  
Unless specified otherwise the most significant bit is transmitted first. This is bit 1 and is the leftmost bit in the code tables in  
this Recommendation. Unless specified otherwise all unused or spare bits are set to “1”. Spare bits must not be used until  
their functions are specified by the CCITT.  
4.2  
Video multiplex arrangement  
The video multiplex is arranged in a hierarchical structure with four layers. From top to bottom the layers are:  
picture;  
Group of blocks (GOB);  
Macroblock (MB);  
Block.  
A syntax diagram of the video multiplex coder is shown in Figure 4. Abbreviations are defined in later subclauses.  
4.2.1 Picture layer  
Data for each picture consists of a picture header followed by data for GOBs. The structure is shown in Figure 5. Picture  
headers for dropped pictures are not transmitted.  
4.2.1.1 Picture start code (PSC) (20 bits)  
A word of 20 bits. Its value is 0000 0000 0000 0001 0000.  
4.2.1.2 Temporal reference (TR) (5 bits)  
A 5-bit number which can have 32 possible values. It is formed by incrementing its value in the previously transmitted picture  
header by one plus the number of non-transmitted pictures (at 29.97 Hz) since that last transmitted one. The arithmetic is  
performed with only the five LSBs.  
4.2.1.3 Type information (PTYPE) (6 bits)  
Information about the complete picture:  
Bit 1  
Bit 2  
Bit 3  
Bit 4  
Bit 5  
Bit 6  
Split screen indicator, “0” off, “1” on;  
Document camera indicator, “0” off, “1” on;  
Freeze picture release, “0” off, “1” on;  
Source format, “0” QCIF, “1” CIF;  
Optional still image mode HI_RES defined in Annex D; “0” on, “1” off;  
Spare.  
4.2.1.4 Extra insertion information (PEI) (1 bit)  
A bit which when set to “1” signals the presence of the following optional data field.  
4.2.1.5 Spare information (PSPARE) (0/8/16 . . . bits)  
If PEI is set to “1”, then 9 bits follow consisting of 8 bits of data (PSPARE) and then another PEI bit to indicate if a further 9  
bits follow and so on. Encoders must not insert PSPARE until specified by the CCITT. Decoders must be designed to discard  
PSPARE if PEI is set to 1. This will allow the CCITT to specify future backward compatible additions in PSPARE.  
Recommendation H.261 (03/93)  
7
P icture layer  
P SC  
TR  
PTYP E  
PE I  
P SP ARE  
GO B layer  
G OB lay er  
G BS C  
GN  
GQ UANT  
GE I  
GS PA RE  
M B layer  
M B layer  
M V D  
M V D  
B lock layer  
M BA  
M TYPE  
M QUA NT  
CBP  
CB P  
M B A stuffing  
B lock laye r  
TCOE FF  
EOB  
T15 02 45 1-9 0/d0 4  
Fixed length  
V ariable len gth  
FIGURE 4/H.261  
Syntax diagramfor the video multiplex coder  
FIGURE 4/H.261...[D04] = 21 CM PAGE PLEINE  
8
Recommendation H.261 (03/93)  
GOB data  
PSC  
T R  
PTYPE  
PE I  
PSPA RE  
PEI  
T1 5142 30-93/d0 5  
FIGURE 5/H.261  
Structureof picturelayer  
FIGURE 5/H.261...[D05] = 3 CM  
4.2.2  
Group of blocks layer  
Each picture is divided into groups of blocks (GOBs). A group of blocks (GOB) comprises one twelfth of the CIF or one  
third of the QCIF picture areas (see Figure 6). A GOB relates to 176 pels by 48 lines of Y and the spatially corresponding 88  
pels by 24 lines of each of C and C .  
B
R
Data for each group of blocks consists of a GOB header followed by data for macroblocks. The structure is shown in Figure  
7. Each GOB header is transmitted once between picture start codes in the CIF or QCIF sequence numbered in Figure 6, even  
if no macroblock data is present in that GOB.  
1
3
2
4
1
3
5
6
5
7
8
QCIF  
9
10  
12  
11  
CIF  
FIGURE 6/H.261  
Arrangement of GOBs in a picture  
Recommendation H.261 (03/93)  
9
GBS C  
GN  
GQUANT  
GEI  
G SPA RE  
G EI  
M B d ata  
T151 42 40- 93/d0 6  
FIGURE 7/H.261  
Structureof groupofblocks layer  
FIGURE 7/H.261...[D06] = 3 CM  
4.2.2.1 Group of blocks start code (GBSC) (16 bits)  
A word of 16 bits, 0000 0000 0000 0001.  
4.2.2.2 Group number (GN) (4 bits)  
Four bits indicating the position of the group of blocks. The bits are the binary representation of the number in Figure 6.  
Group numbers 13, 14 and 15 are reserved for future use. Group number 0 is used in the PSC.  
4.2.2.3 Quantizer information (GQUANT) (5 bits)  
A fixed length codeword of 5 bits which indicates the quantizer to be used in the group of blocks until overridden by any  
subsequent MQUANT. The codewords are the natural binary representations of the values of QUANT (see 4.2.4) which,  
being half the step sizes, range from 1 to 31.  
4.2.2.4 Extra insertion information (GEI) (1 bit)  
A bit which when set to “1” signals the presence of the following optional data field.  
4.2.2.5 Spare information (GSPARE) (0/8/16 . . . bits)  
If GEI is set to “1”, then 9 bits follow consisting of 8 bits of data (GSPARE) and then another GEI bit to indicate if a further  
9 bits follow and so on. Encoders must not insert GSPARE until specified by the CCITT. Decoders must be designed to  
discard GSPARE if GEI is set to 1. This will allow the CCITT to specify future “backward” compatible additions in  
GSPARE.  
NOTE – Emulation of start codes may occur if the future specification of GSPARE has no restrictions on the final GSPARE  
data bits.  
4.2.3  
Macroblock layer  
Each GOB is divided into 33 macroblocks as shown in Figure 8. A macroblock relates to 16 pels by 16 lines of Y and the  
spatially corresponding 8 pels by 8 lines of each of C and C .  
B
R
Data for a macroblock consists of an MB header followed by data for blocks (see Figure 9). MQUANT, MVD and CBP are  
present when indicated by MTYPE.  
10  
Recommendation H.261 (03/93)  
1
2
3
4
5
6
7
8
9
10  
21  
32  
11  
22  
33  
12  
23  
13  
24  
14  
25  
15  
26  
16  
27  
17  
28  
18  
29  
19  
30  
20  
31  
FIGURE 8/H.261  
Arrangement of macroblocks in a GOB  
MBA  
MTYPE  
MQUANT  
MVD  
CBP  
Block data  
FIGURE 9/H.261  
Structure of macroblock layer  
4.2.3.1 Macroblock address (MBA) (Variable length)  
A variable length codeword indicating the position of a macroblock within a group of blocks. The transmission order is as  
shown in Figure 8. For the first transmitted macroblock in a GOB, MBA is the absolute address in Figure 8. For subsequent  
macroblocks, MBA is the difference between the absolute addresses of the macroblock and the last transmitted macroblock.  
The code table for MBA is given in Table 1.  
An extra codeword is available in the table for bit stuffing immediately after a GOB header or a coded macroblock (MBA  
stuffing). This codeword should be discarded by decoders.  
The VLC for start code is also shown in Table 1.  
MBA is always included in transmitted macroblocks.  
Macroblocks are not transmitted when they contain no information for that part of the picture.  
Recommendation H.261 (03/93)  
11  
TABLE 1/H.261  
VLC table for macroblock addressing  
MBA  
Code  
MBA  
Code  
11  
12  
13  
14  
15  
16  
17  
18  
19  
10  
11  
12  
13  
14  
15  
16  
1
011  
010  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
0101  
10  
01  
00  
11  
0101  
0101  
0100  
0100  
0100  
0100  
0100  
0100  
0011  
0011  
0011  
0011  
0011  
0011  
0011  
0011  
0001  
0000  
0011  
0010  
0001  
0001  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
0000  
10  
1
0
011  
010  
001  
000  
111  
110  
101  
100  
011  
010  
001  
000  
111  
0000 0001  
111  
110  
1011  
1010  
1001  
1000  
0111  
0110  
0101  
11  
MBA stuffing  
Start code  
4.2.3.2 Type information (MTYPE) (Variable length)  
Variable length codewords giving information about the macroblock and which data elements are present. Macroblock types,  
included elements and VLC words are listed in Table 2.  
MTYPE is always included in transmitted macroblocks.  
4.2.3.3 Quantizer (MQUANT) (5 bits)  
MQUANT is present only if so indicated by MTYPE.  
A codeword of 5 bits signifying the quantizer to be used for this and any following blocks in the group of blocks until  
overridden by any subsequent MQUANT.  
Codewords for MQUANT are the same as for GQUANT.  
12  
Recommendation H.261 (03/93)  
TABLE 2/H.261  
VLC table for MTYPE  
MQUANT  
MVD  
CBP  
TCOEFF  
VLC  
Prediction  
x
x
x
x
0001  
Intra  
Intra  
Inter  
Inter  
x
x
0000 001  
1
x
x
0000 1  
x
x
x
x
x
x
0000 0000  
0000 0001  
1
Inter + MC  
x
x
x
x
Inter + MC  
x
x
0000 0000 01  
Inter + MC  
001  
Inter + MC + FIL  
Inter + MC + FIL  
Inter + MC + FIL  
x
x
x
x
01  
0000 01  
NOTES  
1
2
“x” means that the item is present in the macroblock.  
It is possible to apply the filter in a non-motion compensated macroblock by declaring it as MC + FIL but  
with a zero vector.  
4.2.3.4 Motion vector data (MVD) (Variable length)  
Motion vector data is included for all MC macroblocks. MVD is obtained from the macroblock vector by subtracting the  
vector of the preceding macroblock. For this calculation the vector of the preceding macroblock is regarded as zero in the  
following three situations:  
1) evaluating MVD for macroblocks 1, 12 and 23;  
2) evaluating MVD for macroblocks in which MBA does not represent a difference of 1;  
3) MTYPE of the previous macroblock was not MC.  
MVD consists of a variable length codeword for the horizontal component followed by a variable length codeword for the  
vertical component. Variable length codes are given in Table 3.  
Advantage is taken of the fact that the range of motion vector values is constrained. Each VLC word represents a pair of  
difference values. Only one of the pair will yield a macroblock vector falling within the permitted range.  
4.2.3.5 Coded block pattern (CBP) (Variable length)  
CBP is present if indicated by MTYPE. The codeword gives a pattern number signifying those blocks in the macroblock for  
which at least one transform coefficient is transmitted. The pattern number is given by:  
32 · P + 16 · P + 8 · P + 4 · P + 2 · P + P  
6
1
2
3
4
5
where P = 1 if any coefficient is present for block n, else 0. Block numbering is given in Figure 10.  
n
The codewords for CBP are given in Table 4.  
Recommendation H.261 (03/93)  
13  
TABLE 3/H.261  
VLC table for MVD  
MVD  
Code  
–16 & 16  
–15 & 17  
–14 & 18  
–13 & 19  
–12 & 20  
–11 & 21  
–10 & 22  
o–9 & 23  
o–8 & 24  
o–7 & 25  
o–6 & 26  
o–5 & 27  
o–4 & 28  
o–3 & 29  
o–2 & 30  
o–1 & 23  
o–0 & 23  
o–1 & 23  
o–2 & –30  
o–3 & –29  
o–4 & –28  
o–5 & –27  
o–6 & –26  
o–7 & –25  
o–8 & –24  
o–9 & –23  
10 & –22  
11 & –21  
12 & –20  
13 & –19  
14 & –18  
15 & –17  
0000 0011 001  
0000 0011 011  
0000 0011 101  
0000 0011 111  
0000 0100 001  
0000 0100 011  
0000 0100 111  
0000 0101 011  
0000 0101 111  
0000 0111 111  
0000 1001 110  
0000 1011 110  
0000 1111 110  
0001 1111 110  
0011 1111 110  
0111 0100 110  
1000 0100 110  
01000 0100  
1
0010 1111 111  
0001 0111 111  
0000 1101 111  
0000 1010 111  
0000 1000 110  
0000 0110 110  
0000 0101 100  
0000 0101 001  
0000 0100 101  
0000 0100 010  
0000 0100 000  
0000 0011 110  
0000 0011 100  
0000 0011 010  
4.2.4  
Block layer  
A macroblock comprises four luminance blocks and one of each of the two colour difference blocks (see Figure 10).  
Data for a block consists of codewords for transform coefficients followed by an end of block marker (see Figure 11). The  
order of block transmission is as in Figure 10.  
4.2.4.1 Transform coefficients (TCOEFF)  
Transform coefficient data is always present for all six blocks in a macroblock when MTYPE indicates INTRA. In other  
cases MTYPE and CBP signal which blocks have coefficient data transmitted for them. The quantized transform coefficients  
are sequentially transmitted according to the sequence given in Figure 12.  
The most commonly occurring combinations of successive zeros (RUN) and the following value (LEVEL) are encoded with  
variable length codes. Other combinations of (RUN, LEVEL) are encoded with a 20-bit word consisting of 6 bits ESCAPE, 6  
bits RUN and 8 bits LEVEL. For the variable length encoding there are two code tables, one being used for the first  
transmitted LEVEL in INTER, INTER+MC and INTER+MC+FIL blocks, the second for all other LEVELs except the first  
one in INTRA blocks which is fixed length coded with 8 bits.  
14  
Recommendation H.261 (03/93)  
TABLE 4/H.261  
VLC table for CBP  
CBP  
Code  
CBP  
Code  
0001 1100 0  
0001 1011 0  
0001 1010 0  
0001 1001 0  
0001 1000 0  
0001 0111 0  
0001 0110 0  
0001 0101 0  
0001 0100 0  
0001 0011 0  
0001 0010 0  
0001 0001 0  
0001 0000 0  
0000 1111 0  
0000 1110 0  
0000 1101 0  
0000 1100 0  
0000 1011 0  
0000 1010 0  
0000 1001 0  
0000 1000 0  
0000 0111 0  
0000 0110 0  
0000 0101 0  
0000 0100 0  
0000 0011 1  
0000 0011 0  
0000 0010 1  
0000 0010 0  
0000 0001 1  
0000 0001 0  
60  
64  
68  
16  
32  
12  
48  
20  
40  
28  
44  
52  
56  
61  
61  
62  
62  
24  
36  
63  
63  
65  
69  
17  
33  
66  
10  
18  
34  
37  
11  
19  
1110 0000  
1101 0000  
1100 0000  
1011 0000  
1010 1000  
1001 1000  
1001 0000  
1000 1000  
1000 0000  
0111 1000  
0111 0000  
0110 1000  
0110 0000  
0101 1000  
0101 0000  
0100 1000  
0100 0000  
0011 1100  
0011 1000  
0011 0100  
0011 0000  
0010 1110  
0010 1100  
0010 1010  
0010 1000  
0010 0110  
0010 0100  
0010 0010  
0010 0000  
0001 1111  
0001 1110  
0001 1101  
35  
13  
49  
21  
41  
14  
50  
22  
42  
15  
51  
23  
43  
25  
37  
26  
38  
29  
45  
53  
57  
30  
46  
54  
58  
31  
47  
55  
59  
27  
39  
1
3
2
4
5
6
Y
C
B
C
R
FIGURE 10/H.261  
Arrangement of blocks in a macroblock  
Recommendation H.261 (03/93)  
15  
TCOEFF  
EOB  
FIGURE 11/H.261  
Structure of block layer  
Increas ing cycles  
pe r picture width  
1
3
2
5
6
7
15  
17  
26  
32  
40  
47  
1 6  
2 7  
3 1  
4 1  
4 6  
5 2  
28  
30  
42  
45  
53  
56  
29  
43  
44  
54  
55  
61  
8
14  
18  
25  
33  
39  
4
9
1 3  
1 9  
2 4  
3 4  
Increasin g cycles  
p er picture heigh t  
10  
11  
21  
12  
20  
23  
22  
36  
35  
37  
3 8  
4 9  
48  
50  
51  
58  
5 7  
5 9  
60  
63  
62  
64  
T1 51 410 0-9 3/d07  
FIGURE 12/H.261  
Transmission order for transformcoefficients  
FIGURE 12/H.261...[D07] = 6 CM  
Codes are given in Table 5.  
The most commonly occurring combinations of zero-run and the following value are encoded with variable length codes as  
listed in the table 5. End of block (EOB) is in this set. Because CBP indicates those blocks with no coefficient data, EOB  
cannot occur as the first coefficient. Hence EOB can be removed from the VLC table for the first coefficient.  
The last bit “s” denotes the sign of the level, “0” for positive and “1” for negative.  
The remaining combinations of (run, level) are encoded with a 20-bit word consisting of 6 bits escape, 6 bits run and 8 bits  
level. Use of this 20-bit word form encoding the combinations listed in the VLC table is not prohibited.  
16  
Recommendation H.261 (03/93)  
TABLE 5/H.261  
VLC table for TCOEFF  
Recommendation H.261 (03/93)  
17  
Run  
Level  
Code  
10  
1s  
EOB  
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
2
2
2
2
2
3
3
3
3
4
4
4
5
5
5
6
6
7
7
8
8
9
9
10  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
Escape  
a) If first coefficient in block  
11s Not first coefficient in block  
0100 s  
1
1
2
3
4
5
6
7
8
0010 1s  
0000 110s  
0010 0110 s  
0010 0001 s  
0000 0010 10s  
0000 0001 1101 s  
0000 0001 1000 s  
0000 0001 0011 s  
0000 0001 0000 s  
0000 0000 1101 0s  
0000 0000 1100 1s  
0000 0000 1100 0s  
0000 0000 1011 1s  
011s  
0001 10s  
0010 0101 s  
0000 0011 00s  
0000 0001 1011 s  
0000 0000 1011 0s  
0000 0000 1010 1s  
0101 s  
9
10  
11  
12  
13  
14  
15  
1
2
3
4
5
6
7
1
2
3
4
5
1
2
3
4
1
2
3
1
2
3
1
2
1
2
1
2
1
2
1
2
1
1
1
1
1
1
1
1
1
1
1
1
1
0000 100s  
0000 0010 11s  
0000 0001 0100 s  
0000 0000 1010 0s  
0011 1s  
0010 0100 s  
0000 0001 1100 s  
0000 0000 1001 1s  
0011 0s  
0000 0011 11s  
0000 0001 0010 s  
0001 11s  
0000 0010 01s  
0000 0000 1001 0s  
0001 01s  
0000 0001 1110 s  
0001 00s  
0000 0001 0101 s  
0000 111s  
0000 0001 0001 s  
0000 101s  
0000 0000 1000 1s  
0010 0111 s  
0000 0000 1000 0s  
0010 0011 s  
0010 0010 s  
0010 0000 s  
0000 0011 10s  
0000 0011 01s  
0000 0010 00s  
0000 0001 1111 s  
0000 0001 1010 s  
0000 0001 1001 s  
0000 0001 0111 s  
0000 0001 0110 s  
0000 0000 1111 1s  
0000 0000 1111 0s  
0000 0000 1110 1s  
0000 0000 1110 0s  
0000 0000 1101 1s  
0000 01  
1
1
1
a)  
Never used in INTRA macroblocks.  
18  
Recommendation H.261 (03/93)  
Run is a 6 bit fixed length code  
Level is an 8 bit fixed length code  
Run  
Code  
Level  
Code  
0
1
0000 00  
0000 01  
0000 10  
ξ
−128  
−127  
ξ
FORBIDDEN  
1000 0001  
ξ
2
ξ
00−2  
00−1  
0000  
1111 1110  
1111 1111  
FORBIDDEN  
ξ
ξ
63  
1111 11  
0001  
0002  
ξ
0000 0001  
0000 0010  
ξ
0127  
0111 1111  
For all coefficients other than the INTRA dc one, the reconstruction levels (REC) are in the range 2048 to 2047 and are  
given by clipping the results of the following formuls:  
REC = QUANT • (2 • level + 1); level > 0  
U
QUANT =  
“odd”  
V
=
<
REC QUANT • (2 • level 1); level  
0 W  
REC = QUANT • (2 • level + 1) 1; level > 0  
U
QUANT =  
“even”  
V
=
+
<
REC QUANT • (2 • level 1) 1; level  
0 W  
REC = 0; level = 0  
NOTE – QUANT ranges from 1 to 31 and is transmitted by either GQUANT or MQUANT.  
Recommendation H.261 (03/93)  
19  
Reconstruction levels (REC)  
QUANT  
8
Level  
127  
1
2
3
4
ξ
9
ξ
17  
18  
ξ
30  
31  
255  
253  
509  
505  
765  
759  
1019  
1011  
ξ
ξ
2039  
2023  
2048  
2048  
ξ
ξ
2048  
2048  
2048  
2048  
ξ
ξ
2048  
2048  
2048  
2048  
126  
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
002  
001  
0000  
0001  
0002  
0003  
0004  
0005  
005  
003  
0000  
0003  
0005  
0007  
0009  
0011  
009  
005  
0000  
0005  
0009  
0013  
0017  
0021  
015  
009  
0000  
0009  
0015  
0021  
0027  
0033  
0019  
0011  
00000  
00011  
00019  
00027  
00035  
00043  
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
0039  
0023  
00000  
00023  
00039  
00055  
00071  
00087  
0045  
0027  
00000  
00027  
00045  
00063  
00081  
00099  
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
0085  
0051  
00000  
00051  
00085  
00119  
00153  
00187  
0089  
0053  
00000  
00053  
00089  
00125  
00161  
00197  
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
0149  
0089  
00000  
00089  
00149  
00209  
00269  
00329  
0155  
0093  
00000  
00093  
00155  
00217  
00279  
00341  
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
0056  
0057  
0058  
0059  
0060  
0113  
0115  
0117  
0119  
0121  
0225  
0229  
0233  
0237  
0241  
0339  
0345  
0351  
0357  
0363  
00451  
00459  
00467  
00475  
00483  
ξ
ξ
ξ
ξ
ξ
00903  
00919  
00935  
00951  
00967  
01017  
01035  
01053  
01071  
01089  
ξ
ξ
ξ
ξ
ξ
01921  
01955  
01989  
02023  
02047  
02033  
02047  
02047  
02047  
02047  
ξ
ξ
ξ
ξ
ξ
02047  
02047  
02047  
02047  
02047  
02047  
02047  
02047  
02047  
02047  
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
ξ
0125  
0126  
0127  
0251  
0253  
0255  
0501  
0505  
0509  
0753  
0759  
0765  
01003  
01011  
01019  
ξ
ξ
ξ
02007  
02023  
02039  
02047  
02047  
02047  
ξ
ξ
ξ
02047  
02047  
02047  
02047  
02047  
02047  
ξ
ξ
ξ
02047  
02047  
02047  
02047  
02047  
02047  
NOTE – Reconstruction levels are symmetrical with respect to the sign of level except for 2047/2048.  
For INTRA blocks the first coefficient is nominally the transform dc value linearly quantized with a step size of 8 and no  
dead-zone. The resulting values are represented with 8 bits. A nominally black block will give 0001 0000 and a nominally  
white one 1110 1011. The code 0000 0000 is not used. The code 1000 0000 is not used, the reconstruction level of 1024  
being coded as 1111 1111 (see Table 6).  
Coefficients after the last non-zero one are not transmitted. EOB (end of block code) is always the last item in blocks for  
which coefficients are transmitted.  
4.3  
Multipoint considerations  
The following facilities are provided to support switched multipoint operation.  
4.3.1 Freeze picture request  
Causes the decoder to freeze its displayed picture until a freeze picture release signal is received or a timeout period of at  
least six seconds has expired. The transmission of this signal is via external means (for example by Recommendation H.221).  
4.3.2  
Fast update request  
Causes the encoder to encode its next picture in INTRA mode with coding parameters such as to avoid buffer overflow. The  
transmission method for this signal is via external means (for example by Recommendation H.221).  
20  
Recommendation H.261 (03/93)  
TABLE 6/H.261  
Reconstruction levels for INTRA-mode dc coefficient  
FLC  
Reconstruction level into  
inverse transform  
0000 0001 (1)  
0000 0010 (2)  
0000 0011 (3)  
ξ
0008  
0016  
0024  
ξ
ξ
ξ
0111 1111 (127)  
1111 1111 (255)  
1000 0001 (129)  
ξ
1016  
1024  
1032  
ξ
ξ
ξ
1111 1101 (253)  
1111 1110 (254)  
2024  
2032  
NOTE – The decoded value corresponding to FLC “n” is 8n except FLC 255 gives 1024.  
4.3.3  
Freeze picture release  
A signal from an encoder which has responded to a fast update request and allows a decoder to exit from its freeze picture  
mode and display decoded pictures in the normal manner. This signal is transmitted by bit 3 of PTYPE (see 4.2.1) in the  
picture header of the first picture coded in response to the fast update request.  
5
Transmission coder  
Bit rate  
5.1  
The transmission clock is provided externally (for example from an I.420 interface).  
5.2  
Video data buffering  
The encoder must control its output bitstream to comply with the requirements of the hypothetical reference decoder defined  
in Annex B.  
When operating with CIF the number of bits created by coding any single picture must not exceed 256 Kbits. K=1024.  
When operating with QCIF the number of bits created by coding any single picture must not exceed 64 Kbits.  
In both the above cases the bit count includes the picture start code and all other data related to that picture including  
PSPARE, GSPARE and MBA stuffing. The bit count does not include error correction framing bits, fill indicator (Fi), fill  
bits or error correction parity information described in 5.4.  
Video data must be provided on every valid clock cycle. This can be ensured by the use of either the fill bit indicator (Fi) and  
subsequent fill all 1's bits in the error corrector block framing (see Figure 13) or MBA stuffing (see 4.2.3) or both.  
Recommendation H.261 (03/93)  
21  
(S1 S2 S3 S4 S5 S6 S 7 S8  
)
= (00 011 01 1)  
Tran smissio n order  
S 1  
S 2  
S3  
S 7  
S 8  
T150 246 0-90/d0 8  
S 1  
Da ta  
Parity  
18  
1
493  
Fi  
1
Coded data  
0
1
Fill (all “1”)  
49 2  
FIGURE 13/H.261  
Error correcting frame  
FIGURE 13/2H.261...[D08] = 9.5  
5.3  
Video coding delay  
This item is included in this Recommendation because the video encoder and video decoder delays need to be known to allow  
audio compensation delays to be fixed when H.261 is used to form part of a conversational service. This will allow lip  
synchronization to be maintained. Annex C recommends a method by which the delay figures are established. Other delay  
measurement methods may be used but they must be designed in a way to produce similar results to the method given in  
Annex C.  
5.4  
Forward error correction for coded video signal  
Error correcting code  
5.4.1  
The transmitted bitstream contains a BCH (511,493) forward error correction code. Use of this by the decoder is optional.  
5.4.2  
Generator polynomial  
9
4
9
6
4
3
g(x) = (x + x + 1) (x + x + x + x + 1)  
Example: For the input data of “01111 . . . 11” (493 bits) the resulting correction parity bits are  
“011011010100011011” (18 bits).  
5.4.3  
Error correction framing  
To allow the video data and error correction parity information to be identified by a decoder an error correction framing  
pattern is included. This consists of a multiframe of eight frames, each frame comprising 1 bit framing, 1 bit fill indicator  
(Fi), 492 bits of coded data (or fill all 1s) and 18 bits parity. The frame alignment pattern is:  
(S S S S S S S S ) = (00011011).  
1 2 3 4 5 6 7 8  
See Figure 13 for the frame arrangement. The parity is calculated against the 493-bits including fill indicator (Fi).  
The fill indicator (Fi) can be set to zero by an encoder. In this case only 492 consecutive fill bits (fill all 1s) plus parity are  
sent and no coded data is transmitted. This may be used to meet the requirement in 5.2 to provide video data on every valid  
clock cycle.  
22  
Recommendation H.261 (03/93)  
5.4.4  
Relock time for error corrector framing  
Three consecutive error correction framing sequences (24 bits) should be received before frame lock is deemed to have been  
achieved. The decoder should be designed such that frame lock will be re-established within 34 000 bits after an error  
corrector framing phase change.  
NOTE – This assumes that the video data does not contain three correctly phased emulations of the error correction framing  
sequence during the relocking period.  
Recommendation H.261 (03/93)  
23  
Annex A  
Inverse transform accuracy specification  
(This annex forms an integral part of this Recommendation)  
A.1  
Generate random integer pel data values in the range L to +H according to the random number generator given  
below (“C” version). Arrange into 8 by 8 blocks. Data set of 10 000 blocks should each be generated for (L = 256, H = 255),  
(L = H = 5) and (L = H = 300).  
A.2  
For each 8 by 8 block, perform a separable, orthonormal, matrix multiply, forward discrete cosine transform using  
at least 64-bit floating point accuracy.  
7
7
1
F(u, v) =  
C(u)C(v)  
f (x, y) cos[π (2x + 1)u/16]cos[π (2y + 1) v/16]  
∑ ∑  
4
=
=
0
x
0
y
with  
u, v, x, y = 0, 1, 2, . . .,7  
where  
x,y = spatial coordinates in the pel domain,  
u,v = coordinates in the transform domain,  
C(u) = 1/ 2 for u = 0; otherwise 1,  
C(v) = 1/ 2 for v = 0; otherwise 1.  
A.3  
For each block, round the 64 resulting transformed coefficients to the nearest integer values. Then clip them to the  
range –2048 to +2047. This is the 12-bit input data to the inverse transform.  
A.4 For each 8 by 8 block of 12-bit data produced by A.3, perform a separable, orthonormal, matrix multiply, inverse  
discrete transform (IDCT) using at least 64-bit floating point accuracy. Round the resulting pels to the nearest integer and clip  
to the range 256 to +255. These blocks of 8 × 8 pels are the reference IDCT input data.  
A.5  
For each 8 by 8 block produced by A.3, apply the IDCT under test and clip the output to the range –256 to +255.  
These blocks of 8 × 8 pels are the test IDCT output data.  
A.6  
For each of the 64 IDCT output pels, and for each of the 10,000 block data sets generated above, measure the peak,  
mean and mean square error between the reference and the test data.  
A.7 For any pel, the peak error should not exceed 1 in magnitude.  
For any pel, the mean square error should not exceed 0.06.  
Overall, the mean square error should not exceed 0.02.  
For any pel, the mean error should not exceed 0.015 in magnitude.  
Overall, the mean error should not exceed 0.0015 in magnitude.  
A.8  
A.9  
All zeros in must produce all zeros out.  
Re-run the measurements using exactly the same data values of A 1, but change the sign on each pel.  
“C” program for random number generation  
/* L and H must be long, that is 32 bits */  
long rand  
long  
{
(L,H)  
L,H;  
static long randx = 1;  
/* long is 32 bits */  
static double z = (double) 0x7fffffff;  
24  
Recommendation H.261 (03/93)  
long i,j;  
double x;  
/* double is 64 bits */  
randx = (randx 1103515245) + 12345;  
*
i = randx & 0x7ffffffe;  
x = ( (double)i ) / z;  
x * = (L+H+1);  
j = x;  
/* keep 30 bits */  
/* range 0 to 0.99999 ... */  
/* range 0 to < L+H+1 */  
/* truncate to integer */  
/* range L to H */  
return( j – L);  
}
Annex B  
Hypothetical reference decoder  
(This annex forms an integral part of this Recommendation)  
The hypothetical reference decoder (HRD) is defined as follows:  
B.1 The HRD and the encoder have the same clock frequency as well as the same CIF rate, and are operated  
synchronously.  
B.2  
The HRD receiving buffer size is (B + 256 kbits). The value of B is defined as follows:  
B = 4R /29.97 where R is the maximum video bit rate to be used in the connection.  
max  
max  
B.3  
B.4  
The HRD buffer is initially empty.  
The HRD buffer is examined at CIF intervals ( 33 ms). If at least one complete coded picture is in the buffer then  
all the data for the earliest picture is instantaneously removed (e.g. at t  
in Figure B.1). Immediately after removing the  
n+1  
above data the buffer occupancy must be less than B. This is a requirement on the coder output bitstream including coded  
picture data and MBA stuffing but not error correction framing bits, fill indicator (Fi), fill bits or error correction parity  
information described in 5.4.  
To meet this requirement the number of bits for the (n+1)th coded picture d  
must satisfy:  
n+1  
t n  
+
1
+
R(t)dt  
dn  
bn  
B
1
+
z
t n  
where  
b is buffer occupancy just after the time t ;  
n
n
t is the time the nth coded picture is removed from the HRD buffer;  
n
R(t) is the video bit rate at the time t.  
Recommendation H.261 (03/93)  
25  
HRD buffe r  
occu pancy  
(b it)  
tn+ 1  
( )d  
R t  
t
tn  
dn+1  
B
bn  
bn+1  
Time  
(CIF in terva l)  
t n  
t n+1  
T 1 502 470- 90/d09  
NOTE - Time (  
-
) is aninteger number of CIF picture periods(1/29.97, 2/29.97, 3/29.97, ...).  
tn+1 tn  
FIGURE B.1/H.261  
HRDbuffer occupancy  
FIGURE B.1/H.261...[D09] = 9CM  
Annex C  
Codec delay measurement method  
(This annex forms an integral part of this Recommendation)  
The video encoder and video decoder delays will vary depending on implementation. The delay will also depend on the  
picture format (QCIF, CIF) and data rate in use. This annex specifies the method by which the delay figures are established  
for a particular design. To allow correct audio delay compensation the overall video delay needs to be established from a user  
perception point of view under typical viewing conditions.  
Point A is the video input to the video coder. Point B is the channel output from the video terminal (i.e. including any FEC,  
channel framing, etc.). Point C is the video output from the decoder.  
A video sequence lasting more than 100 seconds is connected to the video coder input (point A) in Figure C.1 above. The  
video sequence should have the following characteristics:  
it should contain a typical moving scene consistent with the intended purpose of the video codec;  
it should produce a minimum coded picture rate of 7.5 Hz at the bit rate in use;  
it should contain a visible identification mark at intervals throughout the length of the sequence. The visible  
identification should change every 97 video input frames and be located within the picture area represented by  
the first GOB in the picture. For example, the first block in the picture could change from black to white at  
intervals of 97 video frame periods. The identification mark should be chosen so that it can be detected at  
point B and does not significantly contribute to the overall coding performance.  
The codec and video sequence should be arranged so that the bitstream contains less than 10% stuffing (MBA stuffing +  
error correction fill bits).  
The encoder delay is obtained by measuring the time from when the visible identification changes at point A to the time that  
the change is detected at point B. Similarly, the decoder delay is obtained by taking measurements at points B and C.  
26  
Recommendation H.261 (03/93)  
Several measurements should be made during the sequence length and the average period obtained. Several tests should be  
made to ensure that a consistent average figure can be obtained for both encoder and decoder delay times.  
Average results should be obtained for each combination of picture format and bit rate within the capability of the particular  
codec design.  
NOTE – Due to pre- and post-temporal processing it may be necessary to take a mid-level for establishing the transition of the  
identification mark at points B and C.  
B
Video  
co der  
V id eo  
de cod er  
A
C
T1 50248 0-90/d10  
FIGURE C.1/H.261  
Measuring points  
FIGURE C.1/H.261...[D10] = 6CM  
Annex D  
Still image transmission  
(This annex forms an integrat part of this Recommendation)  
D.1  
Introduction  
This annex describes the procedure for transmitting still images within the framework of this Recommendation. This  
procedure enables an H.261 video coder to transmit still images at four times the normal video resolution by temporarily  
stopping the motion video. Administrations may use this optional procedure as a simple and inexpensive method to transmit  
still images. However, Recommendation T.81 (JPEG) is preferred when the procedures for using T.81 within audiovisual  
systems are standardized.  
This procedure can provide high quality image transmission with effects similar to those of progressive and hierarchical  
schemes. Minimal changes to H.261 (low cost), backward compatibility with existing terminals, and flexibility in image  
quality versus transmission speed were the key considerations in its development.  
NOTE – The encoder would set a previously unused bit in PTYPE to “0” when it transmits a still image (unused bits should be  
set to “1”). A decoder that ignores this bit would receive the image as normal video. A decoder that goes into an error condition when this  
bit is “0” would most likely freeze the previous video frame, and resume when this bit is reset to “1”. A decoder having this new capability  
could display the image in a higher resolution, transfer the image to a separate graphics display and hold the image when video resumes,  
print and/or save the image, etc.  
D.2  
Still image format  
The still image format is four times the currently transmitted video format. If the video format is QCIF, then the still image is  
a CIF frame. If the video format is CIF, which contains 352 × 288 luminance samples, then the still image contains 704 × 576  
luminance samples, and a corresponding increase in the number of chrominance samples (a CCIR-601 frame).  
Recommendation H.261 (03/93)  
27  
For transmission using H.261, the still image is sub-sampled 2:1 horizontally and vertically into four sub-images in  
the currently transmitted video format. Figure D.1 shows the sub-sampling pattern on the still image. The samples labelled 0,  
1, 2 and 3 form the four sub-images 0, 1, 2 and 3, respectively.  
0
1
0
1
0
1
3
2
3
2
3
2
0
1
0
1
0
1
3
2
3
2
3
2
0
1
0
1
0
1
3
2
3
2
3
2
0
1
0
1
0
1
3
2
3
2
3
2
FIGURE D.1/H.261  
Sub-sampling pattern  
D.3  
Picture layer multiplex  
When HI_RES is “0”, the two lower bits of the temporal reference (TR) identify one of the four sub-images 0, 1, 2 or 3. The  
three higher bits of the TR shall be set to “0”.  
The encoder transmits a still image by setting HI_RES to “0” and transmitting the four sub-images 0, 1, 2 and 3 in sequential  
order. It is allowed to transmit more than one frame for each sub-image, but should not go back once it starts transmitting the  
next sub-image. The encoder is allowed to resume motion video at any time by setting HI_RES back to “1”.  
NOTE – The reference memory for the current frame is always the previous frame, regardless of whether a frame is motion  
video or still image.  
D.4  
Multipoint considerations  
A still image transmitted within the video bit-stream can be broadcast on a multipoint connection by broadcasting the video.  
The MCV (multipoint command visualization-forcing) and Cancel-MCV commands defined in Recommendation H.230  
provide for this capability. A terminal could force an MCU to broadcast its video by sending MCV, and then return to the  
previous mode of operation by sending Cancel-MCV. MCUs are required to implement these commands, but they are  
optional for terminals.  
D.5  
Other considerations  
All the video coding modes are allowed (intra-frame, inter-frame, motion compensation, etc.);  
the multiplex arrangement below the picture layer remains the same (group of blocks, macroblocks, etc.);  
the maximum number of bits allowed per frame (sub-image) should not be exceeded (256 Kbits for CIF and 64  
Kbits for QCIF);  
forward error correction is not affected.  
28  
Recommendation H.261 (03/93)  

相关型号:

H.DI-0420R-100

General Fixed Inductor, 1 ELEMENT, 10 uH, GENERAL PURPOSE INDUCTOR, SMD, ROHS COMPLIANT
TOKO

H.DI-0420R-150

General Fixed Inductor, 1 ELEMENT, 15 uH, GENERAL PURPOSE INDUCTOR, SMD, ROHS COMPLIANT
TOKO

H.DI-0420R-1R5

General Fixed Inductor, 1 ELEMENT, 1.5 uH, GENERAL PURPOSE INDUCTOR, SMD, ROHS COMPLIANT
TOKO

H.DI-0420R-220

General Fixed Inductor, 1 ELEMENT, 22 uH, GENERAL PURPOSE INDUCTOR, SMD, ROHS COMPLIANT
TOKO

H.DI-0420R-2R2

General Fixed Inductor, 1 ELEMENT, 2.2 uH, GENERAL PURPOSE INDUCTOR, SMD, ROHS COMPLIANT
TOKO

H.DI-0420R-6R8

General Fixed Inductor, 1 ELEMENT, 6.8 uH, GENERAL PURPOSE INDUCTOR, SMD, ROHS COMPLIANT
TOKO

H.DI-0520-100

General Fixed Inductor, 1 ELEMENT, 10 uH, GENERAL PURPOSE INDUCTOR, SMD, ROHS COMPLIANT
TOKO

H.DI-0520-101

General Fixed Inductor, 1 ELEMENT, 100 uH, GENERAL PURPOSE INDUCTOR, SMD, ROHS COMPLIANT
TOKO

H.DI-0520-150

General Fixed Inductor, 1 ELEMENT, 15 uH, GENERAL PURPOSE INDUCTOR, SMD, ROHS COMPLIANT
TOKO

H.DI-0520-1R2

General Fixed Inductor, 1 ELEMENT, 1.2 uH, GENERAL PURPOSE INDUCTOR, SMD, ROHS COMPLIANT
TOKO

H.DI-0520-220

General Fixed Inductor, 1 ELEMENT, 22 uH, GENERAL PURPOSE INDUCTOR, SMD, ROHS COMPLIANT
TOKO

H.DI-0520-2R2

General Fixed Inductor, 1 ELEMENT, 2.2 uH, GENERAL PURPOSE INDUCTOR, SMD, ROHS COMPLIANT
TOKO