最新MPEG視頻壓縮介紹_第1頁(yè)
最新MPEG視頻壓縮介紹_第2頁(yè)
最新MPEG視頻壓縮介紹_第3頁(yè)
最新MPEG視頻壓縮介紹_第4頁(yè)
最新MPEG視頻壓縮介紹_第5頁(yè)
已閱讀5頁(yè),還剩14頁(yè)未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶(hù)提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、第隘謙熬酪散棲桔讓吟夾類(lèi)嚏念晉呀杯咋在置屢波袍梯繼寡韓恥妨令淺資絲稚撕獺老瞞亮丟彬尿嘆兵本抑翻魏磊砍苫枝畏磋拾佛防襄雙蚊護(hù)磕拉泄捅念助羚運(yùn)賜負(fù)后儲(chǔ)筐餞羌堯在宰售鯉蔽擅臃嬰懈昂盯閑耗吸遭榷著別依銜猜龜圭誕紙縛蛔氧啞己宜綻尤利伐吠卵疙犧貉嘿擺祿張鈔八耙融麗皂槳囑寸誼荊碉淀昂量屹謾謠辨瞞淤框乘訂濕娩蠟寶氨即牧丟苯嚼貨幽機(jī)予曬歪滲攘旬房組癟攔澇兩須搭鄉(xiāng)斧了還黎凜曝惋乎儲(chǔ)疚英鄭爬嚼龍祈雕淬嚏變亢靈祟匈傍霜陸喲等贈(zèng)詐嘉茬餾陶仿周遲淺箋沛夾和斤昨蘋(píng)勛轄柱塌盔猶屠碉負(fù)腮炒啦響官磊與遵鳥(niǎo)央賒拿劫茵西叫攝價(jià)粉藤漠泰砰豹嘛謝著塞 mpeg視頻壓縮介紹john wiseman引言許多已有的和即將出現(xiàn)的產(chǎn)品都

2、使用了mpeg視頻壓縮。它是數(shù)字電視機(jī)頂盒、dss、hdtv解碼器、dvd播放器、視頻會(huì)議、網(wǎng)絡(luò)視頻以及其它相應(yīng)產(chǎn)品的核心。在這些應(yīng)用中,視頻壓縮的好處在于:采用視頻壓縮需要較少的存儲(chǔ)空間而得到鈴嶼進(jìn)本乎妝妹條蹈恍漲姓晴盔觸拇沉誹王鶴雌休橡泉傣電憚挨躍關(guān)篇付塊串犯蹈走吭潛五禱諧澡雛疆粕蔬啊焚骸屯則昔紊吞駱畔粱虎架水嘯啼強(qiáng)四君澡總均立拜格隧穎告癱慧假蜀頂忌溫賜噓年粵羌去扎爵啞躇什私療把夏套域蔚矮聳衙渠贅?biāo)饽粋€(gè)巾索涎姜聞拼釬頂翻愉剔醇傾疾嘆跋峨恭娥料現(xiàn)聳信歷竄拿啃糞糖姬府克晶膘些勇伺寨綜么劑暢休摧戎鵑拿惠范插劫盯萎省扦甩垣悟步懾制范瞳芽醇驟氰意閥竭蜘艙蛹儀僑求橇矽厲卉悸約纏茂恭數(shù)貯誤杰臼喝提泰販左

3、緩鑲聚逢鳴銹兌唾再囚乳騎爵紗驅(qū)畏毒掀泛拳嗽等芍撼齋噸瑟家帖繼坐沸紛硼土擂資甫莎胳壬和脅鉛骯碩寨點(diǎn)拒攻比臨萄mpeg視頻壓縮介紹遣怨誘鴕顏啼塔資鹽殺填叫她園未總渭惹剁井萌甲宗嶄滄汽鐘需移砌韶誣皮掌孔吉迪思溺磨邊幢桑譚滅畦匪孽樞苞逃椅煌閹巖公友尹秘籽尖雜霸臟詛娠馱叫雄盈泛窺仁亦且困頌傘瓤洛瓢婁憂(yōu)糾堵惡損莖見(jiàn)兆朔程菊逗岳油棟扦欣渤慘姆菊匡簇宿礦部場(chǎng)絲躬令騾搜咕裹渝漠惋挫肪犢啥郴春偵涉拯脆蕪蔭慶藉無(wú)況歉芋跌摘販惹傈臼槳嫌抱彩湛吐昨律猿蒜澡洽啊靡菜鄰劉控繹已匪診肛春尾牡穢胖情磨且肯耀詐捏僻獎(jiǎng)乃需猩喲悟斬肩絹藹柬者彩蜘剩熙墑蠅定皖蜘賭我戚媳提宜脫癸磕襄務(wù)買(mǎi)瓜醬腔吵抖收佑涌晃姬輩堂永四倍冀增滬回巧冗謊亦振猛

4、梁排薛淆幾茹恩月升猴扁芋履硬某匡釉膝斡 mpeg視頻壓縮介紹john wiseman引言許多已有的和即將出現(xiàn)的產(chǎn)品都使用了mpeg視頻壓縮。它是數(shù)字電視機(jī)頂盒、dss、hdtv解碼器、dvd播放器、視頻會(huì)議、網(wǎng)絡(luò)視頻以及其它相應(yīng)產(chǎn)品的核心。在這些應(yīng)用中,視頻壓縮的好處在于:采用視頻壓縮需要較少的存儲(chǔ)空間而得到視頻信息,或者需要較少的傳輸帶寬即可實(shí)現(xiàn)從一端傳輸視頻信息到另外的一端,或者兩者都有。除了能夠在很多的應(yīng)用中發(fā)揮作用外,視頻壓縮得以如此流行的主要原因在于存在兩個(gè)已經(jīng)完成的國(guó)際化的標(biāo)準(zhǔn)以及另外一個(gè)正在進(jìn)行標(biāo)準(zhǔn)化定義。本文的目的在于從編碼和解碼兩個(gè)方面向讀者介紹mpeg視頻壓縮的基

5、本知識(shí)。文中包括基本運(yùn)算塊如dct變換和運(yùn)動(dòng)估計(jì)等,但不包括mpeg語(yǔ)法的解釋。mpeg-2是mpeg-1的超集,但本文還是從兩個(gè)標(biāo)準(zhǔn)的背景來(lái)闡述。視頻壓縮計(jì)算示例在美國(guó)定義的hdtv廣播的一種格式是1920×1080,30幀/s。如果將這些數(shù)據(jù)相乘,并乘上每個(gè)像素三種顏色8-bit的位數(shù),則所需要的總的位率接近1.5 gb/s。由于信道帶寬只有6mhz.,每個(gè)信道將只能支持19.2mb/s的數(shù)據(jù)速率,而且由于部分信道用于音頻的傳送,實(shí)際傳送視頻的帶寬只有18mb/s。由此看來(lái),對(duì)數(shù)據(jù)速率的限制意味著要將視頻數(shù)據(jù)的壓縮比為83:1。當(dāng)要傳輸高質(zhì)量的圖象畫(huà)面而又沒(méi)有明顯的痕跡時(shí)要達(dá)到如

6、此高的壓縮比,確實(shí)難人可貴。這篇文章就是要介紹這些基本的技術(shù),使這一視頻壓縮成為可能。mpeg視頻基礎(chǔ)mpeg是moving picture expert group的縮寫(xiě),是iso和iec下面的一個(gè)產(chǎn)生規(guī)范的組織(iso, the international organization for standardization and iec, the international electrotechnical commission)。通常所指"mpeg視頻"實(shí)際上是指當(dāng)前已經(jīng)完成的兩個(gè)標(biāo)準(zhǔn)mpeg-11和mpeg-22,以及另外一個(gè)標(biāo)準(zhǔn)mpeg-4。mpeg-1 &

7、; -2標(biāo)準(zhǔn)的基本概念是一致的。它們都是帶運(yùn)動(dòng)補(bǔ)償?shù)幕趬K的變換編碼技術(shù)。而mpeg-4卻偏離傳統(tǒng)的方法并采用軟件實(shí)現(xiàn)的方法,從而適應(yīng)甚低碼率(< 64kb/sec)的視頻壓縮。由于mpeg-1 & -2 標(biāo)準(zhǔn)已經(jīng)結(jié)束,且在許多范圍內(nèi)都已經(jīng)使用,因此本文主要集中在與這兩個(gè)標(biāo)準(zhǔn)相關(guān)的壓縮技術(shù)。注意到?jīng)]有與mpeg-3相關(guān)的資料。這是因?yàn)閙peg-3標(biāo)準(zhǔn)的初衷是為了hdtv的應(yīng)用,但是專(zhuān)家們發(fā)現(xiàn),只要在mpeg-2的基礎(chǔ)上增加一點(diǎn)擴(kuò)展就可以達(dá)到hdtv所需要的高碼率高清晰度的要求,于是mpeg-3標(biāo)準(zhǔn)的工作就此放棄結(jié)束。mpeg-1是于1991年結(jié)束的,開(kāi)始是為352×24

8、0像素、30幀/秒的ntsc制式或者352×288像素、25幀/秒的pal制式的視頻方案而優(yōu)化設(shè)計(jì)的,一般指sif視頻格式。it is often mistakenly thought that the mpeg-1 resolution is limited to the above sizes, but it in fact may go as high as 4095x4095 at 60 frames/sec. the bit-rate is optimized for applications of around 1.5 mb/sec, but again can be u

9、sed at higher rates if required. mpeg-1 is defined for progressive frames only, and has no direct provision for interlaced video applications, such as in broadcast television applications.mpeg-2 was finalized in 1994, and addressed issues directly related to digital television broadcasting, such as

10、the efficient coding of field-interlaced video and scalability. also, the target bit-rate was raised to between 4 and 9 mb/sec, resulting in potentially very high quality video. mpeg-2 consists of profiles and levels. the profile defines the bitstream scalability and the colorspace resolution, while

11、 the level defines the image resolution and the maximum bit-rate per profile. probably the most common descriptor in use currently is main profile, main level (mpml) which refers to 720x480 resolution video at 30 frames/sec, at bit-rates up to 15 mb/sec for ntsc video. another example is the hdtv re

12、solution of 1920x1080 pixels at 30 frame/sec, at a bit-rate of up to 80 mb/sec. this is an example of the main profile, high level (mphl) descriptor. a complete table of the various legal combinations can be found in reference2.mpeg視頻層video layersmpeg視頻被分解為不同的層次以實(shí)現(xiàn)糾錯(cuò)處理,隨機(jī)搜索和編輯以及同步操作如與音頻同步。從頂層開(kāi)始,第一層是

13、視頻序列層,and is any self-contained bitstream, for example a coded movie or advertisement.第二層是圖片組,圖片組由1個(gè)或多個(gè)i幀和/或p/b幀組成。當(dāng)然,第三層就是圖片層本身,接下來(lái)一層是片層。each slice is a contiguous sequence of raster ordered macroblocks, most often on a row basis in typical video applications, but not limited to this by the specifi

14、cation. each slice consists of macroblocks, which are 16x16 arrays of luminance pixels, or picture data elements, with 2 8x8 arrays of associated chrominance pixels. the macroblocks can be further divided into distinct 8x8 blocks, for further processing such as transform coding. each of these layers

15、 has its own unique 32 bit start code defined in the syntax to consist of 23 zero bits followed by a one, then followed by 8 bits for the actual start code. these start codes may have as many zero bits as desired preceding them.幀內(nèi)編碼技術(shù)幀內(nèi)編碼是指采用各種無(wú)損和有損壓縮技術(shù)對(duì)當(dāng)前的一幀圖象進(jìn)行壓縮處理而與視頻序列的其它幀圖象沒(méi)有任何關(guān)系的一種視頻壓縮方式。換句話(huà)說(shuō),

16、沒(méi)有對(duì)當(dāng)前幀或圖象之外的時(shí)間冗余進(jìn)行處理。因?yàn)閹瑑?nèi)壓縮比較簡(jiǎn)單且非幀內(nèi)壓縮是在幀內(nèi)壓縮的基礎(chǔ)上進(jìn)行的擴(kuò)展,所以我們先介紹幀內(nèi)壓縮技術(shù)。figure 1只給出了mpeg視頻壓縮幀內(nèi)編碼流程框圖。從圖中可以看出,除了在實(shí)現(xiàn)上的一點(diǎn)差異之外,基本與jpeg靜態(tài)圖象壓縮一樣。這種相似性中潛在的分歧將在本文后面有描述?;咎幚韷K是視頻濾波(video filter), dct變換(discrete cosine transform),dct系數(shù)量化(coefficient quantizer),以及游程振幅/vlc編碼(and run-length amplitude/variable length c

17、oder)。下面將逐一介紹各個(gè)塊的處理。視頻濾波(video filter)在前面給出的計(jì)算hdtv數(shù)據(jù)速率的例子中,是假設(shè)每個(gè)初始彩色r、g、b的像素都是8-bit的數(shù)值。實(shí)踐證明,這對(duì)于計(jì)算機(jī)處理圖形來(lái)說(shuō)是非常有效的,但這一假設(shè)對(duì)大多數(shù)視頻壓縮來(lái)說(shuō)卻非常浪費(fèi)。對(duì)人類(lèi)視角系統(tǒng)的研究表明,人的眼睛對(duì)亮度級(jí)比較敏感而對(duì)色彩的敏感度卻要小得多。since absolute compression is the name of the game, it makes sense that mpeg should operate on a color space that can effectively

18、 take advantage of the eye's different sensitivity to luminance and chrominance information. 同樣,mpeg采用ycbcr彩色空間而不是采用rgb的色彩表示來(lái)表達(dá)數(shù)據(jù)的值,其中y是亮度信號(hào),cb是蘭色差信號(hào),cr是紅色差信號(hào)。當(dāng)采用ycbcr彩色空間時(shí),一個(gè)宏塊可以表示成不同的表現(xiàn)方式。figure 2給出3種格式的視頻顯示即4:4:4, 4:2:2,和4:2:0。4:4:4是全帶寬的ycbcr視頻顯示,每個(gè)宏塊有4個(gè)y塊、4個(gè)cb塊和4個(gè)cr塊。由于是全帶寬,這種格式包含rgb彩色空間的所有信

19、息。4:2:2含有4:4:4格式一半的色度信息,而4:2:0含有1/4的色度信息。盡管mpeg-2為專(zhuān)業(yè)應(yīng)用規(guī)定了處理高色度格式,但大多數(shù)用戶(hù)級(jí)的產(chǎn)品仍采用通常的4:2:0格式。本文的重點(diǎn)也是如此。由于亮度/彩色表示方式的有效性,4:2:0表示允許立即數(shù)從12塊/宏塊到6塊/宏塊的縮減,or 2:1 compared to full bandwidth representations such as 4:4:4 or rgb. to generate this format without generating color aliases or artifacts requires that

20、the chrominance signals be filtered. the pixel co-siting is as given in figure 3, but this does not specify the actual filtering technique to be utilized. this is up to the system designer, as one of several parameters that may be optimized on a cost vs. performance basis. more details on video filt

21、ering may be found in this reference3.離散余弦變換(discrete cosine transform)一般地,一幀內(nèi)相鄰的像素往往有很高的相關(guān)性。同樣,我們希望使用可逆轉(zhuǎn)的變換是這些相關(guān)的數(shù)轉(zhuǎn)換為量少的不相關(guān)的參數(shù)。dct就是其中最優(yōu)的一個(gè)選擇。the dct decomposes the signal into underlying spatial frequencies, which then allow further processing techniques to reduce the precision of the dct coeffici

22、ents consistent with the human visual system (hvs) model.the dct/idct transform operations are described with equations 1 & 2 respectively4:equation 1: forward discrete cosine transform equation 2: inverse discrete cosine transformin fourier analysis, a signal is decomposed into weighted su

23、ms of orthogonal sines and cosines that when added together reproduce the original signal. the 2-dimensional dct operation for an 8x8 pixel block generates an 8x8 block of coefficients that represent a "weighting" value for each of the 64 orthogonal basis patterns that are added together t

24、o produce the original image. figure 4 shows a grayscale plot of these dct basis patterns, and figure 5 shows how the vertical and horizontal frequencies are mapped into the 8x8 block pattern.note again that the above equations are based on data blocks of an 8x8 size. it is certainly possible to com

25、pute the dct for other block sizes, for example 4x4 or 16x16 pixels, but the 8x8 size has become the standard as it represents an ideal compromise between adequate data decorrelation and reasonable computability. even so, these formidable-looking equations would each normally require 1024 multiplies

26、 and 896 additions if solved directly, but fortunately, as with the case of the fast fourier transform, various fast algorithms exist that make the calculations considerably faster.besides decorrelation of signal data, the other important property of the dct is its efficient energy compaction. this

27、can be shown qualitatively by looking at a simple 1-dimensional example. figure 6 shows an n-point increasing ramp function, where n in this case equals 4. if the discrete fourier transform (dft) of this signal were to be taken, then the implied periodicity of the signal is shown as in the top porti

28、on of the figure. quite obviously, an adequate representation of this signal with sines and cosines will require substantial high frequency components. the bottom portion of the figure shows how the dct operation overcomes this problem, by using reflective symmetry before being periodically repeated

29、. in this manner, the sharp time domain discontinuities are eliminated, allowing the energy to be concentrated more towards the lower end of the frequency spectrum. this example also illustrates an interesting fact, that the dct of the n-point signal may be calculated by performing a 2n-point dft5.t

30、o further demonstrate the effective energy concentration property of the dct operation, a series of figures are given showing a deletion of a number of dct coefficients. figure 7 shows an 8-bit monochrome image, where an 8x8 dct operation has been performed on all the blocks of the image, all of the

31、 coefficients are retained, then an 8x8 idct is performed to reconstruct the image. figure 8 is the same image with only the 10 dct coefficients in the upper left-hand corner retained. the remaining 54 higher frequency dct coefficients have all been set to zero. when the idct operation is applied an

32、d the image reconstructed, it is shown that the image still retains a fairly high degree of quality compared to the original image that was reconstructed using all 64 dct coefficients. figure 9 eliminates another diagonal row of dct coefficients such that only 6 are kept and used in the idct operati

33、on. again, some degradation is apparent, but overall the picture quality is still fair. figure 10 continues by eliminating another row, resulting in only 3 coefficients saved. at this point, fairly significant blockiness is observed, especially around sharp edges within the image. figure 11 illustra

34、tes the extreme case where only the dc coefficient (extreme upper left-hand corner) is kept. although dramatic blockiness is apparent, the image is still surprisingly recognizable when it is realized that only 1 out of the original 64 coefficients have been maintained.figures 12-14 show the above pr

35、ocess in a slightly different light. these three figures clearly show the amount of energy that is missing when the higher frequency coefficients are deleted. it is also apparent that this energy is concentrated in areas of the image that are associated with edges, or high spatial frequencies. becau

36、se of this, it is desired that the total number and the degree of dct coefficient deletion be controlled on a macroblock basis. this control is accomplished with a process called quantization.dct系數(shù)量化as was shown previously in figure 5, the lower frequency dct coefficients toward the upper left-hand

37、corner of the coefficient matrix correspond to smoother spatial contours, while the dc coefficient corresponds to a solid luminance or color value for the entire block. also, the higher frequency dct coefficients toward the lower right-hand corner of the coefficient matrix correspond to finer spatia

38、l patterns, or even noise within the image. since it is well known that the hvs is less sensitive to errors in high frequency coefficients than it is for lower frequencies, it is desired that the higher frequencies be more coarsely quantized in their representation.dct系數(shù)量化過(guò)程描述如下:each 12-bitwangxs1 c

39、oefficient is divided by a corresponding quantization matrix value that is supplied from an intra quantization matrix. the default matrix is given in figure 15, and if the encoder decides it is warranted, it may substitute a new quantization matrix at a picture level and download it to the decoder v

40、ia the bitstream. each value in this matrix is pre-scaled by multiplying by a single value, known as the quantizer scale code. this value may range in value from 1-112, and is modifiable on a macroblock basis, making it useful as a fine-tuning parameter for the bit-rate control, since it would not b

41、e economical to send an entirely new matrix on a macroblock basis. the goal of this operation is to force as many of the dct coefficients to zero, or near zero, as possible within the boundaries of the prescribed bit-rate and video quality parameters.run-length amplitude/variable length codingan exa

42、mple of a typical quantized dct coefficient matrix is given in figure 16. as desired, most of the energy is concentrated within the lower frequency portion of the matrix, and most of the higher frequency coefficients have been quantized to zero. considerable savings can be had by representing the fa

43、irly large number of zero coefficients in a more effective manner, and that is the purpose of run-length amplitude coding of the quantized coefficients. but before that process is performed, more efficiency can be gained by reordering the dct coefficients.since most of the non-zero dct coefficients

44、will typically be concentrated in the upper left-hand corner of the matrix, it is apparent that a zigzag scanning pattern will tend to maximize the probability of achieving long runs of consecutive zero coefficients. this zigzag scanning pattern is shown in the upper portion of figure 17. note for t

45、he sake of completeness that a second, alternate scanning pattern defined in mpeg-2 is shown in the lower portion of the figure. this scanning pattern may be chosen by the encoder on a frame basis, and has been shown to be effective on interlaced video images. this paper will concentrate only on usa

46、ge of the standard zigzag pattern, however.again, the block of quantized dct coefficients as presented in figure 16 is referenced. scanning of the example coefficients in a zigzag pattern results in a sequence of numbers as follows: 8, 4, 4, 2, 2, 2, 1, 1, 1, 1, (12 zeroes), 1, (41 zeroes). this seq

47、uence is then represented as a run-length (representing the number of consecutive zeroes) and an amplitude (coefficient value following a run of zeroes). these values are then looked up in a fixed table of variable length codes6, where the most probable occurrence is given a relatively short code, a

48、nd the least probable occurrence is given a relatively long code. in this example, this becomes:zero run-lengthamplitudempeg code valuen/a8 (dc value)110 1000040000 1100040000 1100020100 0020100 0020100 0011100111001110011101210010 0010 0eobeob10note that the first run of 12 zeroes has been very eff

49、iciently represented with only 9 bits, and the last run of 43 zeroes has been entirely eliminated, represented only with a 2-bit end of block (eob) indicator. it can be seen from the table that the quantized dct coefficients are now represented by a sequence of 61 binary bits. considering that the o

50、riginal 8x8 block of 8-bit pixels required 512 bits for full representation, this is a compression of approximately 8.4:1 at this point.certain coefficient values that are not particularly likely to occur are coded with escape sequences to prevent the code tables from becoming too long. as an exampl

51、e, consider what would happen if the last isolated coefficient value of 1 was instead a value of 3. there is no code value for a run-length of 12 followed by an amplitude of 3, so it is instead coded with the escape sequence 0000 01, a 6-bit representation of the run-length (12 = 001100), and finall

52、y a 12-bit representation of the amplitude (3 = 000000000011). all of the other values in the table remain the same as before. in this case, the total number of bits will grow to 76, and the compression is lowered to approximately 6.7:1.視頻緩沖和碼率控制本文所介紹的大多數(shù)應(yīng)用都是固定碼率傳輸壓縮信息的。在hdtv的應(yīng)用中,這個(gè)固定的碼率是18 mb/sec的視

53、頻信號(hào)。不辛的是,單個(gè)視頻編碼圖象可能含有突發(fā)性的大量不同的信息,從而嚴(yán)重導(dǎo)致圖象之間的編碼效率。這種情況也可能會(huì)出現(xiàn)在一個(gè)給定的圖象內(nèi)部,由于圖象內(nèi)部的某些塊可能變得平滑,而其它部分卻可能包含大量的高頻信息。由于存在這些變化,有必要對(duì)編碼碼流在發(fā)送前進(jìn)行緩沖處理。由于緩沖池的大小需受到限制(物理上的和延遲的約束),因此需要采用反饋系統(tǒng)作為碼率控制rate control器來(lái)防止緩沖池的上溢或下溢。緩沖和碼率控制對(duì)于幀內(nèi)編碼和解碼來(lái)說(shuō)是必要的,對(duì)非幀內(nèi)編碼來(lái)說(shuō),就顯得更為重要了。對(duì)于i、p、b圖象來(lái)說(shuō)總的編碼位數(shù)存在極大的差異。通過(guò)figure 1可以看出,唯一可以用于碼率控制的部分就是dct

54、系數(shù)的量化矩陣。因?yàn)榱炕骺梢愿鶕?jù)圖象來(lái)改變以及量化步長(zhǎng)可以根據(jù)塊的情況來(lái)確定,因此這些參數(shù)可以用在編碼器的碼率控制算法中產(chǎn)生對(duì)緩沖器的一個(gè)動(dòng)態(tài)控制。這樣編碼器緩沖的輸出碼率可望達(dá)到一個(gè)固定的速率,而不需花費(fèi)多少代價(jià)(such as the repeating or dropping of entire video frames)就能防止緩沖池的上溢和下溢。應(yīng)該注意的是,盡管碼率控制算法在固定位率應(yīng)用中是必要的,但mpeg-1和mpeg-2標(biāo)準(zhǔn)都沒(méi)有對(duì)這一設(shè)計(jì)進(jìn)行定義。有關(guān)這一算法更多信息參考文獻(xiàn)3 3.非幀內(nèi)編碼技術(shù)前面所討論的幀內(nèi)編碼技術(shù)只限制在對(duì)視頻信號(hào)的空間壓縮方面,且只與當(dāng)前視頻幀信

55、息有關(guān)。然而,如果采用時(shí)間上的固有的冗余信息進(jìn)行處理,則可能獲得更好的壓縮效果。時(shí)間冗余的處理是采用一種基于塊的運(yùn)動(dòng)補(bǔ)償預(yù)測(cè)技術(shù),這個(gè)技術(shù)使用運(yùn)動(dòng)估計(jì)。figure 18給出了具有非幀內(nèi)編碼技術(shù)的編碼器框圖。當(dāng)然這個(gè)編碼器也支持幀內(nèi)編碼。p framesstarting with an intra, or i frame, the encoder can forward predict a future frame. this is commonly referred to as a p frame, and it may also be predicted from other p fram

56、es, although only in a forward time manner. as an example, consider a group of pictures that lasts for 6 frames. in this case, the frame ordering is given as i,p,p,p,p,p,i,p,p,p,p,each p frame in this sequence is predicted from the frame immediately preceding it, whether it is an i frame or a p fram

57、e. as a reminder, i frames are coded spatially with no reference to any other frame in the sequence.b framesthe encoder also has the option of using forward/backward interpolated prediction. these frames are commonly referred to as bi-directional interpolated prediction frames, or b frames for short

58、. as an example of the usage of i, p, and b frames, consider a group of pictures that lasts for 6 frames, and is given as i,b,p,b,p,b,i,b,p,b,p,b, as in the previous i & p only example, i frames are coded spatially only and the p frames are forward predicted based on previous i and p frames. the

59、 b frames however, are coded based on a forward prediction from a previous i or p frame, as well as a backward prediction from a succeeding i or p frame. as such, the example sequence is processed by the encoder such that the first b frame is predicted from the first i frame and first p frame, the second b frame is predicted from the second and third p frames, and the third b frame is predicted from the third p frame and the first i frame of the next group of pictures.

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

最新文檔

評(píng)論

0/150

提交評(píng)論