語言信號處理章聲音的基本特征

上傳人：我*** IP屬地：北京上傳時間：2022-03-08 格式：PPTX 頁數(shù)：52 大?。?05.87KB 積分：16 舉報 版權(quán)申訴

已閱讀5頁，還剩47頁未讀，繼續(xù)免費閱讀

版權(quán)說明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權(quán)，請進行舉報或認領(lǐng)

文檔簡介

1、第5章提取聲音的基本特征Lv Danju5.1 Volume 音量 The loudness of audio signals is the most prominent features for human aural perception. In general, there are several terms to describe the loudness of audio signals, including volume, Intensity energy. Here we use the term “volume” for further discussion. Volume

2、is a basic acoustic feature that is correlated to the sample amplitudes within each frame.一個音框內(nèi)的抽樣信號震幅大一個音框內(nèi)的抽樣信號震幅大小小Volume 的描述方法 two methods to compute the volume of each frame: sum of absolute samples： logorithm of the sum of sample squares: 1|niiv o l u m es2110 * log()niivolumes整數(shù)運算整數(shù)運算浮點數(shù)運算浮點數(shù)

3、運算單位：分貝單位：分貝音量音量特性：音量特性：有聲音的音量大于氣音的音量，而氣音的音量又有聲音的音量大于氣音的音量，而氣音的音量又大于噪聲的音量。大于噪聲的音量。音量是一個相對性的指標，受音量是一個相對性的指標，受到麥克風設(shè)定的影響很大。到麥克風設(shè)定的影響很大。應(yīng)用：應(yīng)用：通常用在端點檢測，估測有聲的聲母或韻母的開通常用在端點檢測，估測有聲的聲母或韻母的開始位置及結(jié)束位置。始位置及結(jié)束位置。技巧：技巧：在計算音量前最好是先減去音頻信號信號的平均在計算音量前最好是先減去音頻信號信號的平均值，以避免信號的直流偏移（值，以避免信號的直流偏移（DC Bias）所導致）所導致的誤差。的

4、誤差。舉例 volume01.m waveFile=my_sunday.wav;frameSize=256;overlap=128;y, fs, nbits=wavReadInt(waveFile);fprintf(Length of %s is %g sec.n, waveFile, length(y)/fs);frameMat=buffer(y, frameSize, overlap);frameNum=size(frameMat, 2);volume1=zeros(frameNum, 1);volume2=zeros(frameNum, 1);for i=1:frameNumframe=

5、frameMat(:,i);frame=frame-mean(frame);% zero-justifiedvolume1(i)=sum(abs(frame);% method 1volume2(i)=10*log10(sum(frame.2);% method 2end time=(1:length(y)/fs; frameTime=(0:frameNum-1)*(frameSize-overlap)+0.5*frameSize)/fs; subplot(3,1,1); plot(time, y); ylabel(waveFile); subplot(3,1,2); plot(frameTi

6、me, volume1, .-); ylabel(Volume (Abs. sum); subplot(3,1,3); plot(frameTime, volume2, .-); ylabel(Volume (Decibels); xlabel(Time (sec);計算音量與主觀音量計算音量：計算音量：使用音量來表示聲音的強弱，前述兩種計使用音量來表示聲音的強弱，前述兩種計算音量的方法，用數(shù)學的公式來逼近人耳算音量的方法，用數(shù)學的公式來逼近人耳的感覺；的感覺；主觀音量：主觀音量：和人耳的感覺有時候會有相當大的落差，和人耳的感覺有時候會有相當大的落差，為了區(qū)分，我們使用主觀音量來表示為

7、了區(qū)分，我們使用主觀音量來表示人耳所聽到的音量大小。人耳所聽到的音量大小。例如，人耳對于同樣振福但不同頻率的聲例如，人耳對于同樣振福但不同頻率的聲音，所產(chǎn)生的主觀音量就會非常不一樣。音，所產(chǎn)生的主觀音量就會非常不一樣。主觀音量曲線以人耳為測試主體的等主觀音量曲線圖以人耳為測試主體的等主觀音量曲線圖（Curves of Equal Loudness）頻率對主觀音量的影響上面這一張圖，也代表人耳對于不同頻率上面這一張圖，也代表人耳對于不同頻率的聲音的靈敏程度，這也就是人耳的頻率的聲音的靈敏程度，這也就是人耳的頻率響應(yīng)（響應(yīng)（Frequency Response）。如果你）。如果你要測試你

8、自己的耳朵的頻率響應(yīng)，可以到要測試你自己的耳朵的頻率響應(yīng)，可以到這個網(wǎng)頁這個網(wǎng)頁Equal Loudness Tester試試試看：試看： g.html主觀音量測試音色對主觀音量的影響 the perceived loudness is also greatly influenced by the timbre. vowels using the same loudness level, plot the volume curves to see how they are related to the timbre or shapes/positions of lips/tougue舉例 vo

9、lume02.m waveFile=aeiou.wav; frameSize=512; overlap=0; y, fs, nbits=wavReadInt(waveFile); fprintf(Length of %s is %g sec.n, waveFile, length(y)/fs); frameMat=buffer(y, frameSize, overlap); frameNum=size(frameMat, 2); volume1=frame2volume(frameMat, 1);% method 1 volume2=frame2volume(frameMat, 2);% me

10、thod 2volume02.m time=(1:length(y)/fs; frameTime=(0:frameNum-1)*(frameSize-overlap)+0.5*frameSize)/fs; subplot(3,1,1); plot(time, y); ylabel(waveFile); subplot(3,1,2); plot(frameTime, volume1, .-); ylabel(Volume (Abs. sum); subplot(3,1,3); plot(frameTime, volume2, .-); ylabel(Volume (Decibels); xlab

11、el(Time (sec);a e i o u的音量主觀音量容易受到頻率和音色的影響，因主觀音量容易受到頻率和音色的影響，因此我們在進行語音或歌聲合成時，常常根此我們在進行語音或歌聲合成時，常常根據(jù)聲音的頻率和內(nèi)容來對音頻信號的振幅據(jù)聲音的頻率和內(nèi)容來對音頻信號的振幅進行校正，以免造成主觀音量忽大忽小的進行校正，以免造成主觀音量忽大忽小的情況。情況。 Zero Crossing Rate (過零率) 定義：定義：ZCR is another basic acoustic features that can be computed easily. It is equal to the numb

12、er of zero-crossing of the waveform within a given frame音頻信號通過零點的音頻信號通過零點的次數(shù)次數(shù) . ZCR has the following characteristics: In general, ZCR of both unvoiced sounds and environment noise are larger than voiced sounds (which has observable fundamental periods). It is hard to distinguish unvoiced sounds fr

13、om environment noise by using ZCR alone since they have similar ZCR values. ZCR is often used in conjunction with the volume for end-point detection. In particular, ZCR is used for detecting the start and end positings of unvoiced sounds. Some people use ZCR for fundamental frequency estimation, but

14、 it is highly unreliable unless further refine procedure is taken into consideration. 計算過零率在計算過零率時，需注意下列事項：在計算過零率時，需注意下列事項：由于有些信號若恰好位于零點，此時過零由于有些信號若恰好位于零點，此時過零率的計算就有兩種，出現(xiàn)的效果也會不同。率的計算就有兩種，出現(xiàn)的效果也會不同。因此必須多加觀察，才能選用最好的作法。因此必須多加觀察，才能選用最好的作法。大部分都是使用音頻信號的原始整數(shù)值來大部分都是使用音頻信號的原始整數(shù)值來計算，才不會因為使用浮點數(shù)信號，在減計算，才不會因為使

15、用浮點數(shù)信號，在減去直流偏移（去直流偏移（DC Bias）時，造成過零率）時，造成過零率的增加。的增加。舉例 zcr01.m waveFile=csNthu8b_S.wav; frameSize=256; overlap=0; y, fs, nbits=wavReadInt(waveFile); frameMat=buffer(y, frameSize, overlap); for i=1:frameNumframeMat(:,i)=frameMat(:,i)-round(mean(frameMat(:,i); % Zero justification end zcr1=sum(frameM

16、at(1:end-1, :).*frameMat(2:end, :)0); % Method 1 zcr2=sum(frameMat(1:end-1, :).*frameMat(2:end, :)=0); % Method 2做圖部分 time=(1:length(y)/fs; frameNum=size(frameMat, 2); frameTime=(0:frameNum-1)*(frameSize-overlap)+0.5*frameSize)/fs; subplot(2,1,1); plot(time, y); ylabel(waveFile); subplot(2,1,2); plo

17、t(frameTime, zcr1, .-, frameTime, zcr2, .-); title(ZCR); xlabel(Time (sec); legend(Method 1, Method 2);From the above example, it is obvious that From the above example, it is obvious that these two methods generate different ZCR these two methods generate different ZCR curves. The first method does

18、 not count curves. The first method does not count zero positioning as zero crossing, there zero positioning as zero crossing, there the corresponding ZCR values are smaller. the corresponding ZCR values are smaller. Moreover, silence is likely to have low ZCR Moreover, silence is likely to have low ZCR of method 1 and high ZCR for method 2 of method 1 and high ZCR for method 2 since there are likely to have many zero since there are likely to have many zero positioning positioning 上述的范例中，我們使用了兩種方式來計上述的范例中，我們使用了兩種方式來計算過零率，得到的效果雖然不同，但趨勢算過零率，得到的效

人人文庫> 全部分類> 應(yīng)用文書

溫馨提示

1. 本站所有資源如無特殊說明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽，若沒有圖紙預(yù)覽就沒有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫網(wǎng)僅提供信息存儲空間，僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理，對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對任何下載內(nèi)容負責。
6. 下載文件中如有侵權(quán)或不適當內(nèi)容，請與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

語言信號處理章聲音的基本特征

文檔簡介

溫馨提示

最新文檔

評論

語言信號處理章聲音的基本特征

文檔簡介

溫馨提示

最新文檔

評論

相關(guān)文檔