成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	王文生 Wang, Wen-Sheng
論文名稱：	頻譜參數之語音變速變調演算法及其應用 Spectral Parametric Speech Time and Pitch Scaling Algorithms and Its Application System
指導教授：	楊家輝 Yang, Jar-Ferr
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering
論文出版年：	2007
畢業學年度：	95
語文別：	中文
論文頁數：	63
中文關鍵詞：	口腔模型、語音速度調變、語音音高調變、頻譜激發訊號
外文關鍵詞：	Spectral Excited Signal, Vocal Tract Model, Pitch-Scale, Time-Scale
相關次數：	點閱：78 下載：2
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本論文主要研究在探討連續性語音變速變調的技術。我們利用語音壓縮技術之口腔發音模型與頻譜激發為基礎，找出適合訊號特性的技術，發展頻譜參數之語音變速變調演算法。最後，我們利用歌曲樂譜MIDI的資訊，找出MIDI與歌聲的相對關係，使用語音變速變調技術去調整歌唱者歌聲的音高以及節奏，最後針對歌曲音高的特性作音高修改，完成一個能自動更正走音的卡拉OK系統

In this thesis, the main search goal is to develop a continuous speech time and pitch scaling algorithm. By using vocoder model and spectral excitation, which are used in speech coding techniques, we try to find the corresponding parameters to achieve effective speech time and pitch scaling technique. By using the melody information of MIDI signal, we can adjust pitch and rhythm of the singer by the developed speech time and pitch scaling techniques. Finally, the modification of pitch and time of the song, we complete an intelligent karaoke system, which can automatically correct the erroneous pitch and rhythm of the singer

簡介	1
1背景與動機	1
2論文大綱	2
語音變換技術	3
1簡介	         3
2 PICOLA	4
3 Phase vocoder	7
3.1分析階段	8
3.2合成階段	8
3.3修改階段 	9
3.4音高的位移 	10
3.5實驗結果	11
口腔模型的語音變換技術	12
1線性預測編碼(Liner Prediction Coding)	12
2語音口腔模型的參數化	17
2.1頻譜封包之估計(Spectral Envelope Estimation)	19
2.2開迴路基週(Open-loop pitch)與精確基週(fine pitch)搜.22
2.3有聲/無聲(V/UV)的決策	25
3 參數與速度的調整	27
3.1時間參數內插	         27
3.2 語音速度的調整	29
4合成端的參數合成	32
4.1產生包含一基週長度的超過取樣波形	34
4.2基週連續性檢查	35
4.3循環延伸及重新取樣運算	35
4.4語音的後處理	38 
4.5有聲語音與無聲語音過渡區的平滑化	42
5各種演算法效能及複雜度的評估	43
自動更正走音的卡拉OK系統	46
1MIDI格式簡介	46
2歌聲的特性	50
3更正走音的方法	55
3.1 音高的調整 	56
3.2時間匹配	59
結論與未來發展	61
   參考文獻	62
                                    

[1] Morita, N. and Itakura, F, “Time-Scale Modification Algorithm for Speech by
Use of Pointer Interval Control OverLap and Add(PICOLA) and Its Evaluation.”Proc. of Annual Meeting of Acoust. Soc. Of Jpn.,Oct,1986
[2] Dolson, Mark, “The Phase-Vocoder: a turtorial,” .Computer Music Jurnal, Val.
10,4,pp.14-27,The MIT Press, Cambridge,MA.(1986).
[3] Morgan, N and Gold, B, Speech and Audio Singnal Processing, Wiley,200
[4] M. J. Ross, H. L. Shaffer, A. Cohen, R. Freudberg, and H. J. Manley, “Average
magnitude difference function pitch extractor,” IEEE Trans. Acoust., Speech, Signal
Processing, vol. ASSP-22, pp. 353-362, 1974.
[5] L. Gu and R. Liu, “ High-Performance Mandarin Pitch Estimation,”Journal of
Electronics, vol.27, pp. 8-11, 1999.
[6] J. L. Flanagan, R. M. Golden,“ Phase Vocoder,”Bell Syst. Tech. J. 45:1493-1509,
1966.
[7] M. R. Portnoff,“ Time-Scale Modification of Speech Based on Short-Time Fourier
Analysis,” IEEE Trans. Acoust. Speech Signal Process, pp. 374-390,1986.
[8] J. Laroche, M. Dolson, “Improved phase vocoder timescale modification of
audio,” IEEE Transactions on Speech and Audio Processing, vol. 7(3), pp. 323-332, ‘99.
[9] J. Laroche, “Autocorrelation method for high-quality. time/pitch-scaling”, IEEE Workshop on App’s of Signal. Processing to Audio and Acoustics, pp. 131 – 134, ‘93.
[10] R. J. McAulay and T. F. Quatieri, “Speech analysis-synthesis based on a sinusoidal representation,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, no.4, pp. 744-754, Aug. 1986.
[11] T. F. Quatieri and R. J. McAulay, “Speech transformations based on a sinusoidal
representation,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34,
no. 6, pp. 1449-1464, Dec. 1986.
[12] ISO/IEC JTC 1/SC 29/WG 11 N2503-2H,1998-11-15, “Information technology Coding of audio-visual objects, Part3: Audio, Subpart 2: Speech Coding-HVXC.
[13]黃一展,諧波偵測及估計於HVXC編解碼器之快速實現,碩士論文-國立成功學電機工程研究所,民92
[14]陳安璿,整合MIDI 伴奏之歌唱聲合成系統,碩士論文-國立台灣科技大學資訊工程研究所,民92

校內：2009-07-18公開
校外：2009-07-18公開

簡易檢索 / 詳目顯示

相關論文