| 研究生: |
黃春融 Huang, Chun-Rong |
|---|---|
| 論文名稱: |
應用電腦視覺技術於展場互動之研究 The Vision-based Interaction on Augmented Exhibition Environments |
| 指導教授: |
陳祝嵩
Chen, Chu-Song 詹寶珠 Chung, Pau-Choo |
| 學位類別: |
博士 Doctor |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2005 |
| 畢業學年度: | 93 |
| 語文別: | 英文 |
| 論文頁數: | 98 |
| 中文關鍵詞: | 相機校正 、物體辨識 |
| 外文關鍵詞: | Camera self-calibration, Object recognition |
| 相關次數: | 點閱:86 下載:15 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
參觀者在參觀博物館的過程中會由欣賞不同角度的文物或瀏覽其相關的資訊,而自然地與展場互動。如何在這樣的互動環境中,提供參觀者想要瞭解的資訊便成為展場導覽時所考量的重要因素。在本論文中我們應用電腦視覺技術來輔助參觀者與展場之間的互動,以期望提升參觀者對展場的融入感與互動性。首先我們由探討電腦視覺中的相機參數校正問題出發,建構出一應用三維視覺追蹤技術與增添式環場影像 (Augmented Panorama) 技術之互動式擬真博物館系統,此系統可在虛擬的空間呈現高度擬真的文物,如此一來便沒有展場空間的限制。同時參觀者即使在家中也可以由位於前方的電腦螢幕中欣賞虛擬博物館。有別於傳統滑鼠操控介面,為進一步提昇使用者操作時的融入感,此系統透過以電腦視覺為基礎的相機參數估算技術,讓使用者可以透過手持的控制方塊來控制整個虛擬博物館,同時本系統也提供了語音與文字的導覽功能,以便於使用者能更瞭解文物的歷史背景與相關資料。此外,在本論文中,我們同時探討電腦視覺中物體辨識的問題,並應用所發展之物體辨識演算法建構互動式展場導覽系統。當參觀者在博物館的房間中參觀時,本系統上所裝置的相機會拍攝到使用者有興趣的文物,並根據參觀者的指示辨認出該文物,同時以增添式實境 (Augmented reality) 的方式顯示文物的資訊以增加參觀者的融入感。為保護文物,在博物館中文物通常被靜態的放置在櫥窗內共參觀者欣賞,參觀者只能欣賞到文物部分的外觀,在本系統中我們進一步提供虛擬實境 (Virtual Reality) 的顯示模式,使參觀者可以自由欣賞物體的影像式 (Image-based) 三維模型。結合真實、增添式實境與虛擬實境,我們希望參觀者能得到全新的參觀經驗與獲得他們所想要的資訊。透過我們應用電腦視覺技術所發展的互動式擬真博物館系統與互動式展場導覽系統,我們希望提供參觀者有別於以往的參觀經驗,同時又能體驗到與展場互動的樂趣。
This thesis discusses how to apply vision-based technologies to make a visitor feel the exhibits and exhibition environments are immersive and interactive. We first focus on camera self-calibration problem and present how to retrieve camera parameters from un-calibrated images. In the following, we use these camera parameters to develop a tangible photo-realistic virtual museum, which builds up a photo-realistic virtual environment to present stored collections. To provide natural interactions and immersive experiences of a museum "visitor" with exhibits in the virtual environment, we employ a vision-based tangible interface using a handheld 3D physical control cube (PCC) to control the virtual museum. Moreover, this system also provides text and voice guides for the visitors. Besides the virtual museum, we design an interactive museum guide system to assist visitors to receive associated information of interested exhibits in a museum. The main problem in this guide system is how to recognize exhibits seen by visitors. To overcome this problem, we develop a new invariant local feature, contrast context histogram, to recognize exhibits regardless the visitor’s position. When a visitor walks in an exhibition room, this guide system continuously captures the images of the exhibition room. Once the visitor clicks the exhibit in the image, the system will recognize the interested exhibits via our new object recognition algorithm. After recognition, the visitor can easily know the name or key information of exhibit that s/he has clicked via an augmented reality view. Moreover, visitors can appreciate and interact with the image-based 3D model of the recognized exhibit in a virtual reality view by using their hand-held stylus. Through this system, we try to offer visitors novel experiences of traveling among a real world, an augmented world and a virtual world during the tour in the museum. In these two vision-based systems, we present new interaction modes between visitors and exhibition environments no matter real or virtual. With these approaches, we look forward to making the visitors obtain plentiful information with interesting, immersive and interactive experience.
[1] K. S. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fitting of two 3-d point sets,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 9, no. 5, pp. 698-700, 1987.
[2] L. de Agapito, R. Hartley, and E. Hayman, “Linear self-calibration of a rotating and zooming camera,” In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, CVPR’99, pp. 23-25, 1999.
[3] H. Anton, and C. Rorres, Elementary Linear Algebra: Application Version, 8th ed., Chap. 5, John Wiley & Sons, Inc., 2000.
[4] A. Agarwal, and B. Triggs, “3D human pose from silhouettes by relevance vector regression,” In Proceedings of Computer Vision and Pattern Recognition, CVPR’04, vol. 2, pp. 882-888, 2004.
[5] T. Ahonen, A. Hadid, and M. Pietikäinen, “Face recognition with local binary patterns,” In Proceedings of the European Conference on Computer Vision, ECCV’04, pp. 469-481, 2004.
[6] P. J. Besl, and N. D. McKay, “A method for registration of 3D shapes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, no. 5, pp. 239-256, 1992.
[7] S. Bougnoux, “From projective to Euclidean space under any practical situation, a criticism of self-calibration,” In Proceedings of Sixth IEEE International Conference on Computer Vision, ICCV’98, pp. 790-796, 1998.
[8] M. J. Back, R. Gold, A. M. Balsamo, M. D. Chow, M. Gorbet, S. R. Harrison, D. W. MacDonald, and S. L. Minneman, “Designing innovative reading experiences for a museum exhibition,” IEEE Computer, vol. 34, no. 1, pp. 80-87, 2001.
[9] M. Billinghurst, H. Kato, and I. Poupyrev, “The magicbook—moving seamlessly between reality and virtuality,” IEEE Computer Graphics and Applications, vol. 21, no. 3, pp. 6-8, May/June, 2001.
[10] S. Belongie, J. Malik, and J. Puzicha, “Shape matching and object recognition using shape contexts,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 4, pp.509-522, 2002.
[11] M. Billinghurst, A. Cheok, S. Prince, and H. Kato, ”Real world teleconferencing,” IEEE Computer Graphics and Applications, vol. 22, no. 6, pp. 11-13, Nov./Dec., 2002.
[12] S. E. Chen, “QuickTime VR – an image-based approach to virtual environment navigation,” In Proceedings of SIGGRAPH’95, pp. 29–38, 1995.
[13] C. S. Chen, Y. P. Hung, S. W. Shih, C. C. Hsieh, C. Y. Tang, C. G. Yu, and Y. C. Cheng, “Integrating virtual objects into real images for augmented reality,” In Proceedings of ACM Symposium on Virtual Reality Software and Technology, VRST’98, pp. 1-8, 1998.
[14] C. S. Chen, C. K. Yu, and Y. P. Hung, “New calibration-free approach for augmented reality based on parameterized cubed structure,” In Proceedings of International Conference on Computer Vision, ICCV’99, pp. 30-37, September 1999.
[15] C. S. Chen, and W. T. Hsieh, “Composition of 3D graphic objects and panoramas,” In Proceedings of International Conference on Artificial Reality and Tele-existence, ICAT’00, pp. 207-214, 2000.
[16] C. Ciavarella, and F. Paternò, “The design of a handheld, location-aware guide for indoor environments,” Personal and Ubiquitous Computing, vol.8, no.2, pp.82-91, May 2004.
[17] C. S. Chen, and W. Y. Chang, “On pose recovery for generalized visual sensors,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 7, pp. 848- 861, 2004.
[18] J. H. Chen, and C. S. Chen, “Object recognition based on image sequences by using inter-feature-line consistencies,” Pattern Recognition, vol. 37, no. 9, pp. 1913-1923, 2004.
[19] D. F. DeMenthon, and L. S. Davis, “Model-based object pose in 25 line of code,” International Journal of Computer Vision, vol. 15, pp. 123-141, 1995.
[20] N. Davies, K. Cheverst, K. Mitchell, and A. Efrat, “Using and determining location in a context-sensitive tour guide,” IEEE Computer, vol. 34, no. 8, pp. 35-41, 2001.
[21] S. Edelman, N. Intrator, and T. Poggio, “Complex cells and object recognition,” 1997.
[22] M. A. Fischler, and R. C. Bolls, “Random sample consensus: a paradigm for model fitting with application to image analysis and automated cartography,” Communications of ACM, Vol. 24, pp. 381-395, 1981.
[23] W. Freeman, and E. Adelson, “The design and use of steerable filters,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 9, pp. 891-906, 1991.
[24] O. Faugeras, “What can be seen in three dimensions with an uncalibrated stereo rig?,” In Proceedings of European Conference on Computer Vision, ECCV’92, pp. 563-578, 1992.
[25] O. Faugeras, Q. Luong, and S. Maybank, “Camera self-calibration: theory and experiments,” In Proceedings of European Conference on Computer Vision, ECCV’92, pp. 321-334, 1992.
[26] M. Fleck, M. Frid, T. Kindberg, E. O'Brien-Strain, R. Rajani, and M. Spasojevic, “From informing to remembering: ubiquitous systems in interactive museums,” IEEE Pervasive Computing, vol. 1, no. 2, pp. 13–21, 2002.
[27] L. Van Gool, T. Moons, M. Proesmans, and M. Van Diest, “Affine reconstruction from perspective image pairs obtained by a translating camera,” In Proceedings of International Conference on Pattern Recognition, ICPR’94, pp. 290-294, 1994.
[28] L. Van Gool, T. Moons, and D. Ungureanu, “Affine / photometric invariants for planar intensity patterns,“ In Proceedings of the 4th European Conference on Computer Vision, ECCV’96, pp. 642-651, 1996.
[29] I. Gordon, and D. G. Lowe, “Scene modeling, recognition and tracking with invariant image features,“ In Proceedings of International Symposium on Mixed and Augmented Reality, ISMAR’04, pp. 110-119, 2004.
[30] T. Goedemé, T. Tuytelaars, and L. V. Gool, “Fast wide baseline matching for visual navigation,” In Proceedings of IEEE conference on computer vision and pattern recognition, CVPR’04, pp. 24-29, 2004.
[31] C. G. Harris, and M. J. Stephens, “A combined corner and edge detector”, In Proceedings of the Fourth Alvey Vision Conference, pp.147-151, 1988.
[32] S. Van Huffel, and J. Vandewalle, “The total least squares problem: computational aspects and analysis,” Frontiers in Applied Mathematics series, vol. 9, pp. 57-60, 1991.
[33] R. M. Haralick, D. Lee, K. Ottenburg, and M. Nolle, “Analysis and solutions of the three point perspective pose estimation problem,” In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, CVPR’91, pp. 592-598, 1991.
[34] G. Heidemann, “Focus-of-attention from local color symmetries,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 7, pp. 817-830, 2004.
[35] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C, Cambriage University Press, pp. 32-50, 1992.
[36] R. Hartley, “Estimation of relative camera positions for uncalibrated cameras,” In Proceedings of European Conference on Computer Vision, ECCV’92, pp. 579-587, 1992.
[37] R. Hartley, “In defense of the eight-point algorithm,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 6, pp. 580-593, 1997.
[38] A. Harter, A. Hopper, P. Steggles, A. Ward, and P. Webster, “The anatomy of a context-aware application,” Mobile Computing and Networking, pp. 59-68, 1999.
[39] R. Hartley, and A. Zisserman, Multiple View Geometry in Computer Vision, Cambriage University Press, pp. 449-453 and pp. 234-241, 2000.
[40] Y. P. Hung, C. S. Chen, Y. P. Tsai, and S. W. Lin, “Augmenting panoramas with object movies by generating novel views with disparity-based view morphing,” Journal of Visualization and Computer Animation -- Special Issue on Hallucinating the Real World from Real Images, vol. 13, pp. 237-247, 2002.
[41] C. R. Huang, C. S. Chen, and P. C. Chung, “An improved algorithm for two-image camera self-calibration and Euclidean structure recovery using absolute quadric,” Pattern Recognition, vol. 37, no. 8, pp. 1713-1722, 2004.
[42] C. R. Huang, C. S. Chen, and P. C. Chung, “Tangible photo-realistic virtual museum,” IEEE Computer Graphics and Applications, vol. 25, no.1, pp. 15-17, 2005.
[43] H. Kato, and M. Billinghurst, “Marker tracking and HMD calibration for a video-based augmented reality conferencing system,” In Proceedings of International Workshop on Augmented Reality, IWAR'99, pp. 85-94, 1999.
[44] N. Koshzuka, and K. Sakamura, ”Tokyo university digital museum,” In Proceedings of International Conference on Digital Libraries, pp. 179-186, 2000.
[45] T. Kohonen, Self-organizing maps. Springer-Verlag, 2001, 106-115.
[46] F. Kusunoki, M. Sugimoto, and H. Hashizume, “Toward an interactive museum guide with sensing and wireless network technologies,” In Proceedings of IEEE International Workshop on Wireless and Mobile Technologies in Education, pp.99-102, 2002.
[47] P. Kontkanen, P. Myllymaki, T. Roos, H. Tirri, K. Valtonen, and H. Wettig, ”Emerging location aware broadband wireless adhoc networks,” Kluwer Academic Publishers, 2004.
[48] D. G. Lowe, “Robust model-based motion tracking through the integration of search and estimation,” International Journal of Computer Vision, vol. 8, no. 2, pp. 113-122, 1992.
[49] T. Lindeberg, “Scale-space theory: A basic tool for analyzing structures at different scales,” Journal of Applied Statistic, vol. 21, no. 2, pp. 224-270, 1994.
[50] C. P. Lu, G. D. Hager, and E. Mjolsness, “Fast and globally convergent pose estimation from video images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 6, pp. 610-622, 2000.
[51] J. S. Liu and J. H. Chuang, “Self-calibration with varying focal length from two images obtained by a stereo head,” Pattern Recognition, vol. 35, pp. 2937-2948, 2002.
[52] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[53] K. Mikolajczyk, and C. Schmid, “Indexing based on scale invariant interest points”, In Proceedings of International Conference on Computer Vision, ICCV’01, vol. 1, pp. 525-531, 2001.
[54] K. Mikolajczyk, and C. Schmid, “An affine invariant interest point detector,” In Proceedings of European Conference on Computer Vision, ECCV’02, pp. 128-142, 2002.
[55] Y. Ma, Stefano Soatto, Jana Košecká, and Shankar S. Sastry, “Euclidean reconstruction and reprojection up to subgroups,” International Journal of Computer Vision, vol. 38, no. 3, pp. 217-227, 2000.
[56] S. A. Nene, S. K. Nayar, and H. Murase, Columbia object image library (COIL-100), Technical Report CUCS-006-96, Columbia University, 1996.
[57] J. Newman, D. Ingram, and A. Hopper, “Augmented reality in a wide area sentient environment,” In Proceedings of IEEE and ACM International Symposium on Augmented Reality, ISAR'01, p.77, 2001.
[58] T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971-987, 2002.
[59] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C, Cambriage University Press, 1992.
[60] M. Pollefeys, R. Koch, and L. Van Gool, “Self-calibration and metric reconstruction in spite of varying and unknown internal camera parameters,” In Proceedings of Sixth IEEE International Conference on Computer Vision, ICCV’98, pp. 90-95, 1998. Also in International Journal of Computer Vision, vol. 32, no. Special Session on 1998 Marr Prize Papers, pp. 7-25, 1999.
[61] M. Pollefeys, and L. Van Gool, “Stratified self-calibration with the modulus constraint,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, pp. 707-724, 1999.
[62] L. Quan, and T. Viéville, “Canonical representations for the geometries of multiple projective views,” Computer Vision and Image Understanding, vol. 64, no. 2, pp. 193-229, 1996.
[63] D. Reisfeld, H. Wolfson, and Y. Yeshurun, “Context-free attentional operators: the generalized symmetry transform,” International Journal of Computer Vision, vol. 14, pp. 119-130, 1995.
[64] B. Schiele, T. Jebara, and N. Oliver, “Sensory augmented computing: Wearing the museum’s guide,” IEEE Micro, vol. 21, no. 3, pp. 44-52, 2001.
[65] F. Sparacino, “The museum wearable: real-time sensor-driven understanding of visitors' interests for personalized visually-augmented museum experiences,” In Proceedings of Museums and the Web, MW2002, 2002.
[66] P. Sturm, “Critical motion sequences for the self-calibration of cameras and stereo systems with variable focal length,” Image and Vision Computing, vol. 20, pp. 415-426, 2002.
[67] B. Triggs, “Autocalibration and the absolute quadric,” In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, CVPR’97, pp. 609-614, 1997.
[68] B. Triggs, P. McLauchlan, R. Hartley, and A. Fitzgibbon, Bundle adjustment - A modern synthesis, Vision Algorithms: Theory and Practice, LNCS, Springer Verlag, pp. 298-375, 2000
[69] R. Tenmoku, M. Kanbara, and N. Yokoya, “A wearable augmented reality system for navigation using positioning infrastructures and a pedometer”, In Proceedings of Second IEEE and ACM International Symposium on Augmented Reality, ISMAR’03, pp. 344-345, 2003.
[70] A. Thayananthan, B. Stenger, P. Torr, and R. Cipolla, “Shape context and chamfer matching in cluttered scenes,“ In Proceedings of Computer Vision and Pattern Recognition, CVPR’03, vol. 1, pp. 127-133, 2003.
[71] P. Wunsch, and G. Hirzinger, “Registration of CAD-models to images by iterative perspective matching,” In Proceedings of International Conference on Pattern Recognition, pp. 78-83, 1996.
[72] A. P. Witkin, “Scale-space filtering,” In Proceedings of International Joint Conference on Artificial Intelligence, Karlsruhe, Germany, pp. 1019-1022, 1983.
[73] M. White, F. Liarokapis, J. Darcy, N. Mourkoussis, P. Petridis, and P.F. Lister, “Augmented reality for museum artifact visualization,” In Proceedings of 4th Irish Workshop on Computer Graphics, Eurographics, 2003.
[74] G. Xu, J. Terai, and H. Y. Shum, “A linear algorithm for camera self-calibration, motion and structure recovery for multi-planar scenes from two perspective images,” In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, CVPR’00, pp. 474-479, 2000.
[75] J. J. Yokono, and T. Poggio, “Rotation invariant object recognition from one training example,” AI Memo 2004-010. Massachusetts institute of technology, Cambridge, April 2004.
[76] Z. Zhang, R. Deriche, O. Faugeras and Q.-T. Luong, “A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry,” Artificial Intelligence, vol. 78, pp. 87-119, 1995.
[77] Z. Zhang, O. Faugeras, and R. Deriche, “An effective technique for calibrating a binocular stereo through projective reconstruction using both a calibration object and the environment,” Journal of Computer Vision Research, vol. 1, no. 1, pp. 58-68, 1997.