| 研究生: |
楊豐溥 Yang, Feng-Pu |
|---|---|
| 論文名稱: |
群眾外包的重用 The Reuse of Crowdsourcing |
| 指導教授: |
焦惠津
Jiau, Hewi-jin |
| 學位類別: |
博士 Doctor |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2013 |
| 畢業學年度: | 101 |
| 語文別: | 英文 |
| 論文頁數: | 117 |
| 中文關鍵詞: | 群眾外包 、重用 、繼承 、應用程式介面文件 、語意 |
| 外文關鍵詞: | Crowdsourcing, reuse, inheritance, evolution, API document, semantics |
| 相關次數: | 點閱:208 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
群眾外包是目前最重要的人類運算 (Human Computation) 方式之一。然而,要驅使人們去進行群眾外包的工作所需要的花費仍然是十分昂貴的。除了昂貴之外,群眾們還有不夠可靠以及可用性 (availability) 低的缺點。這篇論文確認了群眾外包工作的品質不足與低可用性是普遍存在的問題。而造成這些問題的主因就是嚴重的群眾人力分配不平衡 (inequality of crowdsource aoolcation)。雖然既有的群眾外包平台提供了一些重複使用 (re-use) 的協助,但是其改善的效果大幅受限於人力分配不平衡的因素。這篇論文提出了一個基於群眾外包的重用 (crowdsourcing reuse) 的解法去利用既有不平衡的人力分配,將受到較多人力分配的外包工作成果重用 (reuse) 到極少人關注的群眾外包工作之上。然而重用所需要的額外負擔可能會讓群眾們對重用卻步,繼續之前直接外包的模式。因此,本論文替重用流程進行了自動化。自動化重用流程是植基於兩項技術,語意鬆綁模式 (semantics relaxation pattern) 以及運算式語意 (operational semantics)。除了自動化重用流程之外,如何讓現有的群眾外包產出更有重用性 (reusability) 的結果是這篇論文的另一個重點。語意工廠 (semantics forge) 是一個針對增加重用性而提出來的框架 (framework)。語意工廠整合了三個流程來達成增加重用性的目標,這三個流程分別是 (1) 語意累積流程 (2) 群眾人力分配的監控流程 (3) 重用性機會的探索流程。為了驗證群眾外包重用的實際效果,本論文在應用程式介面 (API) 的相關文件這個領域之上進行了一系列的實例研究。應用程式介面 (API) 的相關文件的缺乏是造成應用程式介面學習障礙的主要原因。實例研究顯示本論文所提的兩個語意鬆綁模式,亦即以繼承為主的語意鬆綁模式 (inheritance-based semantics relaxation pattern) 與可感知演化的語意鬆綁模式 (evolution-aware semantics relaxation pattern) 均可顯著的提高群眾外包的重用性。最後會介紹的是一個用來展現語意工廠實用性的原型 – IDEA。透過IDEA 來展現個語意工廠主要的功能。
Crowdsourcing is one of the most important human computation practices. However, motivating crowd to work for crowdsourcing tasks is expensive. Crowd is also unreliable in terms of quality and availability. This dissertation confirmed that the insufficient quality and low availability are frequent and universally observed problems. The severe inequality of crowdsource allocation is the major cause of these problems. Although existing crowdsourcing platforms provide some re-use supports, the improvement is limited by the severe inequality. This dissertation presented a solution based on crowdsourcing reuse, which could leverage the inequality by reusing from majorities to minorities. However, the extra effort needed by crowdsourcing reuse might hinder crowd from the benefits of reuse. To reduce the impacts of the extra effort, increasing the automation degree of crowdsourcing reuse is one feasible solution. The automation of crowdsourcing reuse was achieved by two techniques, semantics relaxation patterns and operational semantics. In addition to automate crowdsourcing reuse, a framework, Semantics Forge, was also studies to improve the reusability of existing crowd documents. Semantics Forge integrates three process of crowdsourcing for reuse, (1) The Semantics Accumulation Process, (2) The Crowdsource Allocation Monitoring, and (3) The Reuse Opportunities Exploration Process. To evaluate the actual effects achieved by crowdsourcing reuse, several empirical studies were conducted. Empirical studies were conducted in the context of crowd generated API documents, which is the most promising solution for an important issue of API learning barriers. Studies showed that both the Inheritance-based Semantics Relaxation Pattern and the Evolution-aware Semantics Relaxation Pattern could improve the reuse rate of crowd documents dramatically. Finally, a prototyping of Semantics Forge, called
IDEA, was presented to prove the feasibility of major features of the Semantics Forge.
[1] “Platform versions,” 2013.
[2] L. von Ahn, Human Computation. PhD thesis, Carnegie Mellon University, 2005.
[3] M.-C. Yuen, I. King, and K.-S. Leung, “A Survey of Crowdsourcing Systems,” IEEE
International Conference on Privacy, Security, Risk and Trust (PASSAT) and Social
Computing (SocialCom), pp. pp. 766 – 773, Oct. 2011.
[4] M. B. Miles and A. M. Huberman, Qualitative Data Analysis: An Expanded Sourcebook. SAGE Publications, Inc; 2nd edition, 1994.
[5] “Resource Allocation.” http://en.wikipedia.org/wiki/Resource allocation.
[6] E. S. Raymond, The Cathedral and the Bazaar - Musings on Linux and Open Source by an Accidental Revolutionary. O’Reilly Media, Feb. 2001.
[7] J. Sametinger, Software Engineering with Reusable Components. Springer, 1997.
[8] T. Berners-Lee, J. Hendler, and O. Lassila, “The Semantic Web,” Scientific American, no. 0501, May 2001.
[9] A. Gomez-Perez, O. Corcho, and M. Fernandez-Lopez, Ontological Engineering -
with examples from the areas of Knowledge Management, e-Commerce and the Semantic Web. Springer, 2003.
[10] “Iterative and Incremental Developments: A Brief History,” IEEE Computer, vol. 36, no. 6, pp. 47 – 56, June 2003.
[11] K. Beck, Test Driven Development: By Example. Addison-Wesley Professional, 2002.
[12] J. M. D. U. Farooq, J. Stylos, and B. A. Myers, “API Usability: CHI’2009 Special Interest Group Meeting,” Proceedings of the 27th International Conference on
Human Factors in Computing Systems, pp. 2771–2774, 2009.
[13] M. P. Robillard, “What Makes APIs Hard to Learn? Answers from Developers,”
IEEE Software, vol. 26, no. 6, pp. 27–34, Nov./Dec. 2009.
[14] H. C. Jiau and F.-P. Yang, “Facing up to the Inequality of Crowdsourced API
Documentation,” ACM SIGSOFT Software Engineering Notes, vol. 37, no. 1, 2012.
[15] “StackOverflow - A Language-Independent Collaboratively Edited Question and Answer Site for Programmers.” http://stackoverflow.com/.
[16] T. S. Kuhn, The Structure of Scientific Revolutions. University of Chicago Press,
3rd edition ed., 1996.
[17] J. M. Dienhart, “A Linguistic Look at Riddles,” Journal of Pragmatics, vol. 31, no. 1, pp. 95 – 125, 1998.
113
[18] R. S. Arnold, Software Change Impact Analysis. IEEE Computer Society Press,
1996.
[19] J. Han, “Supporting Impact Analysis and Change Propagation in Software Engineering Environments,” Proceedings of the 8th International Workshop on Software
Technology and Engineering Practice, pp. pp. 172 – –182, July 1997.
[20] H. Malik and A. E. Hassan, “Supporting Software Evolution Using Adaptive Change
Propagation Heuristics,” IEEE International Conference on Software Maintenance,
pp. pp. 177–186, 2008.
[21] J. Anvik, L. Hiew, and G. C. Murphy, “Who Should Fix this Bug?,” Proceedings of the International Conference on Software Engineering, pp. pp. 361–370, May 2006.
[22] I. T. Bowman and R. C. Holt, “Reconstructing Ownership Architectures To Help
Understand Software Systems,” Proceedings of the International Workshop on Program Comprehension, pp. pp. 28–37, May 1999.
[23] M. Yetisgen-Yildiz, I. Solti, F. Xia, and S. R. Halgrim, “Preliminary Experience with
Amazon’s Mechanical Turk for Annotating Medical Named Entities,” Proceedings on
Creating Speech and Language Data with Amazon’s Mechanical Turk, pp. pp. 180–
183, June 2010.
[24] N. Lawson, K. Eustice, M. Perkowitz, and M. Yetisgen-Yildiz, “Annotating Large
Email Datasets for Named Entity Recognition with Mechanical Turk,” Proceedings
on Creating Speech and Language Data with Amazon’s Mechanical Turk, pp. pp.
71–79, June 2010.
[25] C. Higgins, E. McGrath, and L. Moretto, “Mturk Crowdsourcing: A Viable Method
for Rapid Discovery of Arabic Nicknames?,” Proceedings on Creating Speech and
Language Data with Amazon’s Mechanical Turk, pp. pp. 89–92, June 2010.
[26] T. Finin, W. Murnane, A. Karandikar, N. Keller, J. Martineau, and M. Dredze,
“Annotating Named Entities in Twitter Data with Crowdsourcing,” Proceedings on
Creating Speech and Language Data with Amazon’s Mechanical Turk, pp. pp. 80–88,
June 2010.
[27] M. E. Fayad, D. C. Schmidt, and R. E. Johnson, Building Application Frameworks:
Object-Oriented Foundations of Framework Design. Wiley Computer Publishing,
Sept. 1999.
[28] M. Fowler, K. Beck, J. Brant, W. Opdyke, and D. Roberts, Refactoring: Improving
the Design of Existing Code. Addison-Wesley Professional, 1999.
[29] G. Meszaros, xUnit Test Patterns: Refactoring Test Code. Addison-Wesley, 2007.
[30] J. Kerievsky, Refactoring to Patterns. Addison-Wesley, 2004.
[31] R. T. Fielding, Architectural Styles and the Design of Network-based Software Architectures. PhD thesis, University of California, Irvine, 2000.
114
[32] J. Brandt, P. J. Guo, J. Lewenstein, M. Dontcheva, and S. R. Klemmer, “Opportunistic Programming: Writing Code to Prototype, Ideate, and Discover,” IEEE
Software, vol. 26, no. 5, pp. 18–24, 2009.
[33] R. Holmes, Pragmatic Software Reuse. PhD thesis, Calgary, Alta., Canada, 2009.
[34] J. Brandt, M. Dontcheva, M. Weskamp, and S. R. Klemmer, “Example-centric Programming: IntegratingWeb Search into the Development Environment,” Proceedings of the 28th international conference on Human factors in computing systems, CHI
’10, pp. 513–522, 2010, ACM.
[35] J. Howe, “The Rise of Crowdsourcing,” Wired, vol. 14, no. 6, 2006.
[36] C. Parnin and C. Treude, “Measuring API Documentation on the Web,” Proceeding of the 2nd International Workshop on Web 2.0 for Software Engineering, pp. 25–30,
May 2011.
[37] J. C. Campbell, C. Zhang, Z. Xu, A. Hindle, and J. Miller, “Deficient Documentation Detection: a Methodology to Locate Deficient Project Documentation Using
Topic Analysis,” Proceedings of the 10th Working Conference on Mining Software
Repositories, pp. 57–60, May 2013.
[38] W. Li, C. Zhang, and S. Hu, “G-Finder: Routing Programming Questions Closer to the Experts,” OOPSLA/SPLASH, October 2010.
[39] L. Mamykina, B. Hartmann, B. Manoim, and M. Mittal., “Design Lessons from the
Fastest Q&A Site in theWest,” Proceeding of the 29th Conference on Human Factors in Computing Systems, 2011.
[40] K. Cwalina and B. Abrams, Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries. Addison-Wesley Professional, 2005.
[41] A. J. Ko, B. A. Myers, and H. H. Aung, “Six Learning Barriers in End-User Programming Systems,” IEEE Symposium on Visual Languages and Human Centric
Computing, pp. 199–206, September 2004.
[42] B. Dagenais and M. P. Robillard, “Creating and Evolving Developer Documentation: Understanding the Decisions of Open Source Contributors,” Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering, pp. 127–136, Nov. 2010.
[43] T. C. Lethbridge, J. Singer, and A. Forward, “How Software Engineers Use Documentation: The State of the Practice,” IEEE Software, vol. 20, no. 6, pp. 35–39,
2003.
[44] C. Petrie, “Pragmatic Semantic Unification,” IEEE Internet Computing, vol. 9, no. 5,
p. 96, Sep.-Oct. 2005.
[45] M. Hepp, “Possible Ontologies: How Reality Constrains the Development of Relevant
Ontologies,” IEEE Internet Computing Research Repository, vol. 11, no. 1, pp. 90–
96, Jan.-Feb. 2007.
115
[46] S. Goel, A. Broder, E. Gabrilovich, and B. Pang, “Anatomy of the Long Tail: Ordinary People with Extraordinary Tastes,” WSDM, pp. 201–210, Feb. 2010.
[47] M. F. Lopez, A. Gomez-Perez, J. P. Sierra, and A. P. Sierra, “Building a Chemical Ontology using Methontology and the Ontology Design Environment,” IEEE
Intelligent Systems and their Applications, vol. 14, no. 1, pp. 37–46, Jan./Feb. 1999.
[48] M. Uschold, “Building Ontologies: Towards a Unified Methodology,” Annual Conf.
of the British Computer Society Specialist Group on Expert Systems, 1996.
[49] S. Dill, N. Eiron, D. Gibson, D. Gruhl, R. Guha, A. Jhingran, T. Kanungo, S. Rajagopalan, A. Tomkins, J. A. Tomlin, and J. Y. Zien, “SemTag and Seeker: Bootstrapping the Semantic Web via Automated Semantic Annotation,” International
Conference on World Wide Web, pp. 178–186, 2003.
[50] M. Vargas-Vera, E. Motta, J. Domingue, M. Lanzoni, A. Stutt, and F. Ciravegna,
“MnM: Ontology-Driven Tool for Semantic Markup,” Knowledge Engineering and
Knowledge Management, pp. 379–391, Oct. 2002.
[51] B. Popov, A. Kiryakov, A. Kirilov, D. Manov, D. Ognyanoff, and M. Goranov, “KIM
- Semantic Annotation Platform,” International Semantic Web Conference, pp. 834
– 849, Oct. 2003.
[52] S. Auer, S. Dietzold, , T. Riechert, and T. Riechert, “OntoWiki - A Tool for So-
cial, Semantic Collaboration,” International Semantic Web Conference, pp. 736–749,
2006, Springer.
[53] K. Siorpaes, “MyOntology: The Marriage of Ontology Engineering and Collective
Intelligence,” SemNet, pp. 127–138, 2007.
[54] M. Hepp, “Harvesting Wiki Consensus: Using Wikipedia Entries as Vocabulary for
Knowledge Management,” IEEE Internet Computing, no. 5, pp. 54–65, Sep.-Oct.
2007.
[55] A. Marchetti, M. Tesconi, F. Ronzanoe, M. Rosella, and S. Minutoli, “SemKey: A
Semantic Collaborative Tagging System,” WWW, May 2007.
[56] F.-P. Yang, H. C. Jiau, and K.-F. Ssu, “IDEA: An Interest-DrivEn Architecture
for Establishing Personalized Semantic Infrastructure,” International Conference on
Intelligent Information Hiding and Multimedia Signal Processing, pp. 1006 – 1009,
Sep. 2009.
[57] “JDiff - An HTML Report of API Differences.” ”http://jdiff.sourceforge.net/”.
[58] T. Apiwattanapong, A. Orso, and M. J. Harrold, “JDiff: A Differencing Technique and Tool for Object-oriented Programs,” Automated Software Engineering, vol. 14,
no. 1, pp. pp. 3–36, Mar 2007.