簡易檢索 / 詳目顯示

研究生: 楊豐溥
Yang, Feng-Pu
論文名稱: 群眾外包的重用
The Reuse of Crowdsourcing
指導教授: 焦惠津
Jiau, Hewi-jin
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2013
畢業學年度: 101
語文別: 英文
論文頁數: 117
中文關鍵詞: 群眾外包重用繼承應用程式介面文件語意
外文關鍵詞: Crowdsourcing, reuse, inheritance, evolution, API document, semantics
相關次數: 點閱:208下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 群眾外包是目前最重要的人類運算 (Human Computation) 方式之一。然而,要驅使人們去進行群眾外包的工作所需要的花費仍然是十分昂貴的。除了昂貴之外,群眾們還有不夠可靠以及可用性 (availability) 低的缺點。這篇論文確認了群眾外包工作的品質不足與低可用性是普遍存在的問題。而造成這些問題的主因就是嚴重的群眾人力分配不平衡 (inequality of crowdsource aoolcation)。雖然既有的群眾外包平台提供了一些重複使用 (re-use) 的協助,但是其改善的效果大幅受限於人力分配不平衡的因素。這篇論文提出了一個基於群眾外包的重用 (crowdsourcing reuse) 的解法去利用既有不平衡的人力分配,將受到較多人力分配的外包工作成果重用 (reuse) 到極少人關注的群眾外包工作之上。然而重用所需要的額外負擔可能會讓群眾們對重用卻步,繼續之前直接外包的模式。因此,本論文替重用流程進行了自動化。自動化重用流程是植基於兩項技術,語意鬆綁模式 (semantics relaxation pattern) 以及運算式語意 (operational semantics)。除了自動化重用流程之外,如何讓現有的群眾外包產出更有重用性 (reusability) 的結果是這篇論文的另一個重點。語意工廠 (semantics forge) 是一個針對增加重用性而提出來的框架 (framework)。語意工廠整合了三個流程來達成增加重用性的目標,這三個流程分別是 (1) 語意累積流程 (2) 群眾人力分配的監控流程 (3) 重用性機會的探索流程。為了驗證群眾外包重用的實際效果,本論文在應用程式介面 (API) 的相關文件這個領域之上進行了一系列的實例研究。應用程式介面 (API) 的相關文件的缺乏是造成應用程式介面學習障礙的主要原因。實例研究顯示本論文所提的兩個語意鬆綁模式,亦即以繼承為主的語意鬆綁模式 (inheritance-based semantics relaxation pattern) 與可感知演化的語意鬆綁模式 (evolution-aware semantics relaxation pattern) 均可顯著的提高群眾外包的重用性。最後會介紹的是一個用來展現語意工廠實用性的原型 – IDEA。透過IDEA 來展現個語意工廠主要的功能。

    Crowdsourcing is one of the most important human computation practices. However, motivating crowd to work for crowdsourcing tasks is expensive. Crowd is also unreliable in terms of quality and availability. This dissertation confirmed that the insufficient quality and low availability are frequent and universally observed problems. The severe inequality of crowdsource allocation is the major cause of these problems. Although existing crowdsourcing platforms provide some re-use supports, the improvement is limited by the severe inequality. This dissertation presented a solution based on crowdsourcing reuse, which could leverage the inequality by reusing from majorities to minorities. However, the extra effort needed by crowdsourcing reuse might hinder crowd from the benefits of reuse. To reduce the impacts of the extra effort, increasing the automation degree of crowdsourcing reuse is one feasible solution. The automation of crowdsourcing reuse was achieved by two techniques, semantics relaxation patterns and operational semantics. In addition to automate crowdsourcing reuse, a framework, Semantics Forge, was also studies to improve the reusability of existing crowd documents. Semantics Forge integrates three process of crowdsourcing for reuse, (1) The Semantics Accumulation Process, (2) The Crowdsource Allocation Monitoring, and (3) The Reuse Opportunities Exploration Process. To evaluate the actual effects achieved by crowdsourcing reuse, several empirical studies were conducted. Empirical studies were conducted in the context of crowd generated API documents, which is the most promising solution for an important issue of API learning barriers. Studies showed that both the Inheritance-based Semantics Relaxation Pattern and the Evolution-aware Semantics Relaxation Pattern could improve the reuse rate of crowd documents dramatically. Finally, a prototyping of Semantics Forge, called
    IDEA, was presented to prove the feasibility of major features of the Semantics Forge.

    Abstract (Chinese) i Abstract (English) iii Acknowledgments v 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1 Dissertation Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Conceptual Framework of Crowdsourcing Reuse . . . . . . . . . . . . . 7 2.1 Observation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3 Methodology of Crowdsourcing Reuse . . . . . . . . . . . . . . . . . . . 15 3.1 Motivation: The Inequality of Crowdsource Allocation . . . . . . . . . . 17 3.1.1 Pilot Study of Crowdsource Allocation . . . . . . . . . . . . . . . 18 3.1.2 Results of the Pilot Study . . . . . . . . . . . . . . . . . . . . . . 20 3.1.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Crowdsourcing with Reuse . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2.1 The Reuse Process . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2.2 Semantics Relaxation Pattern . . . . . . . . . . . . . . . . . . . . 25 3.2.3 Operational Semantics . . . . . . . . . . . . . . . . . . . . . . . . 34 3.2.4 Automating the Reuse Process . . . . . . . . . . . . . . . . . . . . 37 3.3 Crowdsourcing for Reuse . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.3.1 The Semantics Accumulation Process . . . . . . . . . . . . . . . . 39 3.3.2 The Crowdsource Allocation Monitoring . . . . . . . . . . . . . . 46 3.3.3 The Reuse Opportunities Exploration Process . . . . . . . . . . . 49 3.3.4 Semantics Forge . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4 Crowdsourcing Reuse for API Documents Shortage . . . . . . . . . . . 57 4.1 The API Learning Barriers and Documents Shortage . . . . . . . . . . . 57 4.2 The Role of Crowdsourcing in API Documents Shortage . . . . . . . . . 58 4.2.1 Existing Studies for Applying Crowd Documents for API Learning Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.3 Apply Crowdsourcing Reuse Methodology for API Documents Shortage 62 4.3.1 Operational Semantics: Primitive Semantics - API Methods . . . 62 4.3.2 Operational Semantics: Composite Semantics - API Versions . . . 63 4.3.3 Operational Semantics: Improve the Primitive Semantics by the Feedback from Composite Semantics . . . . . . . . . . . . . . . . 68 4.3.4 The Reuse Opportunities Exploration Process: Inheritance hierarchy of Object-Oriented API . . . . . . . . . . . . . . . . . . . . . 73 4.3.5 The Crowdsource Allocation Monitor: Co-evolution between API and Crowd Documents . . . . . . . . . . . . . . . . . . . . . . . . 84 4.3.6 Semantics Forge Prototyping: Interest-Driven Semantics Accumulation Environment . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.3.7 Empirical Study on the Effects of Semantics Relaxation Patterns . 100 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 5.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 110 5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

    [1] “Platform versions,” 2013.
    [2] L. von Ahn, Human Computation. PhD thesis, Carnegie Mellon University, 2005.
    [3] M.-C. Yuen, I. King, and K.-S. Leung, “A Survey of Crowdsourcing Systems,” IEEE
    International Conference on Privacy, Security, Risk and Trust (PASSAT) and Social
    Computing (SocialCom), pp. pp. 766 – 773, Oct. 2011.
    [4] M. B. Miles and A. M. Huberman, Qualitative Data Analysis: An Expanded Sourcebook. SAGE Publications, Inc; 2nd edition, 1994.
    [5] “Resource Allocation.” http://en.wikipedia.org/wiki/Resource allocation.
    [6] E. S. Raymond, The Cathedral and the Bazaar - Musings on Linux and Open Source by an Accidental Revolutionary. O’Reilly Media, Feb. 2001.
    [7] J. Sametinger, Software Engineering with Reusable Components. Springer, 1997.
    [8] T. Berners-Lee, J. Hendler, and O. Lassila, “The Semantic Web,” Scientific American, no. 0501, May 2001.
    [9] A. Gomez-Perez, O. Corcho, and M. Fernandez-Lopez, Ontological Engineering -
    with examples from the areas of Knowledge Management, e-Commerce and the Semantic Web. Springer, 2003.
    [10] “Iterative and Incremental Developments: A Brief History,” IEEE Computer, vol. 36, no. 6, pp. 47 – 56, June 2003.
    [11] K. Beck, Test Driven Development: By Example. Addison-Wesley Professional, 2002.
    [12] J. M. D. U. Farooq, J. Stylos, and B. A. Myers, “API Usability: CHI’2009 Special Interest Group Meeting,” Proceedings of the 27th International Conference on
    Human Factors in Computing Systems, pp. 2771–2774, 2009.
    [13] M. P. Robillard, “What Makes APIs Hard to Learn? Answers from Developers,”
    IEEE Software, vol. 26, no. 6, pp. 27–34, Nov./Dec. 2009.
    [14] H. C. Jiau and F.-P. Yang, “Facing up to the Inequality of Crowdsourced API
    Documentation,” ACM SIGSOFT Software Engineering Notes, vol. 37, no. 1, 2012.
    [15] “StackOverflow - A Language-Independent Collaboratively Edited Question and Answer Site for Programmers.” http://stackoverflow.com/.
    [16] T. S. Kuhn, The Structure of Scientific Revolutions. University of Chicago Press,
    3rd edition ed., 1996.
    [17] J. M. Dienhart, “A Linguistic Look at Riddles,” Journal of Pragmatics, vol. 31, no. 1, pp. 95 – 125, 1998.
    113
    [18] R. S. Arnold, Software Change Impact Analysis. IEEE Computer Society Press,
    1996.
    [19] J. Han, “Supporting Impact Analysis and Change Propagation in Software Engineering Environments,” Proceedings of the 8th International Workshop on Software
    Technology and Engineering Practice, pp. pp. 172 – –182, July 1997.
    [20] H. Malik and A. E. Hassan, “Supporting Software Evolution Using Adaptive Change
    Propagation Heuristics,” IEEE International Conference on Software Maintenance,
    pp. pp. 177–186, 2008.
    [21] J. Anvik, L. Hiew, and G. C. Murphy, “Who Should Fix this Bug?,” Proceedings of the International Conference on Software Engineering, pp. pp. 361–370, May 2006.
    [22] I. T. Bowman and R. C. Holt, “Reconstructing Ownership Architectures To Help
    Understand Software Systems,” Proceedings of the International Workshop on Program Comprehension, pp. pp. 28–37, May 1999.
    [23] M. Yetisgen-Yildiz, I. Solti, F. Xia, and S. R. Halgrim, “Preliminary Experience with
    Amazon’s Mechanical Turk for Annotating Medical Named Entities,” Proceedings on
    Creating Speech and Language Data with Amazon’s Mechanical Turk, pp. pp. 180–
    183, June 2010.
    [24] N. Lawson, K. Eustice, M. Perkowitz, and M. Yetisgen-Yildiz, “Annotating Large
    Email Datasets for Named Entity Recognition with Mechanical Turk,” Proceedings
    on Creating Speech and Language Data with Amazon’s Mechanical Turk, pp. pp.
    71–79, June 2010.
    [25] C. Higgins, E. McGrath, and L. Moretto, “Mturk Crowdsourcing: A Viable Method
    for Rapid Discovery of Arabic Nicknames?,” Proceedings on Creating Speech and
    Language Data with Amazon’s Mechanical Turk, pp. pp. 89–92, June 2010.
    [26] T. Finin, W. Murnane, A. Karandikar, N. Keller, J. Martineau, and M. Dredze,
    “Annotating Named Entities in Twitter Data with Crowdsourcing,” Proceedings on
    Creating Speech and Language Data with Amazon’s Mechanical Turk, pp. pp. 80–88,
    June 2010.
    [27] M. E. Fayad, D. C. Schmidt, and R. E. Johnson, Building Application Frameworks:
    Object-Oriented Foundations of Framework Design. Wiley Computer Publishing,
    Sept. 1999.
    [28] M. Fowler, K. Beck, J. Brant, W. Opdyke, and D. Roberts, Refactoring: Improving
    the Design of Existing Code. Addison-Wesley Professional, 1999.
    [29] G. Meszaros, xUnit Test Patterns: Refactoring Test Code. Addison-Wesley, 2007.
    [30] J. Kerievsky, Refactoring to Patterns. Addison-Wesley, 2004.
    [31] R. T. Fielding, Architectural Styles and the Design of Network-based Software Architectures. PhD thesis, University of California, Irvine, 2000.
    114
    [32] J. Brandt, P. J. Guo, J. Lewenstein, M. Dontcheva, and S. R. Klemmer, “Opportunistic Programming: Writing Code to Prototype, Ideate, and Discover,” IEEE
    Software, vol. 26, no. 5, pp. 18–24, 2009.
    [33] R. Holmes, Pragmatic Software Reuse. PhD thesis, Calgary, Alta., Canada, 2009.
    [34] J. Brandt, M. Dontcheva, M. Weskamp, and S. R. Klemmer, “Example-centric Programming: IntegratingWeb Search into the Development Environment,” Proceedings of the 28th international conference on Human factors in computing systems, CHI
    ’10, pp. 513–522, 2010, ACM.
    [35] J. Howe, “The Rise of Crowdsourcing,” Wired, vol. 14, no. 6, 2006.
    [36] C. Parnin and C. Treude, “Measuring API Documentation on the Web,” Proceeding of the 2nd International Workshop on Web 2.0 for Software Engineering, pp. 25–30,
    May 2011.
    [37] J. C. Campbell, C. Zhang, Z. Xu, A. Hindle, and J. Miller, “Deficient Documentation Detection: a Methodology to Locate Deficient Project Documentation Using
    Topic Analysis,” Proceedings of the 10th Working Conference on Mining Software
    Repositories, pp. 57–60, May 2013.
    [38] W. Li, C. Zhang, and S. Hu, “G-Finder: Routing Programming Questions Closer to the Experts,” OOPSLA/SPLASH, October 2010.
    [39] L. Mamykina, B. Hartmann, B. Manoim, and M. Mittal., “Design Lessons from the
    Fastest Q&A Site in theWest,” Proceeding of the 29th Conference on Human Factors in Computing Systems, 2011.
    [40] K. Cwalina and B. Abrams, Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries. Addison-Wesley Professional, 2005.
    [41] A. J. Ko, B. A. Myers, and H. H. Aung, “Six Learning Barriers in End-User Programming Systems,” IEEE Symposium on Visual Languages and Human Centric
    Computing, pp. 199–206, September 2004.
    [42] B. Dagenais and M. P. Robillard, “Creating and Evolving Developer Documentation: Understanding the Decisions of Open Source Contributors,” Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering, pp. 127–136, Nov. 2010.
    [43] T. C. Lethbridge, J. Singer, and A. Forward, “How Software Engineers Use Documentation: The State of the Practice,” IEEE Software, vol. 20, no. 6, pp. 35–39,
    2003.
    [44] C. Petrie, “Pragmatic Semantic Unification,” IEEE Internet Computing, vol. 9, no. 5,
    p. 96, Sep.-Oct. 2005.
    [45] M. Hepp, “Possible Ontologies: How Reality Constrains the Development of Relevant
    Ontologies,” IEEE Internet Computing Research Repository, vol. 11, no. 1, pp. 90–
    96, Jan.-Feb. 2007.
    115
    [46] S. Goel, A. Broder, E. Gabrilovich, and B. Pang, “Anatomy of the Long Tail: Ordinary People with Extraordinary Tastes,” WSDM, pp. 201–210, Feb. 2010.
    [47] M. F. Lopez, A. Gomez-Perez, J. P. Sierra, and A. P. Sierra, “Building a Chemical Ontology using Methontology and the Ontology Design Environment,” IEEE
    Intelligent Systems and their Applications, vol. 14, no. 1, pp. 37–46, Jan./Feb. 1999.
    [48] M. Uschold, “Building Ontologies: Towards a Unified Methodology,” Annual Conf.
    of the British Computer Society Specialist Group on Expert Systems, 1996.
    [49] S. Dill, N. Eiron, D. Gibson, D. Gruhl, R. Guha, A. Jhingran, T. Kanungo, S. Rajagopalan, A. Tomkins, J. A. Tomlin, and J. Y. Zien, “SemTag and Seeker: Bootstrapping the Semantic Web via Automated Semantic Annotation,” International
    Conference on World Wide Web, pp. 178–186, 2003.
    [50] M. Vargas-Vera, E. Motta, J. Domingue, M. Lanzoni, A. Stutt, and F. Ciravegna,
    “MnM: Ontology-Driven Tool for Semantic Markup,” Knowledge Engineering and
    Knowledge Management, pp. 379–391, Oct. 2002.
    [51] B. Popov, A. Kiryakov, A. Kirilov, D. Manov, D. Ognyanoff, and M. Goranov, “KIM
    - Semantic Annotation Platform,” International Semantic Web Conference, pp. 834
    – 849, Oct. 2003.
    [52] S. Auer, S. Dietzold, , T. Riechert, and T. Riechert, “OntoWiki - A Tool for So-
    cial, Semantic Collaboration,” International Semantic Web Conference, pp. 736–749,
    2006, Springer.
    [53] K. Siorpaes, “MyOntology: The Marriage of Ontology Engineering and Collective
    Intelligence,” SemNet, pp. 127–138, 2007.
    [54] M. Hepp, “Harvesting Wiki Consensus: Using Wikipedia Entries as Vocabulary for
    Knowledge Management,” IEEE Internet Computing, no. 5, pp. 54–65, Sep.-Oct.
    2007.
    [55] A. Marchetti, M. Tesconi, F. Ronzanoe, M. Rosella, and S. Minutoli, “SemKey: A
    Semantic Collaborative Tagging System,” WWW, May 2007.
    [56] F.-P. Yang, H. C. Jiau, and K.-F. Ssu, “IDEA: An Interest-DrivEn Architecture
    for Establishing Personalized Semantic Infrastructure,” International Conference on
    Intelligent Information Hiding and Multimedia Signal Processing, pp. 1006 – 1009,
    Sep. 2009.
    [57] “JDiff - An HTML Report of API Differences.” ”http://jdiff.sourceforge.net/”.
    [58] T. Apiwattanapong, A. Orso, and M. J. Harrold, “JDiff: A Differencing Technique and Tool for Object-oriented Programs,” Automated Software Engineering, vol. 14,
    no. 1, pp. pp. 3–36, Mar 2007.

    下載圖示 校內:2015-08-28公開
    校外:2016-08-28公開
    QR CODE