簡易檢索 / 詳目顯示

研究生: 湯義信
Melo Peralta, Tomas
論文名稱: 用於科學預測的長序變壓器編碼器再現性
Long sequence transformer encoders for prediction of scientific reproducibility
指導教授: 吳馬丁
Torbjörn E. M. Nordling
學位類別: 碩士
Master
系所名稱: 工學院 - 機械工程學系
Department of Mechanical Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 74
中文關鍵詞: 可重現性自然語言處理大型語言模型
外文關鍵詞: Reproducibility, Natural language processing, Language models
相關次數: 點閱:72下載:11
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 研究介紹: 美國國會責成國家科學院、工程院和醫學院評估科學的可重複性狀況。當方法缺乏足夠的細節或者數據和代碼不可用或難以訪問時,再現科學工作可能具有挑戰性。人工評估論文的再現性會消耗時間和資源,因此自動化的再現性評估方法將是有益的。我們的文獻綜述揭示了 10 種預測再現性的替代方法;這些都不能正確預測一篇論文是否會以超過 75% 的準確率進行複制。所發現的方法均未使用轉換器模型,這是自然語言處理任務的最先進架構。
    研究目標: 我們的目標是測試預訓練的變壓器編碼器模型的再現性預測性能。準確性(正確預測的百分比)是用於評估性能的指標。
    研究方法: 我們收集了 158 篇來自心理學、經濟學、特徵選擇和網絡推理領域的先前可重複性項目的文章。收集的數據集中只有 67 篇文章被標記為可重現。這些文章已在再現性研究中進行了評估,並具有二元再現性結果。我們提取方法的部分;去除非字母詞、數字和引用,然後訓練四個預訓練的長序列變換器編碼器模型(Big Bird、Big Bird Pegasus、Longformer 和 Long T5)來預測再現性。使用五折交叉驗證。
    研究結果: 所有預訓練模型始終過度擬合訓練數據 (80%∼100%),但只有兩個模型在其驗證集之一上超過基線準確度 (∼58%)。基線準確率是每次預測多數類所達到的準確率。Long T5 和 Big Bird 在訓練 200 個 epoch 後分別達到 ∼69% 和 ∼65% 的準確率。
    研究結論: Long T5 和 Big Bird 僅在五分之一的情況下超過基線驗證精度,因此我們不能斷定這些模型可以推廣到看不見的數據。我們需要增加數據集的大小以進行進一步測試。

    Introduction: The US Congress tasked the National Academies of Sciences Engineering and Medicine with evaluating the state of reproducibility in science. Reproducing scientific work can be challenging when either the methods lack sufficient detail or the data and code are unavailable or difficult to access. Manual evaluation of a paper’s reproducibility consumes time and resources, so an automated reproducibility assessment method would be beneficial. Our literature review reveals 10 alternative methods to predict reproducibility; none of these could correctly predict whether a paper would replicate with more than 75% accuracy. None of the methods found use transformer models, the state-of-the-art architecture for natural language processing tasks.
    Objectives: We aim to test the performance of pre-trained transformer encoder models for reproducibility prediction. Accuracy (percentage of correct predictions) is the metric used to evaluate performance.
    Methods: We collected 158 articles from previous reproducibility projects in the fields of psychology, economics, feature selection, and network inference. Only 67 articles from the collected dataset are labeled as reproducible. These articles have been assessed in reproducibility studies and have a binary reproducibility outcome. We extract the method’s section; remove non-letter words, numbers, and citations, and then train four pre-trained long sequence transformer encoder models (Big Bird, Big Bird Pegasus, Longformer, and Long T5) to predict reproducibility. Five-fold cross-validation is used.
    Results: All the pre-trained models consistently overfit the training data (80%∼100%) but only two models surpass the baseline accuracy (∼58%) on one of their validation sets. The baseline accuracy is the accuracy achieved by predicting the majority class every time. Long T5 and Big Bird achieve ∼69% and ∼65% accuracies respectively after training for 200 epochs.
    Conclusion: Long T5 and Big Bird only exceed baseline validation accuracy in one out of five folds, so we cannot conclude that these models can generalize to unseen data. We need to increase the size of our dataset for further testing.

    Table of Contents Chinese abstract i Abstract ii Acknowledgment iv Table of Contents v List of Tables vii List of Figures viii List of Symbols ix 1 Introduction 1 1.1 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Transformers for sequence classification . . . . . . . . . . . . . . . . . . . 7 1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.5 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2 Methods 10 2.1 Theory of transformers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Practical implementation of transformers . . . . . . . . . . . . . . . . . . . 15 2.3 Model selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.4 Data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.5 Data preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.6 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3 Results and discussion 32 3.1 Pre-trained models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.2 Dataset stratified by project and label . . . . . . . . . . . . . . . . . . . . 38 3.3 General discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4 Conclusion 48 4.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 References 51 Appendix A Validation accuracy 65 Appendix B Results with a different random seed 68 Appendix C Transformer decoder 73

    Aarts, A., Anderson, J., and Anderson, C. (2015). Estimating the reproducibility of psychological science. Science, 349(6251).
    Abeler, J. B., Falk, A., Goette, L., and Huffman, D. (2011). Replication of reference points and effort provision. The American Economic Review, 101:470–492.
    Ackerman, J. M., Nocera, C. C., and Bargh, J. A. (2010). Incidental haptic sensations influence social judgments and decisions. Science, 328(5986):1712–1715.
    Alammar, J. (2018). The illustrated transformer. https://jalammar.github.io/ illustrated-transformer. Accessed: (2023-03-20).
    Albarracín, D., Handley, I. M., Noguchi, K., McCulloch, K. C., Li, H., Leeper, J., Brown, R. D., Earl, A., and Hart, W. P. (2008). Increasing and decreasing motor and cognitive output: A model of general action and inaction goals. Journal of Personality and Social Psychology, 95:510–523.
    Alter, A. L. and Oppenheimer, D. M. (2008). Effects of fluency on psychological distance and mental construal (or why new york is a large city, but new york is a civilized jungle). Psychological Science, 19(2):161–167. PMID: 18271864.
    Alter, A. L., Oppenheimer, D. M., Epley, N., and Eyre, R. N. (2007). Overcoming intuition: Metacognitive difficulty activates analytic reasoning. Journal of Experimental Psychology: General, 136:569–576.
    Altmejd, A., Dreber, A., Forsell, E., Huber, J., Imai, T., Johannesson, M., Kirchler, M., Nave, G., and Camerer, C. (2019). Predicting the replicability of social science lab experiments. PloS one, 14(12):e0225826.
    Ambrus, A. and Greiner, B. (2012). Imperfect public monitoring with costly punishment: An experimental study. American Economic Review, 102:3317–3332.
    Amodio, D. M., Devine, P. G., and Harmon-Jones, E. (2008). Individual differences in the regulation of intergroup bias: The role of conflict monitoring and neural signals for control. Journal of Personality and Social Psychology, 94:60–74.
    Anderson, C., Kraus, M. W., Galinsky, A. D., and Keltner, D. (2012). The local-ladder effect: Social status and subjective well-being. Psychological Science, 23:764–771.
    Aviezer, H., Yaacov, T., and Todorov, A. (2012). Body cues, not facial expressions, discriminate between intense positive and negative emotions. Science, 338:1225–1229.
    Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533(7604):452–454.
    Balafoutas, L. and Sutter, M. (2012). Affirmative action policies promote women and do not harm efficiency in the laboratory. Science, 335(6068):579–582.
    Bartling, B., Fehr, E., and Schmidt, K. M. (2012). Screening, competition, and job design: Economic origins of good jobs. American Economic Review, 102:834–864.
    Bauer, M. A., Wilkie, J. E., Kim, J. K., and Bodenhausen, G. V. (2012). Cuing consumerism: Situational materialism undermines personal and social well-being. PsychologicalScience, 23:517–523.
    Beaman, C. P., Neath, I., and Surprenant, A. M. (2008). Modeling distributions of immediate memory effects: No strategies needed? Journal of Experimental Psychology: Learning Memory and Cognition, 34:219–229.
    Beltagy, I., Peters, M. E., and Cohan, A. (2020). Longformer: The long-document transformer. arXiv:2004.05150.
    Belz, A., Popović, M., and Mille, S. (2022). Quantified reproducibility assessment of nlp results.
    Berry, C. J., Shanks, D. R., and Henson, R. N. (2008). A single-system account of the relationship between priming, recognition, and fluency. Journal of Experimental Psychology: Learning Memory and Cognition, 34:97–111.
    Blankenship, K. L. and Wegener, D. T. (2008). Opening the mind to close it: Considering a message in light of important values increases message processing and later resistance to change. Journal of Personality and Social Psychology, 94:196–213.
    Boroditsky, L. (2000). Metaphoric structuring: understanding time through spatial metaphors. Cognition, 75(1):1–28.
    Bressan, P. and Stranieri, D. (2008). The best men are (not always) already taken: Female preference for single versus attached males depends on conception risk: Research article. Psychological Science, 19:145–151.
    Cacioppo, J. T., Petty, R. E., and Morris, K. J. (1983). Effects of need for cognition on message evaluation, recall, and persuasion. Journal of Personality and Social Psychology, 45:805–818.
    Camerer, C. F., Dreber, A., Forsell, E., Ho, T.-H., Huber, J., Johannesson, M., Kirchler, M., Almenberg, J., Altmejd, A., Chan, T., et al. (2016). Evaluating replicability of laboratory experiments in economics. Science, 351(6280):1433–1436.
    Camerer, C. F., Dreber, A., Holzmeister, F., Ho, T.-H., Huber, J., Johannesson, M., Kirchler, M., Nave, G., Nosek, B. A., Pfeiffer, T., et al. (2018). Evaluating the replicability of social science experiments in nature and science between 2010 and 2015. Nature Human Behaviour, 2(9):637–644.
    Carter, T. J., Ferguson, M. J., and Hassin, R. R. (2011). A single exposure to the american flag shifts support toward republicanism up to 8 months later. Source: Psychological Science, 22:1011–1018.
    Caruso, E. M., Vohs, K. D., Baxter, B., and Waytz, A. (2013). Mere exposure to money increases endorsement of free-market systems and social inequality. Journal of Experimental Psychology: General, 142:301–306.
    Centerbar, D. B., Schnall, S., Clore, G. L., and Garvin, E. D. (2008). Affective incoherence: When affective concepts and embodied reactions clash. Journal of Personality and Social Psychology, 94:560–578.
    Charness, G. and Dufwenberg, M. (2011). Participation. American Economic Review, 101:1211–1237.
    Chen, R. and Chen, Y. (2011). The potential of social identity for equilibrium selection. American Economic Review, 101:2562–2589.
    Clippel, G. D., Eliaz, K., and Knight, B. (2014). On the selection of arbitrators. American Economic Review, 104:3434–3458.
    Code, P. W. (2023). Machine learning datasets | papers with code. https:// paperswithcode.com/datasets?mod=texts&task=text-classification. [Online; last accessed 16-Arpil-2023].
    Colzato, L. S., Bajo, M. T., van den Wildenberg, W., Paolieri, D., Nieuwenhuis, S., Heij, W. L., and Hommel, B. (2008). How does bilingualism improve executive control? a comparison of active and reactive inhibition mechanisms. Journal of Experimental Psychology: Learning Memory and Cognition, 34:302–312.
    Cox, C. R., Arndt, J., Pyszczynski, T., Greenberg, J., Abdollahi, A., and Solomon, S. (2008). Terror management and adults’ attachment to their parents: The safe haven remains. Journal of Personality and Social Psychology, 94:696–717.
    Critcher, C. R. and Gilovich, T. (2008). Incidental environmental anchors. Journal of Behavioral Decision Making, 21:241–251.
    Derex, M., Beugin, M. P., Godelle, B., and Raymond, M. (2013). Experimental evidence for the influence of group size on cultural complexity. Nature, 503:389–391.
    Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
    Dijksterhuis, A. and Knippenberg, A. V. (1998). The relation between perception and behavior, or how to win a game of trivial pursuit. Journal of Personality and Social Psychology, 74:865–877.
    Dodson, C. S., Darragh, J., and Williams, A. (2008). Stereotypes and retrieval-provoked illusory source recollections. Journal of Experimental Psychology: Learning Memory and Cognition, 34:460–477.
    Dreber, A., Pfeiffer, T., Almenberg, J., Isaksson, S., Wilson, B., Chen, Y., Nosek, B. A., and Johannesson, M. (2015). Using prediction markets to estimate the reproducibility of scientific research. Proceedings of the National Academy of Sciences, 112(50):15343– 15347.
    Duffy, J. and Puzzello, D. (2014). Gift exchange versus monetary exchange: Theory and evidence. American Economic Review, 104:1735–1776.
    Dulleck, U., Kerschbamer, R., and Sutter, M. (2011). The economics of credence goods: An experiment on the role of liability, verifiability, reputation, and competition. American Economic Review, 101:526–555.
    Duncan, K., Sadanand, A., and Davachi, L. (2012). Memory’s penumbra: Episodic memory decisions induce lingering mnemonic biases. Science, 337:485–487.
    Eastwick, P. W. and Finkel, E. J. (2008). Sex differences in mate preferences revisited: Do people know what they initially desire in a romantic partner? Journal of Personality and Social Psychology, 94:245–264.
    Ebersole, C. R., Atherton, O. E., Belanger, A. L., Skulborstad, H. M., Allen, J. M., Banks, J. B., Baranski, E., Bernstein, M. J., Bonfiglio, D. B., Boucher, L., et al. (2016). Many labs 3: Evaluating participant pool quality across the academic semester via replication. Journal of Experimental Social Psychology, 67:68–82.
    Epley, N., Akalis, S., Waytz, A., and Cacioppo, J. T. (2008). Creating social connection through inferential reproduction: Loneliness and perceived agency in gadgets, gods, and hreyhounds: Research article. Psychological Science, 19:114–120.
    Ericson, K. M. M. and Fuster, A. (2011). Expectations as endowments: Evidence on reference-dependent preferences from exchange and valuation experiments. Quarterly Journal of Economics, 126:1879–1907.
    Ersner-Hershfield, H., Mikels, J. A., Sullivan, S. J., and Carstensen, L. L. (2008). Poignancy: Mixed emotional experience in the face of meaningful endings. Journal of Personality and Social Psychology, 94:158–167.
    Estes, Z., Verges, M., and Barsalou, L. W. (2008). Head up, foot down: Object words orient attention to the objects’ typical location: Research report. Psychological Science, 19:93– 97.
    Exline, J. J., Baumeister, R. F., Zell, A. L., Kraft, A. J., and Witvliet, C. V. (2008). Not so innocent: Does seeing one’s own capability for wrongdoing predict forgiveness? Journal of Personality and Social Psychology, 94:495–515.
    Farrell, S. (2008). Multiple roles for time in short-term memory: Evidence from serial recall of order and timing. Journal of Experimental Psychology: Learning Memory and Cognition, 34:128–145.
    Farris, C., Treat, T. A., Viken, R. J., and McFall, R. M. (2008). Perceptual mechanisms that characterize gender differences in decoding women’s sexual intent: Research article. Psychological Science, 19:348–354.
    Fehr, E., Herz, H., and Wilkening, T. (2013). The lure of authority: Motivation and incentive effects of power. American Economic Review, 103:1325–1359.
    Finkel, E. J., Rusbult, C. E., Kumashiro, M., and Hannon, P. A. (2002). Dealing with betrayal in close relationships: Does commitment promote forgiveness? Journal of Personality and Social Psychology, 82:956–974.
    Friedman, D. and Oprea, R. (2012). A continuous dilemma. American Economic Review, 102:337–363.
    Fruyt, F. D., Wiele, L. V. D., and Heeringen, C. V. (2000). Cloninger’s psychobiological model of temperament and character and the five-factor model of personality. Personality and Individual Differences, 29:441–452.
    Fudenberg, D., Rand, D. G., and Dreber, A. (2012). Slow to anger and fast to forgive: Cooperation in an uncertain world. American Economic Review, 102:720–749.
    Förster, J., Liberman, N., and Kuschel, S. (2008). The effect of global versus local processing styles on assimilation versus contrast in social judgment. Journal of Personality and Social Psychology, 94:579–599.
    Galinsky, A. D., Magee, J. C., Inesi, M. E., and Gruenfeld, D. H. (2006). Power and perspectives not taken. Psychological Science, 17(12):1068–1074. PMID: 17201789.
    Gervais, W. M. and Norenzayan, A. (2012). Analytic thinking promotes religious disbelief. Science, 336(6080):493–496.
    Giessner, S. R. and Schubert, T. W. (2007). High in the hierarchy: How vertical location and judgments of leaders’ power are interrelated. Organizational Behavior and Human Decision Processes, 104:30–44.
    Gneezy, U., Keenan, E. A., and Gneezy, A. (2014). Avoiding overhead aversion in charity. Science, 346(6209):632–635.
    Graham, J., Haidt, J., and Nosek, B. A. (2009). Liberals and conservatives rely on different sets of moral foundations. Journal of Personality and Social Psychology, 96:1029–1046.
    Gray, K. and Wegner, D. M. (2009). Moral typecasting: Divergent perceptions of moral agents and moral patients. Journal of Personality and Social Psychology, 96:505–520.
    Gundersen, O. E. and Kjensmo, S. (2018). State of the art: Reproducibility in artificial intelligence. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1).
    Guo, M., Ainslie, J., Uthus, D., Ontañón, S., Ni, J., Sung, Y.-H., and Yang, Y. (2022). LongT5: Efficient text-to-text transformer for long sequences. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 724–736.
    Guo, S., Jiang, Q., Chen, L., and Guo, D. (2016). Gene regulatory network inference using pls-based methods. BMC bioinformatics, 17(1):545.
    Halevy, N., Bornstein, G., and Sagiv, L. (2008). “in-group love"and “out-group hate"as motives for individual participation in intergroup conflict: A new game paradigm. Psychological Science, 19(4):405–411. PMID: 18399895.
    Hancer, E., Xue, B., Karaboga, D., and Zhang, M. (2015). A binary abc algorithm based on advanced similarity scheme for feature selection. Applied Soft Computing, 36:334–348.
    Harmon-Jones, E., Harmon-Jones, C., Fearn, M., Sigelman, J. D., and Johnson, P. (2008). Left frontal cortical activation and spreading of alternatives: Tests of the action-based model of dissonance. Journal of Personality and Social Psychology, 94:1–15.
    Hart, W. and Albarracín, D. (2011). Learning about what others were doing: Verb aspect and attributions of mundane and criminal intent for past actions. Psychological Science, 22:261–266.
    Hauser, M., Cushman, F., Young, L., Kang-Xing Jin, R., and Mikhail, J. (2007). A dissociation between moral judgments and justifications. Mind & Language, 22(1):1–21.
    Hauser, O. P., Rand, D. G., Peysakhovich, A., and Nowak, M. A. (2014). Cooperating with the future. Nature, 511:220–223.
    Henderson, M. D., de Liver, Y., and Gollwitzer, P. M. (2008). The effects of an implemental mind-set on attitude strength. Journal of Personality and Social Psychology, 94:396–411.
    Hippler, H., Deutsch, B., and Strack, F. (1985). Response scales: Effects of category range on reported behavior and comparative judgments. Public Opinion Quarterly, 49:388–395.
    Hsee, C. K. (1998). Less is better: When low-value options are valued more highly than high-value options. Ltd. Journal of Behavioral Decision Making, 11:107–121.
    Huang, J.-H., Yan, J., Wu, Q.-H., Ferro, M. D., Yi, L.-Z., Lu, H.-M., Xu, Q.-S., and Liang, Y.-Z. (2013). Selective of informative metabolites using random forests based on model population analysis. Talanta, 117:549–555.
    Huang, Y., Tse, C. S., and Cho, K. W. (2014). Living in the north is not necessarily favorable: Different metaphoric associations between cardinal direction and valence in hong kong and in the united states. European Journal of Social Psychology, 44:360–369.
    Huck, S., Seltzer, A. J., and Wallace, B. (2011). Deferred compensation in multiperiod labor contracts: An experimental test of lazear’s model. American Economic Review, 101:819– 843.
    Husnu, S. and Crisp, R. J. (2010). Elaboration enhances the imagined contact effect. Journal of Experimental Social Psychology, 46:943–950.
    Ifcher, J. and Zarghamee, H. (2011). Happiness and time preference: The effect of positive affect in a random-assignment experiment. American Economic Review, 101:3109–3129.
    Imangaliyev, S., Keijser, B., Crielaard, W., and Tsivtsivadze, E. (2015). Personalized microbial network inference via co-regularized spectral clustering. Methods, 83:28–35.
    Inbar, Y., Pizarro, D. A., Knobe, J., and Bloom, P. (2009). Disgust sensitivity predicts intuitive disapproval of gays. Emotion, 9:435–439.
    Jacowitz, K. E. and Kahneman, D. (1995). Measures of anchoring in estimation tasks. Personality and Social Psychology Bulletin, 21(11):1161–1166.
    Janssen, M. A., Holahan, R., Lee, A., and Ostrom, E. (2010). Lab experiments for the study of social-ecological systems. Science, 328(5978):613–617.
    Janssen, N., Alario, F.-X., and Caramazza, A. (2008a). A word-order constraint on phonological activation. Psychological Science, 19(3):216–220. PMID: 18315791.
    Janssen, N., Schirm, W., Mahon, B. Z., and Caramazza, A. (2008b). Semantic interference in a delayed naming task: Evidence for the response exclusion hypothesis. Journal of Experimental Psychology: Learning Memory and Cognition, 34:249–256.
    Ji, Z.-H., Zhang, L., Tang, D.-M., Chen, C.-M., Nordling, T. E. M., Zhang, Z.-D., Ren, C.-L., Da, B., Li, X., Guo, S.-Y., et al. (2021). High-throughput screening and machine learning for the efficient growth of high-quality single-wall carbon nanotubes. Nano Research, pages 1–6.
    Jostmann, N. B., Lakens, D., and Schubert, T. W. (2009). Weight as an embodiment of importance. Psychological Science, 20(9):1169–1174. PMID: 19686292.
    Karpicke, J. D. and Blunt, J. R. (2011). Retrieval practice produces more learning than elaborative studying with concept mapping. Science, 331:768–772.
    Kay, A. C., Laurin, K., Fitzsimons, G. M., and Landau, M. J. (2014). A functional basis for structure-seeking: Exposure to structure promotes willingness to engage in motivated action. Journal of Experimental Psychology: General, 143:486–491.
    Kessler, J. B. and Roth, A. E. (2012). Organ allocation policy and the decision to donate. American Economic Review, 102:2018–2047.
    Kidd, D. C. and Castano, E. (2013). Reading literary fiction improves theory of mind. Science, 342(6156):377–380.
    Kirchler, M., Huber, J., and Stockl, T. (2012). Thar she bursts: Reducing confusion reduces bubbles. American Economic Review, 102:865–883.
    Klein, R., Ratliff, K., Vianello, M., Adams Jr, R., Bahník, S., Bernstein, M., Bocian, K., Brandt, M., Brooks, B., Brumbaugh, C., et al. (2014). Data from investigating variation in replicability: A “many labs” replication project. Journal of Open Psychology Data, 2(1).
    Klein, R. A., Vianello, M., Hasselman, F., Adams, B. G., Adams Jr, R. B., Alper, S., Aveyard, M., Axt, J. R., Babalola, M. T., Bahník, Š., et al. (2018). Many labs 2: Investigating variation in replicability across samples and settings. Advances in Methods and Practices in Psychological Science, 1(4):443–490.
    Kogan, S., Kwasnica, A. M., and Weber, R. A. (2011). Coordination in the presence of asset markets. American Economic Review, 101:927–947.
    Koo, M. and Fishbach, A. (2008). Dynamics of self-regulation: How (un)accomplished goal actions affect motivation. Journal of Personality and Social Psychology, 94:183–195.
    Kuziemko, I., Buell, R. W., Reich, T., and Norton, M. I. (2014). Last-place aversion: Evidence and redistributive implications. Quarterly Journal of Economics, 129:105–149.
    Lange, P. A. V., Bruin, E. M. D., Otten, W., and Joireman, J. A. (1997). Development of prosocial, individualistic, and competitive orientations: Theory and preliminary evidence. Journal of Personality and Social Psychology, 73:733–746.
    Lau, G. P., Kay, A. C., and Spencer, S. J. (2008). Loving those who justify inequality: The effects of system threat on attraction to women who embody benevolent sexist ideals. Psychological Science, 19(1):20–21. PMID: 18181786.
    Lee, S. W. and Schwarz, N. (2010). Washing away postdecisional dissonance. Science, 328:709.
    Lemay, E. P. and Clark, M. S. (2008a). How the head liberates the heart: Projection of communal responsiveness guides relationship promotion. Journal of Personality and Social Psychology, 94:647–671.
    Lemay, E. P. and Clark, M. S. (2008b). ”walking on eggshells”: How expressing relationship insecurities perpetuates them. Journal of Personality and Social Psychology, 95:420–441.
    Liefooghe, B., Barrouillet, P., Vandierendonck, A., and Camos, V. (2008). Working memory costs of task switching. Journal of Experimental Psychology: Learning Memory and Cognition, 34:478–494.
    LoBue, V. and DeLoache, J. S. (2008). Detecting the snake in the grass: Attention to fearrelevant stimuli by adults and young children: Research article. Psychological Science, 19:284–289.
    Luo, T., Li, X., Wang, H., and Liu, Y. (2020). Research replication prediction using weakly supervised learning. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1464–1474, Online. Association for Computational Linguistics.
    Luo, T., Meng, R., Wang, X. E., and Liu, Y. (2022). Interpretable research replication prediction via variational contextual consistency sentence masking.
    Makovski, T., Sussman, R., and Jiang, Y. V. (2008). Orienting attention in visual working memory reduces interference from memory probes. Journal of Experimental Psychology: Learning Memory and Cognition, 34:369–380.
    Marsh, J. E., Vachon, F., and Jones, D. M. (2008). When does between-sequence phonological similarity promote irrelevant sound disruption? Journal of Experimental Psychology: Learning Memory and Cognition, 34:243–248.
    Masicampo, E. and Baumeister, R. F. (2008). Toward a physiology of dual-process reasoning and judgment: Lemonade, willpower, and expensive rule-based analysis. Psychological Science, 19(3):255–260. PMID: 18315798.
    Mazar, N. and Ariely, D. (2008). The dishonesty of honest people: A theory of self-concept maintenance. Source: Journal of Marketing Research, 45:633–644.
    McIntosh, L., Juehne, A., Vitale, C., Liu, X., Alcoser, R., Lukas, J., and Evanoff, B. (2017). Repeat: A framework to assess empirical reproducibility in biomedical research. BMC Medical Research Methodology, 17.
    Mei, Y., Zhang, M., and Nyugen, S. (2016). Feature selection in evolving job shop dispatching rules with genetic programming. In Proceedings of the Genetic and Evolutionary Computation Conference 2016, pages 365–372.
    Melo Peralta, T. and Nordling, T. E. M. (2022). A literature review of methods for assessment of reproducibility in science. Research Square preprint.
    Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient estimation of word representations in vector space.
    Mikolov, T., Grave, E., Bojanowski, P., Puhrsch, C., and Joulin, A. (2018). Advances in pretraining distributed word representations. In Proceedings of the International Conference on Language Resources and Evaluation (LREC 2018).
    Mirman, D. and Magnuson, J. S. (2008). Attractor dynamics and semantic neighborhood density: Processing is slowed by near neighbors and speeded by distant neighbors. Journal of Experimental Psychology: Learning Memory and Cognition, 34:65–79.
    Mitchell, C., Nash, S., and Hall, G. (2008). The intermixed-blocked effect in human perceptual learning is not the consequence of trial spacing. Journal of Experimental Psychology: Learning Memory and Cognition, 34:237–242.
    Miyamoto, Y. and Kitayama, S. (2002). Cultural variation in correspondence bias: The critical role of attitude diagnosticity of socially constrained behavior. Journal of Personality and Social Psychology, 83:1239–1248.
    Monin, B. and Miller, D. T. (2001). Moral credentials and the expression of prejudice. Journal of Personality and Social Psychology, 81:33–43.
    Morewedge, C. K., Huh, E. Y., and Vosgerau, J. (2010). Thought for food: Imagined consumption reduces actual consumption. Science, 330:1030–1033.
    Murray, S. L., Derrick, J. L., Leder, S., and Holmes, J. G. (2008). Balancing connectedness and self-protection goals in close relationships: A levels-of-processing perspective on risk regulation. Journal of Personality and Social Psychology, 94:429–459.
    Nairne, J. S., Pandeirada, J. N., and Thompson, S. R. (2008). Adaptive memory: The comparative value of survival processing. Psychological Science, 19(2):176–180. PMID: 18271866.
    Nishi, A., Shirado, H., Rand, D. G., and Christakis, N. A. (2015). Inequality and visibility of wealth in experimental social networks. Nature, 526:426–429.
    Norenzayan, A., Smith, E. E., Kim, B. J., and Nisbett, R. E. (2002). Cultural preferences for formal versus intuitive reasoning. Cognitive Science, 26:653–684.
    Nosek, B., Banaji, M., and Greenwald, A. (2002). Math = male, me = female, therefore math ≠ me. Journal of personality and social psychology, 83:44–59.
    Nurmsoo, E. and Bloom, P. (2008). Preschoolers’ perspective taking in word learning: Do they blindly follow eye gaze?: Research report. Psychological Science, 19:211–215.
    Oberauer, K. (2008). How to say no: Single- and dual-process theories of short-term recognition tested on negative probes. Journal of Experimental Psychology: Learning Memory and Cognition, 34:439–459.
    of Sciences Engineering, N. A. and Medicine (2019). Reproducibility and replicability in science. National Academies Press, Washington, DC.
    OpenAI (2022). Introducing chatgpt. https://openai.com/blog/chatgpt. [Online; accessed 04-Arpil-2023].
    Oppenheimer, D. M., Meyvis, T., and Davidenko, N. (2009). Instructional manipulation checks: Detecting satisficing to increase statistical power. Journal of Experimental Social Psychology, 45:867–872.
    Oppenheimer, D. M. and Monin, B. (2009). The retrospective gambler’s fallacy: Unlikely events, constructing the past, and multiple universes. Judgment and Decision Making, 4:326–334.
    Pacton, S. and Perruchet, P. (2008). An attention-based associative account of adjacent and nonadjacent dependency learning. Journal of Experimental Psychology: Learning Memory and Cognition, 34:80–96.
    Palmer, S. E. and Ghose, T. (2008). Extremal edge: A powerful cue to depth perception and figure-ground organization. Psychological Science, 19(1):77–83. PMID: 18181795.
    Payne, B. K., Burkley, M. A., and Stokes, M. B. (2008). Why do implicit and explicit attitude tests diverge? the role of structural fit. Journal of Personality and Social Psychology, 94:16–31.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
    Pennington, J., Socher, R., and Manning, C. D. (2014). Glove: Global vectors for word representation. In Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543.
    Pleskac, T. J. (2008). Decision making and learning while taking sequential risks. Journal of Experimental Psychology: Learning Memory and Cognition, 34:167–185.
    Purdie-Vaughns, V., Steele, C. M., Davies, P. G., Ditlmann, R., and Crosby, J. R. (2008). Social identity contingencies: How diversity cues signal threat or safety for african americans in mainstream institutions. Journal of Personality and Social Psychology, 94:615–630.
    Pyc, M. A. and Rawson, K. A. (2010). Why testing improves memory: Mediator effectiveness hypothesis. Science, 330:335.
    Ramirez, G. and Beilock, S. L. (2011). Writing about testing worries boosts exam performance in the classroom. Science, 331:211–213.
    Rand, D. G., Greene, J. D., and Nowak, M. A. (2012). Spontaneous giving and calculated greed. Nature, 489:427–430.
    Reynolds, M. and Besner, D. (2008). Contextual effects on reading aloud: Evidence for pathway control. Journal of Experimental Psychology: Learning Memory and Cognition, 34:50–64.
    Richeson, J. A. and Trawalter, S. (2008). The threat of appearing prejudiced and race-based attentional biases. Psychological Science, 19(2):98–102. PMID: 18271854.
    Ross, L., Greene, D., and House, P. (1977). The“false consensus effect": An egocentric bias in social perception and attribution processes. Journal of Experimental Social Psychology, 13(3):279–301.
    Ross, M. and Wilson, A. E. (2002). It feels like yesterday: Self-esteem, valence of personal past experiences, and judgments of subjective distance. Journal of Personality and Social Psychology, 82:792–803.
    Rottenstreich, Y. and Hsee, C. K. (2001). Money, kisses, and electric shocks: On the affective psychology of risk. PSYCHOLOGICAL SCIENCE Research Article, 12.
    Roy, D., Murty, K. S. R., and Mohan, C. K. (2015). Feature selection using deep neural networks. In 2015 International Joint Conference on Neural Networks (IJCNN), pages 1–6. IEEE.
    Rule, N. O. and Ambady, N. (2008). The face of success: Inferences from chief executive officers’ appearance predict company profits. Psychological Science, 19(2):109–111. PMID: 18271856.
    Sahakyan, L., Delaney, P. F., and Waldum, E. R. (2008). Intentional forgetting is easier after two ”shots” than one. Journal of Experimental Psychology: Learning Memory and Cognition, 34:408–414.
    Savani, K., Markus, H. R., Naidu, N. V., Kumar, S., and Berlia, N. (2010). What counts as a choice? u.s. americans are more likely than indians to construe actions as choices. Psychological Science, 21:391–398.
    Schmidt, J. R. and Besner, D. (2008). The stroop effect: Why proportion congruent has nothing to do with congruency and everything to do with contingency. Journal of Experimental Psychology: Learning Memory and Cognition, 34:514–523.
    Schnall, S., Benton, J., and Harvey, S. (2008). With a clean conscience: Cleanliness reduces the severity of moral judgments. Psychological Science, 19(12):1219–1222. PMID: 19121126.
    Schooler, J. W. and Engstler-Schooler, T. Y. (1990). Verbal overshadowing of visual memories: Some things are better left unsaid. Cognitive Psychology, 22(1):36–71.
    Schwarz, N., Strack, F., and Mai, H.-P. (1991). Assimilation and contrast effects in partwhole question sequences: A conversational logic analysis. Public Opinion Quarterly, 55(1):3–23.
    Shafir, E. (1993). Choosing versus rejecting: Why some options are both better and worse than others. Memory & Cognition, 21:546–556.
    Shah, A. K., Mullainathan, S., and Shafir, E. (2012). Some consequences of having too little. Science, 338(6107):682–685.
    Shnabel, N. and Nadler, A. (2008). A needs-based model of reconciliation: Satisfying the differential emotional needs of victim and perpetrator as a key to promoting reconciliation. Journal of Personality and Social Psychology, 94:116–132.
    Simons, D. J., Holcombe, A. O., and Spellman, B. A. (2014). An introduction to registered replication reports at perspectives on psychological science. PerspectivesonPsychological Science, 9(5):552–555. PMID: 26186757.
    Soto, C. J., John, O. P., Gosling, S. D., and Potter, J. (2008). The developmental psychometrics of big five self-reports: Acquiescence, factor structure, coherence, and differentiation from ages 10 to 20. Journal of Personality and Social Psychology, 94:718–737.
    Sparrow, B., Liu, J., and Wegner, D. M. (2011). Google effects on memory: Cognitive consequences of having information at our fingertips. Science, 333(6043):776–778.
    Sripada, C., Kessler, D., and Jonides, J. (2014). Methylphenidate blocks effort-induced depletion of regulatory control in healthy volunteers. Psychological Science, 25:1227–1234.
    Srull, T. K. and Wyer, R. S. (1979). The role of category accessibility in the interpretation of information about persons: Some determinants and implications. Journal of Personality and Social Psychology, 37:1660–1672.
    Stagge, J. H., Rosenberg, D. E., Abdallah, A. M., Akbar, H., Attallah, N. A., and James, R. (2019). Assessing data availability and research reproducibility in hydrology and water resources. Scientific Data, 6(1):190030.
    Stanovich, K. E. and West, R. F. (2008). On the relative independence of thinking biases and cognitive ability. Journal of Personality and Social Psychology, 94:672–695.
    Storm, B. C., Bjork, E. L., and Bjork, R. A. (2008). Accelerated relearning after retrievalinduced forgetting: The benefit of being forgotten. Journal of Experimental Psychology: Learning Memory and Cognition, 34:230–236.
    Strack, F., Martin, L. L., and Stepper, S. (1988). Inhibiting and facilitating conditions of the human smile: A nonobtrusive test of the facial feedback hypothesis. Journal of Personality and Social Psychology, 54:768–777.
    Szymkow, A., Chandler, J., IJzerman, H., Parzuchowski, M., and Wojciszke, B. (2013). Warmer hearts, warmer rooms: How positive communal traits increase estimates of ambient temperature. volume 44, pages 167–176.
    Tamir, M., Mitchell, C., and Gross, J. J. (2008). Hedonic and instrumental motives in anger regulation. Psychological Science, 19(4):324–328. PMID: 18399883.
    Taşkın, G., Kaya, H., and Bruzzone, L. (2017). Feature selection based on high dimensional model representation for hyperspectral images. IEEE Transactions on Image Processing, 26(6):2918–2928.
    Tjärnberg, A., Morgan, D. C., Studham, M., Nordling, T. E. M., and Sonnhammer, E. L. L. (2017). GeneSPIDER–gene regulatory network inference benchmarking with controlled network and data properties. Molecular BioSystems, 13(7):1304–1312.
    Tracy, J. L. and Robins, R. W. (2008). The nonverbal expression of pride: Evidence for cross-cultural recognition. Journal of Personality and Social Psychology, 94:516–530.
    Tversky, A. and Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability122. COGNlTTIVE PSYCHOLOGY, 5:207–232.
    Ud-Dean, S. M. and Gunawan, R. (2016). Optimal design of gene knockout experiments for gene regulatory network inference. Bioinformatics, 32(6):875–883.
    van Dijk, E., van Kleef, G. A., Steinel, W., and van Beest, I. (2008). A social functional approach to emotions in bargaining: When communicating anger pays and when it backfires. Journal of Personality and Social Psychology, 94:600–614.
    Vaswani, A., Shazeer, N., Parmer, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Illia, P. (2017). Attention Is All You Need. Advances in Neural Information Processing Systems, 2017-Decem(Nips):5999–6009.
    Villaverde, A. F., Becker, K., and Banga, J. R. (2017). Premer: a tool to infer biological networks. IEEE/ACM transactions on computational biology and bioinformatics, 15(4): 1193–1202.
    Vohs, K. D. and Schooler, J. W. (2008). The value of believing in free will: Encouraging a belief in determinism increases cheating. Psychological Science, 19(1):49–54. PMID: 18181791.
    Vul, E., Nieuwenstein, M., and Kanwisher, N. (2008). Temporal selection is suppressed, delayed, and diffused during the attentional blink. Psychological Science, 19(1):55–61. PMID: 18181792.
    Williams, L. E. and Bargh, J. A. (2008). Keeping one’s distance: The influence of spatial distance cues on affect and evaluation. Psychological Science, 19(3):302–308. PMID: 18315805.
    Wilson, T. D., Reinhard, D. A., Westgate, E. C., and Gilbert, D. T. (2014). Just think: The challenges of the disengaged mind. Science, 345:72–75.
    Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf,
    R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Scao, T. L., Gugger, S., Drame, M., Lhoest, Q., and Rush, A. M. (2020). Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38– 45, Online. Association for Computational Linguistics.
    Wu, J., Nivargi, R., Lanka, S. S. T., Menon, A. M., Modukuri, S. A., Nakshatri, N., Wei, X., Wang, Z., Caverlee, J., Rajtmajer, S. M., and Giles, C. L. (2021). Predicting the reproducibility of social and behavioral science papers using supervised learning models.
    Yang, Y., Youyou, W., and Uzzi, B. (2020). Estimating the deep replicability of scientific findings using human and artificial intelligence. Proceedings of the National Academy of Sciences, 117(20):10762–10768.
    Yap, M. J., Balota, D. A., Tse, C. S., and Besner, D. (2008). On the additive effects of stimulus quality and word frequency in lexical decision: Evidence for opposing interactive influences revealed by rt distributional analyses. Journal of Experimental Psychology: Learning Memory and Cognition, 34:495–513.
    Zaheer, M., Guruganesh, G., Dubey, K. A., Ainslie, J., Alberti, C., Ontanon, S., Pham, P., Ravula, A., Wang, Q., Yang, L., et al. (2020). Big bird: Transformers for longer sequences. Advances in Neural Information Processing Systems, 33.
    Zaval, L., Keenan, E. A., Johnson, E. J., and Weber, E. U. (2014). How warm days increase belief in global warming. Nature Climate Change, 4:143–147.
    Zhong, C. B. and Liljenquist, K. (2006). Washing away your sins: Threatened morality and physical cleansing. Science, 313:1451–1452.
    Ágnes Melinda Kovács, Téglás, E., and Endress, A. D. (2010). The social sense: Susceptibility to others'beliefs in human infants and adults. Science, 330:1830–1833.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE