| 研究生: |
紀永昌 Chi, Yung-Chang |
|---|---|
| 論文名稱: |
以深度學習及資料擴增預測新興技術專利侵權風險 Forecasting New Technology Patent Infringement Risks Using Deep Learning and Data Augmentation |
| 指導教授: |
王惠嘉
Wang, Hei-Chia |
| 學位類別: |
博士 Doctor |
| 系所名稱: |
管理學院 - 工業與資訊管理學系 Department of Industrial and Information Management |
| 論文出版年: | 2022 |
| 畢業學年度: | 110 |
| 語文別: | 英文 |
| 論文頁數: | 96 |
| 中文關鍵詞: | 深度學習 、資料擴增 、專利風險 、專利侵權 |
| 外文關鍵詞: | Deep Learning, Data Augmentation, Patent Risk, Patent Infringement |
| 相關次數: | 點閱:231 下載:34 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
技術專利被認為是新興技術的來源。而開發新專利需投入大量成本,但有時投入後又可能與其它專利相似,缺乏相關可專利條件而無法通過專利審查,造成投資上的損失,另一方面若能事先確定新產品技術是否有專利侵權對於降低損害風險是非常重要的課題。而專利審查到目前為止皆以人工方式進行審查與比對,由於人力與時間之限制審查時間甚長,就目前相關的專利相似比對研究,多以文字探勘的分類演算法,提供審查通過與否的可能性分析,但對是否侵權的可能性較無討論。
本研究試圖提出從現有專利資料庫評估專利申請與侵權風險的問題。對於每項專利申請,本研究採用了卷積神經網路(Convolutional Neural Networks CNN)與長短期記憶(Long Short-Term Memory Network LSTM)預測模型,及基於關鍵字搜尋的美國專利事務局(USPTO)公開專利申請案及審核結果,再運用資料擴增後訓練前,隨機抽出10%批准及拒絕之申請案當作測試案例,其餘90%案例訓練本研究預測模型,以期計算出一個可為專利侵權與審核預測的模型。實驗結果本研究的模型預測各分類準確度最低皆可達87.7%以上,並可從其中找到專利申請時可能被拒絕原因的分類。
Technology patents are considered the source and bedrock of emerging technologies. Patents create value in any enterprise. However, obtaining patents is time consuming, expensive, and risky; especially if the patent application is rejected. The development of new patents requires extensive costs and resources, but sometimes they may be similar to other patents once the technology is fully developed. They might lack relevant patentable features and as a result, fail to pass the patent examination, resulting in investment losses. Patent infringement is also an especially important topic for reducing the risk of legal damages of patent holders, applicants, and manufacturers. Patent examinations have so far been performed manually. Due to manpower and time limitations, the examination time is exceedingly long and inefficient. Current patent similarity comparison research, and the classification algorithms of text mining are most commonly employed to provide analyses of the possibility of examination approval, but there is insufficient discussion about the possibility of infringement. However, if a new technology or innovation can be accurately determined in advance whether it likely to pass or fail (and why), or is at risk of patent infringement, losses can be mitigated.
This research attempts to identify the issues involved in evaluating patent applications and infringement risks from existing patent databases. For each patent application, this research uses Convolutional Neural Networks (CNN) and Long Short-Term Memory Network (LSTM) prediction model, and the United States Patent and Trademark Office (USPTO) public utility patent application and reviews results based on keyword search. Then, data augmentation is utilized before performing model training; 10% of the approved and rejected applications are randomly selected as test cases, with the remaining 90% of the cases used to train the prediction model of this research in order to determine a model that can predict patent infringement and examination outcomes. Experimental results of the model in this study predict that the accuracy of each classification is at least 87.7%, and can be used to find the classification of the reason for a rejection of a patent application failure.
Adam B. Jaffe., & Manuel Trajtenberg. (2002). Patents, Citations, and Innovations.
The MIT Press. Cambridge, Massachusetts, London, England.
Alessandro, Evangelista., Lorenzo, Ardito., Antonio, Boccaccio., Michele, Fiorentino.,
Antonio, Messeni, Petruzzelli., & Antonio, Uva. (2020). Unveiling the
technological trends of augmented reality: A patent analysis. Computers in
Industry, 118:103221. https://doi.org/10.1016/j.compind.2020.103221.
Alves, T., Rodrigues, R., Costa, H., & Rocha, M. (2017). Development of Text
Mining Tools for Information Retrieval from Patents. Paper presented at the
International Conference on Practical Applications of Computational Biology &
Bioinformatics. https://doi.org/10.1007/978-3-319-60816-7_9.
Amy, J.C. Trappery., Charles, V. Trappey., Jheng-Long, Wu., & Jack, W.C. Wang(2020).
Intelligent compilation of patent summaries using machine learning and
natural language processing techniques. Advanced Engineering Informatics,
volume 43, January 2020, 101027. https://doi.org/10.1016/j.aei.2019.101027
Changyong, Lee., Bokyoung, Kang., & Juneseuk, Shin. (2015). Novelty-focused
patent mapping for technology opportunity analysis. Technological Forecasting
& Social Change. 90(B), 355-365.https://doi.org/10.1016/j.techfore.2014.05.010.
Chen, Y.-L., & Chang, Y.-C. (2012). A three-phase method for patent classification.
Information Processing & Management, 48(6), 1017-1030.
https://doi.org/10.1016/j.ipm.2011.11.001.
Chen, Y.-L., & Chiu, Y.-T. (2011). An IPC-based vector space model for patent
retrieval. Information Processing & Management, 47(3), 309-322.
https://doi.org/10.1016/j.ipm.2010.06.001
Claude, Coulombe. (2018). Text Data Augmentation Made Simple By Leveraging
NLP Cloud APIs. Doctorant Informatique Cognitive, TELUQ/UQAM, Consultant.
Lingua Technologies Inc. DataFranca.
Daniel, Tamming. (2020). Data Augmentation for Text Classification Tasks. A thesis
presented to the University of Waterloo in fulfillment of the thesis requirement
for the degree of Master of Mathematics in Computer Science. Waterloo,
Ontario, Canada.
Davide, Mazzini., Paolo, Napoletano., Flavio, Piccoli., & Raimondo, Schettini. (2020).
A Novel Approach to Data Augmentation for Pavement Distress Segmentation.
Computers in Industry, 121:103225.
https://doi.org/10.1016/j.compind.2020.103225
Dietmar Harhoff, Francis Narin, FM Scherer, & Katrin Vopel. (1999). Citation
Frequency and the Value of Patented Inventions. Review of Economics and
Statistics.81(3),511-515.
Dietmar Harhoff, Frederic M. Scherer, & Katrin Vopel. (2003). Citations, family
size, opposition, and the value of patent rights. Research Policy. 32(8), 1343-
1363. https://doi.org/10.1016/S0048-7333(02)00124-5
Douglas, HM, Leandro, IL d. F., Roniberto, M. d. A., & Jose, ARG(2017).
Claim-based patent indicators: A novel approach to analyze patent content
and monitor technological advances. World Patent Information, 50,64-72.
https://doi.org/10.1016/j.wpi.2017.08.008
Fenglong, Su., & Qinghua, Xie. (2016). Research on clustering extraction of
domain entity attribute words based on deep learning. Electronic technology
applications. 42(6),1674-7720. doi:10.3966/199115992019023001004.
Feng, Zhenyu.(2002). Comment on Festo Case of the Supreme Court of the United
States--Although the theory of equality continues to be effective, its
influence is gradually limited. Intellectual property.200207.
Geng, Jun., Liu, Jiangbin., & Sun, Yuanzhao.(2000). US Patent Law litigation on
Doctrine of Equivalents case studies. Intellectual Property Office. Ministry of
Economic Affairs, R. O. C.
Google. (2020). Google Patents. Retrieved from
https://patents.google.com/advanced
H, Borko. & M, Bernick. Automatic document classification. (1963). Journal of
the ACM. https://doi.org/10.1145/321160.321165.
Hongbin, K.; Junegak, J.; Kwangsoo, K. Semi-automatic extraction of
technological causality from patents. Comput. Ind. Eng. 2018, 115, 532–542.
Sustainability 2018, 10, 3729 18 of 18
Intelligence generates confidence (2022, April 9). GAN model for text generation.
Retrieved from https://cloud.tencent.com/developer/article/1885686.
Janghyeok, Yoon., Byeongki, Jeong., Mujin, Kim., & Changyong, Lee. (2021). An
information entropy and latent Dirichlet allocation approach to noise patent
filtering. Advanced Engineering Informatics, volume 47, January 2021,
101243. https://doi.org/10.1016/j.aei.2020.101243.
Janice, M. Mueller., (2006). An Introduction to Patent Law. Second Edition.
ASPEN PUBLISHERS.
Jeffrey, Pennington., Richard, Socher., & Christopher, D, Manning. (2014). Glove:
Global vectors for word representation. Proceedings of the 2014 conference
on empirical methods in natural language processing.
http://dx.doi.org/10.3115/v1/D14-1162
Jiaxian, Guo., Sidi, Lu., Han, Cai., Weiana, Zhang., Yong, Yu., & Jun, Wang. (2017).
Long Text Generation via Adversarial Training with Leaked Information.
Cornell University.
Jin, Wang., Liang-Chih, Yu., K. Robert, Lai., & Xuejie, Zhang.(2016).
Dimensional Sentiment Analysis Using a Regional CNN-LSTM Model.
Proceedings of the 54th Annual Meeting of the Association for
Computational Linguistics, Berlin, Germany.
http://dx.doi.org/10.18653/v1/P16-2037.
Junghyun, Min., R., Thomas, McCoy., Dipanjan, Das., Emily, Pitler., and Tal, Linzen.
(2020). Syntactic Data Augmentation Increases Robustness to Inference
Heuristics. Proceedings of the 58th Annual Meeting of the Association for
Computational Linguistics. https://aclanthology.org/2020.acl-main.212
Kim, J., Choi, J., Park, S., & Jang, D. (2018). Patent Keyword Extraction for
Sustainable Technology Management. Sustainability (2071-1050), 10(4).
https://doi.org/10.3390/su10041287
Kim, Y.(2014). Convolutional Neural Networks for Sentence Classification.
Proceedings of the 2014 Conference on Empirical Methods in Natural
Language Processing (EMNLP 2014), 1746–1751.
http://dx.doi.org/10.3115/v1/D14-1181.
Konstantinos, Markellos., Katerina, Perdikuri., Penelope, Markellou., Spiros,
Sirmakessis., George, Mayritsakis., & Athanasios, Tsakalidis. (2002). Knowledge
discovery in patent databases. Proceedings of the eleventh international
conference on information and knowledge management.
https://doi.org/10.1145/584792.584915.
Lantao, Yu., Weinan, Zhang., Jun, Wang., & Yong, Yu. (2016). SeqGAN: Sequence
Generative Adversarial Nets with Policy Gradient. Cornell University.
Lee, C., Kim, J., Kwon, O., & Woo, H.-G. (2016). Stochastic technology life cycle
analysis using multiple patent indicators. Technological Forecasting and Social
Change, 106, 53-64. https://doi.org/10.1016/j.techfore.2016.01.024.
Lee, C., Kwon, O., Kim, M., & Kwon, D. (2018). Early identification of emerging
technologies: A machine learning approach using multiple patent indicators.
Technological Forecasting and Social Change, 127, 291-303.
https://doi.org/10.1016/j.techfore.2017.10.002.
Lee, S., Yoon, B., & Park, Y. (2009). An approach to discovering new technology
opportunities: Keyword-based patent map approach. Technovation, 29(6-7),
481-497. https://doi.org/10.1016/j.technovation.2008.10.006.
Leonidas, Aristodemou., & Frank, Tietze. (2018). The state-of-the-art on Intellectual
Property Analytics (IPA): A literature review on artificial intelligence, machine
learning and deep learning methods for analysing intellectual property (IP)
data. World Patent Information, 55(2018) 37-51.
https://doi.org/10.1016/j.wpi.2018.07.002
M.A. Hasan,, W.S. Spangler,, T. Griffin,, A. Alba,, COA: Finding novel patents through
text analysis, Proc. 15th ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining (KDD’09), pp.1175–1184, ACM (online), DOI:
http://doi.acm.org/ 10.1145/1557019.1557146. (2009).
Mnih Andriy, Hinton Geoffrey, A Scalable Hierarchical Distributed Language Model,
NIPS (2008).
Monir, Ech-Chouyyerkh., Hicham Omara., & Mohamed, Lazaar. (2019). Scientific
paper classification using Convolutional Neural Networks. Proceedings of the
4th International Conference on Big Data and Internet of Things.13, 1-6.
https://doi.org/10.1145/3372938.3372951
Muhammad, Abulaish, SMIEEE. & Amit Kumar Sah. (2019). A Text Data
Augmentation Approach for Improving the Performance of CNN. Proceedings
of the 11th International Conference on Communication Systems &
Networks(COMSNETS). https://doi.org/10.1109/COMSNETS.2019.8711054.
Po-Wei, Wu., Yu-Jing, Lin., Che-Han, Chang., Edward, Y. Chang., & Shih-Wei Liao.
(2019. RelGAN: Multi-Domain Image-to-Image Translation via Relative
Attributes. Cornell University.
Rahul, Kapoor., Matti, Karvonen., Samira, Ranaei., & Tuomo, Kassi. (2015).
Patent portfolios of European wind industry: New insights using citation
categories. World Patent Information. 41, 4-10.
https://doi.org/10.1016/j.wpi.2015.02.002
Ralf, Krestel., Renukswamy, Chikkamath., Christoph, Hewel., & Julian, Risch.
(2021). A survey on deep learning for patent analysis. World Patent
Information, Volume 65, June 2021, 102035.
https://doi.org/10.1016/j.wpi.2021.102035
Roh, T., Jeong, Y., & Yoon, B. (2017). Developing a Methodology of Structuring
and Layering Technological Information in Patent Documents through Natural
Language Processing. Sustainability, 9(11), 2117.
https://doi.org/10.3390/su9112117
Rosso, P., Correa, S., & Buscaldi, D. (2011). Passage retrieval in legal texts.
Journal of Logic and Algebraic Programming, 80(3-5), 139-153.
https://doi.org/10.1016/j.jlap.2011.02.001
Schastiani, F. (2002). Machine learning in automated text categorization. A CM
Compuling Surveys, 34, 1-17. https://doi.org/10.1145/505282.505283.
Shohei, H., Shoko, S., Risa, N., Takashi, I., Rikiya, T., Tetsuya, N., Tsuyoshi, I.,
Yusuke, K., Rinju, Y., Takeshi, U., Akira, T., & Toshiya, W. (2012). Modeling Patent
Quality: A System for Large-scale Patentability Analysis using Text Mining. J
journal of Information Processing, 20(3),655-666.
https://doi.org/10.2197/ipsjjip.20.655.
Sunhye, Kim. & Byungun, Yoon. (2020). Patent infringement analysis using a text
mining technique based on SAO structure. Computers in Industry, 125: 103379.
https://doi.org/10.1016/j.compind.2020.103379.
Uspto. (2020). Public Patent Application Information. Retrieved from
https://portal.uspto.gov/pair/PublicPair
Wang, C. S., Teng Morris. (2007). US Patent Litigation.
Weili, Nie., Nina, Narodytska., & Ankit, B., Patel. (2019). RELGAN: RELATIONAL
GENERATIVE ADVERSARIAL NETWORKS FOR TEXT GENERATION. Published as
a conference paper at ICLR 2019.
Yan, Tang. Demey., & Domenico, Golzio. (2020). Search strategies at the European
Patent Office. World Patent Inforamtion, 63(2020)101989.
https://doi.org/10.1016/j.wpi.2020.101989
Yingwen, Wu., Yangjian, Ji., Fu, Gu., & Jianfeng, Guo. (2021). A collaborative
evaluation method of the quality of patent scientific and technological
resources. World Patent Information, Volume 67, December 2021, 102074.
https://doi.org/10.1016/j.wpi.2021.102074
Youngjung, Geum. & Mirae, Kim. (2020). How to identify promising chances for
technological innovation: Keygraph-based patent analysis. Adavnced
Engineering Informatics, Volume 46, October 2020, 101155.
https://doi.org/10.1016/j.aei.2020.101155
Yuan, Zhou., Fang, Dong., Yufei, Liu., Zhaofu, Li., JunFei, Du., & Li, Zhang.(2020).
Forecasting emerging technologies using data augmentation and deep
learning. Scientometrics, 123:1-29. https://doi.org/10.1007/s11192-020-
03351-6.
Yu-Jing, Chiu., Kuang-Chin, Chen., & Hui-Chung, Che. (2021). Patent predictive
price-to-book ration (PB) on improving investment performance—Evidence in
China. World Patent Information, Volume 65, June 2021, 102039.
https://doi.org/10.1016/j.wpi.2021.102039
Yung-Hsien, Tseng., Chi-Jen, Lin., & Yu-I, Lin. (2007). Text Mining Techniques
for Patent Analysis. Information Processing and Management. 43, 1216-1247.
https://doi.org/10.1016/j.ipm.2006.11.011