مقایسه مدل های از پیش آموزش داده شده در خلاصه سازی استخراجی نظرات کاربران موبایل
محورهای موضوعی : پردازش زبان طبیعیمهرداد رضوی دهکردی 1 * , hamid rastegari 2 , اکبر نبی اللهی نجف آبادی 3 , تقی جاودانی گندمانی 4
1 - مهندسی کامپیوتر، فنی مهندسی و علوم پایه،نجف آباد ایران، دانشگاه آزاد اسلامی واحد نجف آباد
2 - Department Of Computer And Information Technology, Islamic Azad University,najafabad, Iran.
3 - واحد نجف آباد دانشگاه آزاد اسلامی
4 - عضو هیات علمی، دانشگاه شهرکرد
کلید واژه: برنامه های موبایل, خلاصه سازی نظرات, آنالیز فروشگاه گوگل , مدل از پیش آموزش دیده,
چکیده مقاله :
از زمان پیدایش برنامه های موبایل، نظرات کاربران برای توسعه دهندگان برنامه بسیار ارزشمند بوده است چون حاوی احساسات، اشکالات و نیازهای جدید کاربران بوده است. به دلیل حجم بالای نظرات، خلاصه سازی آن ها کار بسیار دشوار و مستعد خطاست. تا کنون کارهای بسیاری در زمینه خلاصه سازی استخراجی نظرات کاربران انجام شده است؛ اما در اکثر پژوهش ها یا از روش های قدیمی یادگیری ماشین و یا پردازش زبان طبیعی استفاده شده است و یا اگر مدلی برای خلاصه سازی با استفاده از مبدل ها آموزش دیده است، مشخص نشده که این مدل برای خلاصه سازی نظرات کاربران موبایل کاربرد دارد یا خیر ؟ به بیان دیگر مدل برای خلاصه سازی متون به صورت عام منظوره ارائه شده و هیچ بررسی برای استفاده از آن در خلاصه سازی های خاص منظوره انجام نشده است . در این مقاله در ابتدا 1000 نظر به صورت تصادفی از پایگاه داده Kaggle مربوط به نظرات کاربران انتخاب شد و سپس به 4 مدل از پیش آموزش دیده bart_large_cnn، bart_large_xsum، mT5_multilingual_XLSum و Falcon’sai Text_Summrization برای خلاصه سازی داده شد و معیار های Rouge1، Rouge2 و RoungL برای هر کدام از مدل ها به طور جداگانه محاسبه شد و در نهایت مشخص شد که مدل از پیش آموزش دیده Falcon’sAI با امتیاز 6464/0 در معیار rouge1 ، امتیاز 6140/0 در معیار rouge2 و امتیاز 6346/0 در rougeL بهترین مدل برای خلاصه سازی نظرات کاربران فروشگاه Play است
Since the inception of mobile apps, user feedback has been extremely valuable to app developers as it contains users' feelings, bugs, and new requirements. Due to the large volume of reviews, summarizing them is very difficult and error-prone. So far, many works have been done in the field of extractive summarization of users' reviews; However, in most researches, old methods of machine learning or natural language processing have been used, or if a model has been trained for summarizing using transformers, it has not been determined whether this model is useful for summarizing the reviews of mobile users. No? In other words, the model for summarizing texts has been presented in a general purpose form, and no investigation has been carried out for its use in special purpose summarization. In this article, first, 1000 reviews were randomly selected from the Kaggle database of user reviews, and then given to 4 pre-trained models bart_large_cnn, bart_large_xsum, mT5_multilingual_XLSum, and Falcon'sai Text_Summrization for summarization, and the criteria Rouge1, Rouge2 and RoungL were calculated separately for each of the models and finally it was found that the pre-trained Falcon's AI model with a score of 0.6464 in the rouge1 criterion, a score of 0.6140 in the rouge2 criterion and a score of 0.6346 in rougeL The best model for summarizing users' reviews is the Play Store.
M. R. Dehkordi, H. Seifzadeh, G. Beydoun, and M. H. Nadimi-Shahraki, “Success prediction of android applications in a novel repository using neural networks,” Complex Intell. Syst., vol. 6, no. 3, pp. 573–590, 2020, doi: 10.1007/s40747-020-00154-3.
W. Martin, F. Sarro, Y. Jia, Y. Zhang, and M. Harman, “A survey of app store analysis for software engineering,” IEEE Trans. Softw. Eng., vol. 43, no. 9, pp. 817–847, 2017, doi: 10.1109/TSE.2016.2630689.
E. Guzman and W. Maalej, “How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Reviews,” in 2014 IEEE 22nd International Requirements Engineering Conference (RE), Aug. 2014, pp. 153–162, doi: 10.1109/RE.2014.6912257.
D. Pagano and W. Maalej, “User feedback in the appstore: An empirical study,” in 2013 21st IEEE International Requirements Engineering Conference (RE), Jul. 2013, pp. 125–134, doi: 10.1109/RE.2013.6636712.
L. V. G. Carreno and K. Winbladh, “Analysis of user comments: An approach for software requirements evolution,” Proc. - Int. Conf. Softw. Eng., pp. 582–591, 2013, doi: 10.1109/ICSE.2013.6606604.
W. Maalej and D. Pagano, “On the socialness of software,” Proc. - IEEE 9th Int. Conf. Dependable, Auton. Secur. Comput. DASC 2011, pp. 864–871, 2011, doi: 10.1109/DASC.2011.146.
N. Seyff, F. Graf, and N. Maiden, “Using mobile RE tools to give end-users their own voice,” Proc. 2010 18th IEEE Int. Requir. Eng. Conf. RE2010, pp. 37–46, 2010, doi: 10.1109/RE.2010.15.
A. Al-Subaihin et al., “App store mining and analysis,” in Proceedings of the 3rd International Workshop on Software Development Lifecycle for Mobile, Aug. 2015, pp. 1–2, doi: 10.1145/2804345.2804346.
N. Chen, J. Lin, S. C. H. Hoi, X. Xiao, and B. Zhang, “AR-miner: Mining informative reviews for developers from mobile app marketplace,” in Proceedings - International Conference on Software Engineering, May 2014, no. 1, pp. 767–778, doi: 10.1145/2568225.2568263.
E. C. Groen, J. Doerr, and S. Adam, “Towards Crowd-Based Requirements Engineering A Research Preview,” in Requirements Engineering: Foundation for Software Quality, 2015, pp. 247–253.
S. A. Licorish, B. T. R. Savarimuthu, and S. Keertipati, “Attributes that predict which features to fix: Lessons for app store mining,” ACM Int. Conf. Proceeding Ser., vol. Part F1286, pp. 108–117, 2017, doi: 10.1145/3084226.3084246.
H. P. Luhn, “The Automatic Creation of Literature Abstracts,” IBM J. Res. Dev., vol. 2, no. 2, pp. 159–165, 2010, doi: 10.1147/rd.22.0159.
K. S. Kalyan, A. Rajasekharan, and S. Sangeetha, “AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing,” pp. 1–42, 2021, [Online]. Available: http://arxiv.org/abs/2108.05542.
S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735.
A. M. and G. H. Alex Graves, “Speech Recognition with Deep Recurrent Neural Networks , Department of Computer Science, University of Toronto,” Dep. Comput. Sci. Univ. Toronto, vol. 3, no. 3, pp. 45–49, 2013, [Online]. Available: https://ieeexplore.ieee.org/stampPDF/getPDF.jsp?tp=&arnumber=6638947&ref=aHR0cHM6Ly9pZWVleHBsb3JlLmllZWUub3JnL2Fic3RyYWN0L2RvY3VtZW50LzY2Mzg5NDc/Y2FzYV90b2tlbj1OQUo1VFJxWk5JRUFBQUFBOmtPZmdDbS00NGhqaGI2N3dMd2JrU3lSaEdJREhBWnpMSkxoT201Um5YMXR0S0poUDAtM2hkbT
Y. Kim, “Convolutional Neural Networks for Sentence Classification,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1746–1751, doi: 10.3115/v1/D14-1181.
V. Gupta and G. S. Lehal, “A Survey of Text Summarization Extractive techniques,” J. Emerg. Technol. Web Intell., vol. 2, no. 3, pp. 258–268, 2010, doi: 10.4304/jetwi.2.3.258-268.
N. Moratanch and S. Chitrakala, “A survey on extractive text summarization,” Int. Conf. Comput. Commun. Signal Process. Spec. Focus IoT, ICCCSP 2017, no. November, 2017, doi: 10.1109/ICCCSP.2017.7944061.
M. Allahyari et al., “Text Summarization Techniques: A Brief Survey,” Int. J. Adv. Comput. Sci. Appl., vol. 8, no. 10, 2017, doi: 10.14569/ijacsa.2017.081052.
N. Moratanch and S. Chitrakala, “A survey on abstractive text summarization,” Proc. IEEE Int. Conf. Circuit, Power Comput. Technol. ICCPCT 2016, no. November, 2016, doi: 10.1109/ICCPCT.2016.7530193.
S. Gholamrezazadeh, M. A. Salehi, and B. Gholamzadeh, “A Comprehensive Survey on Text Summarization Systems.” doi: 10.1109/CSA.2009.5404226.
G. L. A. Babu and S. Badugu, “A Survey on Automatic Text Summarisation,” Lect. Notes Networks Syst., vol. 612, pp. 679–689, 2014, doi: 10.1007/978-981-19-9228-5_58.
R. Mishra et al., “Text summarization in the biomedical domain: a systematic review of recent research.,” J. Biomed. Inform., vol. 52, pp. 457–467, Dec. 2014, doi: 10.1016/j.jbi.2014.06.009.
N. Andhale and L. A. Bewoor, “An overview of text summarization techniques,” Proc. - 2nd Int. Conf. Comput. Commun. Control Autom. ICCUBEA 2016, no. May, 2017, doi: 10.1109/ICCUBEA.2016.7860024.
J. R. Thomas, S. K. Bharti, and K. S. Babu, “Automatic keyword extraction for text summarization in e-newspapers,” ACM Int. Conf. Proceeding Ser., vol. 25-26-August-2016, 2016, doi: 10.1145/2980258.2980442.
M. Gambhir and V. Gupta, “Recent automatic text summarization techniques: a survey,” Artif. Intell. Rev., vol. 47, 2017, doi: 10.1007/s10462-016-9475-9.
W. S. El-Kassas, C. R. Salama, A. A. Rafea, and H. K. Mohamed, “Automatic text summarization: A comprehensive survey,” Expert Syst. Appl., vol. 165, no. November 2021, 2021, doi: 10.1016/j.eswa.2020.113679.
L. Abualigah, M. Q. Bashabsheh, H. Alabool, and M. Shehab, “Text Summarization: A Brief Review,” Stud. Comput. Intell., vol. 874, no. January, pp. 1–15, 2020, doi: 10.1007/978-3-030-34614-0_1.
M. Lewis et al., “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension,” CoRR, vol. abs/1910.1, Oct. 2019, doi: https://doi.org/10.48550/arXiv.1910.13461.
S. Narayan, S. B. Cohen, and M. Lapata, “Don’t give me the details, just the summary! Topic-aware convolutional neural networks for extreme summarization,” Proc. 2018 Conf. Empir. Methods Nat. Lang. Process. EMNLP 2018, pp. 1797–1807, 2018, doi: 10.18653/v1/d18-1206.
N. Chen, J. Lin, S. C. H. Hoi, X. Xiao, and B. Zhang, “AR-miner: Mining informative reviews for developers from mobile app marketplace,” in Proceedings - International Conference on Software Engineering, May 2014, no. 1, pp. 767–778, doi: 10.1145/2568225.2568263.
E. C. Groen, J. Doerr, and S. Adam, “Towards Crowd-Based Requirements Engineering A Research Preview,” in Requirements Engineering: Foundation for Software Quality, 2015, pp. 247–253.
S. A. Licorish, B. T. R. Savarimuthu, and S. Keertipati, “Attributes that predict which features to fix: Lessons for app store mining,” ACM Int. Conf. Proceeding Ser., vol. Part F1286, pp. 108–117, 2017, doi: 10.1145/3084226.3084246.
H. P. Luhn, “The Automatic Creation of Literature Abstracts,” IBM J. Res. Dev., vol. 2, no. 2, pp. 159–165, 2010, doi: 10.1147/rd.22.0159.
K. S. Kalyan, A. Rajasekharan, and S. Sangeetha, “AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing,” pp. 1–42, 2021, [Online]. Available: http://arxiv.org/abs/2108.05542.
S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735.
A. M. and G. H. Alex Graves, “Speech Recognition with Deep Recurrent Neural Networks , Department of Computer Science, University of Toronto,” Dep. Comput. Sci. Univ. Toronto, vol. 3, no. 3, pp. 45–49, 2013, [Online]
Y. Kim, “Convolutional Neural Networks for Sentence Classification,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1746–1751, doi: 10.3115/v1/D14-1181.
V. Gupta and G. S. Lehal, “A Survey of Text Summarization Extractive techniques,” J. Emerg. Technol. Web Intell., vol. 2, no. 3, pp. 258–268, 2010, doi: 10.4304/jetwi.2.3.258-268.
N. Moratanch and S. Chitrakala, “A survey on extractive text summarization,” Int. Conf. Comput. Commun. Signal Process. Spec. Focus IoT, ICCCSP 2017, no. November, 2017, doi: 10.1109/ICCCSP.2017.7944061.
M. Allahyari et al., “Text Summarization Techniques: A Brief Survey,” Int. J. Adv. Comput. Sci. Appl., vol. 8, no. 10, 2017, doi: 10.14569/ijacsa.2017.081052.
N. Moratanch and S. Chitrakala, “A survey on abstractive text summarization,” Proc. IEEE Int. Conf. Circuit, Power Comput. Technol. ICCPCT 2016, no. November, 2016, doi: 10.1109/ICCPCT.2016.7530193.
S. Gholamrezazadeh, M. A. Salehi, and B. Gholamzadeh, “A Comprehensive Survey on Text Summarization Systems.” doi: 10.1109/CSA.2009.5404226.
G. L. A. Babu and S. Badugu, “A Survey on Automatic Text Summarisation,” Lect. Notes Networks Syst., vol. 612, pp. 679–689, 2014, doi: 10.1007/978-981-19-9228-5_58.
R. Mishra et al., “Text summarization in the biomedical domain: a systematic review of recent research.,” J. Biomed. Inform., vol. 52, pp. 457–467, Dec. 2014, doi: 10.1016/j.jbi.2014.06.009.
N. Andhale and L. A. Bewoor, “An overview of text summarization techniques,” Proc. - 2nd Int. Conf. Comput. Commun. Control Autom. ICCUBEA 2016, no. May, 2017, doi: 10.1109/ICCUBEA.2016.7860024.
J. R. Thomas, S. K. Bharti, and K. S. Babu, “Automatic keyword extraction for text summarization in e-newspapers,” ACM Int. Conf. Proceeding Ser., vol. 25-26-August-2016, 2016, doi: 10.1145/2980258.2980442.
M. Gambhir and V. Gupta, “Recent automatic text summarization techniques: a survey,” Artif. Intell. Rev., vol. 47, 2017, doi: 10.1007/s10462-016-9475-9.
W. S. El-Kassas, C. R. Salama, A. A. Rafea, and H. K. Mohamed, “Automatic text summarization: A comprehensive survey,” Expert Syst. Appl., vol. 165, no. November 2021, 2021, doi: 10.1016/j.eswa.2020.113679.
L. Abualigah, M. Q. Bashabsheh, H. Alabool, and M. Shehab, “Text Summarization: A Brief Review,” Stud. Comput. Intell., vol. 874, no. January, pp. 1–15, 2020, doi: 10.1007/978-3-030-34614-0_1.
M. Lewis et al., “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension,” CoRR, vol. abs/1910.1, Oct. 2019, doi: https://doi.org/10.48550/arXiv.1910.13461.
S. Narayan, S. B. Cohen, and M. Lapata, “Don’t give me the details, just the summary! Topic-aware convolutional neural networks for extreme summarization,” Proc. 2018 Conf. Empir. Methods Nat. Lang. Process. EMNLP 2018, pp. 1797–1807, 2018, doi: 10.18653/v1/d18-1206.
T. Hasan et al., “{XL}-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages,” in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Aug. 2021, pp. 4693–4703, [Online]. Available: https://aclanthology.org/2021.findings-acl.413.
L. Basyal and M. Sanghvi, “Text Summarization Using Large Language Models: A Comparative Study of MPT-7b-instruct, Falcon-7b-instruct, and OpenAI Chat-GPT Models,” 2023, [Online]. Available: http://arxiv.org/abs/2310.10449.
C. C. Aggarwal and C. C. Aggarwal, Mining text data. Springer, 2015.
B. Srinivasa-Desikan, Natural Language Processing and Computational Linguistics: A practical guide to text analysis with Python, Gensim, spaCy, and Keras. Packt Publishing Ltd, 2018.
C.-Y. Lin, “Rouge: A package for automatic evaluation of summaries,” in Text summarization branches out, 2004, pp. 74–81.
T. Hasan et al., “{XL}-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages,” in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Aug. 2021, pp. 4693–4703, [Online]. Available: https://aclanthology.org/2021.findings-acl.413
L. Basyal and M. Sanghvi, “Text Summarization Using Large Language Models: A Comparative Study of MPT-7b-instruct, Falcon-7b-instruct, and OpenAI Chat-GPT Models,” 2023, [Online]. Available: http://arxiv.org/abs/2310.10449.
B. Srinivasa-Desikan, Natural Language Processing and Computational Linguistics: A practical guide to text analysis with Python, Gensim, spaCy, and Keras. Packt Publishing Ltd, 2018.