استفاده همزمان از همبستگی‌خطی پیرسون و ترکیب الگوریتم‌های داده‌کاوی به منظور بهبود پیش‌بینی نوع تومور در بیماران سرطانی

محورهای موضوعی : مهندسی الکترونیک

1 - دانشجوی کارشناسی ارشد دانشگاه آزاد اسلامی واحد بوشهر
2 - گروه کامپیوتر، واحد تهران مرکز، دانشگاه آزاد اسلامی، تهران، ایران

تاریخ دریافت : 1399/05/11 تاریخ پذیرش : 1399/05/11 تاریخ انتشار : 1398/08/01

کلید واژه: ضریب همبستگی پیرسون, Adaboost, آدابوست, الگوریتم‌های دسته‌بندی, بیز‌ساده, Pearson’s correlation coefficient, classification algorithms, Naive Bayes,

چکیده مقاله :

امروزه سرطان سینه از شایع‌ترین بیماری‌های سرطان در بین زنان به‌شمار می‌آید. آمارها از رشد شش درصدی این نوع سرطان در ایران حکایت می‌کند که نشان دهنده جدی بودن خطر آن می‌باشد. این در صورتی است که در صورت پیشگیری و یا تشخیص زود هنگام بیماری می‌توان تا حد زیادی از خطرات آن جلوگیری نمود. با پیشرفت علوم پزشکی، زمینه لازم جهت ایجاد سیستم‌هایی با قابلیت پیشگیری، پیش‌بینی و درمان بیماران با استفاده از فناوری‌های جدید حاصل گردیده است. داده‌کاوی پزشکی سعی در مدل‌سازی و کشف روابط بین عوامل خطرساز جهت پیش‌بینی وضعیت بیماران آینده با کمک از داده‌های در‌دست دارد. در این پژوهش سعی گردیده تا با مقایسه الگوریتم‌های مختلف داده‌کاوی و ترکیب این الگوریتم‌ها، روشی جدید، کارا و با دقت بالا و قابلیت پیاده‌سازی بر روی داده‌های محلی ایجاد گردد. در نهایت روش پیشنهادی که به بهبود کارایی الگوریتم بیز ساده با استفاده از الگوریتم آدابوست می پردازد، توانایی پیش‌بینی نوع تومور خوش‌خیم یا بدخیم با دقت96.67 درصد را دارا می باشد. داده‌های لازم جهت این فرآیند از سایتUCI جهت تشخیص نوع تومور با569 رکورد و32 متغیر، استخراج گردیده است.

چکیده انگلیسی:

Nowadays, breast cancer is the most common cancer disease among women. Statistics shows a six percent increase in Iran which indicates it as a serious danger. However, its danger can be prevented increasingly by early diagnosis or prediction. By medical science progress, the way for developing of a system with the capability of prevention, prognosis and cure by using the new technologies is paved. Medical data mining tries to design a model and find relationships among risky factors to predict the condition of future patients with the aid of current data. We try to compare different data mining algorithms and combination of these algorithms to develop a new, efficient method with high accuracy and capability to perform on local data. Finally, proposed method which improves efficiency of Naive Bayes with Adaboost algorithm can predict the kind of benign or malign tumor with the 96/67% accuracies. Required data for this procedure is extracted from UCI site to diagnose the kind of tumor with 569 records and 32 variables.

منابع و مأخذ:

[1] Thangaraju, P., Mehala, R., (2015); “Novel Classification based approaches over Cancer Diseases ”, International Journal of Advanced Research in Computer and Communication Engineering, Vol 4 , Issue 4, 294-297

[2] Karim Khani Zand Hamid, (2015); “A COMPARITIVE SURVEY ON DATA MINING TECHNIQUES FOR BREAST CANCER DIAGNOSIS AND PREDICTION”, Indian Journal of Fundamental and Applied Life Sciences, Vol 5, 4330-4339

[3] Majidi Zolbanin Hamed, Delen Dorsan, Hassan Zadeh Amir, (2015); “Predicting overall survivability in comorbidity of cancers: A data mining approach ”, Elsevier Decision Support Systems 74,150-161

[4]Venkatalakshmi B, Shivsankar M.V, (2014);” Heart Disease Diagnosis Using Predictive Data mining”, International Journal of Innovative Research in Science, Engineering and Technology, Volume 3, Issue 3,1873-1877

[5] Vikas Chaurasia, Saurabh Pal, (2014); “A Novel Approach for Breast Cancer Detection using

Data Mining Techniques”, International Journal of Innovative Research in Computer

and Communication Engineering, Vol 2, Issue 1,2456-2465

[6]غلامی، م. برومندنیا، ع. (1395)، ارائه روشی با استفاده از ترکیب بهره اطلاعاتی، K نزدیکترین همسایه و شبکه های عصبی جهت پیش بینی وضعیت جنین ، یازدهمین کنفرانس علوم و تکنولوژی، مشهد، ایران، بهمن 95.

[7] میرعابدینی سیدجواد، غلامی، م. (1395)؛ «ارائه روشی جهت افزایش دقت تشخیص بیماری‌های قلبی با ترکیب الگوریتم‌های داده‌کاوی(MAWB)»، سومین همایش ملی مهندسی رایانه و مدیریت فناوری اطلاعات، سوم، تهران، دانشگاه شهید بهشتی

[8] P.Ramachandran, Dr.N.Girija, Dr.T.Bhuvaneswari, (2013); “Cancer Spread Pattern – an Analysis using Classification and Prediction Techniques ”, International Journal of Advanced Research in Computer and Communication Engineering, Vol 2, Issue 6,2363-2367

[9] Kharya Shweta, (2012); “USING DATA MINING TECHNIQUES FOR DIAGNOSIS

AND PROGNOSIS OF CANCER DISEASE

”, International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol 2, Issue 2,55-66

[10] Safdari Reza, Ghazisaeedi Marjan, et al (2013); “A Model for Predicting Myocardial Infarction Using Data Mining Techniques”, Iranian Journal of Medical Informatics, Vol 2, Issue 4,1-6

[11] غلامی، م. (1396)، دادهکاوی برای همه، تهران، انتشارات ناقوس، چاپ اول.

[12] North, M., (2012), Data mining for the masses. Amazon, First Edition

[13] Purusothaman, G., Krishnakumari, P., (2015), A Survey of Data Mining Techniques on Risk Prediction: Heart Disease, Indian Journal of Science and Technology, Vol 8(12), 2–5.

[14] غلامی، م. نجفی، ن، (1395)، بررسی و مقایسه اثربخشی الگوریتم های داده کاوی جهت پیشبینی بیماری پارکینسون، اولین کنفرانس بین المللی چشم اندازهای نو در مهندسی برق و کامپیوتر، تهران، ایران، دانشگاه علم و صنعت، بهمن 95.

[15]Durgalakshmi, B., Vijayakumar, V., (2015), Progonosis and Modelling of Breast Cancer and its Growth Novel Naive Bayes, Procedia Computer Science, Vol 50, 551, 553.

[16]چوبینه، پ. غلامی، م. (1395)، مقایسه شش الگوریتم برتر حوزه داده کاوی، یازدهمین کنفرانس علوم و تکنولوژی، مشهد، ایران، بهمن 95.

[17] Guru Rao, C.V., Sreenivasa Rao, M., (2016), Cluster Analysis of Medical Research Data using R, Global Journal of Computer Science and Technology: C Software & Data Engineering, Vol 16, 17–22.

[18] Karim Khani Zand, H., (2015), A COMPARITIVE SURVEY ON DATA MINING TECHNIQUES FOR BREAST CANCER DIAGNOSIS AND PREDICTION, Indian Journal of Fundamental and Applied Life Sciences, Vol 5,4330-4339.

[19] Boughorbel, S., Al-Ali, R., Elkum, N., (2016), Model Comparison for Breast Cancer Prognosis Based on Clinical Data, PLOS ONE, 15-1.

[20] Han, J., Jian, P., (2011), Data mining: concepts and techniques, Elsevier.

[21] 5. L. Breiman, J. H. Friedman. (1984),“Classification and regression trees,” Monterey.

[22] حقیقی. مهری،"داده کاوی و یادگیری ماشین: مروری بر دسته بندی کننده ها"، کنفرانس بین المللی یافته های نوین، تهران، ایران، شهریور 94.

[23] مرتضی‌پور.رضا، مطلق‌زاده،مهسان، استفاده از شبکه بیزین در طبقه‌بندی، چهارمین کنفرانس مهندسی برق و الکترونیک، دانشگاه آزاد گناباد، شهریور،1391.

[24] نظری. احمد،"ارائه الگوریتم ترکیبی بهینه بردار ماشین پشتیبان و جنگل تصادفی در تشخیص به موقع بیماری های قلبی"، کنفرانس بین المللی پژوهش در علوم و تکنولوژی، مالزی، اذز 94.

[25] Y. Liu and P. Zhao and et al. (2015), “A Boosting Algorithm for Item Recommendation with Implicit Feedback,” Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015)

[26] Krishnaiah, V., Narsimha,G., Subhash Chandra, N., (2016), Heart Disease Prediction System using Data Mining Techniques and Intelligent Fuzzy Approach: A Review, International Journal of Computer Applications, Vol 136, No 2, 43–51.

[27] سعادت. محمد، زمانی بروجنی. فرساد،"مروری بر روش های بهبود کارایی تکنیک جنگل تصادفی"، اولین همایش ملی فناوری اطلاعات، ارتباطات و محاسبات نرم، دانشگاه آزاد خوراسگان، اصفهان، ایران. اردیبهشت 1395.

[28] Han, Y. Liu, X. Sun. (2013), “A scalable random forest algorithm based on MapReduce,” Software Engineering and Service Science (ICSESS), 2013 4th IEEE International Conference on, pp.849-852.

[29] Parijaee Moghaddam. A and Mousavi. S , Learning Decision Tree Using Neural Network for Stability and Flexibilit, Iranian Journal of Medical Informatics, vol. 1, no. 3, pp. 39-44, 2013.

[30] T. M. Khoshgoftar, M. Golawala, and J. Van Hulse. (2007), “An Empirical Study of Learning from Imbalanced Data Using Random Forest.,” presented at the 19th IEEE Conference on Tools with Artificial Intelligence.

[31]Zheng, G. (2017), Logistic Regression, Model Selection, and Cross Validation, personal.umich.edu, March. 25.

[32] Soundarya, M., Balakrishnan, R., (2014), Survey on Classification Techniques in Data mining, International Journal of Advanced Research in Computer and Communication Engineering, Vol. 3, 7550-7552.

[33] Zheng, G. Logistic Regression, Model Selection, and Cross Validation, personal.umich.edu, March. 25, 2017.

[34] Street, W., Wolberg, W., Mangasarian, O., (1992), Nuclear Feature Extraction For Breast Tumor Diagnosis. International Symposium on Electronic ImagingScience and Technology, VOL, 1905, 861-870.

[35] Lichman, M. (2013). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

[36] Christian, S., Winston,W., Zappa, Ch., (2011), Data Analysis and Decision Making, Forth edition, 14-16.

_||_

مقالات مرتبط

طراحی کنترل گسسته فازی تطبیقی مقاوم برای ردیابی مجانبی بازوی ربات هنرمند
تاریخ چاپ : 1400/04/01
تشخیص انجمن های پایدار در شبکه های اجتماعی پویا با استفاده از گره های با نفوذ
تاریخ چاپ : 1401/01/01
طراحی کنترل کننده فازی نوع سوگنو بهینه برای کنترل سرعت موتور DC با در نظر گرفتن دینامیک درایو و چاپر با الگوریتم بهینه‌سازی مبتنی برآموزش و یادگیری
تاریخ چاپ : 1400/04/01
بهینه کردن خطینگی واثر هارمونیک سوم در تقویت کننده های با گستره بسامدی پهن در تکنولوزی 130نانومترCMOS با استفاده از اثربدنه
تاریخ چاپ : 1400/01/01
شبکه هوشمند برای مانیتورینک وضعیت بیمار سرطان سینه
تاریخ چاپ : 1401/10/01
طراحی و بهینه سازی کنترل کننده عصبی برای تنظیم و کنترل ولتاژ خروجی مبدل های DC به DC افزاینده
تاریخ چاپ : 1400/10/01

اشتراک گذاری

آدرس مقاله

استفاده همزمان از همبستگی‌خطی پیرسون و ترکیب الگوریتم‌های داده‌کاوی به منظور بهبود پیش‌بینی نوع تومور در بیماران سرطانی