استفاده از طبقهبندی بینالمللی آماری بیماریها (ICD)برای آمادهسازی دادههای پزشکی در سیستمهای پشتیبان تصمیم
محورهای موضوعی : پردازش چند رسانه ای، سیستمهای ارتباطی، سیستمهای هوشمندفروغ السادات حسینی 1 , مهدی افضلی 2 , محمود مرادی 3
1 - دانشگاه علوم پزشکی و خدمات درمانی و بهداشتی، زنجان، ایران
2 - . استادیار، گروه مهندسی فنآوری اطلاعات، واحد زنجان، دانشگاه آزاد اسلامی، زنجان، ایران
3 - استادیار، گروه علم اطلاعات و دانش شناختی، دانشگاه رازی، کرمانشاه، ایران
کلید واژه: ICD, انباره داده, ILTEC, آمادهسازی داده, هوشتجاری,
چکیده مقاله :
کیفیت اطلاعات در موفقیت تجزیه و تحلیل اطلاعات بسیار حیاتی و مهم است. اطلاعات بارگذاری شده در انبارداده باید صحیح، دقیق و با کیفیت باشد. داده با کیفیت در انباره داده موجب تحلیل مناسب و تصمیمگیری بهتر میشود. همچنین مباحث کیفیت داده باید قبل از بارگذاری در انباره داده مورد توجه قرار بگیرد. پاکسازی داده به مفهوم یافتن و حذف خطاها است. همچنین در این فرایند داده های اضافی و ناسازگار شناسایی میشوند. پاکسازی داده در مرحله استخراج، انتقال، بارگذاری با اطمینان از کیفیت داده در انباره داده موجبات اثربخشی هوشمندی کسب و کار را فراهم میآورد. هدف پاکسازیداده، شناسایی دادههای بد (اشتباه، نامرتبط و ناقص) به منظور اصلاح یا حذف آنها است تا از دقت و سازگاری مجموعه داده اطمینان حاصل شود. این پژوهش با هدف تشریح و تبیین روش پاکسازی داده برای حذف دادههای بد انجام شده است. بانکاطلاعاتی نمونه از اطلاعات بیماریهای استانهای زنجان، ایلام وهمدان تشکیل شده است. به منظور حل مشکلات داده در بانک نمونه از فرم سیشارپ و ابزاهای نرم افزار اسکیوال استفاده شده است. بخش اصلی نتایج نشان می دهد که بکارگیری روش پاکسازی داده موجب کاهش میزان خطای بانک داده تا میزان 008/0 درصد شده است.
Data quality is very crucial for the success Data analysis.The data loaded to the data warehouse must be correct, accurate and must be of very high quality.High quality data in the data warehouse will result in the better analysis and better decision making.So this data quality issues must be addressed before the data is loaded in to the data warehouse. Data cleaning find errors and remove errors.It also detect and deals with data redundancy and data inconsistency. Data cleaning using ETL to ensure quality data in the data warehouse for effective business intelligence. The purpose of data cleansing is to detect so called dirty data (incorrect, irrelevant or incomplete parts of the data) to either modify or delete it to ensure that a given set of data is accurate and consistent with other sets in the system. This research aims to explain and clarify data cleaning method for correcting dirty data. The sample database was defined as the collection of all diseases in the provinces of Zanjan, Elam and Hamedan. In order to solve the problems in the sample database C # and SQL store Procedure was applied. An important part of the results revealed the error after data cleaning was reduced to 0.008 %.
[1] Al Farsi, Budour Ahmed; Saini, Dinesh Kumar. Business Intelligence Design Model (BIDM) for University. International Journal of Computer Applications, Vol. 111 , No 14, (2015),pgs.43-49
[2]Ghazanfari.M,M.Jafari.M, Rouhani.S.A tool to evaluate the business intelligence of enterprise systems.Scientia Iranica Transactions E(Industrial Engineering), Vol.16, No.6, )2011(, pgs.1579–1590
[3] Golfarelli , Matte.New Trends in Business Intelligence.Conference: Proceedings of the 28th International Convention MIPRO (BIS&DE&ISS), MIPRO 2005,)May 30-June 03, 2005(, Opatija, Croatia, https://www.researchgate.net/publication/ 221535705_New_Trends_in_Business_Intelligence , 2016/29/01
[4] Rupali Gil,R; Singh,J.A Review of Contemporary Data Quality Issues in Data Warehouse ETL Environment. Journal on Today’s Ideas Tomorrow’s Technologies,Vol. 2, No. 2,) 2014(, pgs.153-160
[5] Ong1, In Lih; Siew, Pei Hwa; Wong, Siew Fan.A Five-Layered Business Intelligence Architecture.IBIMA Publishing,Vol.2011, http://www.ibimapublishing .com/journals/CIBIMA/cibima.html,2016/29/01
[6] Kalelkar, Medha; Churi, Prathamesh; Kalelkar, Deepa. Implementation of Model-View-Controller Architecture Pattern for Business Intelligence Architecture. International Journal of Computer Applications, Vol.102, No.12, )2014(, pgs.16-21
[7] Anand , Nitin. ETL and its impact on Business Intelligence. International Journal of Scientific and Research Publications, Vol. 4, Issue 2, )2014(, pgs.1-3
[8] Tank, Darshan M; Ganatra, Amit; Kosta, Y P. Speeding ETL Processing in Data Warehouses Using High-Performance Joins For Changed Data Capture (CDC). International Conference on Advances in Recent Tecnologic in Communication and Computing, )2010( International Conference on Source: IEEE Xplore, pgs.365-368
[9] Rajashree, Y.Patil; Kulkarni,R.V. A Review of Data Cleaning Algorithms for Data Warehouse Systems. International Journal of Computer Science and Information Technologies, Vol. 3 , No.5 ,)2012(, pgs.5212 – 5214
[10] Kirange, Mayuri; Makhijani, R.K. Revolution In DW By Solving Causes Of Data Quality Problems In DW And ETL. International Journal of Computer Science and Mobile Computing, Vol. 4, Issue. 1, )2015(, pgs.64 – 73
[11] Devi1,S; Kalia, A. Study of Data Cleaning & Comparison of Data Cleaning Tools. International Journal of Computer Science and Mobile Computing, Vol. 4, Issue. 3,)2015(, pgs.360 – 370
[12] Kabiri,A.; Chiadmi,D. Survey on ETL processes. Journal of Theoretical and applied Information Technology. Vol. 54, No. 2, )2013(, pgs.219-229
[13] Talebzadeh,Hossein. A Service-Based Framework for ETL Process Based on Metadata. Journal of Basic and Applied Scientific Research,Vol.2, No.1, ) 2012(, pgs.54-59
[14] Ganapavarapu, Vinaya Bharadwaj. “Designing and Implementing a Data Warehouse using Dimensional Modeling” Master of Science, Computer Engineering , The University of New Mexico, Heileman,Gregory ,)2014(
[15] NEDELCU, Bogdan.Business Intelligence Systems. Database Systems Journal Vol. IV, No. 4, )2013(, pgs.12-20
Vora, Mital; Vora, Jelam; Jani, Dr. N. N. Modelling Architecture for Multimedia Data Warehouse. International Journal of Innovative Research in Science Engineering and Technology, Vol. 4, Issue 1, January )2015( ,pgs.18699-18703
[17] Kushanoor, Akbar; Dr. Krishna, S.Murali; Sagar Reddy, T. Vidya. ETL Process Modeling In DWH Using Enhanced Quality Techniques. International Journal of Database Theory and Application Vol. 6, No. 4, August,) 2013(,pgs.179-198
[18] Dubey, Alok ; Kamal, Archana ; Gupta, Suresh C. Effects of Aggregation and Data Size on Query Performance and Memory Requirements of a Data Warehouse. International Conference on Reliability Optimization and Information Technology , India,)2014(, IEEE Xplore, pgs.(99-104)
[19] Khalid,Muhammad[et al].Challenges of Dimensional Modeling in Business Intelligence Systems. International Journal of Computer & Organization Trends, Vol. 21, No.1, )2015(,pgs.14-15
[20] Ifeanyi, Nwakanma [et al].The Role of Data Warehousing Concept for Improved Organizations Performance and Decision Making . International Journal of Computer Science and Mobile Computing, Vol. 3, Issue. 10, )2014(, pgs.451 – 455
[21] Ali El-Sappagh, Shaker H.; Ahmed Hendawi, Abdeltawab M.; El Bastawissy, Ali Hamed. A proposed model for data warehouse ETL processes. Journal of King Saud University, Computer and Information Sciences (2011) 23, pgs.91–104
[22] Porwa, Sonal; Vora, Deepali. A Comparative Analysis of Data Cleaning Approaches to Dirty Data. International Journal of Computer Applications ,Volume 62, No.17, )2013(,pgs.30-34
[23]Bhattacharjee, Arup Kumar; Mallick ,Atanu; Dey, Arnab; Bandyopadhyay, Sananda. Data Cleaning in Text File. IOSR Journal of Computer Engineering, Volume 9, Issue 2 (2013), Pgs. 17-21
[24] Bhattacharjee, Arup Kumar; Mallick ,Atanu; Dey, Arnab ; Bandyopadhyay , Sananda. Enhanced Technique for Data Cleaning in Text File. International Journal of Computer Science Issues, Vol. 10, Issue 5, No 2, )2013(,pgs.229-233
[25] Khedri, Ridha; Chiang, Fei; SabriAn, Khair Eddin. Algebraic Approach Towards Data Cleaning. The 4th International Conference on Emerging Ubiquitous Systems and Pervasive Networks , Procedia Computer Science 21 (2013), www.sciencedirect.com, pgs.50 – 59
[26] Sheoran, Jyoti. Issues of Data Quality in Data Warehouses. International Conference on Advances in Computer Engineering & Applications (ICACEA-2014) at IMSEC,GZB, pgs.6-8, http://www.ijcaonline.org/proceedings/icacea/ number6/15835-1465 ,2016/01/29
[27] Save, Ashwini M.; Kolkur , Seema. Hybrid Technique for Data Cleaning. International Journal of Computer Applications Proceedings on National Conference on Role of Engineers in National Building NCRENB,)2014(, pgs. 4-8
[28] Kulkarni, Prerna S; Bakal, J.W.Hybrid Approaches for Data Cleaning in Data Warehouse. International Journal of Computer Applications, Vol.88 , No.18, )2014(, pgs.7-10
[29] Bhattacharjee, Arup Kumar; Chatterjee, Partha; Shaw, Mukesh Prasad ; Chakraborty, Manomoy. ETL based Cleaning on Database. International Journal of Computer Applications ,Vol 105, No. 8,) 2014(, pgs.34-40.
_||_