Double Clustering Method in Hiding Association Rules
محورهای موضوعی : B. Computer Systems OrganizationZahra Kiani Abari 1 , Mohammad Naderi Dehkordi 2
1 - Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University, Najafabad, Isfahan, Iran
2 - Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University, Najafabad, Isfahan, Iran
کلید واژه: Clustering, Data Mining, Association rules, Frequent Item-sets, Privacy Preserving Data Mining,
چکیده مقاله :
Association rules are among important techniques in data mining which are used for extracting hidden patterns and knowledge in large volumes of data. Association rules help individuals and organizations take strategic decisions and improve their business processes. Extracted association rules from a database contain important and confidential information that if published, the privacy of individuals may be threatened. Therefore, the process of hiding sensitive association rules should be performed prior to sharing the database. This is done through changing the database transactions. These changes must be made in such a way that all sensitive association rules are hidden and a maximum number of non-sensitive association rules are extractable from the sanitized database. In fact, a balance is to be established between hiding the sensitive rules and extracting the non-sensitive rules. A new algorithm is presented in this paper to create a balance between preserving privacy and extracting knowledge. The items of sensitive rules are clustered in the proposed algorithm, in order to reduce changes. In fact, reduction of changes and clustering of rules are applied in order to reduce the side effects of the hiding process on non-sensitive rules.