INDUCING VALUABLE RULES FROM IMBALANCED DATA: THE CASE OF AN IRANIAN BANK EXPORT LOANS

Title / Keyword / DOI / DOR

Author

Journal

By Subjects

Subject of Articles

Issue

Year

Vol.

Page

Logical Operator

Parameter

Advanced

International Journal of Information, Security and Systems Management

Issue 1 Vol. 2 Winter 2013

Submit Your Paper
Review for this publication

PDF XML

Downloads : 57

Export Article
- BibTex
- EndNote
- RIS
- ISI
- APA
- MLA
- HARVARD
- VANCOUVER
- Mendeley
- HTML
Related Links
- Google Scholar
More by Authors Link
- DOAJ
- Google Scholar
- PubMed

Share To

Article Url

Manuscript ID : 553735 Visit : 92 Page: 130 - 135

Article Type: Original Research

INDUCING VALUABLE RULES FROM IMBALANCED DATA: THE CASE OF AN IRANIAN BANK EXPORT LOANS

Subject Areas : International Journal of Information, Security and Systems Management

Received: 2014-12-15 Accepted : 2014-12-15 Published : 2013-06-01

Keywords: Credit Scoring, banking industry, Rule extraction, IMBALANCED DATA SAMPLING,

Abstract :

Credit scoring is a classification problem leading to introducing numerous techniques to deal with it such as support vector machines, neural networks and rule-based classifiers. Rule bases are the top priority in credit decision making because of their ability to explicitly distinguish between good and bad applicants.In a credit- scoring context, imbalanced data sets frequently occur as the number of good loans in a portfolio, which is usually much higher than the number of loans that default. The paper is to explore the suitability of RIPPER, One R, Decision table, PART and C 4.5 for loan default prediction rule extraction.A real database of one of Iranian banks export loans is used, and class imbalance issues are investigated in its loan database by random oversampling the minority class of defaulters along with three sampling of majority in non-defaulters class. The performance criterion chosen to measure such an effect is the area under the receiver operating characteristic curve (AUC), accuracy measure and number of rules. Friedman’s statistic is used to test significant differences between techniques and datasets. The results shows that PART is the best classifier in all of balanced and imbalanced datasets

References:

Designing a Combined-fuzzy Methodology to Improve Organizational Diagnosis Process Effectiveness through Identification and Assessment of Effective Parameters
Print Date : 2018-12-01
Petroleum Tax Regime in Iran
Print Date : 2018-12-01
Scale Development for Decision-Making Styles of Iranian Youth
Print Date : 2018-12-01
The Effect of Radio Waves on the Quality and Safety of Wearable Sensors in Healthcare
Print Date : 2018-12-01
A Proposed Model for Assessing the Determinants of Enterprise Resource Planning Adoption and Satisfaction
Print Date : 2018-12-01
Context-Aware Recommender Systems: A Review of the Structure Research
Print Date : 2018-12-01