Skip to content

shasan7/Credit_Fraud

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 

Repository files navigation

Detecting Fraud Transactions using the Credit Card Fraud Detection dataset

Used scikit-learn's machine learning models for prediction. Logistic Regression, Random Forest, Support Vector Machine, XGBoost, Ridge Classifier and SGD Classifiers were run.

The Random Forest, Support Vector Machie and XGBoost classifiers all got nearly 100% accuracy, but the XGBoost has a macro average F1-Score of 0.93, while Random Forest got 0.92 and SVM got 0.79 as F1-Score. XGBoost was trained with 100 estimators using the "logloss" evaluation metric, and "balanced" value was passed to the class_weights parameter to provide attention to the minority class, which is crucial for a dataset like ours with huge class imbalance.

Obtained the following results:

Classifiers: LogisticRegression has a score of 98.0 % accuracy score.

Classification Report: 

           precision    recall  f1-score   support

       0       1.00      0.98      0.99     56864
       1       0.06      0.92      0.11        98

accuracy                           0.98     56962
macro avg      0.53      0.95      0.55     56962
weighted avg   1.00      0.98      0.99     56962

Confusion Matrix: 
 [[55478  1386]
 [    8    90]] 

Classifiers: RandomForestClassifier has a score of 100.0 % accuracy score.

Classification Report: 

           precision    recall  f1-score   support

       0       1.00      1.00      1.00     56864
       1       0.96      0.74      0.84        98

accuracy                           1.00     56962
macro avg      0.98      0.87      0.92     56962
weighted avg   1.00      1.00      1.00     56962

Confusion Matrix: 
 [[56861     3]
 [   25    73]] 

Classifiers: RidgeClassifier has a score of 99.0 % accuracy score.

Classification Report: 

           precision    recall  f1-score   support

       0       1.00      0.99      0.99     56864
       1       0.10      0.85      0.17        98

accuracy                           0.99     56962
macro avg      0.55      0.92      0.58     56962
weighted avg   1.00      0.99      0.99     56962

Confusion Matrix: 
 [[56076   788]
 [   15    83]] 

Classifiers: SGDClassifier has a score of 93.0 % accuracy score.

Classification Report: 

           precision    recall  f1-score   support

       0       1.00      0.93      0.96     56864
       1       0.02      0.93      0.04        98

accuracy                           0.93     56962
macro avg      0.51      0.93      0.50     56962
weighted avg   1.00      0.93      0.96     56962

Confusion Matrix: 
 [[53003  3861]
 [    7    91]] 

Classifiers: SVC has a score of 100.0 % accuracy score.

Classification Report: 

           precision    recall  f1-score   support

       0       1.00      1.00      1.00     56864
       1       0.48      0.76      0.58        98

accuracy                           1.00     56962
macro avg      0.74      0.88      0.79     56962
weighted avg   1.00      1.00      1.00     56962

Confusion Matrix: 
 [[56783    81]
 [   24    74]] 

Classifiers: XGBClassifier has a score of 100.0 % accuracy score.

Classification Report: 

           precision    recall  f1-score   support

       0       1.00      1.00      1.00     56864
       1       0.91      0.81      0.85        98

accuracy                           1.00     56962
macro avg      0.95      0.90      0.93     56962
weighted avg   1.00      1.00      1.00     56962

Confusion Matrix: 
 [[56856     8]
 [   19    79]]