✨ HR Analytics: Employee Attrition & Performance Analysis ✨

📍 Objective

Employee attrition refers to the rate at which employees leave a company. The goal of this project is to model employee attrition and identify the most significant factors influencing turnover. This analysis helps HR professionals predict how many employees are likely to leave and which employees are at the highest risk, thus informing retention strategies.

😇 Motivation

This project aims to leverage data analytics to improve employee satisfaction, reduce operational costs, and enhance overall organizational performance. Using data-driven insights allows organizations to create a positive work environment and retain talent.

📐 System Architecture

The analysis was performed as follows:

Load the Dataset: The IBM HR Analytics Attrition Dataset is loaded.
Data Exploration: Basic information about the dataset is gathered and key attributes identified.
Data Cleaning: Missing values are handled, and the dataset is cleaned for further analysis.
Data Visualization: Visualizations are created using Matplotlib and Seaborn to explore trends in attrition.
Statistical Analysis:
- ANOVA Test for numerical feature importance.
- Chi-Square Test for categorical feature importance.
Data Preprocessing:
- The target variable, Attrition, is mapped to binary values.
- Features are selected and encoded using one-hot encoding.
Train-Test Split: Data is split into training and testing sets using train_test_split.
Modeling: Various machine learning algorithms are implemented, including:
- Logistic Regression
- Random Forest
- Support Vector Machine
- XGBoost
- LightGBM
- CatBoost
- AdaBoost
Model Evaluation: Accuracy scores and confusion matrices are computed.
Comparison: Model performance is compared using ROC curves.

📁 Dataset

The dataset used in this project is a hypothetical dataset created by IBM data scientists. It contains 1470 rows and 35 columns, including both numeric and categorical features related to employee characteristics.

Dataset Link

Dataset Attributes:

Age
Attrition
BusinessTravel
Department
DistanceFromHome
Education
EducationField
EnvironmentSatisfaction
Gender
JobInvolvement
JobLevel
JobSatisfaction
MaritalStatus
MonthlyIncome
OverTime
TotalWorkingYears
WorkLifeBalance
YearsAtCompany
... and more.

📝 Libraries Used:

Pandas
NumPy
Matplotlib
Seaborn
HvPlot
SciPy
Sklearn
XGBoost
LightGBM
CatBoost
Warnings

⚠️ Prerequisites:

Python Programming
Data Science
Data Analysis
Data Pre-processing
Data Visualization
Statistical Analysis
Machine Learning Algorithms

✨ Model Evaluation:

Algorithm	Training Data Accuracy	Testing Data Accuracy
Logistic Regression	0.9271	0.8639
Random Forest	0.8902	0.8413
Support Vector Machine	0.9349	0.8662
XGBoost	1.0000	0.8526
LightGBM	1.0000	0.8390
CatBoost	0.9845	0.8503
AdaBoost	0.9077	0.8322

📈 Comparing Model Performance Using ROC Curve:

🔑 Conclusion

This project provided a comprehensive analysis of employee attrition using the IBM HR Analytics dataset. By implementing various machine learning models, we identified the most effective predictors of employee turnover. These insights can help HR teams implement targeted retention strategies and optimize workforce performance.

📩 Feedback

If you have any feedback, please reach out to me on LinkedIn

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Code		Code
Docs		Docs
HR PPT.pptx		HR PPT.pptx
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

✨ HR Analytics: Employee Attrition & Performance Analysis ✨

📍 Objective

😇 Motivation

📐 System Architecture

📁 Dataset

Dataset Attributes:

📝 Libraries Used:

⚠️ Prerequisites:

✨ Model Evaluation:

📈 Comparing Model Performance Using ROC Curve:

🔑 Conclusion

📩 Feedback

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

mayankyadav23/HR-Analytics

Folders and files

Latest commit

History

Repository files navigation

✨ HR Analytics: Employee Attrition & Performance Analysis ✨

📍 Objective

😇 Motivation

📐 System Architecture

📁 Dataset

Dataset Attributes:

📝 Libraries Used:

⚠️ Prerequisites:

✨ Model Evaluation:

📈 Comparing Model Performance Using ROC Curve:

🔑 Conclusion

📩 Feedback

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages