A collaborative filtering-based book recommendation system built with Python and Streamlit. This application uses K-Nearest Neighbors (KNN) algorithm to suggest books similar to a user's selection based on user ratings data.
π Live Demo: https://books-recommender-system-vineet416.streamlit.app/
- Collaborative Filtering: Recommendations based on user rating patterns
- Interactive UI: Clean and intuitive Streamlit interface
- Visual Recommendations: Displays book covers for recommended titles
- Real-time Suggestions: Get 5 book recommendations instantly
- Large Dataset: Trained on Book-Crossing dataset with thousands of books and ratings
- Python 3.8+
- Streamlit - Web application framework
- NumPy - Numerical computing
- Pandas - Data manipulation and analysis
- Scikit-learn - Machine learning (KNN algorithm)
- Pickle - Model serialization
Books Recommender Project/
β
βββ app.py # Main Streamlit application
βββ research.ipynb # Jupyter notebook for EDA and model training
βββ requirements.txt # Project dependencies
β
βββ artifacts/ # Serialized models and data
β βββ model.pkl # Trained KNN model
β βββ book_names.pkl # List of book titles
β βββ book_pivot.pkl # Pivot table of user-book ratings
β βββ final_rating.pkl # Processed ratings with metadata
β
βββ Datasets/ # Raw datasets (Book-Crossing)
β βββ BX-Books.csv # Book information
β βββ BX-Users.csv # User information
β βββ BX-Book-Ratings.csv # User ratings
β
βββ env/ # Virtual environment (not tracked in git)
- Python 3.8 or higher
- pip package manager
-
Clone the repository
git clone https://github.com/vineet416/Books_Recommender_System.git cd "Books Recommender Project"
-
Create a virtual environment (recommended)
conda create -n env python=3.11 conda activate env/
-
Install dependencies
pip install -r requirements.txt
-
Ensure artifacts folder exists Make sure the
artifacts/folder contains all the required pickle files:model.pklbook_names.pklbook_pivot.pklfinal_rating.pkl
-
Start the Streamlit app
streamlit run app.py
-
Open your browser The application will automatically open in your default browser at
http://localhost:8501 -
Use the application
- Select a book from the dropdown menu
- Click "Show Recommendation"
- View 5 similar book recommendations with cover images
-
Data Preprocessing:
- Loads Book-Crossing dataset containing books, users, and ratings
- Filters users with significant rating history
- Filters popular books with sufficient ratings
- Creates a user-book rating matrix
-
Model Training:
- Uses K-Nearest Neighbors (KNN) algorithm
- Calculates similarity between books based on user rating patterns
- Books rated similarly by the same users are considered similar
-
Recommendation Generation:
- Takes a selected book as input
- Finds the 6 nearest neighbors (including the input book)
- Returns 5 most similar books with cover images
- Algorithm: K-Nearest Neighbors (KNN)
- Similarity Metric: Cosine similarity / Euclidean distance
- Number of Neighbors: 6 (returns 5 recommendations)
- Approach: User-Item Collaborative Filtering
The project uses the Book-Crossing Dataset, which contains:
- Books: 271,360 books with metadata (ISBN, title, author, publisher, etc.)
- Users: 278,858 users (anonymized)
- Ratings: 1,149,780 ratings (scale 0-10)
Dataset Source: Book Recommendation Dataset on Kaggle
Original Source: Book-Crossing dataset collected by Cai-Nicolas Ziegler
To retrain the model or explore the data analysis process:
- Open
research.ipynbin Jupyter Notebook or VS Code - Run all cells sequentially
- The notebook will:
- Load and explore the datasets
- Perform data cleaning and preprocessing
- Create the user-item matrix
- Train the KNN model
- Save artifacts to the
artifacts/folder
- Input: Book title (string)
- Output: List of recommended books and their poster URLs
- Process: Finds nearest neighbors using KNN model
- Input: Book indices from KNN suggestions
- Output: List of poster/cover image URLs
- Process: Maps book indices to image URLs from dataset
The Streamlit interface includes:
- Clean header with emoji icon
- Descriptive subtitle and methodology info
- Searchable dropdown for book selection
- Recommendation button
- 5-column grid layout for displaying results
- Book titles and cover images for each recommendation
- Error handling with user-friendly messages
Contributions are welcome! Here are some ways you can contribute:
- Report bugs and issues
- Suggest new features or improvements
- Improve documentation
- Add new recommendation algorithms
- Enhance the UI/UX
This project is open-source and available for educational purposes.
- Book-Crossing Dataset - For providing the comprehensive books and ratings data
- Streamlit - For the excellent web framework
- Scikit-learn - For machine learning tools
Vineet Patel
- οΏ½π§ Email: [email protected]
- π» GitHub: @vineet416
- π LinkedIn: @vineet416
For questions or feedback, please open an issue in the repository or reach out via email.
Note: Make sure to have all pickle files generated before running the app. Run the research.ipynb notebook first to generate these files if they don't exist.
β If you found this project helpful, please give it a star!