Sebastian Quirarte | [email protected] | in/sebastianquirarte | Last Updated: Nov 29 2023
The aim of this project is to extract, transform, and visualize data from any YouTube channel and all its videos using the YouTube API using Python and PowerBI. In this case I wanted to analyze one of my favorite channels: Kurzgesagt – In a Nutshell.
-
Data Extraction and Transformation: Extracting data directly from YouTube through their API and storing this data into dataframes.
-
Data Preprocesing: Converting data types, creating new columns and cleaning dataframes.
-
Visualization and Analysis Visualizing and analyzing the data obtained from all the videos uploaded by the channel of interest.
-
Dashboard: Creating a dashboard in PowerBI of the data collected to display the project results in a concise manner.
- googleapiclient: The official Python client library for Google's discovery based APIs.
- pandas: data manipulation, analysis, and data structures and operations for manipulating numerical tables and time series.
- IPython: interactive command-line terminal for Python. Used in this case for JSON formatting.
- dateutil: helps you manipulate and work with dates and time stamps.
- isodate: implements ISO 8601 date, time and duration parsing.
- youtubeAPI: functions written specifically for this project, available in file 'youtubeAPI.py'
- seaborn: data visualization library based on matplotlib.
- matplotlib: comprehensive library for creating static, animated, and interactive visualizations in Python.
- nltk: symbolic and statistical natural language processing (NLP) for English written in Python.
- wordcloud: word cloud generator in Python.
Dashboard PowerBI (LINK)
