Skip to content

au500/visualize_dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 

Repository files navigation

#How to Best Visualize a Dataset Easily

#Overview

This is the code for this video by Siraj Raval on Youtube. The human activities dataset contains 5 classes (sitting-down, standing-up, standing, walking, and sitting) collected on 8 hours of activities of 4 healthy subjects. The data set is downloaded from here. This code downloads the dataset, cleans it, creates feature vectors, then uses T-SNE to reduce the dimensionality of the feature vectors to just 2. Then, we use matplotlib to visualize the data.

##Dependencies

Install dependencies via 'pip install'. (i.e pip install pandas).

Note** updated dataset is here if the other link is broken http://rstudio-pubs-static.s3.amazonaws.com/19668_2a08e88c36ab4b47876a589bb1d61c37.html

##Usage

To run this code, just run the following in terminal:

python data_visualization.py

##Challenge

The challenge for this video is to visualize this Game of Thrones dataset. Use T-SNE to lower the dimensionality of the data and plot it using matplotlib. In your README, write our 2-3 sentences of something you discovered about the data after visualizing it. This will be great practice in understanding why dimensionality reduction is so important and analyzing data visually.

##Due Date is December 29th 2016

##Credits

The credits for this code go to Yifeng-He. I've merely created a wrapper around the code to make it easy for people to get started.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages