Abstract

YouTube videos are categorized into a number of classes such as music, entertainment, sports. This paper will explore how these categories are related, how strongly they are related, and the techniques used to gain this information. Data mining, graph analysis, and visualization techniques will be leveraged to accomplish this goal.

Introduction

YouTube is the preeminent public video sharing platform on the internet today. It’s vast array of examples of human behavior can provide insight into how we as a society categorize activities, and how those categorizations are related to each other. Each video has a number of attributes we can mine, including variable number of related videos. We can use the related video information to construct a social network of videos that show their relationships.

The related videos are determined by YouTube algorithmi- cally, not by hand, so this approach is not as organic as human labeled data. However, the machine labeling allows access to a much greater data set.

This work seeks to explore the relationships between videos through first order attributes and infer second order relationships such as how similar or dissimilar video categories are.

Full Paper