This project opened my eyes to the realities of data science and how hard it is to get the data that we want from open sources, but also exposed me to real-world patterns and analytics of trends—in this case, Movies and YouTube videos.
It also employed me with the necessary skills to conduct cohesive research and data analysis on various topics, from Preprocessing, Visualization, Inferencing, and Rule Mining.
Overall, the results we got from this project were—to say the least—interesting for a casual movie-goer, but also provides movie production companies a guide on which movie genres to release on what month to achieve higher user ratings.
My Roles
Exploratory Data Analysis
Pre-processed, formatted, and visualized The Movies Database and YouTube Trending Videos dataset
Statistical Inferencing
Applied Chi-Square Test of Independence on Genre and Month dependence, uncovering a significantly high dependence of genres being released to the month it was released
Association Rule Mining
Integrated Movie Genre Ratings and Month Released to make rules on which Movie Genres are more likely to score high on which month