1 of 18

IMDB Movie Analysis

By- Ravi Verma

2 of 18

Project Description:

  • Analysing IMDB rating dataset to uncover

“What factors influence the success of a movie on IMDB?"

Here, success can be defined by high IMDB ratings.

  • The impact of this problem is significant for
    • movie producers,
    • directors, and
    • investors
  • Understand what makes a movie successful to make informed decisions in their future projects.

3 of 18

APPROACH:

  • Firstly, go through the dataset to know more about the data, tables, columns and the rows.
    1. Cleaning Data:
    2. Handling Missing Data: 
    3. Clubbing Columns:   
    4. Removing Outliers: 
  • Analyze the data given in the dataset to write the excel formulas and functions.
  • Use excel formulaes and functions to get the data asked in the given tasks.
  • Using retrieved information from the dataset, create visualization.

4 of 18

Cleaning and Handling Data:

Find and remove 2697 blank cells.�Removed rows in which any necessary column data is missing.

127 duplicates were found in the movie_title column.

5 of 18

Tech Stack Used:

Using MS Excel for analysing the dataset:

Microsoft Excel is a versatile tool for data

analysis and visualization. It offers

functions for organizing and manipulating data, pivot tables for summarizing large datasets, and various chart types for visual representation. With features like data validation and What-If Analysis, users can ensure data accuracy and explore different scenarios. Excel's sharing capabilities enable collaboration, making it a go-to choice across industries for effective data-driven decision-making.

6 of 18

Insights:

Task A: Movie Genre Analysis

Task B: Movie Duration Analysis

Task C: Language Analysis

Task D: Director Analysis

Task E: Budget Analysis

7 of 18

Task AMovie Genre Analysis: Analyze the distribution of movie genres and their impact on the IMDB score.

8 of 18

Task A: Descriptive Statistics of the IMDB scores

9 of 18

�Task B. Movie Duration Analysis:  Analyze the distribution of movie durations and its impact on the IMDB score.� B.1: Distribution of Number of Movies w.r.t duration.

10 of 18

Task B. Movie Duration Analysis: B.2: Distribution vs IMDB Score

11 of 18

Task B: Tables and Formulaes

Formula for distributing movie durations by class intervals:

=COUNTIF($A$2:$A$3715, "<60")

Formula for calculating average IMDB Score as per duration distribution:

=AVERAGEIFS($B$2:$B$3715,$A$2:$A$3715,">60",$A$2:$A$3715,"<90")

12 of 18

�Task C. Language Analysis: Situation: Examine the distribution of movies based on their language.

13 of 18

�Task D. Director Analysis: Influence of Directors on movie ratings.

Top 15 high IMDB scored directors

15 Least IMDB scored directors

14 of 18

Task E. Budget Analysis: Explore the relationship between movie budgets and their financial success.

15 of 18

Task E. Budget Analysis: Table representing Correlation coefficient, gross profit margin, movie title.

16 of 18

Links to the Presentation and Excel Sheet

  • Excel: https://docs.google.com/spreadsheets/d/1l56CGUd7IyVk5PA-ZB59h5EMs2brZobW/edit?usp=sharing&ouid=112358832982115230109&rtpof=true&sd=true
  • Presentation: https://drive.google.com/file/d/1ho0KTbjwFlL-Ko8XCdgaeSqhWFqfnolO/view?usp=sharing

17 of 18

Results:

  1. Top 5 most common movie genres are Drama, Comedy, Thriller, Action, Romance.
  2. Most number of movies are of the duration ranges between 1.5hrs to 2hrs.
  3. Top 3 movie languages are English, French, Spanish.
  4. Identified top 15 and bottom 15 IMDB rated directors.
  5. Analysed the budget and found movies with the highest profit and highest loss.

18 of 18

THANK YOU!