1 of 7

AI Mini-Project ReelRatings

Group 16: Jeff Suliga, Tanya Acharya, Rishi Patel, Parth Mittal

2 of 7

Problem Statement & Analysis

  • In a world flooded with content, it's hard to figure out what's worth your time.
  • Reviews are crucial for deciding if movies or shows match your preferences, but many review sites, like Letterboxd, can be confusing due to "joke" reviews.
  • ReelRatings is a tool that analyzes user reviews to predict real ratings, helping users determine if a movie or show has a positive or negative sentiment and navigate the platform more effectively.

3 of 7

Use-Case Scenarios

  • Help users decide which movies/shows to watch
  • Filter out “joke” reviews via analyzing sentiment, allowing users to see reviews based on a more accurate understanding of their meaning
  • Can be implemented by companies wanting to improve their recommendation algorithms
  • Can help filmmakers and producers understand insights to how the public perceives their work

4 of 7

AI Algorithm & Model

  • Support Vector Regression (SVR) Model:
    • Implemented a traditional Machine Learning approach using a Support Vector Regression (SVR) model on 50,000 IMDB reviews.
    • Designed for numerical sentiment analysis, providing a baseline for performance comparison.
    • Preprocessing steps included text normalization, tokenization, and TF-IDF vectorization for feature extraction.
  • DistilBERT Model:
    • Leveraged DistilBERT, a transformer-based model optimized for efficiency.
    • Despite being smaller and faster than BERT, DistilBERT retains 97% of its predecessor's language understanding capabilities.
    • Fine-tuned on 15,000 IMDB reviews, optimizing for sentiment classification within our computational limits.

5 of 7

Results and Demonstration

  • SVR Model Performance
    • The SVR model's RMSE of 0.3085 is indicative of its precision in predicting sentiment scores. This level of accuracy suggests that the model can discern the subtleties within the sentiment spectrum effectively.
  • DistilBERT Model
    • Threefold reduction seen in Training Loss across Epochs
    • Comparing the RMSE of the SVR (0.3085) and DistilBERT (0.3271) models highlights a close contest in predictive accuracy.
    • High level of reliability for classification task.
  • Combined Approach Analysis
    • Leveraging the strengths of both models, our combined approach offers a more balanced sentiment rating.
    • DistilBERT first classifies the sentiment (positive/negative), which then guides the SVR model to adjust its rating within a specific range, providing a more context-aware sentiment score.

6 of 7

Lesson Learned

  • Increased Epochs:
    • Training the model for longer periods (more epochs) can improve its learning and fine-tuning.
  • Expanded Training Dataset:
    • Using a larger and diverse collection of reviews can help the model understand a wider range of opinions and become more accurate.
  • Enhanced Hardware Resources:
    • Using faster and more powerful computers can speed up the model training, allowing it to handle more complex tasks and larger sets of data.
  • Ensemble Methods:
    • Combining multiple models or techniques can improve overall prediction accuracy by leveraging diverse approaches.

��

7 of 7

Q & A