1 of 29

Purple 7: Andrew, Billy, Pallavi, Shivangi, Slim, Vanessa

DATA ANALYSIS

MSIS 510

2 of 29

About Kickstarter

Visualization Tools

Multiple Linear Regression

Decision Tree Modeling

Logistic Regression

Conclusion

TABLE OF CONTENTS

01

02

03

04

05

06

3 of 29

Introduction

01

01

4 of 29

Kickstarter’s Mission

To bring more creativity into the world

5 of 29

What is Kickstarter?

  • Do you have a great idea and need help in funding a project? Kickstarter can help!

  • Kickstarters helps artists, musicians, filmmakers, designers, and other creators find the resources and support they need to make their ideas a reality.

  • Many creative projects, big or small, have come to life with support of kickstarter community.

ABOUT US

6 of 29

How Kickstarter Works

Creators (fund seekers)

Backers (Community members)

Micro-payments

Funding minus 5% commission

Exclusive products/

services

Exposure on platform

7 of 29

Successfully funded

Unsuccessfully funded

Dataset & Market Trends

  • Kickstarter dataset captures details about almost 20,000 projects under all categories like design, technology and many other categories.

  • The dataset contains details of projects like project title, category, location, fundraising goals and many other.

  • The data also contains some performance metrics of the projects with their performance outcome.

8 of 29

Visualization Tools

06

02

9 of 29

Understand & Clean Data

Num.

DATA

Cate.

DATA

Launch Duration

Prepare Duration

Title

Blurb

Staff Picked

Category

(popularity..)

Location

(City, Country)

(Language..)

Deadline

(Week, Month…)

Supporter Number

Goal

Pledge Amount

Status

More Data?

COVID?

TEXT

DATA

10 of 29

What are the important variables?

Text

Classification

Time

Others

Title Length

Category

Launch Time

COVID

Blurb length

Country

Project Duration

Goal

City

Prepare Duration

Staff Pick

Deadline

Spotlight

Pledged

Correlation Coefficient Matrix

11 of 29

Successful vs. Failed Projects Across Categories

  • Comics & Design Categories have the highest success rates.

  • Technology and Theater Categories have much lower success rates.

12 of 29

Success Rate Comparison Across the Top 10 Countries

  • Project from English speaking countries have much higher success rates.

  • Success vs Failure rates are more balanced out for the projects in many other countries.

  • Some countries have higher failure rates.

13 of 29

Success Rate According to Different Word Counts of Names for the Projects

  • Success rate increase as the length of the project names increase.

  • A project name with 13 words outperform other projects.

  • There seems to be some correlation between the two variables.

14 of 29

Percentage of the Success and Failure Based on Launch Day of the Week

  • Tuesday has the highest volume of projects launched.

  • Tuesday has a sharper increase in its success rate, compared to the failure rate.

15 of 29

COVID’s Effects on Project Performances

  • Projects ending during the pandemic has greater chance of success.

  • Projects ending during the pandemic also can pledge 5 times the amount of the asking goals.

16 of 29

Multiple Linear Regression

06

03

17 of 29

What is Multiple Linear Regression?

  • Focuses on explaining the relationship between a scalar response and multiple explanatory variables

  • Useful for analyzing the impact of one variable as compared to a range of other variables.

  • With Kickstarter, we can analyze the impact of a variable such as Staff Pick, measured against other variables such as duration or start date, to see if a project is ultimately successful or not..

THE MODEL

18 of 29

Pre-Processing our Data

  • With our eyes set on a linear regression model, we realize our scope may be limited due to only being able to use numeric data.

  • To incorporate variables such as season of launch, season of the deadline, success status, etc, we first have to create new columns that are numeric instead.

  • Finally, a multiple linear regression does not handle some N/A values well, so it is important we omit these with na.action = na.exclude.

19 of 29

Launch Season

Project Duration

Goal

Deadline Season

Staff Pick

Results

Coefficient: 0.04472

Pr(>|t|): <2e-16

Coefficient: 0.02777

Pr(>|t|): <2e-16

Coefficient: -0.00389

Pr(>|t|): 0.14

Coefficient: 0.01482

Pr(>|t|):<2e-16

Coefficient: 0.09114

Pr(>|t|):<2e-16

20 of 29

Decision Tree Modeling

06

04

21 of 29

Predictors Considered

____________________________________________________________________________________________

  • Days to Launch
  • Deadline Season
  • Deadline Weekday
  • Project Duration
  • Launch Season
  • Launch Weekday
  • Staff Pick
  • Usd_Type
  • End During Covid
  • Values

Target_______________________________________

  • State

Decision Tree

Significant Variables

____________________________________________________________________________________________

  • Goal in 1000s
  • Staff Pick
  • Days to Launch

Accuracy

___________________________________________________

73.3%

22 of 29

Logistic Regression

06

05

23 of 29

Logistic Regression 1

Question

When is the best time to launch a product and for how long?

Outcome Variable

______________________________________________________________________________________________________

Input Variables

______________________________________________________________________________________________________

Launch Season

Successful

Project

Failed

Project

Project

Duration

Launch

Day of Week

24 of 29

Results & Analysis

Significant Variables

______________________________________________________________________________________________________

Project Duration: 60-70 days

Launch Season: Winter

Launch Season: Summer

Launch Day: Tuesday

-1.5

0.14

-0.32

0.16

Accuracy: 67%

Winter

Tuesday

Shorter

25 of 29

Logistic Regression 2

Question

How should you categorize and describe your product?

Outcome Variable

______________________________________________________________________________________________________

Input Variables

______________________________________________________________________________________________________

Successful

Project

Failed

Project

Project

Category

Project Name Word Count

26 of 29

Results & Analysis

Significant Variables

______________________________________________________________________________________________________

12/15 Categories

12/16 Word Count Values

-1 to 1.6

0.2 to 1.85

Accuracy: 60%

Comics

Dance

Design

Music

More words

in the title

(up to 16)

More Success

27 of 29

Conclusion

06

06

28 of 29

Conclusion

Kickstarter should take advantage of their data to give suggestions and instructions to the project creators and help them avoid unnecessary mistakes

A project’s success largely depends on the idea and delivery.

Our goal is to help make decisions to improve decisions towards launching a product on Kickstarter

29 of 29

Thank You!

Questions?