1 of 7

Brazil Ecommerce Sales Prediction using Prophet

Mazi Prima Reza

2 of 7

2017-2018 sales show a sudden spike on Black Friday.

This is the time series dataset that is currently used to be analyzed and forecasted future sales. It shows a sudden spike three times higher than normal days at 24 Nov 2018, or in Black Friday.

24 Nov 2018

3 of 7

Prophet could predict future sales with 20.42 RMSE

Train

Test

4 of 7

Yet Prophet overfit train dataset and could not predict the trend in the test dataset

During May - July 2018, there was a continuous drop that did not happen in the previous year. A simple google search didn’t answer why this drop happened. This unusual trends affect a higher error prediction in test dataset up to 70.42 RMSE.

But is this a good model?

5 of 7

Variant G performs better than others

Nine experiments are conducted in finding the best model to predict future sales. Turns out variant G is the best model based on what we have!

Variant Name

MinMax Scaler

Correcting Outliers

Hyperparameter Tuning

Holiday Context

RMSE

Train

Test

Control Group

FALSE

FALSE

FALSE

FALSE

58.99

75.14

A

TRUE

FALSE

FALSE

FALSE

58.99

75.41

B

TRUE

TRUE

FALSE

FALSE

37.44

82.92

C

FALSE

FALSE

TRUE

FALSE

59.98

76.20

D

TRUE

FALSE

TRUE

FALSE

59.89

77.21

E

TRUE

TRUE

TRUE

FALSE

36.56

81.88

F

FALSE

FALSE

TRUE

TRUE

20.48

77.61

G

TRUE

FALSE

TRUE

TRUE

20.43

74.06

H

TRUE

TRUE

TRUE

TRUE

20.02

92.01

6 of 7

Future Improvement

Gather events information help Prophet understand data trend better.

7 of 7

Winsorizing is used to correct the outliers

Since black friday affect the sales three times than normal, obviously there will be an outlier and we have to handle this. Winsorizing is used to remove the outliers, it corrects outliers to the maximum values in the normal day.

Before

After