1 of 18

Hotel Booking Cancellations

By: Huong Lenoch

2 of 18

Content

Hotel Booking System

Problem Statement

Hypothesis

Data Set

Tools and Methods

Data Visualization

Modeling and Evaluation

Conclusion

3 of 18

Problem Statement��Using available hotel booking data set to predict cancellations to help produce better forecasts and reduce uncertainty in management decisions.��

Hypothesis

A guest with longer lead-time* and has previous cancellations will more likely cancel his/her booking.

Hotel Booking System Diagram

Note: * Lead-time is number of days prior to arrival that the booking was placed in the hotel

4 of 18

Data

  • Data set obtained from Kaggle.com posted by Jesse Mostipak
  • Recorded 65,535 bookings, 32 variables from the years of 2015 to 2017

5 of 18

Machine Learning was used to build a predictive model to predict booking cancellations. The target, “is canceled”, is binary (0: no; 1: yes), so two-class classification algorithms are chosen including Tree, Neural Network and Logistic Regression.

Orange: Friendly user interface when it comes to charts/graphs, no coding requires.

Excel: Some “ninja tricks” will be used to make the message clearer.

Methods

And

Tools

6 of 18

Data Visualizations and Analytics

Figure 1: The Resort Hotel has more bookings but less cancellations.

7 of 18

Figure 2: The hotels have more cancellations during the high season from July to October.

8 of 18

Figure 3: On average, the bookings that were canceled had higher average daily rate of $102.62, whereas the ones that were not canceled had an average rate of $90.05.

9 of 18

Effect of deposit type on cancellations

Figure 4: Over 99% of customers who had non-refund deposit type canceled their bookings

10 of 18

Figure 5: Effect of booking history on cancellations

  • Most cancellations were made by guests who had already had one previous cancellation and had less than 5 non canceled bookings.
  • Customers with high number of previous booking not canceled, rarely canceled.
  • In contrast, all customers with high number of previous cancellations (more than 13) canceled their bookings.

11 of 18

Figure 6: Bookings were made a few days before the arrival date are rarely canceled; however, bookings were made over 200 days before the arrival date are canceled very often.

12 of 18

Figure 7: Effect of lead time on cancellations�

The trend line shows a positive correlation between lead time and cancellation probability. The longer the lead time is, the higher chance the booking get canceled.

13 of 18

Figure 5

Figure 6

Figure 7

Hypothesis

A guest with a longer lead time and has previous cancellations has a higher probability of cancelling their bookings

14 of 18

  • Deposit type
  • Previous cancellations
  • Required car parking spaces
  • Previous bookings not canceled
  • Total of special requests
  • Lead time

15 of 18

All models reached accuracy values above 80%

-> Booking cancellations can be predicted by these models.

 

Tree

Logistic Regression

Neural Network

Accuracy

0.8338

0.8099

0.8376

Precision

0.7931

0.8340

0.8110

Recall

0.7458

0.6078

0.7321

16 of 18

Conclusions

City Hotel

From July to October

High daily rate

Non-refund deposit type

Long lead-time

Previous cancellations

17 of 18

Lessons learned

  • Booking cancellations can be predicted.
  • Data visualization and analytics allowed the understanding of feature’s predictive relevance.
  • These prediction models enable hotels to have more precise demand forecasts.

18 of 18

References