Advanced AI
Final Report
資工碩一
R09922188 曾泓硯
Outline
Project Proposal
1
Team Members
ID/Name:R09922188 曾泓硯
Education: NTHU EE -> NTU CS
Research Topic: Deep Learning on HDR tone mapping
Project Experience related to AI:
Adaptive Learning in Education, Facial Super-Resolution,
Lensless device & Blind-deblurring, Style Transfer on Android Devices
Motivation/Background
What is the state of the art?
Introduction
2
Look into the data
Column Name | Description | Column Name | Description |
dt | Consumption Month | Location_cnt (domestic/ overseas, offline/ online) | Numbers of where the consumption happens |
chid | Customer ID | Location_amt (domestic/ overseas, offline/ online) | Ratio of total amount where the consumption happens |
Shop tag | Category Type | Card_txn (1~14, other) | Numbers of consumption of each card |
txn_cnt | Consumption counts | Card_amt (1~14, other) | Ratio of total amount of each card |
txn_amt | Consumption amount | Other private data | Marital, education, nationality, … |
Evaluation
Problem Definition
Classification
3
Data Preprocessing
Data Preprocessing
chid | tag_0 | tag_1 | tag_2 | tag_3 | tag_4 | other |
123456 | 20 | 60 | 50 | 0 | 5 | … |
chid | Shop_tag | total | Other |
123456 | 0 | 20 | … |
123456 | 1 | 60 | … |
123456 | 2 | 50 | … |
123456 | 4 | 5 | … |
chid | tag_0 | tag_1 | tag_2 | tag_3 | tag_4 | other |
123456 | 20 | 60 | 50 | 0 | 5 | … |
chid | tag_0 | tag_1 | tag_2 | tag_3 | tag_4 | other |
123456 | 20 | 60 | 50 | 0 | 5 | … |
chid | tag_0 | tag_1 | tag_2 | tag_3 | tag_4 | other |
123456 | 20 | 60 | 50 | 0 | 5 | … |
chid | tag_0 | tag_1 | tag_2 | tag_3 | tag_4 | other |
123456 | 20 | 60 | 50 | 0 | 5 | … |
dt_1 |
dt_1 |
dt_1 |
dt_1 |
dt_5 |
Experiment setting
Results
Model with some descriptions | NDCG Result |
#1 Random Forest max_depth = 10 | 0.4834 |
#1 Random Forest max_depth = 100 | 0.6538 |
#2 Random Forest Ensemble with different rank as ground truth max_depth = 100 | 0.6539 |
#3 Random Forest Ensemble with different month model max_depth = 100 | 0.6656 |
#3 XGBoost Classifier Ensemble with different month model n_estimators=100, learning_rate= 0.3 | 0.6744 |
# N: Setting N from previous slide
Feature Importance
Regression
4
Data Preprocessing
Data Preprocessing
chid | dt | total | Other |
123456 | 1 | 20 | … |
123456 | 2 | 60 | … |
123456 | 5 | 50 | … |
123456 | 4 | 5 | … |
chid | dt_1 | dt_2 | dt_3 | dt_4 | dt_5 | other |
123456 | 20 | 60 | 0 | 5 | 50 | … |
chid | dt_1 | dt_2 | dt_3 | dt_4 | dt_5 | other |
123456 | 20 | 60 | 0 | 5 | 50 | … |
chid | dt_1 | dt_2 | dt_3 | dt_4 | dt_5 | other |
123456 | 20 | 60 | 0 | 5 | 50 | … |
chid | dt_1 | dt_2 | dt_3 | dt_4 | dt_5 | other |
123456 | 20 | 60 | 0 | 5 | 50 | … |
chid | dt_1 | dt_2 | dt_3 | dt_4 | dt_5 | other |
123456 | 20 | 60 | 0 | 5 | 50 | … |
tag_0 |
tag_1 |
tag_1 |
tag_1 |
tag_5 |
Experiment setting
Results
Model with some descriptions | NDCG Result |
#1 XGBoost Regressor n_estimators=100, max_depth=15, eta=0.1 | 0.651 |
#2 XGBoost Regressor n_estimators=100, max_depth=15, eta=0.1 | 0.702 |
#2 XGBoost Regressor selected features & only related shop tag n_estimators=100, max_depth=15, eta=0.1 | 0.705 |
#2 CatBoost Regressor depth=10, iterations = 1000, learning_rate=0.1 | 0.708 |
Feature Importance (XGBoost)
Ensemble
5
How to Do it?
Results
Model with some descriptions | NDCG Result |
#1 CatBoost Regressor (predict tag data/ non-predict tag data) | 0.706 |
#2 Classification (0.67) Regression (0.70) Ranking (0.51) | 0.6927 |
#2 Classification (0.67) Regression (0.70) | 0.7027 |
Failure Cases
6
Methods & Results
Methods | NDCG |
Deep Learning | X (acc: 0.11 during training) |
Learn to Rank | 0.5169 |
Conclusions
7
Classification
Regression
Ensemble & Failure Cases
Next Step?
Reference
[1] Olivier Chapelle, Yi Chang, Yahoo learn to rank challenge
[2] Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). New York, NY, USA: ACM. https://doi.org/10.1145/2939672.2939785
[3] Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.
[4] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., … Chintala, S. (2019). PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32 (pp. 8024–8035). Curran Associates, Inc. Retrieved from http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
Q&A