KaggleX 2024-Showcase�ReguGuard AI
Author: Shijun Ju
Advisor: Himaja Vadaga
Google | Proprietary & Confidential
KaggleX Showcase
1
Personal Background
ReguGuard AI
Google | Proprietary & Confidential
KaggleX Showcase
2
Project definition - Problem statement
ReguGuard AI
Data Science Topic(s) Applied
Google | Proprietary & Confidential
KaggleX Showcase
3
Project Architecture and Methods
ReguGuard AI
Google | Proprietary & Confidential
KaggleX Showcase
4
Project Details – Data Preparation
ReguGuard AI
Google | Proprietary & Confidential
KaggleX Showcase
5
Project Details – Training and Evaluation
ReguGuard AI
Google | Proprietary & Confidential
KaggleX Showcase
6
Project Details – Response Generation
ReguGuard AI
Google | Proprietary & Confidential
KaggleX Showcase
7
Project Details – Results
ReguGuard AI
Variables Tested
SCC: SparseCategoricalCrossentropy; SCA: SparseCategoricalAccuracy
*** To save evaluation time, randomly selected 900 out of 2,910.
| Gemma-2b 3 versions of questions QLoRA 4bits Rank 6 | Gemma-7b 3 versions of questions TPU LoRA Rank 6 | Gemma-7b 4 versions of questions TPU LoRA Rank 6 | Gemma-7b 4 versions of questions TPU LoRA Rank 10 |
Normal | 20.7% | 39.6% | 70.5% | 66.6% |
Beam Search (3) | - | 48.0% | 78.6% | 73.0% |
Training Loss | SCC: 0.3495 | SCC: 0.0519 SCA: 0.9572 | SCC: 0.0444 SCA: 0.9623 | SCC: 0.0474 SCA: 0.958 |
Training Data | 7,711 | 8,730 | 11,640 | 11,640 |
Testing Data | 900*** | 2,910 | 2,910 | 2,910 |
Google | Proprietary & Confidential
KaggleX Showcase
8
Conclusion
ReguGuard AI
Summary
Future work
Things learned
Google | Proprietary & Confidential
KaggleX Showcase
9
Project Links
ReguGuard AI
Data
Finetuned Gemma-2b: full model
Finetuned Gemma-7b: LoRA adaptor files only
Training / Experiment notes
References: TPU Training
Google | Proprietary & Confidential
KaggleX Showcase
10
Thank you!��Advisor: Himaja Vadaga�Organizer: Kaggle.com���
Google | Proprietary & Confidential
Presentation Title
11