Applied Deep Learning HW2
Natural Language Generation
Deadline: 2024/10/24 23:59:59
Links
NTU COOL (To be modified)
TA Hours:
9/30 10/7 @ 德田524
10/14 @ Online https://meet.google.com/rhj-ugax-tpu
10/21: 10:00~11:00 @ 德田524
Updates
10/21 OH change to 10:00~11:00 @ 德田524
Task Description
Chinese News Summarization (Title Generation)
從小就很會念書的李悅寧, 在眾人殷殷期盼下,以榜首之姿進入臺大醫學院, 但始終忘不了對天文的熱情。大學四年級一場遠行後,她決心遠赴法國攻讀天文博士。 從小沒想過當老師的她,再度跌破眾人眼鏡返台任教,......
榜首進台大醫科卻休學 、27歲拿到法國天文博士 李悅寧跌破眾人眼鏡返台任教
Data
Data (cont.)
Metrics
Objective
Bonus: Applied GPT-2 on Summarization
read
I
like
to
<eos>
Decoder
I
like
...
...
...
to
Time step 1
Decoder
read
Decoder
I
like
...
<eos>
Time step 3
I
like
...
...
to
to
read
Final Generated Output
Time step 1
https://towardsdatascience.com/language-models-gpt-and-gpt-2-8bdb9867c50a
Bonus: Applied GPT-2 on Summarization (cont.)
Report
Q1: Model (2%)
Q2: Training (2%)
Q3: Generation Strategies(6%)
Bonus: : Applied GPT-2 on Summarization (2%)
Rules
What You Can Do
gdown ==5.2.0, tqdm==4.66.5, pandas==2.0.3, jsonlines==4.0.0, protobuf==4.25.5
What You Can NOT Do
Logistics
Grading
Submission - Format
Submission - File Layout
Submission - Scripts
Submission - Scripts
Submissiom - Reproducibility
Execution Environment
Late Submission Penalty
Guide
Text-to-Text Transformer (T5)
HW1: BERT
HW2: T5
Decoder
Bi-Encoder
<input>
Hidden state
Bi-Encoder
Hidden state
<s>,y1,y2,y3
y1,y2,y3,</s>
<input>
<output>
Q,K,V
Q
K,V
Q,K,V
Training
Some Reminders
How to Fix T5 FP16 Training
Documents
Q&A