ABCDEFGHIJKLMNOPQRSTUVWXYZAAAB
1
Course motivation: Harvard Business Review 2012: Sexiest job in 21st century: Data scientist. 2022: Sexiest job in 21st century: Web3 Data scientist
2
Course goal: This course "Blockchain & data science" is to train students to become a Web3 data scientist.
3
Key milestones to check on students: Analyzing your data with other data (跨資料集)
4
Key milestones to check on students: Data Privacy & Security
5
Key milestones to check on students: Data Governance, data as labor, Identity, DAO vs. RegTech
6
Note: 正本清源: Web3 is based on Web2. Don't just learn Web3 -- You'll just be a 憤青. Be solid on Web2 data science too.
7
Note: This NTU course is listed in both CS and Smart MHI (Medicine & Health Informatics), as Prof. LIao is appointed at both 資工 & 智慧醫療 and designs the data science syllabus for the program.
8
To meet minimum data science requirement, students should finish 4 modules, each consists of 2 weeks:
9
Module 1: What is Web3 (blockchain) and data science, taught in Week 1 & 2, respectively. Bootstrap the course with Healthcare InsurTech blockchain data
10
Module 2: Level 1 in Web2 data science
11
Module 3: Level 2 in Web2 data science
12
Module 4: Level 3 in Web2 data science
13
After 4 modules and midterm, we are ready to tackle 3 applications (medical, machine doctor, money).
14
These 3 Ms got the most important dataset to analyze for this decade: 沒有健康, 其他沒意義. 人是領頭的1. 所以, 3M = 人 (medical) 用機器 (machine doctor) 去賺錢 (money). 3個都需要AI data science.
15
Module 5 is on 醫療AI/遠距醫療/Web3醫療/人用機器去賺錢.
16
Module 6 concludes the course:
17
Module 6 is on Web3 data science.
18
19
Logistsics: 本課有doctors & 智慧醫療學生. Some are from Japan, SE Asia, and West. To accommodate them, we have class in medical school on Saturdays. If you are capable CS students, you can leave right after each class (8:10 to 11am).
20
21
Details are at: csie.ntu.edu.tw/~data
22
23
VideoDateWeekThemeContentAssignmentAssignment_InfoDue date
24
Module 1: What is Web3 (blockchain) and data science. Bootstrap the course with Healthcare InsurTech blockchain data
25
9/172
Course: Web 3 & data science
Class logistics
26
Analyzing what data. Analyzing blockchained data.
27
This week: What is blockchain. Next week: What is data.
28
Web3 & blockchainWeb3 & blockchainWeb3 and Blockchain. Case study: Medical InsurTech
29
Case study: Medical InsurTech
30
31
Web3 & blockchain: Nexus mutual & privacy-preserving
9/243
Web3 & blockchain: Nexus mutual & privacy-preserving
Web3 and Blockchain. Case study: Medical InsurTech
32
Be a Web3 data scientist: Sexiest job in 21st century
Web3 & data science:Data Science pro-bono
33
1. Data manipulation & blindspots (Dark data)
34
2. From Fermi estimation rule to data science
35
Module 2: Level 1 in Web2 data science
HW#1
36
20221001_lecture10/14
Game theory and Adaptive Markets
Science of Unknown unknowns
37
AI & Book of WhyBayesian belief network & causality: AI & beyond.
38
Module 2-4: 2 textbooks each week: 1. Berkeley's Computational & Inferential Thinking. 2. Stanford's Elements of Statistical Learning. Followed by NTU's problem-driven approach.
39
Read Berkeley's textbook:
Opening: Data engineering 1.How to be a data engineer: Case study with "T" below.
40
Berkeley TV Set approachT: Table Operations: Data join and pivot
41
V: Visualization
42
S: Simulation
43
e: Estimation
44
t: CART by Berkeley & Stanford
45
Read Stanford's textbook:
Statistics: FoundationStatistics: Foundation
46
47
10/85HolidayHoliday (Also, Prof. Liao out of town for FinTeG meetup.)
48
Optional: FinTeG. You can skip it:
Prof. Liao's talk at FinTeG (audio only. I didn't get video file, but audio will do.)
HW#2
49
50
20221015_lecture10/156
51
Read Berkeley's textbook:
Opening: Data engineering 2.
Data visualization 三部曲: Table ops, Visualization, What makes good visualization
52
Cloud computing: ColabCloud computing: Google colab as a service
53
Berkeley TV Set approachT: Table Operations
Pandas dataframe
54
Read Stanford's textbook:
Statistics: FoundationStatistics vs. data science
55
Statistics vs. probability
56
3M
57
Law of large numbers & CLT theorem
58
PCA, Anova, SEM, Regression:
59
NTU's problem-driven approach
Problem-driven: Eco data:
60
PCA
Google Colab PCA
61
Anova
Google Colab Anova
62
Module 3: Level 2 in Web2 data science
63
20221022_lecture_part110/227Data science in the cloud: TV set approach
64
Read Berkeley's textbook:
Berkeley TV Set approachV: Visualization
65
66
Read Stanford's textbook:
SEM
67
20221022_lecture_part2SEM
68
NTU's problem-driven approach
SEM
Google Colab SEM
69
70
20221029_lecture10/298
71
Read Berkeley's textbook:
Berkeley TV Set approachS: Simulation, T: Ends with more on CART next week
72
Midterm-Recap
73
Candidate confusion
74
Google Trends
75
Difference-in-means
76
Read Stanford's textbook:
Stanford Statistics
77
NTU's problem-driven approach
COVID-19 Regression
78
79
Module 4: Level 3 in Web2 data science
80
20221105_lecture11/59
81
Read Berkeley's textbook:
Berkeley TV Set approache: estimation
82
Linear Regression Demo
https://github.com/yu-to-chen/data-science/blob/master/data8/functions/Functions_and_Tables.ipynb
83
Read Stanford's textbook:
Berkeley's Difference-in-means and Logistic Regression
84
85
86
87
20221112_lecture11/1210Berkeley TV Set approachT: CART: Tree-based approach
88
"+ Neural network approach"
Case study: Smart Explorer. Hadoop too. Integrating human feedback.
89
NTU's problem-driven approach
Web3 data science
90
smart_contract_lecture
91
Module 5: 醫療AI/遠距醫療/Web3醫療.
92
3M: 人用機器去賺錢
93
20221119_lecture11/1911Web3 Data ScienceDataFi: Web3 Data Science & Security
94
Case study: Smart Explorer. Hadoop too. Integrating human feedback.
95
智慧醫療 (Smart Medicine)Care group
96
97
20221126_lecture11/2612Smart contractHW#3
98
DataFiDataFi: Web3 Data Science & Security
99
Taiwan election day!
100
20221203_lecture12/313Blockchain teachingDataFi