Data Science Topics

SV: Statistics & Visualization

SV I: Statistics Basics

SV II: Visualization Basics

SV III: Lying with statistics

DM: Data Mining / Machine Learning

DM I : Decision Tree 1 (C4.5)

DM II: Clustering (K-MEANS)

DM III: Classification 1 (SVM)

DM IV: Frequent Item Sets (Apriori)

DM V: Maximum Likelihood (EM)

DM VI: Graph Mining (PageRank)

DM VII: AdaBoost

DM VIII: Classification 2 (kNN)

DM IX: Classification 3 (Naive Bayes)

DM X: Optimization: Gradient Descent

DM XI: Evaluation

SML: Systems for machine learning

SML I: Hazy

SML II: MAD Skills & MLbase

SML III: Graph Processing

DI: Data Integration & CrowdSourcing

DI I: Intro to Data Integration

DI II: Data Wrangling

DI III: CrowdSourcing Overview & Quality Control

DI IV: Entity Resolution

DI V: Declarative Crowd-Sourcing

AF: Analytic Frameworks, Storage & Databases

AF I: Map/Reduce - Basics

AF II: Map/Reduce Extensions

AF III: R & Julia

AF IV: Languages for Hadoop

AF V: Spark

AF VI: Scope & Reef

AF VII: NoSQL

AF VIII: Other NoSQL Systems

AF IX: Column Databases

SV: Statistics & Visualization

SV I: Statistics Basics

SV II: Visualization Basics

SV III: Lying with statistics

DM: Data Mining / Machine Learning

Reading for whole class:

Wu, Xindongand Kumar, Vipin and Ross Quinlan, J. and Ghosh, Joydeep and Yang, Qiang and Motoda, Hiroshi and McLachlan, Geoffrey J. and Ng, Angus and Liu, Bing and Yu, Philip S. and Zhou, Zhi-Hua and Steinbach, Michael and Hand, David J. and Steinberg, Dan: Top 10 algorithms in data mining, Knowledge and Information Systems, Volume 14, Issue 1, 2008

DM I : Decision Tree 1 (C4.5)

DM II: Clustering (K-MEANS)

DM III: Classification 1 (SVM)

DM IV: Frequent Item Sets (Apriori)

        

DM V: Maximum Likelihood (EM)

DM VI: Graph Mining (PageRank)

DM VII: AdaBoost

DM VIII: Classification 2 (kNN)

DM IX: Classification 3 (Naive Bayes)

DM X: Optimization: Gradient Descent

DM XI: Evaluation

SML: Systems for machine learning

SML I: Hazy

SML II: MAD Skills & MLbase

SML III: Graph Processing

DI: Data Integration & CrowdSourcing

DI I: Intro to Data Integration

DI II: Data Wrangling

DI III: CrowdSourcing Overview & Quality Control

DI IV: Entity Resolution

DI V: Declarative Crowd-Sourcing

AF: Analytic Frameworks, Storage & Databases

AF I: Map/Reduce - Basics

AF II: Map/Reduce Extensions

AF III: R & Julia

AF IV: Languages for Hadoop

AF V: Spark

AF VI: Scope & Reef

AF VII: NoSQL

AF VIII: Other NoSQL Systems

AF IX: Column Databases