Make Learning Data Science fun
A reasonable timeline for learning Data Science
Month 1-2
Familiarize yourself with programming R, Python, SQL (Nothing deep, just basic)
Read articles about the job markets
Take a course on Exploratory Data Analysis and Data Validation (Excel)
Learn descriptive statistics - measure of dispersion, measure of central tendency
Play around with some visualizations discovering fun insights (Tableau, Excel, PowerBI)
Month 3 – 6 (this is ongoing)
To Do | Python | R |
Download an IDE for your preferred programming language. | Spyder, VS code, Jupyter | Rstudio, Jupyter (I prefer doing all my analysis in Jupyter even when using R) |
Understand the basics of different data types (string, integers, floats) and data structures | N/A | |
Get familiar with “Data” libraries. | Numpy, Pandas, Matplotlib, Statsmodel | Dplyr, Tidyverse, Ggplot2, Caret |
Build basic visualization | Matplotlib, Plotly (my absolute favorite), Bokeh | Ggplot2, Plotly, Bokeh |
Manipulate and wrangle data with SQL/Pandas/R | Pandas | Tidyverse |
Dig deeper into statistical concepts – Correlation analysis, Hypothesis Testing, Distribution types, Linear Regression and learn how these techniques fit into the data analysis ecosystem | Statsmodel, Scipy | Car, ggpubr (truth be told, I slightly prefer R for statistical analysis) |
Month 6 - 9
Month 9 - 18
More Statistical concepts
More Supervised Learning
Unsupervised Models
Month 18-24
And their applications (not limited to):
Text Mining
Image Classification
High level Familiarity of:
Autoencoders
Convolutional Neural Networks (CNNs)
Recurrent Neural Networks (RNNs)