དཔལ་ལྡན་འབྲུག་གཞུང་། ཤེས་རིག་དང་རིག་རྩལ་གོང་འཕེལ་ལྷན་ཁག།
Department of School Education
Ministry of Education & Skills Development
Online Training for ICT Teachers
28 February 2023
Classes XI & XII ICT Curriculum
Data Science SESSION II
Data Science Part II
Presentation Outline
Discussion
Activity
Explain
ICT Curriculum
Objectives
Competency
Present a visual representation of a dataset by applying data analysis modules in a programming language to communicate a message.
Content Scope
Career Opportunities
Introduction to Pandas
What?
Extremely versatile tool for manipulating dataset
Why?
Python Data Analysis Library
NumPy Vs Pandas
NumPy | Pandas |
| |
Getting Started with Pandas
In windows | In mac |
pip install pandas |
pip3 install pandas |
Video Link
Pandas can be installed using the Python package manager pip
Pandas Series
Sample
Output
Problem
Write a python program to display at least 4 names of your family members or friends.
Python Code
Pandas DataFrame
Python� Code
Problem
Write a python program to display the population data of any five Dzongkhags. Include following information:
Sample Output
Series and DataFrame
Series | DataFrame |
Store only one type of Data as in single column of table | Store various type of data in the form of rows and columns |
Hold small data | Holds large data (External Data) |
Datasets of Pandas are either stored as Series or DataFrame
Data Analysing Methods
info() → Displays a summary of the DataFrame
max() → Returns the maximum value
min() → Returns the minimum value
sort() → Returns the sorted DataFrame
describe() → Returns basic statistics for the numerical columns in the DataFrame
head() → Returns the first five rows of the DataFrame
head(N) → Returns the first N rows of the DataFrame
tail() → Returns the last five rows of the DataFrame
tail(N) → Returns the last N rows of the DataFrame
Data Analysing Methods (Activity 1)
Problem Statement
Write a python program to:
Link to csv file
Sample Output
Data Analysing Methods (Activity 1)
Python Code
Data Referencing & Cleaning Methods
shape() → Returns the number of rows and columns
notna() → Select non-null values
dropna() → Removes rows or columns with missing data
fillna() → Fills missing values with a specified value
isin() → Filter data based on elements
loc() → Access a group of rows and columns by labels
iloc() → Access a group of rows and columns by integer position
Data Referencing & Cleaning Methods (Activity 2)
Problem Statement
Write a python program using Pandas to collect all data with non-empty values under the ‘Calories’ column.
Link to csv file
Sample Output
Statistical Data Analysing Methods
sum() → Returns the sum of the values in a column or row
mean() → Returns the average of the values in a column or row
median() → Returns the median value of the values in a column or row
mode() → Returns the mode (most common value) of the values in a column or row
std() → Returns the standard deviation of the values in a column or row
corr() → Returns the correlation coefficients between values in a column or row
Statistical Data Analysing Methods (Activity 3)
Problem Statement
Write a python program to:
Link to csv file.
Sample Output
Pandas in Summary
Wes McKinney
(Twitter Photo)
Powerful and open-source library for data manipulation and data analysis in Python
Pandas Resources
GitHub Repository
YouTube Video
Pandas Official Site
Pandas Activity Book →