1 of 90

DATA VISUALIZATION

PYTHON

SYLLABUS (2024.25)

Data Visualization: Purpose of plotting; drawing and saving following types of plots using Matplotlib – line plot, bar graph, Histogram.

Customizing plots: adding label, title, and legend in plots.

CBSE -XII – INFORMATICS PRACTICES (065)

2 of 90

LEARNING OUTCOMES

  • About Plotting.
  • Line Chart and its customization.
  • Bar Chart and its customization.
  • Histogram and its customization.

RECORD PROGRAMS

3 of 90

The results obtained after analysis is used to make inferences or draw conclusions about data as well as to make important business decisions.

Sometimes, it is not easy to infer by merely looking at the results. In such cases, visualisation helps in better understanding of results of the analysis.

Data visualisation means graphical or pictorial representation of the data using graph, chart, etc. The purpose of plotting data is to visualise variation or show relationships between variables.

Visualisation also helps to effectively communicate information to intended users.

Traffic symbols, ultrasound reports, Atlas book of maps, speedometer of a vehicle, tuners of instruments are few examples of visualisation that we come across in our daily lives.

Visualisation of data is effectively used in fields like health, finance, science, mathematics, engineering, etc.

PyPlot is a collection of methods within matplotlib library, which allows user to construct 2D plots easily and interactively.

4 of 90

Plotting using Matplotlib:

Matplotlib library is used for creating static, animated, and interactive 2D- plots or figures in Python.

It can be installed using the following pip command from the command prompt:

pip install matplotlib

For plotting using Matplotlib, we need to import its Pyplot module using the following command:

import matplotlib.pyplot as plt

 

Import PyPlot: (Use one of the following)

import matplotlib.pyplot

import matplotlib.pyplot as plt (or any valid identified in place of pl)

from matplotlib import pyplot

5 of 90

Figure 4.1: Components of a plot

6 of 90

The pyplot module of matplotlib contains a collection of functions that can be used to work on a plot.

The plot() function of the pyplot module is used to create a figure. A figure is the overall window where the outputs of pyplot functions are plotted.

A figure contains a plotting area, legend, axis labels, ticks, title, etc.

Each function makes some change to a figure: example, creates a figure, creates a plotting area in a figure, plots some lines in a plotting area, decorates the plot with labels, etc. It is always expected that the data presented through charts easily understood.

Hence, while presenting data we should always give a chart title, label the axis of the chart and provide legend in case we have more than one plotted data.

To plot x versus y, we can write plt.plot(x,y). The show() function is used to display the figure created using the plot() function.

7 of 90

Customisation of Plots

Pyplot library gives us numerous functions, which can be used to customise charts such as adding titles or legends.

grid([b, which, axis])

Configure the grid lines.

legend(\*args, \*\*kwargs

Place a legend on the axes.

savefig(\*args, \*\*kwargs)

Save the current figure

show(\*args, \*\*kw)

Display all figures

title(label[,fontdict,loc,pad])

Set a title for the axes.

xlabel(xlabel[,fontdict, labelpad])

Set the label for the x-axis.

xticks([ticks, labels])

Get or set the current tick locations and labels of the x-axis.

ylabel(ylabel[,fontdict, labelpad])

Set the label for the y-axis.

yticks([ticks, labels])

Get or set the current tick locations and labels of the y-axis

List of Pyplot functions to customise plots

8 of 90

Demo Program:

Program to display 4 CT marks of a student using line chart.

import matplotlib.pyplot as plt

ctno=['CT1','CT2','CT3','CT4']

marks=[15,30,22,35]

plt.plot(ctno,marks)

plt.show()

9 of 90

Marker:

(Changing marker type, size and colour)

It is also possible to specify each point in the line through a marker.

A marker is any symbol that represents a data value in a line chart or a scatter plot. (The data points being plotted on a graph/chart are called markers.)

We can give following additional optional arguments in plot() function:

 

marker=<valid marker type>,markersize=<in points>, markeredgecolor=<valid color>

We can specify marker type as dots, crosses, diamonds, etc. If you do not specify marker type, data points will not be marked specifically on the line chart and its default type will be the same as that of the line type.

10 of 90

Marker

Symbol

Description

 

Marker

Symbol

Description

‘.’

Point marker

“8”

octagon

‘,’

Pixel marker

“s”

square

‘o’

Circle marker

“p”

pentagon

“v”

triangle_down

“P”

plus (filled)

“^”

triangle_up

“*”

star

“<”

triangle_left

“h”

hexagon1

“>”

triangle_right

“H”

hexagon2

“1”

tri_down

“+”

plus

“2”

tri_up

“x”

x

“3”

tri_left

“X”

x (filled)

“4”

tri_right

“D”

diamond

Some of the Matplotlib Markers

11 of 90

Marker Types for Plotting

12 of 90

Colour : It is also possible to format the plot further by changing the colour of the plotted data.

We can either use character codes or the color names as values to the parameter color in the plot( ).

Colour abbreviations for plotting

Important Point: Continuous data are measured while discrete data are obtained by counting. Height, weight are examples of continuous data. It can be in decimals. Total number of students in a class is discrete. It can never be in decimals

13 of 90

The Pandas Plot function (Pandas Visualisation):

In previous Programs, we learnt that the plot( ) function of the pyplot module of matplotlib can be used to plot a chart.

However, starting from version 0.17.0, Pandas objects Series and DataFrame come equipped with their own.plot() methods.

This plot() method is just a simple wrapper around the plot() function of pyplot.

Thus, if we have a Series or DataFrame type object (let's say 's' or 'df') we can call the plot method by writing:

s.plot() or df.plot()

The plot( ) method of Pandas accepts a considerable number of arguments that can be used to plot a variety of graphs.

It allows customising different plot types by supplying the kind keyword arguments.

The general syntax is: plt.plot(kind),where kind accepts a string indicating the type of .plot, as listed in the following table.

In addition, we can use the matplotlib.pyplot methods and functions also along with the plt() method of Pandas objects.

14 of 90

kind =

Plot type

line

Line plot (default)

bar

Vertical bar plot

barh

Horizontal bar plot

hist

Histogram

Others

(Following are not in syllabus)

box

Boxplot

area

Area plot

pie

Pie plot

scatter

Scatter plot

Arguments accepted by kind for different plots

we will learn to use plot() function to create various types of charts with respect to the type of data stored in DataFrames.

Ex: ctno=['CT1','CT2','CT3','CT4']

marks=[15,30,22,35]

pl.plot(ctno,marks,'b',linewidth=10,marker='s',

markersize=20,markeredgecolor='r')

15 of 90

Note :(1) We can combine the marker type with color code

e.g.,’r+’ when given for line color marks the color as ‘red’ and markertype as plus(‘+’), ‘b3’ means line color marks the color as ‘blue’ and markertype as ‘tri left marker’.

(2) When you do not specify markeredgecolor separately in plot(), the marker takes the same color as the line.

(3) If you do not specify the linestyle separately along with linecolor-markerstyle combination (eg.,’r+’), python will only plot the markers and not the line. To get the line, specify linestyle argument also. Ex: pl.plot(ctno,marks,’rd’).

16 of 90

title: To add a title to your plot, we need to call function title( )

Syntax:<matplotlib.pyplot>.title(<title string>)

Ex: pl.title(“Vegetable Rates at various places”)

pl.plot(x-axis values sequence, y-axis values sequence)

pl.xlabel(“Label here”) #To display x-axis label

pl.ylabel(“Label here”) #To display y-axis label

pl.show( ) # To display the chart/plot.

 

Setting limits for X-axis and Y-axis:

PyPlot by default, tries to find best fitting range for X-axis and Y-axis depending on the data being plotted.

We can give xlimits and ylimits as follows:

<matplotlib.pyplot>.xlim(<xmin>,<xmax>)

<matplotlib.pyplot>.ylim(<ymin>,<ymax>)

17 of 90

Setting ticks for Axes:

 

By default, PyPlot will automatically decide which data points will have ticks on the axes, but we can also decide which data points will have tick marks on X and Y-axes.

Syntax (for X-axis): xticks(<sequence containing tick data points>,[<optional sequence containing tick labels>])

Syntax (for Y-axis): yticks(<sequence containing tick data points>,[<optional sequence containing tick labels>])

 

18 of 90

Adding Lagends:

A legend is a color or mark linked to a specific data range plotted. When we plot multiple ranges on a single plot, it becomes necessary that legends are specified.

To add a legend,

<matplotlib.pyplot>.legend(loc=<position number or string>)

Position Numbers – 1.upper right, 2.upper left,

3.lower left, 4.lower right.

 

19 of 90

Saving a Figure:

savefig( ) function is used to save a plot created using pyplot functions for later use or for keeping records.

Syntax:<matplotlib.pyplot>.savefig(<string with filename and path>)

We can save figures formats like .pdf, .png, .eps, etc

Ex: pl.savefig(“myfile.pdf”)

#stores the plot in current directory

pl.savefig(“D:\\data\\myfile.pdf”)

# it store the pdf file in D Drive, data folder

20 of 90

A line chart or line graph is a type of chart which displays information as a series of data points called ‘markers’ connected by straight line segments.

A line plot is a graph that shows the frequency of data along a number line. It is used to show continuous dataset.

A line plot is used to visualise growth or decline in data over a time interval.

With PyPlot, a line chart is created using plot( ) function.

1. LINE CHART

21 of 90

Program to display 4 CT marks of a student using line chart.

import matplotlib.pyplot as pl

ctno=['CT1','CT2','CT3','CT4']

marks=[29,32,34,35]

pl.xlabel("CT Number")

pl.ylabel("CT Marks")

pl.plot(ctno,marks)

pl.show()

22 of 90

23 of 90

Specifying plot size: We can change the plot size as per our requirements.

Syntax:

<matplotlib.pyplot>.figure(figsize=(<width>,<length>))

Ex:

matplotlib.pyplot.figure(figsize=(16,8))

(or pl.figure(figsize=(16,8))

Here 15 units wide ie., x coordinates, 8 units long ie.,y coordinates.

To show grid: pl.grid(True)

24 of 90

Program to display 4 CT marks of a student using line chart.

(With desired plotsize and grid)

import matplotlib.pyplot as pl

ctno=['CT1','CT2','CT3','CT4']

marks=[29,32,34,35]

pl.figure(figsize=(16,8))

pl.xlabel("CT Number")

pl.ylabel("CT Marks")

pl.plot(ctno,marks)

pl.grid(True)

pl.show()

25 of 90

Applying various settings in plot() function:

Changing Line Colour:

Syntax: <matplotlib.pyplot>.plot(<data1>,[,data2],<colour code>)

Ex: pl.plot(ctno,marks,’r’)

Note: 1. If we skip color information, python will plot multiple lines in the same plot with different colors.

2. We can also write full colour names like ‘red’,’light green’ or by using hex strings like ‘#008000’, etc.

26 of 90

Changing Line Width: Ex: pl.plot(ctno,marks,linewidth=2)

Changing Line Style: linestyle (or) ls = [‘solid’,’dashed’,’dashdot’,’dotted’]

Ex: pl.plot(ctno,marks,linewidth=3,linestyle=’dashed’)

Program to display 4 CT marks of a student using line chart.

(With different line width and life style)

import matplotlib.pyplot as pl

ctno=['CT1','CT2','CT3','CT4']

marks=[29,32,34,35]

pl.figure(figsize=(16,8))

pl.plot(ctno,marks,'r',linewidth=10,

linestyle='dashed')

pl.xlabel("CT Number")

pl.ylabel("CT Marks")

pl.plot(ctno,marks)

pl.grid(True)

pl.show()

27 of 90

ls=‘solid’

ls=‘dashed’

linestyle=‘dashdot’

ls=‘dotted’

Note : “We can use either linestyle or ls , default line style is solid.

28 of 90

Changing marker type, size and colour:

The data points being plotted on a graph/chart are called markers.

We can give following additional optional arguments in plog() function:

marker=<valid marker type>,markersize=<in points>,

markeredgecolor=<valid color>

We can specify marker type as dots, crosses, diamonds, etc.

If you do not specify marker type, data points will not be marked specifically on the line chart and its default type will be the same as that of the line type.

29 of 90

pl.plot(ctno,marks,'b',linewidth=10,marker='s',

markersize=20,markeredgecolor='r')

30 of 90

Note :

(1) We can combine the marker type with color code

e.g.,’r+’ when given for line color marks the color as ‘red’ and markertype as plus(‘+’), ‘b3’ means line color marks the color as ‘blue’ and markertype as ‘tri left marker’.

(2) When you do not specify markeredgecolor separately in plot(), the marker takes the same color as the line.

31 of 90

If you do not specify the linestyle separately along with linecolor-markerstyle combination (eg.,’r+’), python will only plot the markers and not the line. To get the line, specify linestyle argument also.

Ex: pl.plot(ctno,marks,’rd’).

32 of 90

Demo Program:

import matplotlib.pyplot as pp

import numpy as np

X=np.arange(4) #[0,1,2,3]

Y=[5.0,25.0,45.0,20.0]

pp.xlim(-3.0,3.5)

pp.ylim(4,70)

pp.bar(X,Y)

pp.title("A sample Bar Chart")

pp.show()

Setting limits for X-axis and Y-axis:

PyPlot by default, tries to find best fitting range for X-axis and Y-axis depending on the data being plotted.

We can give xlimits and y limits as follows:

<matplotlib.pyplot>.xlim(<xmin>,<xmax>)

<matplotlib.pyplot>.ylim(<ymin>,<ymax>)

33 of 90

Note:1. While setting up the limits for axes, we must keep in mind that only the data that falls into the limits of X and Y-axes will be plotted, rest of the data will not show in the plot.

2. If we swapped the limits (min,max) as (max,min), then the plot gets flipped.

import matplotlib.pyplot as pp

X=[0,1,2,3]

Y=[5.0,25.0,45.0,20.0]

pp.xlim(-2,4)

pp.plot(X,Y)

pp.show()

import matplotlib.pyplot as pp

X=[0,1,2,3]

Y=[5.0,25.0,45.0,20.0]

pp.xlim(4,-2)

pp.plot(X,Y)

pp.show()

34 of 90

Setting ticks for Axes:

By default, PyPlot will automatically decide which data points will have ticks on the axes, but we can also decide which data points will have tick marks on X and Y-axes.

Syntax (for X-axis): xticks(<sequence containing tick data points>,

[<optional sequence containing tick labels>])

Syntax (for Y-axis): yticks(<sequence containing tick data points>,

[<optional sequence containing tick labels>])

35 of 90

import matplotlib.pyplot as pp

X=[0,1,2,3]

Y=[5.0,25.0,45.0,20.0]

pp.plot(X,Y)

pp.show()

import matplotlib.pyplot as pp

X=[0,1,2,3]

Y=[5.0,25.0,45.0,20.0]

pp.xticks([0,1,2,3])

pp.plot(X,Y)

pp.show()

import matplotlib.pyplot as pp

X=[0,1,2,3]

Y=[5.0,25.0,45.0,20.0]

pp.xticks([0.5,1,5])

pp.yticks([10,15,40])

pp.plot(X,Y)

pp.show()

36 of 90

Write a program to compare rates of vegetables in Raithubazar and Sunday Market using line charts.

 

RBazar

SMarket

Brinjal

35

50

Onion

25

35

Potato

50

40

Chilly

60

80

Program:

import matplotlib.pyplot as plt

Veg=["Brinjal","Onion","Potato","Chilly"]

RBazar=[35,25,50,60]

SMarket=[50,35,40,80]

plt.plot(Veg,RBazar,label='RB',color='r')

plt.plot(Veg,SMarket,label='SM',color='g')

plt.xlabel("Vegetable Names")

plt.ylabel("Vegetable Rates")

plt.title("Vegetable Rates Comparision")

plt.savefig('D:/rates.jpg')

plt.legend(loc=3)

plt.show()

USING LEGENDS

37 of 90

NCERT - EXAMPLES

Let us consider that in a city, the maximum temperature of a day is recorded for three consecutive days.

Program 4-1 demonstrates how to plot temperature values for the given dates. The output generated is a line chart.

Program 4-1 Plotting Temperature against Height

import matplotlib.pyplot as plt

#list storing date in string format

date=["25/12","26/12","27/12"]

#list storing temperature values

temp=[8.5,10.5,6.8]

#create a figure plotting temp versus date

plt.plot(date, temp)

#show the figure

plt.show()

38 of 90

In program 4-1, plot() is provided with two parameters, which indicates values for x-axis and y-axis, respectively.

The x and y ticks are displayed accordingly. As shown in Figure 4.2, the plot() function by default plots a line chart. We can click on the save button on the output window and save the plot as an image. A figure can also

be saved by using savefig() function. The name of the figure is passed to the function as parameter.

 

For example: plt.savefig('x.png').

 

In the previous example, we used plot() function to plot a line graph. There are different types of data available for analysis. The plotting methods allow for a handful of plot types other than the default line plot, as listed in Table 4.1. (from our syllabus) Choice of plot is determined by the type of data we have.

39 of 90

Program 4-2 Plotting a line chart of date versus temperature by adding Label on X and Y axis, and adding a Title and Grids to the chart.

import matplotlib.pyplot as plt

date=["25/12","26/12","27/12"]

temp=[8.5,10.5,6.8]

plt.plot(date, temp)

plt.xlabel("Date") #add the Label on x-axis

plt.ylabel("Temperature") #add the Label on y-axis

plt.title("Date wise Temperature") #add the title to the chart

plt.grid(True) #add gridlines to the background

plt.yticks(temp)

plt.show()

In this example, we have used the xlabel, ylabel, title and yticks functions. We can see that compared to Figure 1, the Figure 2 conveys more meaning, easily. We will learn about customisation of other plots in later sections.

40 of 90

Let us write the Program 4-3 applying some of the customisations.

Program 4-3 Consider the average heights and weights of persons aged 8 to 16 stored in the following two lists:

height = [121.9,124.5,129.5,134.6,139.7,147.3, 152.4, 157.5,162.6]

weight= [19.7,21.3,23.5,25.9,28.5,32.1,35.7,39.6, 43.2]

Let us plot a line chart where:

i. x axis will represent weight

ii. y axis will represent height

iii. x axis label should be “Weight in kg”

iv. y axis label should be “Height in cm”

v. colour of the line should be green

vi. use * as marker

vii. Marker size as10

viii. The title of the chart should be “Average weight with respect to average height”.

ix. Line style should be dashed

x. Linewidth should be 2.

41 of 90

import matplotlib.pyplot as plt

import pandas as pd

height=[121.9,124.5,129.5,134.6,139.7,147.3,152.4, 157.5,162.6]

weight=[19.7,21.3,23.5,25.9,28.5,32.1,35.7,39.6,43.2]

df=pd.DataFrame({"height":height,"weight":weight})

plt.xlabel('Weight in kg') #Set xlabel for the plot

plt.ylabel('Height in cm') #Set ylabel for the plot

plt.title('Average weight with respect to average height') #Set chart title

#plot using marker'-*' and line colour as green

plt.plot(df.weight,df.height,marker='*',markersize=10,color='green',

linewidth=2, linestyle='dashdot')

plt.show()

42 of 90

In the above we created the DataFrame using 2 lists, and in the plot function we have passed the height and weight columns of the DataFrame.

The output is shown in following figure.

Line chart showing average weight against average height

43 of 90

A bar graph or a bar chart is a graphical display of data using bars of different heights. A bar chart can be drawn vertically or horizontally using rectangles or bars of different heights/widths.

Each y value is plotted as bar on corresponding x-value on x-axis.

If you want that multiple commands affect a common bar chart, then either store all the related statements in a Python script (.py file) with last statement being <matplotlib.pyplot>.show()

2. BAR CHARTS

44 of 90

Rakesh went to Raithu Bazar to purchase to buy vegetables. Program to program to display him vegetable names and its rates per KG using a bar chart.

import matplotlib.pyplot as pp

vegetables=['Brinjal','Tamota','Onion','Beetroot',’Chilly’]

rates=[60,45,28,52,80]

pp.xlabel("Vegetable Names")

pp.ylabel("Vegetable Rates Per KG")

pp.bar(vegetables,rates)

pp.show()

45 of 90

46 of 90

import matplotlib.pyplot as pp

vegetables=['Brinjal','Tamota','Onion','Beetroot']

rbazarrates=[60,45,28,52]

smarketrates=[95,70,45,35]

pp.xlabel("Vegetable Names")

pp.ylabel("Raithu Bazar,Sunday Market Rates Per KG")

pp.bar(vegetables,smarketrates)

pp.bar(vegetables,rbazarrates)

pp.show()

Bharat compared rates of vegetables in Raithu Bazar and in Sunday market. He found lot of variation in rates of vegetables per KG. Write a program for the comparison using Bar chart.

47 of 90

pp.bar(vegetables,rbazarrates)

pp.bar(vegetables,smarketrates)

Observe the following:

48 of 90

Changing widths of the Bars in a Bar Chart:

Default width = 0.8 units

To specify common width (using a scalar value):

<matplotlib.pyplot>.bar(<x-sequence>,

<y-sequence>,width=<float value>)

Ex: pp.bar(vegetables,rates,width=0.4)

If you specify a scalar value (a single value) for width argument, then that width is applied to all the bars of the bar chart.

 

49 of 90

To specify different widths for different bars:

<matplotlib.pyplot>.bar(<x-sequence>,

<y-sequence>,

width=<width values sequence>)

 

Ex:

pp.bar(vegetables,rates,

width=[0.4,0.2,0.5,0.8,0.4])

Note: The width values’ sequence in a bar( ) must have widths for all the bars, i.e., its length must match the length of data sequences being plotted, otherwise Python will report an error.

50 of 90

Changing the colors of the Bars:By default, a bar chart draws bars with same default color.

To specify common color:

<matplotlib.pyplot>.bar(<x-sequence>,

<y-sequence>,color=<color code/name>)

When we specify single color name or single color code with color argument of the bar( ) function, the specified color is applied to all the bars of the bar chart i.e., all bars of the bar chart have the same common color.

To specify different colors for different Bars:

<matplotlib.pyplot>.bar(<x-sequence>,

<y-sequence>,color=<color codes squence/color names>)

51 of 90

Ex: pp.bar(vegetables,rates,width=[0.4,0.2,0.5,0.8,0.4],color=['b','red','k','g','y'])

52 of 90

CREATING MULTIPLE BARS CHART

CTMarks is a list having 5 subject marks for CT1 & CT2. Create a bar chart that plots these two sub lists of CTMarks in a single chart. Keep the width of each bar as 0.3

import matplotlib.pyplot as pl

import numpy as np

CTMarks=[[32,37,39,29,25],[33,38,37,30,28]]

X=np.arange(5) #it gives 0,1,2,3,4

pl.bar(X+0.00,CTMarks[0],color='b',width=0.30)

pl.bar(X+0.30,CTMarks[1],color='g',width=0.30)

pl.show()

53 of 90

CTMarks=[[32,37,39,29,25],[33,38,37,30,28]]

X=np.arange(5) #it gives 0,1,2,3,4

pl.bar(X+0.00,CTMarks[0],color='b',width=0.30)

54 of 90

CTMarks=[[32,37,39,29,25],[33,38,37,30,28]]

X=np.arange(5) #it gives 0,1,2,3,4

pl.bar(X+0.30,CTMarks[1],color='g',width=0.30)

55 of 90

pl.bar(X+0.00,CTMarks[0],color='b',width=0.30)

pl.bar(X+0.50,CTMarks[1],color='g',width=0.30)

56 of 90

pl.bar(X+0.00,CTMarks[0],color='b',width=0.30)

pl.bar(X+0.10,CTMarks[1],color='g',width=0.30)

57 of 90

Creating a Horizontal Bar Chart:

Use barh( ) instead of bar( ).

The label that you give to x-axis in bar( ), will become y-axis label in barh( )

Ex:

import matplotlib.pyplot as pp

vegetables=['Brinjal','Tamota','Onion','Beetroot','Chilly']

rates=[60,45,28,52,80]

pp.xlabel("Vegetable Rates Per KG")

pp.ylabel("Vegetable Names")

pp.barh(vegetables,rates)

pp.show()

58 of 90

import matplotlib.pyplot as pp

vegetables=['Brinjal','Tamota','Onion',\

'Beetroot','Chilly']

rates=[60,45,28,52,80]

pp.xlabel("Vegetable Names")

pp.ylabel("Vegetable Rates Per KG")

pp.title("Vegetable Rates at various places")

pp.bar(vegetables,rates)

pp.show()

Rakesh went to Raithu Bazar to purchase to buy vegetables. Write a program to display him vegetable names and its rates per KG using a bar chart. Show title also.

Title: To add a title to your plot, we need to call function title( )

Syntax: <matplotlib.pyplot>.title(<title string>)

Ex: pl.title(“Vegetable Rates at various places”)

59 of 90

import matplotlib.pyplot as pp

import numpy as np

X=np.arange(4) #[0,1,2,3]

Y=[5.0,25.0,45.0,20.0]

pp.bar(X,Y)

pp.title("A sample Bar Chart")

pp.show()

Demo Program:

60 of 90

Demo Program:

import matplotlib.pyplot as pp

import numpy as np

X=np.arange(4) #[0,1,2,3]

Y=[5.0,25.0,45.0,20.0]

pp.xlim(-3.0,3.5)

pp.ylim(4,70)

pp.bar(X,Y)

pp.title("A sample Bar Chart")

pp.show()

Setting limits for X-axis and Y-axis:

PyPlot by default, tries to find best fitting range for X-axis and Y-axis depending on the data being plotted.

We can give xlimits and y limits as follows:

<matplotlib.pyplot>.xlim(<xmin>,<xmax>)

<matplotlib.pyplot>.ylim(<ymin>,<ymax>)

61 of 90

import matplotlib.pyplot as pp

X=[0,1,2,3]

Y=[5.0,25.0,45.0,20.0]

pp.bar(X,Y)

pp.show()

import matplotlib.pyplot as pp

X=[0,1,2,3]

Y=[5.0,25.0,45.0,20.0]

pp.xticks([0.5,1,5])

pp.yticks([10,15,40])

pp.bar(X,Y)

pp.show()

62 of 90

import matplotlib.pyplot as pp

import numpy as np

amount=[5000,4500,6000,3200,5500,6200]

X=np.arange(6) #0,1,2,3,4,5

pp.title("Donations - Week Collection")

pp.bar(X,amount,color='blue',width=0.3)

pp.xticks(X,['Mon','Tue','Wed','Thu','Fri','Sat'])

pp.xlabel("Days")

pp.ylabel("Donation Amount Collected")

pp.show()

Program : “ABC” school celebrated volunteering week where each section of class VI dedicated a day for collecting amount for charity being supported by the school. Section A volunteered on Monday, B on Tuesday, etc There are six sections in class VI. Amounts collected by section A to F are 5000,4500,6000,3200,5500,6200. Write a program to plot the collected amount vs. days using a bar chart. The ticks on X-axis should have Day names. The graph should have proper title and axes titles.

63 of 90

Adding Lagends:

A legend is a color or mark linked to a specific data range plotted. When we plot multiple ranges on a single plot, it becomes necessary that legends are specified.

To add a legend,

<matplotlib.pyplot>.legend(loc=<position number or string>)

Position Numbers – 1.upper right, 2.upper left,

3.lower left, 4.lower right.

64 of 90

Legends Demo Program: 5 subject marks for 3 CT Exams of a student.

import matplotlib.pyplot as pl

import numpy as np

CTMarks=[[32,37,39,29,25],[33,38,37,30,28],[34,33,39,40,35]]

X=np.arange(5) #it gives 0,1,2,3,4

pl.bar(X+0.00,CTMarks[0],color='b',width=0.20,label='CT 1 Marks')

pl.bar(X+0.20,CTMarks[1],color='g',width=0.20,label='CT 2 Marks')

pl.bar(X+0.40,CTMarks[2],color='k',width=0.20,label='CT 3 Marks')

pl.legend(loc='upper right') #or 1 instread of upper right

pl.title("3 CT Marks of a student")

pl.xlabel("Subjects")

pl.ylabel("CTs")

pl.show()

65 of 90

Saving a Figure:

savefig( ) function is used to save a plot created using pyplot functions for later use or for keeping records.

Syntax: <matplotlib.pyplot>.savefig(<string with filename and path>)

We can save figures formats like .pdf, .png, .eps, etc

Ex:

pl.savefig(“myfile.pdf”) #stores the plot in current directory

pl.savefig(“D:\\data\\myfile.pdf”) # it store the pdf file in D Drive, data folder

66 of 90

A histogram is a summarisation tool for discrete or continuous data.

A histogram provides a visual interpretation of numerical data by showing the number of data points that fall within a specified range of values (called bins).

It is similar to a vertical bar graph. Histogram, unlike a vertical bar graph, shows no gaps between the bars.

Visual representation of data distribution

Can display large set of data

3. HISTOGRAMS

67 of 90

A histogram is a summarization tool for discrete or continuous data.

A histogram provides a visual interpretation of numerical data by showing the number of data points that fall within a specified range of values (called bins).

It is similar to a vertical bar graph. Histogram, unlike a vertical bar graph, shows no gaps between the bars.

Visual representation of data distribution Can display large set of data

Histograms are column-charts, where each column represents a range of values, and the height of a column corresponds to how many values are in that range.

To make a histogram, the data is sorted into "bins" and the number of data points in each bin is counted. The height of each column in the histogram is then proportional to the number of data points its bin contains.

The df.plot(kind=’hist’) function automatically selects the size of the bins based on the spread of values in the data.

Point: If we do not specify Bins are the number of intervals you want to divide all of your data into, such that it can be displayed as bars on a histogram.

68 of 90

hist( ) function

69 of 90

pl.hist(x)

pl.xlabel("ages")

pl.ylabel("count")

pl.show()

import matplotlib.pyplot as pl

x=[23,45,21,13,34,45,56,67,87,57,83,89,45,56,67,4,1,56,67,45]

Write a program to plot ages of 20 citizens using histogram

70 of 90

pl.hist(x,ec='red')

#ec means edge color

71 of 90

pl.hist(x,bins=5,ec='red')

72 of 90

pl.hist(x,bins=[1,13,20,40,60,100],ec='red')

Taking bins as a sequence

Intervals

[1,13) - 1,2,….12

[13,20) – 13,14,….19

[20,40) – 20,21…..39

[40,60) – 40,41,…..59

[60,100] – 60,61,…100

pl.hist(x,"auto",ec='red')

Auto – it will take number of

bins by its own

73 of 90

pl.hist(x,20,ec='red')

74 of 90

pl.hist(x,"auto",(1,200),ec='red')

Range : minimum and maximum value of x as range

75 of 90

cumulative = True

In every interval, Present interval value + Smaller Values

76 of 90

cumulative = -1

In every interval, Present interval value + Bigger Values

77 of 90

pl.hist(x,ec='red',histtype='step')

histtype( )

bar – default

barstacked – if multiple set of datas one above another, stacked bar.

Step – line plot ie unfilled

Stepfilled – line plot ie filled

Type of histogram to draw

78 of 90

histtype='stepfilled'

histtype='step'

histtype=‘barstacked'

histtype=‘bar'

79 of 90

Mid – bin between the edges

pl.hist(x,ec='red',histtype='bar',

align='mid')

align : Horizontal alignment of the histogram bars

(left, right, mid)

80 of 90

align=‘right

align=‘left

81 of 90

Orientation

horizontal (or) vertical.

Default value is “vertical

pl.hist(x,ec='red',orientation='horizontal')

82 of 90

NCERT TEXT – EXAMPLES

Program 4-8

import pandas as pd

import matplotlib.pyplot as plt

data = {'Name':['Arnav', 'Sheela', 'Azhar', 'Bincy', 'Yash','Nazar'],'Height' : [60,61,63,65,61,60],

'Weight' : [47,89,52,58,50,47]}

df=pd.DataFrame(data)

df.plot(kind='hist')

plt.show()

Figure 4.9: A histogram as output of Program 4-8

It is also possible to set value for the bins parameter,

for example,

df.plot(kind=’hist’,bins=20)

df.plot(kind='hist',bins=[18,19,20,21,22])

df.plot(kind='hist',bins=range(18,25))

83 of 90

Customising Histogram:

Taking the same data as above, now let see how the histogram can be customised.

Let us change the edgecolor, which is the border of each hist, to green. Also, let us change the line style to ":" and line width to 2. Let us try another property called fill, which takes boolean values.

The default True means each hist will be filled with color and False means each hist will be empty.

Another property called hatch can be used to fill to each hist with pattern ( '-', '+', 'x', '\\', '*', 'o', 'O', '.').

84 of 90

Program 4-9

import pandas as pd

import matplotlib.pyplot as plt

data = {'Name':['Arnav', 'Sheela', 'Azhar','Bincy','Yash','Nazar'],'Height' : [60,61,63,65,61,60],

'Weight' : [47,89,52,58,50,47]}

df=pd.DataFrame(data)

df.plot(kind='hist',edgecolor='Green',linewidth=2,

linestyle=':',fill=False,hatch='o')

plt.show()

85 of 90

RECORD PROGRAMS – 16 to 20

16. Write a program for given the school result data, analyse the performance of the students subject wise, plot using bar chart.

import matplotlib.pyplot as plt

import pandas as pd

marks = { "English" :[45,50,48],

"Maths":[65,70,55],

"Physics":[75,85,52],

"Chemistry" :[45,50,53],

"IP":[95,100,90]}

df = pd.DataFrame(marks,

index=['Rajesh','Naveen','Sunitha'])

print("************Marksheet************")

print(df)

df.plot(kind='bar')

plt.title("Students and their Marks")

plt.xlabel("Student Names")

plt.ylabel("Marks")

plt.savefig('D:/marks.pdf')

plt.show()

86 of 90

17. Write a program to compare rates of vegetables in Raithubazar and Sunday Market using line charts.

 

RBazar

SMarket

Brinjal

35

50

Onion

25

35

Potato

50

40

Chilly

60

80

Program:

import matplotlib.pyplot as plt

Veg=["Brinjal","Onion","Potato","Chilly"]

RBazar=[35,25,50,60]

SMarket=[50,35,40,80]

plt.plot(Veg,RBazar,label='RB',color='r')

plt.plot(Veg,SMarket,label='SM',color='g')

plt.xlabel("Vegetable Names")

plt.ylabel("Vegetable Rates")

plt.title("Vegetable Rates Comparision")

plt.savefig('D:/rates.jpg')

plt.legend(loc=3)

plt.show()

87 of 90

18. Rakesh went to Vegetables shop to purchase vegetables.  Write a program to display him vegetable names and its rates per KG using a bar chart (Give different colour to each bar).

Given Data: Vegetable names are Brinjal, Tamota, Onion, Beetroot, Chilly

Their Corresponding Rates are 60,45,28,52,80.

Program:

import matplotlib.pyplot as pp

vegetables=['Brinjal','Tamota','Onion','Beetroot','Chilly']

rates=[60,45,28,52,80]

pp.title ("Vegetables and their Rates")

pp.xlabel("Vegetable Names")

pp.ylabel("Vegetable Rates Per KG")

pp.bar(vegetables,rates,color=['b','red','k','g','y'])

pp.savefig('D:/veg.jpg')

pp.show()

OUTPUT

88 of 90

19. Plot the following data on line chart and customize chart according to below given instructions.

Month

January

February

March

April

May

Sales

500

350

450

550

600

Write a program which includes all the following:

(a) Write a title for the chart ‘The Monthly Sales Report’

(b) Write the appropriate titles of both the axes

(c) Write code to display legends

(d) Display blue color for the line

(e) Use line style-dashed

(f) Display diamond style markers on data points.

Program:

#Importing matplotlib library

import matplotlib.pyplot as pt

months=['January','February','March','April','May']

sales=[500,350,450,550,600]

#Plotting a line graph

pt.plot(months,sales,label='Sales',color='b', linestyle='dashed',marker='D')

pt.title("The Monthly Sales Report")

pt.xlabel("Months")`

pt.ylabel("Sales")

pt.legend()

pt.savefig('D:/monthlysales.jpg')

#Displaying a line chart

pt.show()

Output

89 of 90

20. Plot the following ages details of 20 students using Histogram.

X=[16,15,14,18,17,16,14,16,15,18,15,16,18,14,16,15,14,16,17,14]

Program:

#Importing matplotlib library

import matplotlib.pyplot as pl

X=[16,15,14,18,17,16,14,16,15,18,15,16,18,14,16,15,14,16,17,14]

#Plotting a histogram with edge color red

pl.hist(X,ec='red')

#Displaying title of the graph

pl.title("Histogram showing Countwise Ages in a Inter College of 20 students")

#Displaying X axis label

pl.xlabel("Ages")

#Displaying Y axis label

pl.ylabel("Count")

pl.savefig('D:/ageshistogram.pdf')

#Displaying histogram

pl.show()

Output

90 of 90

THANK YOU

ALL THE BEST MY DEAR….