DATA VISUALIZATION
PYTHON
SYLLABUS (2024.25)
Data Visualization: Purpose of plotting; drawing and saving following types of plots using Matplotlib – line plot, bar graph, Histogram.
Customizing plots: adding label, title, and legend in plots.
CBSE -XII – INFORMATICS PRACTICES (065)
LEARNING OUTCOMES
RECORD PROGRAMS
The results obtained after analysis is used to make inferences or draw conclusions about data as well as to make important business decisions.
Sometimes, it is not easy to infer by merely looking at the results. In such cases, visualisation helps in better understanding of results of the analysis.
Data visualisation means graphical or pictorial representation of the data using graph, chart, etc. The purpose of plotting data is to visualise variation or show relationships between variables.
Visualisation also helps to effectively communicate information to intended users.
Traffic symbols, ultrasound reports, Atlas book of maps, speedometer of a vehicle, tuners of instruments are few examples of visualisation that we come across in our daily lives.
Visualisation of data is effectively used in fields like health, finance, science, mathematics, engineering, etc.
PyPlot is a collection of methods within matplotlib library, which allows user to construct 2D plots easily and interactively.
Plotting using Matplotlib:
Matplotlib library is used for creating static, animated, and interactive 2D- plots or figures in Python.
It can be installed using the following pip command from the command prompt:
pip install matplotlib
For plotting using Matplotlib, we need to import its Pyplot module using the following command:
import matplotlib.pyplot as plt
Import PyPlot: (Use one of the following)
import matplotlib.pyplot
import matplotlib.pyplot as plt (or any valid identified in place of pl)
from matplotlib import pyplot
Figure 4.1: Components of a plot
The pyplot module of matplotlib contains a collection of functions that can be used to work on a plot.
The plot() function of the pyplot module is used to create a figure. A figure is the overall window where the outputs of pyplot functions are plotted.
A figure contains a plotting area, legend, axis labels, ticks, title, etc.
Each function makes some change to a figure: example, creates a figure, creates a plotting area in a figure, plots some lines in a plotting area, decorates the plot with labels, etc. It is always expected that the data presented through charts easily understood.
Hence, while presenting data we should always give a chart title, label the axis of the chart and provide legend in case we have more than one plotted data.
To plot x versus y, we can write plt.plot(x,y). The show() function is used to display the figure created using the plot() function.
Customisation of Plots
Pyplot library gives us numerous functions, which can be used to customise charts such as adding titles or legends.
grid([b, which, axis]) | Configure the grid lines. |
legend(\*args, \*\*kwargs | Place a legend on the axes. |
savefig(\*args, \*\*kwargs) | Save the current figure |
show(\*args, \*\*kw) | Display all figures |
title(label[,fontdict,loc,pad]) | Set a title for the axes. |
xlabel(xlabel[,fontdict, labelpad]) | Set the label for the x-axis. |
xticks([ticks, labels]) | Get or set the current tick locations and labels of the x-axis. |
ylabel(ylabel[,fontdict, labelpad]) | Set the label for the y-axis. |
yticks([ticks, labels]) | Get or set the current tick locations and labels of the y-axis |
List of Pyplot functions to customise plots
Demo Program:
Program to display 4 CT marks of a student using line chart.
import matplotlib.pyplot as plt
ctno=['CT1','CT2','CT3','CT4']
marks=[15,30,22,35]
plt.plot(ctno,marks)
plt.show()
Marker:
(Changing marker type, size and colour)
It is also possible to specify each point in the line through a marker.
A marker is any symbol that represents a data value in a line chart or a scatter plot. (The data points being plotted on a graph/chart are called markers.)
We can give following additional optional arguments in plot() function:
marker=<valid marker type>,markersize=<in points>, markeredgecolor=<valid color>
We can specify marker type as dots, crosses, diamonds, etc. If you do not specify marker type, data points will not be marked specifically on the line chart and its default type will be the same as that of the line type.
Marker | Symbol | Description |
| Marker | Symbol | Description |
‘.’ | | Point marker | “8” | | octagon | |
‘,’ | | Pixel marker | “s” | | square | |
‘o’ | | Circle marker | “p” | | pentagon | |
“v” | | triangle_down | “P” | | plus (filled) | |
“^” | | triangle_up | “*” | | star | |
“<” | | triangle_left | “h” | | hexagon1 | |
“>” | | triangle_right | “H” | | hexagon2 | |
“1” | | tri_down | ||||
“+” | | plus | ||||
“2” | | tri_up | “x” | | x | |
“3” | | tri_left | “X” | | x (filled) | |
“4” | | tri_right | “D” | | diamond |
Some of the Matplotlib Markers
Marker Types for Plotting
Colour : It is also possible to format the plot further by changing the colour of the plotted data.
We can either use character codes or the color names as values to the parameter color in the plot( ).
Colour abbreviations for plotting
Important Point: Continuous data are measured while discrete data are obtained by counting. Height, weight are examples of continuous data. It can be in decimals. Total number of students in a class is discrete. It can never be in decimals
The Pandas Plot function (Pandas Visualisation):
In previous Programs, we learnt that the plot( ) function of the pyplot module of matplotlib can be used to plot a chart.
However, starting from version 0.17.0, Pandas objects Series and DataFrame come equipped with their own.plot() methods.
This plot() method is just a simple wrapper around the plot() function of pyplot.
Thus, if we have a Series or DataFrame type object (let's say 's' or 'df') we can call the plot method by writing:
s.plot() or df.plot()
The plot( ) method of Pandas accepts a considerable number of arguments that can be used to plot a variety of graphs.
It allows customising different plot types by supplying the kind keyword arguments.
The general syntax is: plt.plot(kind),where kind accepts a string indicating the type of .plot, as listed in the following table.
In addition, we can use the matplotlib.pyplot methods and functions also along with the plt() method of Pandas objects.
kind = | Plot type |
line | Line plot (default) |
bar | Vertical bar plot |
barh | Horizontal bar plot |
hist | Histogram |
Others (Following are not in syllabus) | |
box | Boxplot |
area | Area plot |
pie | Pie plot |
scatter | Scatter plot |
Arguments accepted by kind for different plots
we will learn to use plot() function to create various types of charts with respect to the type of data stored in DataFrames.
Ex: ctno=['CT1','CT2','CT3','CT4']
marks=[15,30,22,35]
pl.plot(ctno,marks,'b',linewidth=10,marker='s',
markersize=20,markeredgecolor='r')
Note :(1) We can combine the marker type with color code
e.g.,’r+’ when given for line color marks the color as ‘red’ and markertype as plus(‘+’), ‘b3’ means line color marks the color as ‘blue’ and markertype as ‘tri left marker’.
(2) When you do not specify markeredgecolor separately in plot(), the marker takes the same color as the line.
(3) If you do not specify the linestyle separately along with linecolor-markerstyle combination (eg.,’r+’), python will only plot the markers and not the line. To get the line, specify linestyle argument also. Ex: pl.plot(ctno,marks,’rd’).
title: To add a title to your plot, we need to call function title( )
Syntax:<matplotlib.pyplot>.title(<title string>)
Ex: pl.title(“Vegetable Rates at various places”)
pl.plot(x-axis values sequence, y-axis values sequence)
pl.xlabel(“Label here”) #To display x-axis label
pl.ylabel(“Label here”) #To display y-axis label
pl.show( ) # To display the chart/plot.
Setting limits for X-axis and Y-axis:
PyPlot by default, tries to find best fitting range for X-axis and Y-axis depending on the data being plotted.
We can give xlimits and ylimits as follows:
<matplotlib.pyplot>.xlim(<xmin>,<xmax>)
<matplotlib.pyplot>.ylim(<ymin>,<ymax>)
Setting ticks for Axes:
By default, PyPlot will automatically decide which data points will have ticks on the axes, but we can also decide which data points will have tick marks on X and Y-axes.
Syntax (for X-axis): xticks(<sequence containing tick data points>,[<optional sequence containing tick labels>])
Syntax (for Y-axis): yticks(<sequence containing tick data points>,[<optional sequence containing tick labels>])
Adding Lagends:
A legend is a color or mark linked to a specific data range plotted. When we plot multiple ranges on a single plot, it becomes necessary that legends are specified.
To add a legend,
<matplotlib.pyplot>.legend(loc=<position number or string>)
Position Numbers – 1.upper right, 2.upper left,
3.lower left, 4.lower right.
Saving a Figure:
savefig( ) function is used to save a plot created using pyplot functions for later use or for keeping records.
Syntax:<matplotlib.pyplot>.savefig(<string with filename and path>)
We can save figures formats like .pdf, .png, .eps, etc
Ex: pl.savefig(“myfile.pdf”)
#stores the plot in current directory
pl.savefig(“D:\\data\\myfile.pdf”)
# it store the pdf file in D Drive, data folder
A line chart or line graph is a type of chart which displays information as a series of data points called ‘markers’ connected by straight line segments.
A line plot is a graph that shows the frequency of data along a number line. It is used to show continuous dataset.
A line plot is used to visualise growth or decline in data over a time interval.
With PyPlot, a line chart is created using plot( ) function.
1. LINE CHART
Program to display 4 CT marks of a student using line chart.
import matplotlib.pyplot as pl
ctno=['CT1','CT2','CT3','CT4']
marks=[29,32,34,35]
pl.xlabel("CT Number")
pl.ylabel("CT Marks")
pl.plot(ctno,marks)
pl.show()
Specifying plot size: We can change the plot size as per our requirements.
Syntax:
<matplotlib.pyplot>.figure(figsize=(<width>,<length>))
Ex:
matplotlib.pyplot.figure(figsize=(16,8))
(or pl.figure(figsize=(16,8))
Here 15 units wide ie., x coordinates, 8 units long ie.,y coordinates.
To show grid: pl.grid(True)
Program to display 4 CT marks of a student using line chart.
(With desired plotsize and grid)
import matplotlib.pyplot as pl
ctno=['CT1','CT2','CT3','CT4']
marks=[29,32,34,35]
pl.figure(figsize=(16,8))
pl.xlabel("CT Number")
pl.ylabel("CT Marks")
pl.plot(ctno,marks)
pl.grid(True)
pl.show()
Applying various settings in plot() function:
Changing Line Colour:
Syntax: <matplotlib.pyplot>.plot(<data1>,[,data2],<colour code>)
Ex: pl.plot(ctno,marks,’r’)
Note: 1. If we skip color information, python will plot multiple lines in the same plot with different colors.
2. We can also write full colour names like ‘red’,’light green’ or by using hex strings like ‘#008000’, etc.
Changing Line Width: Ex: pl.plot(ctno,marks,linewidth=2)
Changing Line Style: linestyle (or) ls = [‘solid’,’dashed’,’dashdot’,’dotted’]
Ex: pl.plot(ctno,marks,linewidth=3,linestyle=’dashed’)
Program to display 4 CT marks of a student using line chart.
(With different line width and life style)
import matplotlib.pyplot as pl
ctno=['CT1','CT2','CT3','CT4']
marks=[29,32,34,35]
pl.figure(figsize=(16,8))
pl.plot(ctno,marks,'r',linewidth=10,
linestyle='dashed')
pl.xlabel("CT Number")
pl.ylabel("CT Marks")
pl.plot(ctno,marks)
pl.grid(True)
pl.show()
ls=‘solid’
ls=‘dashed’
linestyle=‘dashdot’
ls=‘dotted’
Note : “We can use either linestyle or ls , default line style is solid.
Changing marker type, size and colour:
The data points being plotted on a graph/chart are called markers.
We can give following additional optional arguments in plog() function:
marker=<valid marker type>,markersize=<in points>,
markeredgecolor=<valid color>
We can specify marker type as dots, crosses, diamonds, etc.
If you do not specify marker type, data points will not be marked specifically on the line chart and its default type will be the same as that of the line type.
pl.plot(ctno,marks,'b',linewidth=10,marker='s',
markersize=20,markeredgecolor='r')
Note :
(1) We can combine the marker type with color code
e.g.,’r+’ when given for line color marks the color as ‘red’ and markertype as plus(‘+’), ‘b3’ means line color marks the color as ‘blue’ and markertype as ‘tri left marker’.
(2) When you do not specify markeredgecolor separately in plot(), the marker takes the same color as the line.
If you do not specify the linestyle separately along with linecolor-markerstyle combination (eg.,’r+’), python will only plot the markers and not the line. To get the line, specify linestyle argument also.
Ex: pl.plot(ctno,marks,’rd’).
Demo Program:
import matplotlib.pyplot as pp
import numpy as np
X=np.arange(4) #[0,1,2,3]
Y=[5.0,25.0,45.0,20.0]
pp.xlim(-3.0,3.5)
pp.ylim(4,70)
pp.bar(X,Y)
pp.title("A sample Bar Chart")
pp.show()
Setting limits for X-axis and Y-axis:
PyPlot by default, tries to find best fitting range for X-axis and Y-axis depending on the data being plotted.
We can give xlimits and y limits as follows:
<matplotlib.pyplot>.xlim(<xmin>,<xmax>)
<matplotlib.pyplot>.ylim(<ymin>,<ymax>)
Note:1. While setting up the limits for axes, we must keep in mind that only the data that falls into the limits of X and Y-axes will be plotted, rest of the data will not show in the plot.
2. If we swapped the limits (min,max) as (max,min), then the plot gets flipped.
import matplotlib.pyplot as pp
X=[0,1,2,3]
Y=[5.0,25.0,45.0,20.0]
pp.xlim(-2,4)
pp.plot(X,Y)
pp.show()
import matplotlib.pyplot as pp
X=[0,1,2,3]
Y=[5.0,25.0,45.0,20.0]
pp.xlim(4,-2)
pp.plot(X,Y)
pp.show()
Setting ticks for Axes:
By default, PyPlot will automatically decide which data points will have ticks on the axes, but we can also decide which data points will have tick marks on X and Y-axes.
Syntax (for X-axis): xticks(<sequence containing tick data points>,
[<optional sequence containing tick labels>])
Syntax (for Y-axis): yticks(<sequence containing tick data points>,
[<optional sequence containing tick labels>])
import matplotlib.pyplot as pp
X=[0,1,2,3]
Y=[5.0,25.0,45.0,20.0]
pp.plot(X,Y)
pp.show()
import matplotlib.pyplot as pp
X=[0,1,2,3]
Y=[5.0,25.0,45.0,20.0]
pp.xticks([0,1,2,3])
pp.plot(X,Y)
pp.show()
import matplotlib.pyplot as pp
X=[0,1,2,3]
Y=[5.0,25.0,45.0,20.0]
pp.xticks([0.5,1,5])
pp.yticks([10,15,40])
pp.plot(X,Y)
pp.show()
Write a program to compare rates of vegetables in Raithubazar and Sunday Market using line charts.
| RBazar | SMarket |
Brinjal | 35 | 50 |
Onion | 25 | 35 |
Potato | 50 | 40 |
Chilly | 60 | 80 |
Program:
import matplotlib.pyplot as plt
Veg=["Brinjal","Onion","Potato","Chilly"]
RBazar=[35,25,50,60]
SMarket=[50,35,40,80]
plt.plot(Veg,RBazar,label='RB',color='r')
plt.plot(Veg,SMarket,label='SM',color='g')
plt.xlabel("Vegetable Names")
plt.ylabel("Vegetable Rates")
plt.title("Vegetable Rates Comparision")
plt.savefig('D:/rates.jpg')
plt.legend(loc=3)
plt.show()
USING LEGENDS
NCERT - EXAMPLES
Let us consider that in a city, the maximum temperature of a day is recorded for three consecutive days.
Program 4-1 demonstrates how to plot temperature values for the given dates. The output generated is a line chart.
Program 4-1 Plotting Temperature against Height
import matplotlib.pyplot as plt
#list storing date in string format
date=["25/12","26/12","27/12"]
#list storing temperature values
temp=[8.5,10.5,6.8]
#create a figure plotting temp versus date
plt.plot(date, temp)
#show the figure
plt.show()
In program 4-1, plot() is provided with two parameters, which indicates values for x-axis and y-axis, respectively.
The x and y ticks are displayed accordingly. As shown in Figure 4.2, the plot() function by default plots a line chart. We can click on the save button on the output window and save the plot as an image. A figure can also
be saved by using savefig() function. The name of the figure is passed to the function as parameter.
For example: plt.savefig('x.png').
In the previous example, we used plot() function to plot a line graph. There are different types of data available for analysis. The plotting methods allow for a handful of plot types other than the default line plot, as listed in Table 4.1. (from our syllabus) Choice of plot is determined by the type of data we have.
Program 4-2 Plotting a line chart of date versus temperature by adding Label on X and Y axis, and adding a Title and Grids to the chart.
import matplotlib.pyplot as plt
date=["25/12","26/12","27/12"]
temp=[8.5,10.5,6.8]
plt.plot(date, temp)
plt.xlabel("Date") #add the Label on x-axis
plt.ylabel("Temperature") #add the Label on y-axis
plt.title("Date wise Temperature") #add the title to the chart
plt.grid(True) #add gridlines to the background
plt.yticks(temp)
plt.show()
In this example, we have used the xlabel, ylabel, title and yticks functions. We can see that compared to Figure 1, the Figure 2 conveys more meaning, easily. We will learn about customisation of other plots in later sections.
Let us write the Program 4-3 applying some of the customisations.
Program 4-3 Consider the average heights and weights of persons aged 8 to 16 stored in the following two lists:
height = [121.9,124.5,129.5,134.6,139.7,147.3, 152.4, 157.5,162.6]
weight= [19.7,21.3,23.5,25.9,28.5,32.1,35.7,39.6, 43.2]
Let us plot a line chart where:
i. x axis will represent weight
ii. y axis will represent height
iii. x axis label should be “Weight in kg”
iv. y axis label should be “Height in cm”
v. colour of the line should be green
vi. use * as marker
vii. Marker size as10
viii. The title of the chart should be “Average weight with respect to average height”.
ix. Line style should be dashed
x. Linewidth should be 2.
import matplotlib.pyplot as plt
import pandas as pd
height=[121.9,124.5,129.5,134.6,139.7,147.3,152.4, 157.5,162.6]
weight=[19.7,21.3,23.5,25.9,28.5,32.1,35.7,39.6,43.2]
df=pd.DataFrame({"height":height,"weight":weight})
plt.xlabel('Weight in kg') #Set xlabel for the plot
plt.ylabel('Height in cm') #Set ylabel for the plot
plt.title('Average weight with respect to average height') #Set chart title
#plot using marker'-*' and line colour as green
plt.plot(df.weight,df.height,marker='*',markersize=10,color='green',
linewidth=2, linestyle='dashdot')
plt.show()
In the above we created the DataFrame using 2 lists, and in the plot function we have passed the height and weight columns of the DataFrame.
The output is shown in following figure.
Line chart showing average weight against average height
A bar graph or a bar chart is a graphical display of data using bars of different heights. A bar chart can be drawn vertically or horizontally using rectangles or bars of different heights/widths.
Each y value is plotted as bar on corresponding x-value on x-axis.
If you want that multiple commands affect a common bar chart, then either store all the related statements in a Python script (.py file) with last statement being <matplotlib.pyplot>.show()
2. BAR CHARTS
Rakesh went to Raithu Bazar to purchase to buy vegetables. Program to program to display him vegetable names and its rates per KG using a bar chart.
import matplotlib.pyplot as pp
vegetables=['Brinjal','Tamota','Onion','Beetroot',’Chilly’]
rates=[60,45,28,52,80]
pp.xlabel("Vegetable Names")
pp.ylabel("Vegetable Rates Per KG")
pp.bar(vegetables,rates)
pp.show()
import matplotlib.pyplot as pp
vegetables=['Brinjal','Tamota','Onion','Beetroot']
rbazarrates=[60,45,28,52]
smarketrates=[95,70,45,35]
pp.xlabel("Vegetable Names")
pp.ylabel("Raithu Bazar,Sunday Market Rates Per KG")
pp.bar(vegetables,smarketrates)
pp.bar(vegetables,rbazarrates)
pp.show()
Bharat compared rates of vegetables in Raithu Bazar and in Sunday market. He found lot of variation in rates of vegetables per KG. Write a program for the comparison using Bar chart.
pp.bar(vegetables,rbazarrates)
pp.bar(vegetables,smarketrates)
Observe the following:
Changing widths of the Bars in a Bar Chart:
Default width = 0.8 units
To specify common width (using a scalar value):
<matplotlib.pyplot>.bar(<x-sequence>,
<y-sequence>,width=<float value>)
Ex: pp.bar(vegetables,rates,width=0.4)
If you specify a scalar value (a single value) for width argument, then that width is applied to all the bars of the bar chart.
To specify different widths for different bars:
<matplotlib.pyplot>.bar(<x-sequence>,
<y-sequence>,
width=<width values sequence>)
Ex:
pp.bar(vegetables,rates,
width=[0.4,0.2,0.5,0.8,0.4])
Note: The width values’ sequence in a bar( ) must have widths for all the bars, i.e., its length must match the length of data sequences being plotted, otherwise Python will report an error.
Changing the colors of the Bars:�By default, a bar chart draws bars with same default color.
To specify common color:
<matplotlib.pyplot>.bar(<x-sequence>,
<y-sequence>,color=<color code/name>)
When we specify single color name or single color code with color argument of the bar( ) function, the specified color is applied to all the bars of the bar chart i.e., all bars of the bar chart have the same common color.
To specify different colors for different Bars:
<matplotlib.pyplot>.bar(<x-sequence>,
<y-sequence>,color=<color codes squence/color names>)
Ex: pp.bar(vegetables,rates,width=[0.4,0.2,0.5,0.8,0.4],color=['b','red','k','g','y'])
CREATING MULTIPLE BARS CHART
CTMarks is a list having 5 subject marks for CT1 & CT2. Create a bar chart that plots these two sub lists of CTMarks in a single chart. Keep the width of each bar as 0.3
import matplotlib.pyplot as pl
import numpy as np
CTMarks=[[32,37,39,29,25],[33,38,37,30,28]]
X=np.arange(5) #it gives 0,1,2,3,4
pl.bar(X+0.00,CTMarks[0],color='b',width=0.30)
pl.bar(X+0.30,CTMarks[1],color='g',width=0.30)
pl.show()
CTMarks=[[32,37,39,29,25],[33,38,37,30,28]]
X=np.arange(5) #it gives 0,1,2,3,4
pl.bar(X+0.00,CTMarks[0],color='b',width=0.30)
CTMarks=[[32,37,39,29,25],[33,38,37,30,28]]
X=np.arange(5) #it gives 0,1,2,3,4
pl.bar(X+0.30,CTMarks[1],color='g',width=0.30)
pl.bar(X+0.00,CTMarks[0],color='b',width=0.30)
pl.bar(X+0.50,CTMarks[1],color='g',width=0.30)
pl.bar(X+0.00,CTMarks[0],color='b',width=0.30)
pl.bar(X+0.10,CTMarks[1],color='g',width=0.30)
Creating a Horizontal Bar Chart:
Use barh( ) instead of bar( ).
The label that you give to x-axis in bar( ), will become y-axis label in barh( )
Ex:
import matplotlib.pyplot as pp
vegetables=['Brinjal','Tamota','Onion','Beetroot','Chilly']
rates=[60,45,28,52,80]
pp.xlabel("Vegetable Rates Per KG")
pp.ylabel("Vegetable Names")
pp.barh(vegetables,rates)
pp.show()
import matplotlib.pyplot as pp
vegetables=['Brinjal','Tamota','Onion',\
'Beetroot','Chilly']
rates=[60,45,28,52,80]
pp.xlabel("Vegetable Names")
pp.ylabel("Vegetable Rates Per KG")
pp.title("Vegetable Rates at various places")
pp.bar(vegetables,rates)
pp.show()
Rakesh went to Raithu Bazar to purchase to buy vegetables. Write a program to display him vegetable names and its rates per KG using a bar chart. Show title also.
Title: To add a title to your plot, we need to call function title( )
Syntax: <matplotlib.pyplot>.title(<title string>)
Ex: pl.title(“Vegetable Rates at various places”)
import matplotlib.pyplot as pp
import numpy as np
X=np.arange(4) #[0,1,2,3]
Y=[5.0,25.0,45.0,20.0]
pp.bar(X,Y)
pp.title("A sample Bar Chart")
pp.show()
Demo Program:
Demo Program:
import matplotlib.pyplot as pp
import numpy as np
X=np.arange(4) #[0,1,2,3]
Y=[5.0,25.0,45.0,20.0]
pp.xlim(-3.0,3.5)
pp.ylim(4,70)
pp.bar(X,Y)
pp.title("A sample Bar Chart")
pp.show()
Setting limits for X-axis and Y-axis:
PyPlot by default, tries to find best fitting range for X-axis and Y-axis depending on the data being plotted.
We can give xlimits and y limits as follows:
<matplotlib.pyplot>.xlim(<xmin>,<xmax>)
<matplotlib.pyplot>.ylim(<ymin>,<ymax>)
import matplotlib.pyplot as pp
X=[0,1,2,3]
Y=[5.0,25.0,45.0,20.0]
pp.bar(X,Y)
pp.show()
import matplotlib.pyplot as pp
X=[0,1,2,3]
Y=[5.0,25.0,45.0,20.0]
pp.xticks([0.5,1,5])
pp.yticks([10,15,40])
pp.bar(X,Y)
pp.show()
import matplotlib.pyplot as pp
import numpy as np
amount=[5000,4500,6000,3200,5500,6200]
X=np.arange(6) #0,1,2,3,4,5
pp.title("Donations - Week Collection")
pp.bar(X,amount,color='blue',width=0.3)
pp.xticks(X,['Mon','Tue','Wed','Thu','Fri','Sat'])
pp.xlabel("Days")
pp.ylabel("Donation Amount Collected")
pp.show()
Program : “ABC” school celebrated volunteering week where each section of class VI dedicated a day for collecting amount for charity being supported by the school. Section A volunteered on Monday, B on Tuesday, etc There are six sections in class VI. Amounts collected by section A to F are 5000,4500,6000,3200,5500,6200. Write a program to plot the collected amount vs. days using a bar chart. The ticks on X-axis should have Day names. The graph should have proper title and axes titles.
Adding Lagends:
A legend is a color or mark linked to a specific data range plotted. When we plot multiple ranges on a single plot, it becomes necessary that legends are specified.
To add a legend,
<matplotlib.pyplot>.legend(loc=<position number or string>)
Position Numbers – 1.upper right, 2.upper left,
3.lower left, 4.lower right.
Legends Demo Program: 5 subject marks for 3 CT Exams of a student.
import matplotlib.pyplot as pl
import numpy as np
CTMarks=[[32,37,39,29,25],[33,38,37,30,28],[34,33,39,40,35]]
X=np.arange(5) #it gives 0,1,2,3,4
pl.bar(X+0.00,CTMarks[0],color='b',width=0.20,label='CT 1 Marks')
pl.bar(X+0.20,CTMarks[1],color='g',width=0.20,label='CT 2 Marks')
pl.bar(X+0.40,CTMarks[2],color='k',width=0.20,label='CT 3 Marks')
pl.legend(loc='upper right') #or 1 instread of upper right
pl.title("3 CT Marks of a student")
pl.xlabel("Subjects")
pl.ylabel("CTs")
pl.show()
Saving a Figure:
savefig( ) function is used to save a plot created using pyplot functions for later use or for keeping records.
Syntax: <matplotlib.pyplot>.savefig(<string with filename and path>)
We can save figures formats like .pdf, .png, .eps, etc
Ex:
pl.savefig(“myfile.pdf”) #stores the plot in current directory
pl.savefig(“D:\\data\\myfile.pdf”) # it store the pdf file in D Drive, data folder
A histogram is a summarisation tool for discrete or continuous data.
A histogram provides a visual interpretation of numerical data by showing the number of data points that fall within a specified range of values (called bins).
It is similar to a vertical bar graph. Histogram, unlike a vertical bar graph, shows no gaps between the bars.
Visual representation of data distribution
Can display large set of data
3. HISTOGRAMS
A histogram is a summarization tool for discrete or continuous data.
A histogram provides a visual interpretation of numerical data by showing the number of data points that fall within a specified range of values (called bins).
It is similar to a vertical bar graph. Histogram, unlike a vertical bar graph, shows no gaps between the bars.
Visual representation of data distribution Can display large set of data
Histograms are column-charts, where each column represents a range of values, and the height of a column corresponds to how many values are in that range.
To make a histogram, the data is sorted into "bins" and the number of data points in each bin is counted. The height of each column in the histogram is then proportional to the number of data points its bin contains.
The df.plot(kind=’hist’) function automatically selects the size of the bins based on the spread of values in the data.
Point: If we do not specify Bins are the number of intervals you want to divide all of your data into, such that it can be displayed as bars on a histogram.
hist( ) function
pl.hist(x)
pl.xlabel("ages")
pl.ylabel("count")
pl.show()
import matplotlib.pyplot as pl
x=[23,45,21,13,34,45,56,67,87,57,83,89,45,56,67,4,1,56,67,45]
Write a program to plot ages of 20 citizens using histogram
pl.hist(x,ec='red')
#ec means edge color
pl.hist(x,bins=5,ec='red')
pl.hist(x,bins=[1,13,20,40,60,100],ec='red')
Taking bins as a sequence
Intervals
[1,13) - 1,2,….12
[13,20) – 13,14,….19
[20,40) – 20,21…..39
[40,60) – 40,41,…..59
[60,100] – 60,61,…100
pl.hist(x,"auto",ec='red')
Auto – it will take number of
bins by its own
pl.hist(x,20,ec='red')
pl.hist(x,"auto",(1,200),ec='red')
Range : minimum and maximum value of x as range
cumulative = True
In every interval, Present interval value + Smaller Values
cumulative = -1
In every interval, Present interval value + Bigger Values
pl.hist(x,ec='red',histtype='step')
histtype( )
bar – default
barstacked – if multiple set of datas one above another, stacked bar.
Step – line plot ie unfilled
Stepfilled – line plot ie filled
Type of histogram to draw
histtype='stepfilled'
histtype='step'
histtype=‘barstacked'
histtype=‘bar'
Mid – bin between the edges
pl.hist(x,ec='red',histtype='bar',
align='mid')
align : Horizontal alignment of the histogram bars
(left, right, mid)
align=‘right’
align=‘left’
Orientation
horizontal (or) vertical.
Default value is “vertical”
pl.hist(x,ec='red',orientation='horizontal')
NCERT TEXT – EXAMPLES
Program 4-8
import pandas as pd
import matplotlib.pyplot as plt
data = {'Name':['Arnav', 'Sheela', 'Azhar', 'Bincy', 'Yash','Nazar'],'Height' : [60,61,63,65,61,60],
'Weight' : [47,89,52,58,50,47]}
df=pd.DataFrame(data)
df.plot(kind='hist')
plt.show()
Figure 4.9: A histogram as output of Program 4-8
It is also possible to set value for the bins parameter,
for example,
df.plot(kind=’hist’,bins=20)
df.plot(kind='hist',bins=[18,19,20,21,22])
df.plot(kind='hist',bins=range(18,25))
Customising Histogram:
Taking the same data as above, now let see how the histogram can be customised.
Let us change the edgecolor, which is the border of each hist, to green. Also, let us change the line style to ":" and line width to 2. Let us try another property called fill, which takes boolean values.
The default True means each hist will be filled with color and False means each hist will be empty.
Another property called hatch can be used to fill to each hist with pattern ( '-', '+', 'x', '\\', '*', 'o', 'O', '.').
Program 4-9
import pandas as pd
import matplotlib.pyplot as plt
data = {'Name':['Arnav', 'Sheela', 'Azhar','Bincy','Yash','Nazar'],'Height' : [60,61,63,65,61,60],
'Weight' : [47,89,52,58,50,47]}
df=pd.DataFrame(data)
df.plot(kind='hist',edgecolor='Green',linewidth=2,
linestyle=':',fill=False,hatch='o')
plt.show()
RECORD PROGRAMS – 16 to 20
16. Write a program for given the school result data, analyse the performance of the students subject wise, plot using bar chart.
import matplotlib.pyplot as plt
import pandas as pd
marks = { "English" :[45,50,48],
"Maths":[65,70,55],
"Physics":[75,85,52],
"Chemistry" :[45,50,53],
"IP":[95,100,90]}
df = pd.DataFrame(marks,
index=['Rajesh','Naveen','Sunitha'])
print("************Marksheet************")
print(df)
df.plot(kind='bar')
plt.title("Students and their Marks")
plt.xlabel("Student Names")
plt.ylabel("Marks")
plt.savefig('D:/marks.pdf')
plt.show()
17. Write a program to compare rates of vegetables in Raithubazar and Sunday Market using line charts.
| RBazar | SMarket |
Brinjal | 35 | 50 |
Onion | 25 | 35 |
Potato | 50 | 40 |
Chilly | 60 | 80 |
Program:
import matplotlib.pyplot as plt
Veg=["Brinjal","Onion","Potato","Chilly"]
RBazar=[35,25,50,60]
SMarket=[50,35,40,80]
plt.plot(Veg,RBazar,label='RB',color='r')
plt.plot(Veg,SMarket,label='SM',color='g')
plt.xlabel("Vegetable Names")
plt.ylabel("Vegetable Rates")
plt.title("Vegetable Rates Comparision")
plt.savefig('D:/rates.jpg')
plt.legend(loc=3)
plt.show()
18. Rakesh went to Vegetables shop to purchase vegetables. Write a program to display him vegetable names and its rates per KG using a bar chart (Give different colour to each bar).
Given Data: Vegetable names are Brinjal, Tamota, Onion, Beetroot, Chilly
Their Corresponding Rates are 60,45,28,52,80.
Program:
import matplotlib.pyplot as pp
vegetables=['Brinjal','Tamota','Onion','Beetroot','Chilly']
rates=[60,45,28,52,80]
pp.title ("Vegetables and their Rates")
pp.xlabel("Vegetable Names")
pp.ylabel("Vegetable Rates Per KG")
pp.bar(vegetables,rates,color=['b','red','k','g','y'])
pp.savefig('D:/veg.jpg')
pp.show()
OUTPUT
19. Plot the following data on line chart and customize chart according to below given instructions.
Month | January | February | March | April | May |
Sales | 500 | 350 | 450 | 550 | 600 |
Write a program which includes all the following:
(a) Write a title for the chart ‘The Monthly Sales Report’
(b) Write the appropriate titles of both the axes
(c) Write code to display legends
(d) Display blue color for the line
(e) Use line style-dashed
(f) Display diamond style markers on data points.
Program:
#Importing matplotlib library
import matplotlib.pyplot as pt
months=['January','February','March','April','May']
sales=[500,350,450,550,600]
#Plotting a line graph
pt.plot(months,sales,label='Sales',color='b', linestyle='dashed',marker='D')
pt.title("The Monthly Sales Report")
pt.xlabel("Months")`
pt.ylabel("Sales")
pt.legend()
pt.savefig('D:/monthlysales.jpg')
#Displaying a line chart
pt.show()
Output
20. Plot the following ages details of 20 students using Histogram.
X=[16,15,14,18,17,16,14,16,15,18,15,16,18,14,16,15,14,16,17,14]
Program:
#Importing matplotlib library
import matplotlib.pyplot as pl
X=[16,15,14,18,17,16,14,16,15,18,15,16,18,14,16,15,14,16,17,14]
#Plotting a histogram with edge color red
pl.hist(X,ec='red')
#Displaying title of the graph
pl.title("Histogram showing Countwise Ages in a Inter College of 20 students")
#Displaying X axis label
pl.xlabel("Ages")
#Displaying Y axis label
pl.ylabel("Count")
pl.savefig('D:/ageshistogram.pdf')
#Displaying histogram
pl.show()
Output
THANK YOU
ALL THE BEST MY DEAR….