1 of 108

Matplotlib�

Matplotlib is a low level graph plotting library in python that serves as a visualization utility.

2 of 108

Installation of Matplotlib�

pip install matplotlib

3 of 108

Import Matplotlib�

  • Once Matplotlib is installed, import it in your applications by adding the import module statement:

import matplotlib

4 of 108

Matplotlib------ Pyplot�

  • Most of the Matplotlib utilities lies under the pyplot submodule, and are usually imported under the plt alias:

import matplotlib.pyplot as plt

  • Now the Pyplot package can be referred to as plt.

5 of 108

import matplotlib.pyplot as pltimport numpy as np��xpoints = np.array([06])ypoints = np.array([0250])��plt.plot(xpoints, ypoints)plt.show()

Example:our own Python Server

Draw a line in a diagram from position (0,0) to position (6,250):

6 of 108

Plotting graph�

  • Plotting x and y points
  • The plot() function is used to draw points (markers) in a diagram.
  • By default, the plot() function draws a line from point to point.
  • The function takes parameters for specifying points in the diagram.
  • Parameter 1 is an array containing the points on the x-axis.
  • Parameter 2 is an array containing the points on the y-axis.

7 of 108

import matplotlib.pyplot as pltimport numpy as np��xpoints = np.array([18])ypoints = np.array([310])��plt.plot(xpoints, ypoints)plt.show()

If we need to plot a line from (1, 3) to (8, 10), we have to pass two arrays [1, 8] and [3, 10] to the plot function.

8 of 108

Plotting Without Line

To plot only the markers, you can use shortcut string notation parameter 'o', which means 'rings'.

import matplotlib.pyplot as pltimport numpy as np��xpt = np.array([18])ypt = np.array([310])��plt.plot(xpt, ypt, 'o')plt.show()

9 of 108

Multiple Points�

  • We can plot as many points as we like, just make sure you have the same number of points in both axis.
  • Example:

Draw a line in a diagram from position (1, 3) to (2, 8) then to (6, 1) and finally to position (8, 10):

import matplotlib.pyplot as pltimport numpy as np��xpoints = np.array([1268])ypoints = np.array([38110])��plt.plot(xpoints, ypoints)plt.show()

10 of 108

Default X-Points�

  • If we do not specify the points on the x-axis, they will get the default values 0, 1, 2, 3 etc., depending on the length of the y-points.

  • So, if we take the same example as above, and leave out the x-points, the diagram will look like this:

11 of 108

Default X-Points�

import matplotlib.pyplot as pltimport numpy as np��ypt=np.array

([3811057])plt.plot(ypt)plt.show()

12 of 108

Matplotlib Markers�

  • You can use the keyword argument marker to emphasize each point with a specified marker:

import matplotlib.pyplot as pltimport numpy as np��ypoints = np.array([38110])��plt.plot(ypoints, marker = 'o')plt.show()

13 of 108

  • plt.plot(ypoints, marker = '*')

14 of 108

15 of 108

Format Strings fmt

  • You can also use the shortcut string notation parameter to specify the marker.

  • This parameter is also called fmt, and is written with this syntax:

  • marker|line|color

16 of 108

import matplotlib.pyplot as pltimport numpy as np��ypoints = np.array([38110])��plt.plot(ypoints, 'o:r')plt.show()

17 of 108

Line Reference

18 of 108

Color Reference

19 of 108

  • You can use the keyword argument markersize or the shorter version, ms to set the size of the markers:

import matplotlib.pyplot as pltimport numpy as npypt= np.array([38110])plt.plot(ypt, marker = 'o', ms = 20)plt.show()

20 of 108

Marker Color�

  • You can use the keyword argument markeredgecolor or the shorter mec to set the color of the edge of the markers. Example: Set the EDGE color to red:

plt.plot(ypoints, marker = 'o’,ms = 20, mec = 'r')

21 of 108

You can use the keyword argument markerfacecolor or the shorter mfc to set the color inside the edge of the markers:

plt.plot(ypoints, marker = 'o', ms = 20, mfc = 'r')

22 of 108

23 of 108

import matplotlib.pyplot as pltimport numpy as np��ypoints = np.array([38110])��plt.plot(ypoints, marker = 'o', ms = 20,

mec = 'r', mfc = 'r')plt.show()

24 of 108

plt.plot(ypoints, marker = 'o', ms = 20,

mec = '#4CAF50', mfc = '#4CAF50')

25 of 108

Adding text on matplotlib plot

26 of 108

  • The output plot looks very simple. Now, let’s see some text commands to add it on our plot.

  • title() is used to add the title of axes. The first and mandatory argument is the title you want to give and the rest are optional to format it.

  • Similarly, xlabel() and ylabel() are used to add titles to x-axis and y-axis. It also takes title as an argument.

27 of 108

plt.plot(ypoints, marker = 'o', ms = 20,

mfc = 'r’)

plt.title("Simple graph")

plt.show()

28 of 108

plt.plot(ypoints, marker = 'o’,

ms = 20, mfc = 'r’)

plt.title("Simple graph")

plt.xlabel('Location')

plt.ylabel('Number of Restaurants')

plt.show()

29 of 108

Types of graphs

  • Bar Graph.
  • Pie Chart.
  • Scatter Plot.
  • Histogram.
  • Line Chart and Subplots.

30 of 108

Bar chart�

  • With Pyplot, you can use the bar() function to draw bar graphs:

import matplotlib.pyplot as pltimport numpy as np��x = np.array(["A""B""C""D"])y = np.array([38110])��plt.bar(x,y)plt.show()

31 of 108

Horizontal Bars

If you want the bars to be displayed horizontally instead of vertically, use the barh() function. Example:

plt.barh(x, y)

32 of 108

Bar Color

The bar() and barh() take the keyword argument color to set the color of the bars:

plt.bar(x, y, color = "red")

33 of 108

Bar Width

The bar() takes the keyword argument width to set the width of the bars:

plt.bar(x, y, width = 0.1)

34 of 108

plt.barh(x, y, height = 0.1)

35 of 108

plt.barh(x, y, height = 0.6)

36 of 108

Creating Pie Charts�

  • With Pyplot, you can use the pie() function to draw pie charts:

import matplotlib.pyplot as pltimport numpy as np��y = np.array([35252515])��plt.pie(y)plt.show()

37 of 108

Labels

Add labels to the pie chart with the labels parameter.

The labels parameter must be an array with one label for each wedge.

Example:

y = np.array([35252515])mylabels = ["Apples""Bananas""Cherries""Dates"]plt.pie(y, labels = mylabels)plt.show() 

38 of 108

import matplotlib.pyplot as pltimport numpy as np��y = np.array([35252515])mylabels = ["Apples","Bananas","Cherries","Dates"]myexplode = [0.2000]plt.pie(y, labels = mylabels, explode = myexplode)plt.show() 

39 of 108

plt.pie(y, labels = mylabels, explode = myexplode, shadow = True)

40 of 108

y = np.array([35252515])mylabels = ["Apples""Bananas""Cherries""Dates"]��plt.pie(y, labels = mylabels)plt.legend()plt.show() 

41 of 108

Scatter Plots�

  • With Pyplot, you can use the scatter() function to draw a scatter plot.

  • The scatter() function plots one dot for each observation.

  • It needs two arrays of the same length, one for the values of the x-axis, and one for values on the y-axis:

42 of 108

import matplotlib.pyplot as pltimport numpy as np��x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])��plt.scatter(x, y)plt.show()

43 of 108

x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])y = np.array ([99,86,87,88,111,86,103,87,94,78,77,85,86])plt.scatter(x, y, color = ’blue')��x = np.array([2,2,8,1,15,8,12,9,7,3,11,4,7,14,12])y = np.array

([100,105,84,105,90,99,90,95,94,100,79,112,91,80,85])plt.scatter(x, y, color = ’red')��plt.show()

44 of 108

45 of 108

Color Each Dot�

x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])y = np.array ([99,86,87,88,111,86,103,87,94,78,77,85,86])colors = np.array(["red","green","blue","yellow","pink","black","orange","purple","beige","brown","gray","cyan","magenta"])��plt.scatter(x, y, c=colors)��plt.show()

46 of 108

47 of 108

Size�

x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])y = np.array ([99,86,87,88,111,86,103,87,94,78,77,85,86])sizes = np.array

([20,50,100,200,500,1000,60,90,10,300,600,800,75])��plt.scatter(x, y, s=sizes)

Plt.show()

48 of 108

49 of 108

Alpha�

  • You can adjust the transparency of the dots with the alpha argument.

  • Just like colors, make sure the array for sizes has the same length as the arrays for the x- and y-axis:

50 of 108

x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])y = np.array ([99,86,87,88,111,86,103,87,94,78,77,85,86])sizes = np.array ([20,50,100,200,500,1000,60,90,10,300,600,800,75])��plt.scatter(x, y, s=sizes, alpha=0.5)��plt.show()

51 of 108

52 of 108

Histogram�

53 of 108

Histogram.�

  • A histogram is a graph showing frequency distributions.
  • It is a graph showing the number of observations within each given interval.

  • Example: Say you ask for the height of 250 people, you might end up with a histogram like this:

54 of 108

  • In Matplotlib, we use the hist() function to create histograms.

  • The hist() function will use an array of numbers to create a histogram, the array is sent into the function as an argument.

55 of 108

import matplotlib.pyplot as pltimport numpy as np��x = np.random.normal(17010250)��plt.hist(x)plt.show() 

56 of 108

Subplot()

  • pyplot. subplots method provides a way to plot multiple plots on a single figure.

  •  With the subplot() function you can draw multiple plots in one figure

57 of 108

import matplotlib.pyplot as pltimport numpy as np#plot 1:�x = np.array([0123])y = np.array([38110])plt.subplot(121)plt.plot(x,y)#plot 2:�x = np.array([0123])y = np.array([10203040])plt.subplot(122)plt.plot(x,y)��plt.show()

58 of 108

59 of 108

  • import numpy as np: Imports the NumPy library and assigns it the alias 'np' for easier reference.
  • import matplotlib.pyplot as plt: Imports the pyplot module from the Matplotlib library and assigns it the alias 'plt' for easier reference.
  • x = np.array([0, 1, 2, 3]): Creates a NumPy array 'x' with values 0, 1, 2, and 3.
  • y = np.array([3, 8, 1, 10]): Creates a NumPy array 'y' with values 3, 8, 1, and 10.
  • plt.subplot(2, 3, 1): Creates a subplot grid with 2 rows and 3 columns and selects the first subplot (top-left corner). The subsequent plot commands will be applied to this subplot.
  • plt.plot(x, y): Plots the values of 'x' against the corresponding values of 'y' in the selected subplot.
  • plt.show(): Displays the plot. Without this line, the plot would not be shown.

60 of 108

Patches

patch is a 2D artist with a face color and an edge color.

61 of 108

62 of 108

63 of 108

  • The Matplotlib.axes.Axes.add_patch() method in the axes module of matplotlib library is used to add a Patch to the axes’ patches; return the patch.

  • Syntax: 

Axes.add_patch(self, p)

64 of 108

import matplotlib.path as mpath

import matplotlib.pyplot as plt

# adjust figure and assign coordinates

fig = plt.figure()

ax = fig.add_subplot(1, 1,1)

pp1 = plt.Rectangle((0.2, 0.75),

                    0.4, 0.15)

pp2 = plt.Circle((0.7, 0.2), 0.15)

  

pp3 = plt.Polygon([[0.15, 0.15],

                   [0.35, 0.4],

                   [0.2, 0.6]])

# depict illustrations

ax.add_patch(pp1)

ax.add_patch(pp2)

ax.add_patch(pp3)

65 of 108

66 of 108

import matplotlib.patches as mpatches

  • Matplotlib is an amazing visualization library in Python for 2D plots of arrays.
  • Matplotlib is a multi-platform data visualization library built on NumPy arrays
  • The Axes Class contains most of the figure elements: Axis, Tick, Line2D, Text, Polygon, etc., and sets the coordinate system.

67 of 108

  • It looks like you've created a simple matplotlib plot with a circle using patches

import matplotlib.pyplot as plt

import matplotlib.patches as mpatches

fig, ax = plt.subplots()

circ = mpatches.Circle((1, 0), 5, linestyle='solid', edgecolor='b', facecolor='none')

ax.add_patch(circ)

ax.set_xlim(-10, 10)

ax.set_ylim(-10, 10)

ax.set_aspect('equal')

68 of 108

69 of 108

70 of 108

  • It looks like you've created a simple matplotlib plot with a circle using patches

import matplotlib.pyplot as plt

import matplotlib.patches as mpatches

fig, ax = plt.subplots()

circ = mpatches.Circle((1, 0), 5, linestyle='solid', edgecolor='b', facecolor=‘pink')

ax.add_patch(circ)

ax.set_xlim(-10, 10)

ax.set_ylim(-10, 10)

ax.set_aspect('equal')

71 of 108

matplotlib.patches.Rectangle

import matplotlib.pyplot as plt

fig ,ax = fig.add_subplot( )

rect1 = matplotlib.patches.Rectangle((-200, -100), 400, 200, color ='green’)

rect2 = matplotlib.patches.Rectangle((0, 150), 300, 20, color ='pink')

rect3 = matplotlib.patches.Rectangle((-300, -50), 40, 200, color ='yellow')

ax.add_patch(rect1)

ax.add_patch(rect2)

ax.add_patch(rect3)

plt.xlim([-400, 400])

plt.ylim([-400, 400])

plt.show()

72 of 108

  • This code creates a green rectangle with a width of 400 and a height of 200, centered at (-200, -100).
  • This code creates a pink rectangle with a width of 300 and a height of 20, centered at (0, 150).
  • This code creates a yellow rectangle with a width of 40 and a height of 200, centered at (-300, -50).

73 of 108

Seaborn

74 of 108

Install Seaborn.�all seaborn

pip install seaborn

Import Seaborn

import seaborn as sns

75 of 108

  • Most popular statistical visualization library in Python.

  • Seaborn is a library for making statistical graphics in Python.

  • It builds on top of matplotlib and integrates closely with pandas data structures.

  • Seaborn helps you explore and understand your data.

76 of 108

  • It provides a high-level interface for creating attractive and informative statistical graphics.

  • Seaborn comes with several built-in themes and color palettes to make it easy to create visually appealing plots.

  • It is particularly useful for visualizing complex datasets with multiple variables.

77 of 108

Some key features of Seaborn include:

  1. Statistical Plots: Seaborn includes functions to create various statistical plots such as scatter plots, bar plots, box plots, violin plots, pair plots, and more. These functions often make it easier to generate informative visualizations with less code compared to Matplotlib.
  2. Built-in Themes and Color Palettes: Seaborn comes with built-in themes that improve the aesthetics of plots. It also provides a variety of color palettes to make it easy to choose visually pleasing color schemes for your visualizations.
  3. Integration with Pandas DataFrames: Seaborn is designed to work with Pandas DataFrames, making it convenient to visualize datasets directly without extensive data manipulation.

78 of 108

import pandas as pd

import seaborn as sns

import matplotlib.pyplot as plt

# Create a sample dataset

data = pd.DataFrame({'X': [1, 2, 3, 4, 5], 'Y': [2, 4, 1, 3, 5]})

# Scatter plot using Seaborn

sns.scatterplot(x='X', y='Y', data=data)

# Show the plot

plt.show()

79 of 108

80 of 108

Box plot

import seaborn as sns

import matplotlib.pyplot as plt

# Create a sample dataset

data = sns.load_dataset("tips")

# Box plot

sns.boxplot(x='day', y='total_bill', data=data)

# Show the plot

plt.show()

81 of 108

82 of 108

Pair Plot

import seaborn as sns

import matplotlib.pyplot as plt

# Create a sample dataset

data = sns.load_dataset("iris")

# Pair plot

sns.pairplot(data, hue='species')

# Show the plot

plt.show()

83 of 108

84 of 108

Violin Plot�

import seaborn as sns

import matplotlib.pyplot as plt

# Create a sample dataset

data = sns.load_dataset("tips")

# Violin plot

sns.violinplot(x='day', y='total_bill', data=data)

# Show the plot

plt.show()

85 of 108

86 of 108

Scatter Plot with Regression Line:�

import seaborn as sns

import matplotlib.pyplot as plt

# Create a sample dataset

data = sns.load_dataset("tips")

# Scatter plot with regression line

sns.regplot(x='total_bill', y='tip', data=data)

# Show the plot

plt.show()

87 of 108

import matplotlib.pyplot as plt

import seaborn as sns

sns.distplot([0, 1, 2, 3, 4, 5])

plt.show()

88 of 108

import seaborn as sns

sns.set(style="dark")

fmri = sns.load_dataset("fmri")

 

# Plot the responses for

Different events and regions

sns.lineplot(x="timepoint",

             y="signal",

             hue="region",

             style="event",

             data=fmri)

89 of 108

  • x="timepoint": The x-axis variable is "timepoint."
  • y="signal": The y-axis variable is "signal."
  • hue="region": Different colors are used for different regions.
  • style="event": Different line styles are used for different events.
  • data=fmri: The data comes from the loaded "fmri" dataset.

90 of 108

Logistic regression

91 of 108

Logistic regression is very similar to linear regression.

When we use logistic regression?

We use it when we have a (binary outcome) of interest and a number of explanatory variables.

Outcome:

e.g. the presence of absence of a

symptom, presence or absence of a disease

92 of 108

From the equation of the logistic regression model we can do:

1-we can determine which explanatory variables can influence the outcome.

Which means which variables had the highest OR or the risk in production of the outcome

(1= has the disease 0= doesn’t have the disease)

93 of 108

From the equation of the logistic regression model we can do:

2- we can use an individual values of the explanatory variables to evaluate he or she will have a particular outcome

94 of 108

we start the logistic regression model by creating a binary variable to represent the outcome (Dependant variable) (1= has the disease 0=doesn’t have the disease)

We take the probability P of an individual has the highest coded category (has the disease) as the dependant variable.

We use the logit logistic transformation in the regression equation

95 of 108

The logit is the natural logarithm of the odds ratio of ‘disease’

Logit (P)= ln P/ 1-p

The logistic regression equation

Logit (p)= a + b1X1+ b2X2 + b3X3 +……… + biXi X= Explanatory variables

P= estimated value of true probability that an individual with a particular set of values for X has the disease. P corresponds to the proportion with the disease, it has underlying binominal distribution

b= estimated logistic regression coefficients The exponential of a particular coefficient for

example eb1 is an estimated of the odds ratio.

96 of 108

For a particular value of X1 the estimated odds of the disease while adjusting for all other X’s in the equation.

As the logistic regression is fitted on a log scale the effects of X’s are multiplicative on the odds of the disease . This means that their combined effect is the product of their separate effects.

This is unlike linear regression where the effects of X’s on the dependant variables are additive.

97 of 108

Plain English:

  1. Take the significant variables in the univariate analysis
  2. Set the P value that you will take those variables to be put in the models e.g. 0.05 or 0.1
  3. if all variables in the univariate analysis are insignificant ? Don’t bother doing logisitic regression. There is no question here about those variables for prediction of the disease

98 of 108

Plain English:

  1. the idea of doing a logisitic regression we have two many variables that are significant with the outcome we are looking for and we want to know which is more stronger in prediction of the disease outcome

  • we look in the output of the statistical program for Odds ratio and CI, significance of the variable, manipulate to select of the best combination of explanatory variables

99 of 108

Plain English:

  1. the idea of doing a logisitic regression we have two many variables that are significant with the outcome we are looking for and we want to know which is more stronger in prediction of the disease outcome

Mathematical model that describes the relationship between an outcome with one or more explanatory variables

  1. we look in the output of the statistical program for Odds ratio and CI, significance of the

variable, manipulate to select of the best combination of explanatory variables

100 of 108

Example:

A study was done to test the relationship between HHV8 infection and sexual behavior of men, were asked about histories of sexually transmitted diseases in the past ( gonorrhea, syphilis, HSV2, and HIV)

The explanatory variables were the presence of each of the four infection coded as 0 if the patient has no history or 1 if the patient had a history of that infection and the patient age in years

101 of 108

Dependant outcome HHV8 infection

Parameter estimate

P

OR

95% CI

Intercept

-2.2242

0.006

Gonorrhea

0.5093

0.243

1.664

0.71-3.91

Syphilis

1.1924

0.093

3.295

0.82-13.8

HSV2

0.7910

0.0410

2.206

1.03-4.71

HIV

1.6357

0.0067

5.133

1.57-

16.73

Age

0.0062

0.76

1.006

0.97-1.05

102 of 108

Example:

Chi square for covariate= 24.5 P=0.002

Indicating at least one of the covariates is significantly associated with HHV-8 serostatus.

HSV-2 positively associated with HHV8 infection P=0.04

HIV is positively associated with HHV 8 infection P=0.007

103 of 108

Those with a history of HSV-2 having 2.21 times odds of being HHV-8 positive compared to those with negative history after adjusting for other infections

Those with a history of HIV having 5.1 times odds of being HHV-8 positive compared to those with negative history after adjusting for other infections

104 of 108

Multiplicative effect of the model suggests a man who is both HSV2 and HIV seropositive is estimated to have 2.206 X 5.133 = 11.3 times the odds of HHV 8 infection compared to a man negative for both after adjusting for the other two infections.

In this example gonorrhea had a significant chi-square but when entered in the model it was not significant

(no indication of independent relationship between a history of gonorrhea and HHV8 seropositivity)

105 of 108

There is no significant relationship between HHV8 seropositivity and age, the odds ratio indicates that the estimated odds of HHV8 seropositivity increases by 0.6% for each additional year of age.

106 of 108

What is the probability of 51 year old man has HHV8 infection if he has gonorrhea positive and HSV2 positive but doesn’t have the two other diseases (Syphilis and HIV)?

Add up the regression coefficients Constant +b1 +b2 +b3X age

-2.2242 + 0.5093+0.7910+ (0.0062X51)=

-0.6077

107 of 108

probability of this person= P= ez / 1+ ez

P= e (-0.6077)/ 1+ e (-0.6077) =0.35

108 of 108

THANK YOU