Matplotlib�
Matplotlib is a low level graph plotting library in python that serves as a visualization utility.
Installation of Matplotlib�
pip install matplotlib
Import Matplotlib�
import matplotlib
Matplotlib------ Pyplot�
import matplotlib.pyplot as plt
import matplotlib.pyplot as plt�import numpy as np��xpoints = np.array([0, 6])�ypoints = np.array([0, 250])��plt.plot(xpoints, ypoints)�plt.show()
Example:our own Python Server
Draw a line in a diagram from position (0,0) to position (6,250):
Plotting graph�
import matplotlib.pyplot as plt�import numpy as np��xpoints = np.array([1, 8])�ypoints = np.array([3, 10])��plt.plot(xpoints, ypoints)�plt.show()
If we need to plot a line from (1, 3) to (8, 10), we have to pass two arrays [1, 8] and [3, 10] to the plot function.
Plotting Without Line
To plot only the markers, you can use shortcut string notation parameter 'o', which means 'rings'.
import matplotlib.pyplot as plt�import numpy as np��xpt = np.array([1, 8])�ypt = np.array([3, 10])��plt.plot(xpt, ypt, 'o')�plt.show()
Multiple Points�
Draw a line in a diagram from position (1, 3) to (2, 8) then to (6, 1) and finally to position (8, 10):
import matplotlib.pyplot as plt�import numpy as np��xpoints = np.array([1, 2, 6, 8])�ypoints = np.array([3, 8, 1, 10])��plt.plot(xpoints, ypoints)�plt.show()
Default X-Points�
Default X-Points�
import matplotlib.pyplot as plt�import numpy as np��ypt=np.array
([3, 8, 1, 10, 5, 7])�plt.plot(ypt)�plt.show()
Matplotlib Markers�
import matplotlib.pyplot as plt�import numpy as np��ypoints = np.array([3, 8, 1, 10])��plt.plot(ypoints, marker = 'o')�plt.show()
Format Strings fmt
import matplotlib.pyplot as plt�import numpy as np��ypoints = np.array([3, 8, 1, 10])��plt.plot(ypoints, 'o:r')�plt.show()
Line Reference
Color Reference
import matplotlib.pyplot as plt�import numpy as np�ypt= np.array([3, 8, 1, 10])�plt.plot(ypt, marker = 'o', ms = 20)�plt.show()
Marker Color�
�plt.plot(ypoints, marker = 'o’,ms = 20, mec = 'r')�
You can use the keyword argument markerfacecolor or the shorter mfc to set the color inside the edge of the markers:
plt.plot(ypoints, marker = 'o', ms = 20, mfc = 'r')
import matplotlib.pyplot as plt�import numpy as np��ypoints = np.array([3, 8, 1, 10])��plt.plot(ypoints, marker = 'o', ms = 20,
mec = 'r', mfc = 'r')�plt.show()
plt.plot(ypoints, marker = 'o', ms = 20,
mec = '#4CAF50', mfc = '#4CAF50')
Adding text on matplotlib plot
plt.plot(ypoints, marker = 'o', ms = 20,
mfc = 'r’)
plt.title("Simple graph")
plt.show()
plt.plot(ypoints, marker = 'o’,
ms = 20, mfc = 'r’)
plt.title("Simple graph")
plt.xlabel('Location')
plt.ylabel('Number of Restaurants')
plt.show()
Types of graphs
Bar chart�
import matplotlib.pyplot as plt�import numpy as np��x = np.array(["A", "B", "C", "D"])�y = np.array([3, 8, 1, 10])��plt.bar(x,y)�plt.show()
Horizontal Bars
If you want the bars to be displayed horizontally instead of vertically, use the barh() function. Example:
plt.barh(x, y)
Bar Color
The bar() and barh() take the keyword argument color to set the color of the bars:
plt.bar(x, y, color = "red")
Bar Width
The bar() takes the keyword argument width to set the width of the bars:
plt.bar(x, y, width = 0.1)
plt.barh(x, y, height = 0.1)
plt.barh(x, y, height = 0.6)
Creating Pie Charts�
import matplotlib.pyplot as plt�import numpy as np��y = np.array([35, 25, 25, 15])��plt.pie(y)�plt.show()
Labels
Add labels to the pie chart with the labels parameter.
The labels parameter must be an array with one label for each wedge.
Example:
y = np.array([35, 25, 25, 15])�mylabels = ["Apples", "Bananas", "Cherries", "Dates"]�plt.pie(y, labels = mylabels)�plt.show()
import matplotlib.pyplot as plt�import numpy as np��y = np.array([35, 25, 25, 15])�mylabels = ["Apples","Bananas","Cherries","Dates"]�myexplode = [0.2, 0, 0, 0]�plt.pie(y, labels = mylabels, explode = myexplode)�plt.show()
plt.pie(y, labels = mylabels, explode = myexplode, shadow = True)
y = np.array([35, 25, 25, 15])�mylabels = ["Apples", "Bananas", "Cherries", "Dates"]��plt.pie(y, labels = mylabels)�plt.legend()�plt.show()
Scatter Plots�
import matplotlib.pyplot as plt�import numpy as np��x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])�y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])��plt.scatter(x, y)�plt.show()
�x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])�y = np.array ([99,86,87,88,111,86,103,87,94,78,77,85,86])�plt.scatter(x, y, color = ’blue')��x = np.array([2,2,8,1,15,8,12,9,7,3,11,4,7,14,12])�y = np.array
([100,105,84,105,90,99,90,95,94,100,79,112,91,80,85])�plt.scatter(x, y, color = ’red')��plt.show()
Color Each Dot�
�x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])�y = np.array ([99,86,87,88,111,86,103,87,94,78,77,85,86])�colors = np.array(["red","green","blue","yellow","pink","black","orange","purple","beige","brown","gray","cyan","magenta"])��plt.scatter(x, y, c=colors)��plt.show()
Size�
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])�y = np.array ([99,86,87,88,111,86,103,87,94,78,77,85,86])�sizes = np.array
([20,50,100,200,500,1000,60,90,10,300,600,800,75])��plt.scatter(x, y, s=sizes)
Plt.show()
Alpha�
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])�y = np.array ([99,86,87,88,111,86,103,87,94,78,77,85,86])�sizes = np.array ([20,50,100,200,500,1000,60,90,10,300,600,800,75])��plt.scatter(x, y, s=sizes, alpha=0.5)��plt.show()
Histogram�
Histogram.�
import matplotlib.pyplot as plt�import numpy as np��x = np.random.normal(170, 10, 250)��plt.hist(x)�plt.show()
Subplot()
import matplotlib.pyplot as plt�import numpy as np�#plot 1:�x = np.array([0, 1, 2, 3])�y = np.array([3, 8, 1, 10])�plt.subplot(1, 2, 1)�plt.plot(x,y)�#plot 2:�x = np.array([0, 1, 2, 3])�y = np.array([10, 20, 30, 40])�plt.subplot(1, 2, 2)�plt.plot(x,y)��plt.show()
Patches
A patch is a 2D artist with a face color and an edge color.
Axes.add_patch(self, p)
import matplotlib.path as mpath
import matplotlib.pyplot as plt
# adjust figure and assign coordinates
fig = plt.figure()
ax = fig.add_subplot(1, 1,1)
pp1 = plt.Rectangle((0.2, 0.75),
0.4, 0.15)
pp2 = plt.Circle((0.7, 0.2), 0.15)
pp3 = plt.Polygon([[0.15, 0.15],
[0.35, 0.4],
[0.2, 0.6]])
# depict illustrations
ax.add_patch(pp1)
ax.add_patch(pp2)
ax.add_patch(pp3)
import matplotlib.patches as mpatches
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
fig, ax = plt.subplots()
circ = mpatches.Circle((1, 0), 5, linestyle='solid', edgecolor='b', facecolor='none')
ax.add_patch(circ)
ax.set_xlim(-10, 10)
ax.set_ylim(-10, 10)
ax.set_aspect('equal')
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
fig, ax = plt.subplots()
circ = mpatches.Circle((1, 0), 5, linestyle='solid', edgecolor='b', facecolor=‘pink')
ax.add_patch(circ)
ax.set_xlim(-10, 10)
ax.set_ylim(-10, 10)
ax.set_aspect('equal')
matplotlib.patches.Rectangle
import matplotlib.pyplot as plt
fig ,ax = fig.add_subplot( )
rect1 = matplotlib.patches.Rectangle((-200, -100), 400, 200, color ='green’)
rect2 = matplotlib.patches.Rectangle((0, 150), 300, 20, color ='pink')
rect3 = matplotlib.patches.Rectangle((-300, -50), 40, 200, color ='yellow')
ax.add_patch(rect1)
ax.add_patch(rect2)
ax.add_patch(rect3)
plt.xlim([-400, 400])
plt.ylim([-400, 400])
plt.show()
Seaborn
Install Seaborn.�all seaborn
pip install seaborn
Import Seaborn
import seaborn as sns
Some key features of Seaborn include:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Create a sample dataset
data = pd.DataFrame({'X': [1, 2, 3, 4, 5], 'Y': [2, 4, 1, 3, 5]})
# Scatter plot using Seaborn
sns.scatterplot(x='X', y='Y', data=data)
# Show the plot
plt.show()
Box plot
import seaborn as sns
import matplotlib.pyplot as plt
# Create a sample dataset
data = sns.load_dataset("tips")
# Box plot
sns.boxplot(x='day', y='total_bill', data=data)
# Show the plot
plt.show()
Pair Plot
import seaborn as sns
import matplotlib.pyplot as plt
# Create a sample dataset
data = sns.load_dataset("iris")
# Pair plot
sns.pairplot(data, hue='species')
# Show the plot
plt.show()
Violin Plot�
import seaborn as sns
import matplotlib.pyplot as plt
# Create a sample dataset
data = sns.load_dataset("tips")
# Violin plot
sns.violinplot(x='day', y='total_bill', data=data)
# Show the plot
plt.show()
Scatter Plot with Regression Line:�
import seaborn as sns
import matplotlib.pyplot as plt
# Create a sample dataset
data = sns.load_dataset("tips")
# Scatter plot with regression line
sns.regplot(x='total_bill', y='tip', data=data)
# Show the plot
plt.show()
import matplotlib.pyplot as plt
import seaborn as sns
sns.distplot([0, 1, 2, 3, 4, 5])
plt.show()
import seaborn as sns
sns.set(style="dark")
fmri = sns.load_dataset("fmri")
# Plot the responses for
Different events and regions
sns.lineplot(x="timepoint",
y="signal",
hue="region",
style="event",
data=fmri)
Logistic regression
Logistic regression is very similar to linear regression.
When we use logistic regression?
We use it when we have a (binary outcome) of interest and a number of explanatory variables.
Outcome:
e.g. the presence of absence of a
symptom, presence or absence of a disease
From the equation of the logistic regression model we can do:
1-we can determine which explanatory variables can influence the outcome.
Which means which variables had the highest OR or the risk in production of the outcome
(1= has the disease 0= doesn’t have the disease)
From the equation of the logistic regression model we can do:
2- we can use an individual values of the explanatory variables to evaluate he or she will have a particular outcome
we start the logistic regression model by creating a binary variable to represent the outcome (Dependant variable) (1= has the disease 0=doesn’t have the disease)
We take the probability P of an individual has the highest coded category (has the disease) as the dependant variable.
We use the logit logistic transformation in the regression equation
The logit is the natural logarithm of the odds ratio of ‘disease’
Logit (P)= ln P/ 1-p
The logistic regression equation
Logit (p)= a + b1X1+ b2X2 + b3X3 +……… + biXi X= Explanatory variables
P= estimated value of true probability that an individual with a particular set of values for X has the disease. P corresponds to the proportion with the disease, it has underlying binominal distribution
b= estimated logistic regression coefficients The exponential of a particular coefficient for
example eb1 is an estimated of the odds ratio.
For a particular value of X1 the estimated odds of the disease while adjusting for all other X’s in the equation.
As the logistic regression is fitted on a log scale the effects of X’s are multiplicative on the odds of the disease . This means that their combined effect is the product of their separate effects.
This is unlike linear regression where the effects of X’s on the dependant variables are additive.
Plain English:
Plain English:
Plain English:
Mathematical model that describes the relationship between an outcome with one or more explanatory variables
variable, manipulate to select of the best combination of explanatory variables
Example:
A study was done to test the relationship between HHV8 infection and sexual behavior of men, were asked about histories of sexually transmitted diseases in the past ( gonorrhea, syphilis, HSV2, and HIV)
The explanatory variables were the presence of each of the four infection coded as 0 if the patient has no history or 1 if the patient had a history of that infection and the patient age in years
Dependant outcome HHV8 infection
| Parameter estimate | P | OR | 95% CI |
Intercept | -2.2242 | 0.006 | | |
Gonorrhea | 0.5093 | 0.243 | 1.664 | 0.71-3.91 |
Syphilis | 1.1924 | 0.093 | 3.295 | 0.82-13.8 |
HSV2 | 0.7910 | 0.0410 | 2.206 | 1.03-4.71 |
HIV | 1.6357 | 0.0067 | 5.133 | 1.57- 16.73 |
Age | 0.0062 | 0.76 | 1.006 | 0.97-1.05 |
Example:
Chi square for covariate= 24.5 P=0.002
Indicating at least one of the covariates is significantly associated with HHV-8 serostatus.
HSV-2 positively associated with HHV8 infection P=0.04
HIV is positively associated with HHV 8 infection P=0.007
Those with a history of HSV-2 having 2.21 times odds of being HHV-8 positive compared to those with negative history after adjusting for other infections
Those with a history of HIV having 5.1 times odds of being HHV-8 positive compared to those with negative history after adjusting for other infections
Multiplicative effect of the model suggests a man who is both HSV2 and HIV seropositive is estimated to have 2.206 X 5.133 = 11.3 times the odds of HHV 8 infection compared to a man negative for both after adjusting for the other two infections.
In this example gonorrhea had a significant chi-square but when entered in the model it was not significant
(no indication of independent relationship between a history of gonorrhea and HHV8 seropositivity)
There is no significant relationship between HHV8 seropositivity and age, the odds ratio indicates that the estimated odds of HHV8 seropositivity increases by 0.6% for each additional year of age.
What is the probability of 51 year old man has HHV8 infection if he has gonorrhea positive and HSV2 positive but doesn’t have the two other diseases (Syphilis and HIV)?
Add up the regression coefficients Constant +b1 +b2 +b3X age
-2.2242 + 0.5093+0.7910+ (0.0062X51)=
-0.6077
probability of this person= P= ez / 1+ ez
P= e (-0.6077)/ 1+ e (-0.6077) =0.35
THANK YOU