COSC3000:
Visualisation
Week 2 Lecture 1: Univariate data
Tuesday - 27/02/2024
Ben Roberts [b.roberts@uq.edu.au]
Univariate data
2
Histogram
3
Histogram bins
4
[Sturge’s rule]
[default in Excel]
[Rice’s rule]
Example: height data
5
Source: WS Cleveland “Visualizing data”
6
Thoughts?
7
Thoughts?
8
Thoughts?
9
Thoughts?
plot.hist(data, density=True)
Probability distributions
10
Cumulative Distribution Function (CDF)
11
Gaussian: Normal distribution
12
KDE - Kernel density estimation
13
Two random samples from same (Gaussian) distribution
Example:
14
Credit: Ciaran O’Hare
Example:
15
Credit: Ciaran O’Hare
Don’t get carried away!
Quantiles (CDF)
16
17
x = np.linspace(0.0, 1.0, 20)
q = np.quantile(data, x)
18
Comparing two univariate data sets: Q-Q plots
19
20
Box - whisker plot
21
plt.boxplot(data, notch = True, showmeans=True)
22
plt.boxplot(data, notch = True, showmeans = True)
Example:
23
Descriptive Statistics
24
Distribution measures: descriptive statistics
25
Descriptive statistics: central tendency
26
Descriptive statistics: variation
A measure of the width (spread) of distribution:�
27
28
“Sample” mean, standard deviation
29
sigma = np.std(sample)
sigma_c = np.std(sample, ddof=1)
Standard Error (in the Mean): SEM
30
Example:
31
Accuracy vs. precision
32
Random error
Systematic error
Uncertainty, error bars, significance
33
Example:
34
Some basic plotting tips
35
36
Credit: Ciaran O’Hare
Some general tips:
37
38
Credit: Ciaran O’Hare
39
Credit: Ciaran O’Hare
Matplotlib cheat sheets
40