1 of 16

Descriptive Statistics

Analyzing Distributions

1

Business Analytics

Lecture # 05

2 of 16

TOPICS to be COVERED

01

Z- score

02

Empirical Rule

03

Box Plots

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

3 of 16

z-Scores

  • A z-score allows us to measure the relative location of a value in the data set.
  • More specifically, a z-score helps us determine how far a particular value is from the mean relative to the data set’s standard deviation.
  • The z-score is often called the standardized value.

3

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

4 of 16

  •  

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

5 of 16

  • How many standard deviations a value is from the mean.�In this example, the value 1.7 is 2 standard deviations away from the mean of 1.4, so 1.7 has a z-score of 2.�Similarly 1.85 has a z-score of 3.

5

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

6 of 16

  • For example, z1 =1.2 indicates that x1 is 1.2 standard deviations greater than the sample mean.
  • Similarly, z2 = − 0.5 indicates that x2 is 0.5, or 1/2, standard deviation less than the sample mean.
  • A z-score of zero indicates that the value of the observation is equal to the mean.

6

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

7 of 16

  •  

7

The z-scores for the class size data are computed in Table 2.13.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

8 of 16

Figure 2.20: Calculating z-Scores for the Home Sales Data in Excel

8

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

9 of 16

Empirical Rule

  • When the distribution of data exhibits a symmetric bell-shaped distribution, as shown in Figure 2.21, the empirical rule can be used to determine the percentage of data values that are within a specified number of standard deviations of the mean.

9

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

10 of 16

Empirical Rule

10

    • For data having a bell-shaped distribution:
      • Within 1 standard deviation:
        • Approximately 68% of the data values
      • Within 2 standard deviations:
        • Approximately 95% of the data values
      • Within 3 standard deviations:
        • Almost all the data values

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

11 of 16

11

The height of adult males in the United States has a bell-shaped distribution similar to that shown in Figure 2.21, with a mean of approximately 69.5 inches and standard deviation of approximately 3 inches.

Using the empirical rule, we can draw the following conclusions.

  • Approximately 68% of adult males in the United States have heights between 69.5 - 3 = 66.5 and 69.5 + 3 = 72.5 inches.

  • Approximately 95% of adult males in the United States have heights between 63.5 and 75.5 inches.

  • Almost all adult males in the United States have heights between 60.5 and 78.5 inches.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

12 of 16

Identifying Outliers

    • Outliers: Extreme values in a data set
    • It can be identified using standardized values (z-scores)
    • Any data value with a z-score less than –3 or greater than +3 is an outlier

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

13 of 16

Box Plots

  • A box plot is a graphical summary of the distribution of data. A box plot is developed from the quartiles for a data set. Figure 2.22 is a box plot for the home sales data.

13

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

14 of 16

Figure 2.23: Box Plots Comparing Home Sale Prices in Different Communities

14

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

15 of 16

15

EXAMPLE

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.

16 of 16

Thank You !

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use.