1 of 49

Chapter 14��Measurement and Data Quality

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

2 of 49

Measurement

  • The assignment of numbers to represent the amount of an attribute present in an object or person, using specific rules
  • Advantages:
      • Removes subjectivity and guesswork; e.g. Two people measuring the weight of a person using the same scale would likely get identical results.
      • Provides precise information; e.g. Instead of describing Ahmad as “rather tall,” we can depict him as being 6 feet 2 inches tall.
      • Less vague than words; e.g. an average temperature of 99.6F, there would be no ambiguity.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

3 of 49

Levels of Measurement

  • There are four levels (classes) of measurement:
    • Nominal (assigning numbers to classify characteristics into categories)
    • Ordinal (ranking objects based on their relative standing on an attribute)
    • Interval (objects ordered on a scale that has equal distances between points on the scale)
    • Ratio (equal distances between score units; there is a rational, meaningful zero)
  • A variable’s level of measurement determines what mathematic operations can be performed in a statistical analysis.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

4 of 49

Nominal Measurement

  • The value does not imply any ordering of the cases.

  • Variables can not be manipulated mathematically.

  • It includes assigning numbers to; gender, marital status, health status, blood type, and nursing specialty.

4

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

5 of 49

Example: Gender

1 = Male

2 = Female

5

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

6 of 49

Example; marital status

1 = Married

2 = Divorced

3 = separated

4 = Widowed

6

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

7 of 49

Ordinal Measurement

  • When attributes can be rank-ordered

  • The researcher assigns numbers to categories and sorts variables based on their relative ranks.

  • Intervals between variables are not equal and variables can manipulated mathematically.

  • E.g. of ordinal variable is height and pain intensity (mild, moderate, or severe).

7

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

8 of 49

Ordinal

  • Order/ranking imposed on categories

  • Numbers must preserve order
    • 1 = Tallest
    • 2 = Next tallest
    • 3 = Third tallest

8

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

9 of 49

Example of ordinal: Income

1 = Low income

2 = Middle income

3 = Upper income

9

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

10 of 49

Interval Variables

  • Interval variables allow to rank order the items that are measured, and to quantify and compare the sizes of differences between them. For example, temperature, as measured in degrees Fahrenheit or Celsius, constitutes an interval scale.
  • Numerical distances between intervals, and the difference between intervals is meaningful
  • Interval- contain equal distance between units of measure- but no zero
  • Absence of an absolute zero point; the zero is arbitrary (random); the zero is meaningless.
  • Examples: Temp (Fahrenheit / Celsius), Blood Sugar, PH, Calendar Years

10

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

11 of 49

Interval Measurement (cont)

  • When distance between attributes has meaning, for example, temperature (in Fahrenheit) - distance from 30-40 is same as distance from 70-80

  • Note that ratios dont make any sense - 80 degrees is not twice as hot as 40 degrees (although the attribute values are).

11

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

12 of 49

Ratio Variables

  • Ratio variables are very similar to interval variables;

  • In addition to all the properties of interval variables, they feature an identifiable absolute zero point, thus they allow for statements such as x is two times more than y.

12

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

13 of 49

Ratio Variable (cont.)

  • Continuum of values; consistent intervals.
  • There is an absolute zero point; meaningful; the zero is not arbitrary; there is a clear definition of zero.
  • Examples: Weight, Height, Pulse Rate, Vital Capacity, Energy and Electrical Charge, Time, Distance, and Space.

13

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

14 of 49

Ratio Measurement (cont.)

  • Has an absolute zero which is meaningful
  • Can construct a meaningful ratio (fraction), for example, number of clients in past six months
  • It is meaningful to say that ...we had twice as many clients in this period as we did in the previous six months.

14

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

15 of 49

The Hierarchy of Levels

15

Nominal

Interval

Ratio

Attributes are only named; weakest

Attributes can be ordered

Distance is meaningful

Absolute zero

Ordinal

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

16 of 49

Errors of Measurement

  • Obtained Score = True score ± Error
    • Obtained score: An actual data value for a participant (e.g., anxiety scale score)
    • True score: The score that would be obtained with an infallible (perfect) measure; it can never be known because measures are not infallible.
    • Error: The error of measurement, caused by factors that distort measurement; The difference between true and obtained scores is the result of factors that distort the measurement.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

17 of 49

Factors That Contribute to Errors of Measurement

  1. Situational contaminants; e.g. participant’s awareness, location, environmental factors (temperature, lighting, and time of day)
  2. Transitory personal factors (such temporary personal states as fatigue, hunger, anxiety, or mood)
  3. Response-set biases; such as social desirability, acquiescence, and extreme responses are potential problems in self-report measures

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

18 of 49

Factors That Contribute to Errors of Measurement (cont.)

  1. Administration variations; Alterations in the methods of collecting data from one person to the next can result in score variations unrelated to variations in the target attribute. Such as observers alter their coding categories, or interviewers improvise question wording,
  2. Item sampling; Errors can be introduced as a result of the sampling of items used in the measure. For example, A person might get 95 questions correct on one test but only 92 right on another similar test.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

19 of 49

Question

Is the following statement True or False?

  • The true score is data obtained from the actual research study. �

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

20 of 49

Answer

  • False
    • The true score is the score that would be obtained with an infallible measure. The obtained score is an actual value (datum) for a participant.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

21 of 49

Psychometric Assessments

  • A psychometric assessment is an evaluation of the quality of a measuring instrument.
  • Key criteria in a psychometric assessment:
    • Reliability
    • Validity

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

22 of 49

Reliability

  • The consistency and accuracy with which an instrument measures the target attribute
  • Reliability assessments involve computing a reliability coefficient.
    • Reliability coefficients can range from .00 to 1.00.
    • Coefficients below .70 are considered unsatisfactory.
    • Coefficients of .80 or higher are desirable.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

23 of 49

Three Aspects of Reliability Can Be Evaluated

  • Stability
  • Internal consistency
  • Equivalence

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

24 of 49

Stability

  • The extent to which scores are similar on two separate administrations of an instrument
  • Evaluated by test–retest reliability
    • Requires participants to complete the same instrument on two occasions
    • Appropriate for relatively enduring attributes (e.g., creativity)

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

25 of 49

Internal Consistency

  • The extent to which all the items on an instrument are measuring the same unitary attribute
  • Evaluated by administering instrument on one occasion
  • Appropriate for most multi-item instruments
  • The most widely used approach to assessing reliability
  • Assessed by computing coefficient alpha (Cronbach’s alpha)
  • Alphas ≥.80 are highly desirable.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

26 of 49

Question

When determining the reliability of a measurement tool, which value would indicate that the tool is most reliable?

  1. 0.50
  2. 0.70
  3. 0.90
  4. 1.10

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

27 of 49

Answer

c. 0.90

  • Reliability coefficients can range from 0.0 to 1.00. Coefficients of 0.80 or higher are desirable. Thus, a coefficient of 0.90 would indicate that the tool is very reliable. A value greater than 1.00 for a coefficient would be an error.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

28 of 49

Equivalence

  • The degree of similarity between alternative forms of an instrument or between multiple raters/observers using an instrument
  • Most relevant for structured observations
  • Assessed by comparing agreement between observations or ratings of two or more observers (interobserver/interrater reliability) The data can then be used to compute an index of equivalence or agreement between observers.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

29 of 49

Reliability Principles

  1. Low reliability can undermine adequate testing of hypotheses.
  2. Reliability estimates vary depending on procedure used to obtain them.
  3. Reliability is lower in homogeneous (have similar scores) than heterogeneous samples .
  4. Reliability is lower in shorter than longer multi-item scales.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

30 of 49

Validity

  • The degree to which an instrument measures what it is supposed to measure
  • Four aspects of validity:
    1. Face validity
    2. Content validity
    3. Criterion-related validity
    4. Construct validity

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

31 of 49

Face Validity

  • Refers to whether the instrument looks as though it is an appropriate measure of the construct
  • Based on judgment; no objective criteria for assessment

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

32 of 49

Content Validity

  • The degree to which an instrument has an adequate sample of items for the construct being measured
  • Evaluated by expert evaluation, often via a quantitative measure—the content validity index (CVI)

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

33 of 49

Question

Is the following statement True or False?

  • Face validity of an instrument is based on judgment.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

34 of 49

Answer

  • True
    • Face validity refers to whether the instrument looks like it is an appropriate measure of the construct. There are no objective criteria for assessment; it is based on judgment.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

35 of 49

Criterion-Related Validity

  • The degree to which the instrument is related to an external criterion
  • Validity coefficient is calculated by analyzing the relationship between scores on the instrument and the criterion.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

36 of 49

Criterion-Related Validity (cont.)

Two types:

    • Predictive validity: the instrument’s ability to distinguish people whose performance differs on a future criterion; e.g. When a school of nursing correlates incoming students’ high school grades with subsequent grade-point averages,
    • Concurrent validity: the instrument’s ability to distinguish individuals who differ on a present criterion; For example, a psychological test to differentiate between those patients in a mental institution who can and cannot be released could be correlated with current behavioral ratings of health care personnel.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

37 of 49

Construct Validity

  • Concerned with these questions:
    • What is this instrument really measuring?
    • Does it adequately measure the construct of interest?

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

38 of 49

Some Methods of Assessing Construct Validity

  1. Known-groups technique
    • In this procedure, the instrument is administered to groups expected to differ on the critical attribute because of some known characteristic. For example, in validating a measure of fear of the labor experience, we could contrast the scores of primiparas and multiparas. We would expect that women who had never given birth would be more anxious than women who had done so, and so we might question the instrument’s validity if such differences did not emerge.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

39 of 49

Some Methods of Assessing Construct Validity (cont.)

  1. Testing relationships based on theoretical predictions:

example, Scores on A and B are correlated positively, as predicted by theory. Therefore, it is inferred that A and B are valid measures of X and Y.

  • Factor analysis: is a method for identifying clusters of related variables. Each cluster, called a factor, represents a relatively unitary attribute.

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

40 of 49

Criteria for Assessing Screening/Diagnostic Instruments

  • Sensitivity: is the ability of an instrument to identify a “case correctly,” that is, to screen in or diagnosis a condition correctly.i.e., to diagnose a condition; yielding “true positives.”
  • Specificity: is the instrument’s ability to identify noncases correctly, that is, to screen out those without the condition correctly; yielding “true negatives.”

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

41 of 49

Criteria for Assessing Screening/Diagnostic Instruments (cont.)

  • Likelihood ratio: Summarizes the relationship between sensitivity and specificity in a single number
    • LR+: the ratio of true positives to false positives
    • LR-: the ratio of false negatives to true negatives

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

42 of 49

Guidelines for Evaluating Data Quality in Quantitative Studies

1. Is there a congruence between the research variables as conceptualized (i.e., as discussed in the introduction) and as operationalized (i.e., as described in the methods section)?

2. If operational definitions (or scoring procedures) are specified, do they clearly indicate the rules of measurement? Do the rules seem sensible? Were data collected in such a way that measurement errors were minimized?

3. Does the report offer evidence of the reliability of measures? Does the evidence come from the research sample itself, or is it based on other studies? If the latter, is it reasonable to conclude that data quality for the research sample and the reliability sample would be similar (e.g., are sample characteristics similar)?

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

43 of 49

Guidelines for Evaluating Data Quality in Quantitative Studies

4. If reliability is reported, which estimation method was used? Was this method appropriate? Should an alternative or additional method of reliability appraisal have been used? Is the reliability sufficiently high?

5. Does the report offer evidence of the validity of the measures? Does the evidence come from the research sample itself, or is it based on other studies? If the latter, is it reasonable to believe that data quality for the research sample and the validity sample would be similar (e.g., are the sample characteristics similar)?

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

44 of 49

Guidelines for Evaluating Data Quality in Quantitative Studies

6. If validity information is reported, which validity approach was used? Was this method appropriate? Does the validity of the instrument appear to be adequate?

7. If there is no reliability or validity information, what conclusion can you reach about the quality of the data in the study?

8. Were the research hypotheses supported? If not, might data quality play a role in the failure to confirm the hypotheses?

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

45 of 49

Exercises

45

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

46 of 49

Age

  • How old are you? ____ years
  • What Level of Measurement (LOM)? ____

46

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

47 of 49

Age

  • How old are you?
    • 2534
    • 3544
    • 4454
    • 55 or older
  • What Level of Measurement (LOM)? _____

47

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

48 of 49

Income

  • What is your level of income?
    • 1 = under $35,000
    • 2 = $35,000$50,000
    • 3 = $50,000$100,000
  • LOM? ____

48

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins

49 of 49

End of Presentation

Copyright © 2010 Wolters Kluwer Health | Lippincott Williams & Wilkins