2 of 50

Learning Targets

38-1 Describe the characteristics of an intelligence test, and distinguish between achievement and aptitude tests.

38-2 Discuss when and why intelligence tests were created, and explain how today’s tests differ from early intelligence tests.

38-3 Describe the normal curve, and explain standardization, reliability, and validity.

3 of 50

What is an intelligence test?

a method for assessing an individual’s

mental aptitudes and comparing

them with those of others, using

numerical scores

Psychologists classify intelligence tests as either achievement tests, intended to reflect what you have learned, or aptitude tests, intended to predict your ability to learn a new skill.

4 of 50

What is the difference between an achievement test and an aptitude test?

achievement test

Exams covering what you have learned in this course are achievement tests.

Examples include the AP^® exam, chapter or unit tests in your courses, final exams in college, etc.

aptitude test

A college entrance exam, which seeks to predict your ability to do college work, is an aptitude test.

Examples include the SAT or ACT or career tests that help predict what future job might best fit your interests.

5 of 50

Aptitude and achievement tests.

What achievement or aptitude tests have you taken?

In your opinion, how well did these tests reflect what you’d learned or predict what you were capable of learning?

Talk with your partner.

6 of 50

1. What Would You Answer?

Which of the following is the best example of an aptitude test?

A. Atul answers questions about the rules of the road.

B. Mr. Anderson’s AP^® psychology test covers the

material from the current unit.

C. Sherjeel takes the ACT for college admission.

D. Jeffrey is required to translate 50 Mandarin sentences for his final exam.

E. Lucy and Meghan discuss what they might study in

college.

7 of 50

Interpret the graph.

Use your understanding of statistics to explain the data on the graph above.

8 of 50

Thinking critically.

Research indicates that there is a strong positive correlation between SAT scores and intelligence scores. Many consider the modern SAT to be more of an achievement test, measuring the rigor of courses taken in high school, the access to preparation courses, and other social factors.

9 of 50

Consider the quote below.

Plato, a pioneer of the individualist

tradition, wrote more than 2000 years ago in

The Republic that “no two persons are

born exactly alike; but each differs from the other in natural endowments, one being suited for one occupation and the other for another.”

Do you agree with Plato? Talk about it.

10 of 50

AP^® Exam Tip

Become familiar with the key contributors

in intelligence testing and be able to identify

how they differ

(e.g., Galton, Binet, Terman, Wechsler, and Stern).

11 of 50

How were individual differences in �mental abilities historically researched?

English scientist Francis Galton was fascinated with measuring human traits.

Galton wondered if it might be possible to measure “natural ability” and to encourage those of high ability to mate with one another.

He devised methods to measure “intellectual strengths” based on such things as reaction time, sensory acuity, muscular power, and body proportions.

12 of 50

What were the results of Galton’s research?

Galton’s quest for a simple intelligence measure failed, and the measurements he gathered did not correlate with intelligence.

Galton did; however, leave the field of psychology with statistical techniques that are still used, the phrase nature and nurture, and the belief in the inheritance of genius.

13 of 50

How did Alfred Binet contribute �to the field?

French psychologist Alfred Binet was commissioned by the French government to design fair and unbiased intelligence tests to administer to French schoolchildren.

Alfred Binet

(1857-1911)

14 of 50

What was Binet’s assumption about intellectual development?

Binet and his student, Théodore Simon, began by assuming that all children follow the same

course of intellectual development but that some develop more rapidly.

A “dull” child should score much like a typical younger child, and a “bright” child like a typical older child.

Thus, their goal became measuring each child’s mental age, the level of performance typically associated with a certain chronological age.

15 of 50

What is meant by mental age?

Binet assumed the average 9-year-old,has a

mental age of 9.

Children with below-average mental ages, such as

9-year-olds who perform at the level of typical

7-year-olds, would struggle

with age-appropriate schoolwork.

Although the child had a chronological age of 9,

Binet would say they have a mental age of 7.

16 of 50

How did Binet test for mental age?

To measure mental age, Binet and Simon theorized that mental aptitude, like athletic aptitude, is a general capacity that shows up in various ways.

They tested a variety of reasoning and problem-solving questions on Binet’s two daughters, and then on “bright” and “backward” Parisian schoolchildren.

Items answered correctly could then predict how well

other French children would handle their schoolwork.

17 of 50

How were Binet’s tests modified by �Lewis Terman?

Stanford University professor Lewis Terman, modified Binet’s tests for use as a numerical measure of inherited intelligence. Adapting some of Binet’s original items, adding others, and establishing new age norms, Terman extended the upper end of the test’s range from teenagers to “superior adults.”

Terman also gave his revision the name today’s version retains—the Stanford-Binet.

For Terman, intelligence tests revealed the intelligence with which a person was born.

18 of 50

What is the intelligence quotient (IQ) and how was it derived?

From such tests, German psychologist William Stern derived the famous term intelligence quotient, or IQ. The IQ was simply a person’s mental age divided by

chronological age and multiplied by 100 to get rid of the decimal point.

IQ was defined as the ratio of mental age (ma) to chronological age (ca) multiplied by 100

(thus, IQ = ma/ca × 100).

On contemporary intelligence tests, the average performance for a given age is assigned a score of 100.

19 of 50

What were the limits of IQ calculating?

The original IQ formula worked fairly well for children but not for adults.

Most current intelligence tests, including the

Stanford-Binet, no longer compute an IQ in this manner.

20 of 50

How did the Army utilize the intelligence tests?

With Terman’s help, the U.S. government developed new tests to evaluate both newly arriving immigrants and World War I army recruits—the world’s first mass

administration of an intelligence test.

The Army Alpha and Beta (the version for illiterate or non-English speaking recruits) tests were intended to measure verbal and numerical abilities, following directions and general knowledge.

To some psychologists, the results indicated the inferiority

of people not sharing their Anglo-Saxon heritage

21 of 50

What were the problems with the early intelligence tests?

Sweeping judgments based on intelligence test scores became an embarrassment to most of those who championed testing.

Lewis Terman came to appreciate that test scores

reflected not only people’s innate mental abilities but also their education, native language, and familiarity with the culture assumed by the test.

Abuses of the early intelligence tests, such as in immigrant screening, remind us that science can be value-laden.

22 of 50

What intelligence test did David Wechsler design?

Psychologist David Wechsler created what is now the most widely used individual intelligence test, the Wechsler Adult Intelligence Scale (WAIS), together with a version

for school-age children (the Wechsler Intelligence Scale for Children [WISC]) and another

for preschool children (the WPPSI).�(Evers et al., 2012)

23 of 50

What are some of the subtests of the WAIS?

Recognizing similarities
Vocabulary
Letter-number sequencing
Block design

(use four blocks to make the image shown)

24 of 50

What information does a WAIS provide?

The WAIS yields not only an overall intelligence score, as does the Stanford-Binet, but also individual scores for verbal comprehension, perceptual organization, working memory, and processing speed.

Striking differences among these individual scores can provide clues to cognitive strengths or weaknesses.

For example, a low verbal comprehension score

combined with high scores on other subtests could indicate a reading or language disability.

25 of 50

What three criteria must an �intelligence test meet to be accepted?

standardized

To make scores meaningful they are compared to a pretested sample population.

reliable

The test gives consistent scores no matter who takes it or when they take the test.

valid

The test measures or predicts what it is supposed to.

26 of 50

What is the normal curve?

If a graph is constructed of test-takers’ scores, the scores typically form a bell-shaped pattern called the bell curve, or normal curve.

27 of 50

How is the normal curve defined?

the bell-shaped curve that describes the distribution

of many physical and psychological attributes

Most scores fall near the average, and fewer and fewer scores lie near the extremes.

28 of 50

What is a characteristic of a normal curve distribution?

Remember that in a normal distribution the mean, median, and mode are all the same and at the center.

29 of 50

What is another characteristic of the �normal curve?

~68% of scores fall 1 standard deviation from the mean

~95% of scores fall 2 standard deviations from the mean

~99% of scores fall 3 standard deviations from the mean

30 of 50

What does the test score indicate?

For both the Stanford-Binet and Wechsler scales, a score indicates whether that person’s performance fell above or below the average.

31 of 50

How is an intelligence score derived using the normal curve?

A performance higher than all but 2.5% of all scores earns an intelligence score of 130.

A performance lower than 97.5% of all scores earns an intelligence score of 70.

32 of 50

How do the tests remain standardized?

To keep the average score near 100, the Stanford-Binet and Wechsler scales are periodically restandardized.

The WAIS, 4^th ed., was standardized on a sample who took the test during 2007, not to David Wechsler’s initial 1930’s sample.

33 of 50

Thinking critically:

Why is there a need to restandardize tests?

If you compared the performance of the most

recent standardization sample with that of the 1930’s sample, do you suppose you would

find rising or declining test performance?

Discuss with your partner.

34 of 50

What is the Flynn effect?

It turns out that intelligence test performance has improved.

This worldwide phenomenon is called the Flynn effect, in honor of New Zealand researcher James Flynn who first calculated its magnitude.

The average person’s intelligence test score in 1920 was—by today’s standard— only a 76.

35 of 50

What is reliability and how �is it determined?

Reliability is the extent to which a test yields consistent results and can be assessed three ways:

Split-half: scores on two halves of the test (even items v. odd items) are compared.
Alternative form: varying versions of the test are given and results are compared.
Test-retest: the same test is readministered and results are compared.

The higher the correlation between the two scores, the higher the test’s reliability

36 of 50

2. What Would You Answer?

If the same test yields consistent results upon retesting, it can be said to have a high degree of

A. reliability.

B. validity.

C. content validity.

D. predictive validity.

E. normal curve.

37 of 50

What is validity?

the extent to which a test measures or predicts

what it is supposed to

For example, if your environmental science teacher spent several weeks discussing global warming trends, then gave an assessment on that subject, the test would be valid if it contained questions on global warming trends.

38 of 50

What is the difference between content validity and predictive validity?

content validity

the extent to which a test samples the behavior that is of interest

For example, the road test

for a driver’s license has content validity because it samples the tasks a driver routinely faces.

predictive validity

the success with which a test predicts the behavior it is designed to predict

For example, some academic aptitude tests can predict success in school at certain ages.

39 of 50

When can predictive validity yield little information?

Consider a correlation

between football linemen’s body weight and their success on the field.

Note how insignificant the relationship becomes when the range of weight is narrowed to 280 to 320 pounds.

40 of 50

3. What Would You Answer?

Which of the following can be used to demonstrate

that only about 2 percent of the population scores at least two standard deviations above the mean on an intelligence test?

a. reliability test

b. aptitude test

c. predictive validity test

d. test-retest procedure

e. normal curve

41 of 50

The limits of prediction.

As the range of

data under consideration narrows,

its predictive power diminishes.

42 of 50

Think about it.

Are you working to the potential reflected in your standardized test scores?

What, other than your aptitude, is affecting your

school performance?

Write down your thoughts.

43 of 50

Learning Target 38-1 Review

Describe the characteristics of an

intelligence test, and distinguish

between achievement and aptitude tests.

An intelligence test assesses people’s mental aptitudes and compares them with those of others, using numerical scores.
Achievement tests are designed to assess what you have learned.

44 of 50

Learning Target 38-1 Review cont.

Describe the characteristics of an

intelligence test, and distinguish

between achievement and aptitude tests.

Aptitude tests are designed to predict what you can learn.
The WAIS (Wechsler Adult Intelligence Scale), an aptitude test, is the most widely used intelligence test for adults.

45 of 50

Learning Target 38-2 Review

Discuss when and why intelligence

tests were created.

In the late 1800s, Francis Galton, who believed that genius was inherited, attempted but failed to construct a simple intelligence test. His hope had been to identify those with exceptional abilities and encourage them to reproduce.

46 of 50

Learning Target 38-2 Review part II

Discuss when and why intelligence

tests were created.

In France in 1904, Alfred Binet, who tended toward an environmental explanation of intelligence differences, started the modern intelligence-testing movement by Developing questions to measure children’s mental age and thus predict progress in the school system. Binet hoped his test would be used to improve children’s education rather than to limit their opportunities.

47 of 50

Learning Target 38-2 Review part III

Discuss when and why intelligence tests

were created, and explain how today’s

tests differ from early intelligence tests.

During the early twentieth century, Lewis Terman of Stanford University revised Binet’s work for use in the United States.
Terman believed intelligence is inherited, and he thought his modified version of the Stanford-Binet could help guide people toward appropriate opportunities.

48 of 50

Learning Target 38-2 Review part IV

Discuss when and why intelligence tests

were created, and explain how today’s

tests differ from early intelligence tests.

During this period, intelligence tests were sometimes used to document scientists’ misguided assumptions about the innate inferiority of certain ethnic and immigrant groups.

49 of 50

Learning Target 38-3 Review

Describe the normal curve, and explain standardization, reliability, and validity.

The distribution of test scores often forms a normal (bell-shaped) curve around the central average score, with fewer and fewer scores at the extremes.
Standardization establishes a basis for meaningful score comparisons by giving a test to a representative sample of future test-takers.
Reliability is the extent to which a test yields consistent results (on two halves of the test, or when people are retested).

50 of 50

Learning Target 38-3 Review cont.

Describe the normal curve, and explain standardization, reliability, and validity.

Validity is the extent to which a test measures or predicts what it is supposed to measure.
A test has content validity if it samples the pertinent behavior (as a driving test measures driving ability).
It has predictive validity if it predicts a behavior it was designed to predict. (Aptitude tests have predictive ability if they can predict future achievements.)