Observations Gained by Watching Young People Search the Internet

Please send comments & questions to dave //at// plml \\dot\\ org.

The Origins of the Question

A lengthy discussion on the ALA Librarians Listserv prompted wonderment about whether searchers’ search strategies differed for content they knew much about (high prior knowledge) and content they knew little about (little prior knowledge).

Specifically, the following thread indicated the need for initial research in the area:

From: Debbie Abilock [e-mail removed]

Will we qualify the novice / expert distinction to be discipline or topic

specific? - I'm a novice at ship navigation, more expert at earthquakes - so

will my behavior change with the topic or will I transfer strategies...?

-----Original Message-----

From: Dave H Crusoe [e-mail removed]

Sent: Wednesday, June 06, 2012 8:30 AM

Subject: [INFOLIT] Re: FW: Re: Re: Search, and 'slow vs fast' exploration

Debbie (all),

I concur with your assessment that teaching novices to pause PRIOR to

conducting a search isn't likely to significantly impact behavior.

Nonetheless, I wonder if there are a differences between novice and expert

approaches to searching that weigh on this assessment.

Drawing from other literature, I'd expect a novice to exhibit an imprecise,

scattershot method - "george washington" - and to subsequently visually

parse the results for something that looks most appropriate (or click on the

very first thing). On the other hand, I'd expect an expert to approach

methodically, if briefly, with a more precise initial query (e.g., using

specific phrases, URL filters, etc) and intentionally exploring the results

in more depth.  

A little observation could bear this out. Maybe someone has tried this?

It would also be curious to observe whether the responsiveness of search

results (e.g., the dynamic refresh, search-while-you-type) is shifting the

expert pattern to be more fluid and less - as above - pre-determined and

intentional.

Intrigued by the possibility that (A) searchers may exhibit different behavior based upon expert and novice patterns and (B) that expert-novice patterns may be domain-specific, we created a brief observation and evaluation instrument [link] to explore matters further.

Earlier Research

Information literacy researchers have conducted and published limited studies about how people search the web - presumably,  more of this research is in the archives at search companies and not publicly accessible.

How the Survey was Conducted

To date, the evaluation instrument has been used with four middle-school level students. [Note: the report will be amended, and evolve, as the participant number increases - several other subjects are being readied for interviews].

Researchers met with students at their homes, or in public spaces. We planned 60 minutes for each meeting; the instrument requires approximately 40 minutes, from start to finish. Due to participants’ young ages, we required parental consent forms [link] for each child.

The instrument was delivered in accordance with general survey practices, although questions were amended, in some cases, as students displayed a differing expertise, skill and knowledge profile. Open-ended ‘thinkaloud’ sections were the source of most variation, due to the subjects that participants were asked to explore for/about, the nature of questions that arise when asking a young people to ‘think aloud’, and of subsequent follow-on questions to probe the nature of what each respondent might have been considering.

At the conclusion of each session, respondents were provided with a thank-you gift (Harvard COOP or ITunes gift card with a $15 value).

Observations with Respect to the Research Question

To characterize the difference between novice and expert, we take guidance from Chi, Glaser and Rees, 1982[1]. Chi, Glaser and Rees describe expertise as a combination of wide and deep subject/domain knowledge (“procedural and declarative”, p.70) and the ability to call upon, and retrieve, a broad range of “schemata” for interpreting the information they are provided, or required to explore. On the other hand, novices are thought to display less profound background knowledge about a subject, and to display less focused strategies for interpreting the information they are provided (p. 70-71).

A treatment of web search expertise can also be found in Kuiper et al (2008, p. 683, as cited in Gasser, Cortesi, Malik and Lee, 2012[2], p105).

In our case, specific measurements of students’ domain knowledge were not conducted. In place, subjects were asked to rate their familiarity with the content they selected for search, and the content provided to them for search.

And, although limited in number, the observations (A) make it clear that expert and novice patterns may differ in some important ways and (B) indicate that further, more through, research may yield concrete and statistically significant results in the area.  

Findings include:

Each respondent referenced Google as their go-to engine for searching, and almost always used a single search to obtain their results. Few participants spent significant time on the results page, and most selected just one result to explore. One once, of all searches, did a participant scan to the second results page.

The result selected most often used was Wikipedia, followed by Ask.com, and most respondents saw no trouble in using Wikipedia as their sole source for information[3]. 

On the nature of novice and expert patterns

There does seem to be an exception, visible when learners had already identified sources for expertise beyond Wikipedia and Ask.com. However, it may not be in the approach (as was suspected in the listserv discussion) but in the secondary step of identifying the right reference/information material & site.

In these cases, learners had identified corresponding “go-to” sites for learning about topics. For instance, one participant matched several “Magic The Gathering” sites with specific questions I asked about where he might find me information about Magic. It occurred to me that, through prior searching, he had developed some subject-matter expertise for understanding both what he needed to find, and understanding how he wanted to use the content on the sites.

In one other case, I was able to observe the reference-site behavior in formation. I had asked a participant to show me information about how caves are formed, and she identified a site that provided this information[4]. We cleared the screen and started a new task - to search for ‘what a cave pearl is’. Her search results presented a normal profile, and from the list, she selected the same site that she had selected earlier, despite there being other worthy candidates for the same information.

When I asked her why she had selected that result, she explained that "it was on a site that I know".

Background knowledge does play another role in the understanding of content. As one respondent mentioned, knowing something about the subject is helpful for identifying and discerning “good” from “bad” information on a page. In one of his searches (benefits of skateboarding), he identified a page that helped to build his belief and understanding of some topics, but disagreed with the quality of information about others.

Thus, not only are sites “good” and “bad”, but it is potential that an expert searcher is able to draw upon background understanding to identify the “good” and “bad” within any single resource.

On the nature of search iteration

How I wish I were able to report that searchers frequently iterated their searches with keyword refinements! They did not - in fact, over the several hours of combined searching, just two searches (performed by a single participant) were iterative.

In one case, I asked the participant to find the population of a small Pacific island country. Whereas her initial search yielded a wide range of information, her second search (including the word “estimated”) yielded a page that gave her the information she needed to report a number. The participant added modifier, “estimated”, as she stated that nobody knew the real answer, and that most resources only “estimated” the population figures.

On the nature of keyword selection

Most participants merely re-typed my question into the search engine - as a full sentence. One participant went as far as to explain that my question was too long, and that the search he was conducting might fail. This finding seems an evolution from evidence from research in 2007/8, in which keywords, rather than sentences, may have been the primary interface for information retrieval (Gasser, Cortesi, Malik and Lee, 2012[5], p.43).

The use of full sentences seems to have an adverse effect on results. Answer sites optimized for specific sentences appeared - often Ask.com, About.com and Wikipedia. In cases when students more selectively used keywords, they also seemed to select more specific corresponding sites.

When asked why participants changed the words used in the search, they explained that "I use words I know the meaning of" and "I use words that will give me the exact result I'm looking for.”

So there was some mental manipulation worthy of exploring further. Under which conditions will a learner consider the question and the desired output, and form search-engine-esque keyword structures?

Implications for Search Education

Although a small sample size, we can learn from the experience in the following ways:

(1) Most participants were unable to identify a search engine, or make differentiations between search engines, question/answer sites and even web browsers. Students should receive explicit instruction about what a search engine is, and the difference between a browser, search engine, community answer site, online community and wiki.

(2) Most participants were unable to explain what they trusted about sites, or did not trust about sites. Few subjects correctly read the URLs associated with sites, or were fooled by the <title> of a site and the difference between the <title>’s seeming correspondence to the research topic and the true site content. Teaching students to critically evaluate the multiple components of web search results AND strategies to evaluate the trustworthiness of content ON a site seems critical, if not fundamental.

(3) Keyword abstraction may be important. Strategies for teaching students to think carefully about how to parse the research question may require more emphasis, and an explanation of why this practice is important (given the context of natural language searching).

(4) Helping students identify quality “go to” sites may aid in future related searches. Helping students understand WHY sites qualify as good “go to” sites, and working through activities that critique the quality of a reference site, may be productive. For example, both Wikipedia and Ask.com were used as “go to” sites -- critiquing the merits of these resources may be productive if alternatives are presented.

Limitations with the Current Study

As was stated, this study was limited in number and non-scientific in its execution. It was designed to present exploratory results in the area to inform and drive a discussion online.

Additionally, participant selection is likely biased as a result of where outreach and recruitment were conducted. Economically, it is likely that participants were recruited from a variety of backgrounds. Educationally-speaking, however, each respondent was related to a parent with significant ties to a top-tier university school of education. All of the respondents’ parents, and respondents, cared deeply about education.

Recommendations for Future Study

These preliminary observations indicate that future research is likely to be productive in teaching search educators and researchers more about how young people find and evaluate information, about how young people apply skills learned in one search context to a new search context, and about how novices and experts differ in their approach to finding information. This is particularly important for those who want to understand the range of skills a novice must acquire to become an expert (but does not indicate how best to teach these skills).

The following are (some) approaches worth considering:

  1. An increase in the sophistication of this research model and number of participants for the current study;
  2. An exploration of whether learners apply skills utilized in one graphical context (e.g., Google) with a new graphical context (e.g. Bing). Participants mentioned that Bing was “different’ from Google, and “weren’t sure” about how to use it. One participant was asked explicitly to try Bing for his searches, after he had completed earlier searches using Google. He opted to “google for bing” to access Bing.com - he did not type the address into the URL bar. No noticeable difference was observed in his approach to using Bing vs. Google.
  3. Under which conditions will a learner consider the question and the desired output, and form search-engine-esque keyword structures?  Is this engine-dependent?
  4. What does a person’s anticipated results set contain? E.g., what are they expecting to find?
  5. What are criteria that children naturally use to identify a site as “good” or “bad”?

[1] http://www.public.asu.edu/~mtchi/papers/ChiGlaserRees.pdf

[2] http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2005272

[3] In fact, most respondents indicated that Wikipedia and Ask.com are search engines. One respondent provided an exception to the rule ("someone can just enter Sponge Bob for anything in there"... "my teacher told me never to use it") and displayed a slightly different search profile.

[4] She did not, however, know how to explain to me why she thought it was factual or trustworthy; although some basic author research was a relatively simple task, as the author is a noted cave photographer and writer. I led her through an author research task, but no strong connections were drawn between qualifications (e.g., PhD and educational background) and content.

[5] http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2005272