1 of 18

Lesson 4: Searching the web

Year 8 – Developing for the web

2 of 18

Key vocabulary

Starter activity

You probably use them every day, but what parts of a search engine can �you identify?

Visited hyperlink

Search bar

Search term

Unvisited hyperlink

Child pages

Categories

3 of 18

Key vocabulary

Starter activity

You probably use them every day, but what parts of a search engine can �you identify?

Visited hyperlink

Search bar

Search term

Unvisited hyperlink

Child pages

Categories

4 of 18

Lesson 4: Searching the web

Objectives

In this lesson, you will:

  • Describe what a search engine is
  • Explain how search engines ‘crawl’ through the World Wide Web and how they select and rank results
  • Analyse how search engines select and rank results when searches �are made

4

5 of 18

How do search engines work?

Activity 1

Search engines use keywords to categorise the web pages that they find.

When a user wants to find a useful web page, they enter these keywords and the search engine provides hyperlinks so that the user can access them.

5

6 of 18

Gathering information

Activity 1

Search engines use programs known as crawlers or spiders to find content on the World Wide Web.

These crawlers visit links from one web page to another, recording common keywords that they find.

By travelling along these links, the crawlers can eventually find newly created content.

6

7 of 18

Crawling

Activity 1

7

8 of 18

Crawling

Activity 1

8

Step 1: Source code of page explored for metatags that explain what the page is about

9 of 18

Crawling

Activity 1

9

Step 2: Important keywords are recorded (headings and words near the top of the document are flagged as more important for search results)

10 of 18

Crawling

Activity 1

10

Step 3: Hyperlinks are added to a queue, ready to be visited by the crawler once the search of the page is complete.

11 of 18

Indexing

Activity 1

When crawlers finish their journey, they are stored in a data structure called an index.

The index records the following about each web page:

  • Frequently used keywords
  • Type of content found, (images, text, etc.)
  • Date of last update

Other useful information is recorded that may be used later when users run their searches.

11

12 of 18

Crawl and index

Activity 1

  1. Open the ‘Crawl and index’ worksheet.
  2. Read through the HTML source code for two different web pages.
  3. Think about what a crawler might pick up as it reads the page.
  4. Complete the index table to predict what a crawler might summarise about each page.

12

13 of 18

Needle in a haystack

Activity 2

There are potentially millions of web pages that could be stored in a search engine index that correspond to a single keyword.

Searches query the index database to find pages with those keywords in them.

If you are looking to buy a ladder, why might the web page on the right appear at the top of the search results?

13

14 of 18

Spam

Activity 2

Web designers can use this knowledge to their advantage.

By filling a web page with multiple keywords, they can trick crawlers into thinking a page is more useful than it actually is.

14

15 of 18

Ranking algorithms

Activity 2

Search engine designers create complex algorithms that attempt to rank the importance of web pages, beyond the frequency that keywords appear.

How might ranking algorithms consider the following factors when judging the relevance of a web page?

  • When the page was last updated
  • Web pages that link to the crawled page
  • Other web pages that the crawled page links to
  • How long visitors to the page tend to stay

15

16 of 18

Build a high-quality web page

Activity 2

Considering all that you have learnt, you now need to create a web page that would rank highly at the top of a list of search results.

Your page needs to summarise the how search engines work, including:

  • How crawlers work
  • How web pages are indexed
  • How web pages could be ranked and why this is necessary

Use the ‘What makes a quality web page’ handout to give you some ideas about high-quality designs.

Save your web page as ‘search_engines.html

16

17 of 18

Plenary

Plenary

Swap seats with the person next to you.

Use the ‘Plenary review criteria’ handout to see if you think their web page has:

  • Clear headings using the heading tags
  • Important keywords near the top of each section of the page
  • Suitable meta-tags
  • Suitable images
  • No unnecessary information
  • A complementary colour scheme that isn’t too strong
  • Key information obvious on the page (e.g. uses bold, italics, etc.)

17

18 of 18

Next lesson

Summary

18

In this lesson, you...

Explored how search engines find and rank the content of web pages in order to provide more appropriate web pages for searchers

Next lesson, you will...

Consider how users can tailor searches to narrow the results and begin linking web pages you create using hyperlinks