1 of 2

S1

S2

S3

S4

/wiki/Mano_Solo

/wiki/The_Entire_History_of_You

URL is fetched by Server 2

URL is fetched by Server 3

2 of 2

en.wikipedia.org

S0

S1..N

POST /fetch_urls

GET URL

  1. GET wikipedia URL
  2. Extract URLs from doc
  3. POST http://N.crawler.company.com/fetch_urls