1 of 11

Vision-based Page Rank Estimation with

Graph Networks

Timo I. Denk, Samed Güner (TINF16 B1)

May 20, 2019

DHBW Karlsruhe

Report: https://timodenk.com/arxiv/201905-pagerank.pdf

2 of 11

Problem Statement

Does the appearance of a website correlate with its popularity?

3 of 11

Appearance-Rank Correlation

good look

poor look

high rank

low rank

4 of 11

Dataset

We crawled the world's top 100k websites and generated a dataset of graphs

desktop screenshot

mobile screenshot

hyperlink

540 ms

1337 KB

HTTPS errors ...

560 ms

1326 KB

HTTPS

errors ...

489 ms

1557 KB

HTTPS

errors ...

5 of 11

Datacrawler

6 of 11

Model Architecture

desktop screenshot

mobile screenshot

hyperlink

Screenshot Feature Extractor

feature vectors

Graph Network

estimated rank

screenshot graph

feature vector graph

7 of 11

Training Objective

Ranking problem

Pairwise ranking loss, Burges et. al (2005)

Given two samples, guess which one is higher ranked

→ Parameter update with stochastic gradient descent

8 of 11

Visualization of the intermediate representations of the feature extractor ("activation maps")

column layout

natural images

9 of 11

Human Score

Baseline for performance comparison

#68,636 (top)

#15,898 (bottom)

screenshots of the first page

screenshots of the second page

10 of 11

Results

Mode

Accuracy

Description

Random guessing

50.0%

Random guessing in pairwise prediction

Human score

57.8%

Experiments with nine test people

Feature extractor

60.7%

CNN only

Graph network

62.7%

CNN feature extractor + graph network

11 of 11

Contributions

Publication of our screenshot dataset

Page ranks can be inferred from web page screenshots, correlation exists

ML model for vision-based page rank estimation based on a combination of convolutional and graph network