A Unified Framework for Rank-based Evaluation Metrics for Link Prediction in Knowledge Graphs
Workshop on Graph Learning Benchmarks (GLB 2022)�Presented by Charles Tapley Hoyt on April 26th, 2022�Download at https://bit.ly/glb2022-ranking-metrics or https://zenodo.org/record/6489211, licensed under CC BY-4.0
1
Max Berrendorf* LMU Munich
Mikhail Galkin�Mila, McGill University
Volker Tresp
LMU Munich, Siemens
Benjamin M. Gyori�Harvard Medical School
Charles Tapley Hoyt*�Harvard Medical School
Common Rank-based Metrics Have Some Issues
2
Mean Rank (MR)�Better reflects average, but susceptible to outliers
Co-domain = [1, inf)
Mean Reciprocal Rank (MRR)�Biased towards low ranks, but doesn't completely disregard high ones à la hits at k
Co-domain = (0, 1]
Hits at K
Does not differentiate between misses at k + 1 and k + d for a large d��Co-domain = [0,1]�
General Form of a Rank-based Metric
3
Desiderata for Improved Rank-based Metrics
4
Motivation for Improved Rank-based Metrics
5
Metrics' Expectation Depends on Dataset Size
6
New Metrics through Statistical Adjustment
Solution: introduce affine statistical adjustments (like Bonferroni in statistics). Make adjustment by:
�
7
Adjust by expectation | |
Adjust by expectation and optimum | |
Adjust by expectation and variance (z-score) | |
Proposed New Metrics
Each new metric comes with a reference implementation already available in PyKEEN v1.8.0. Use the following code to get started (it accepts lots of synonyms, too):
8
Case Study: Rank-based Evaluation Metrics
9
☢️ Brief Derivations for MRR ☢️
Post-facto Adjustments
We pre-computed the expectations and variances for 34 datasets in PyKEEN for each:
Download from https://zenodo.org/record/6369163 as a gzipped TSV and apply as an "affine" transformation to existing results.
10
11
Paper | 📜 | |
Reference Implementation | 🧑🔬 | |
Analysis and Results | 📊 | |
Website | 🌐 | |
Post-facto Adjustments | 🎛️ |
12
PyKEEN Advisors
PyKEEN Contributors
See also: https://pykeen.github.io/organization
Generalized Hölder Means
13
General Form of a Rank-based Metric
14