Semi-Supervised Community Detection Using Structure and Size
Arjun Bakshi, Srinivasan Parthasarathy, and Kannan Srinivasan
Presented at: IEEE International Conference on Data Mining
November 17-20, 2018 in Singapore
Community Detection
Source: http://snap.stanford.edu/agm/
Source: Learning Community Embedding with Community Detection and Node Embedding on Graphs, CIKM’17
Semi-Supervised Community Detection
Size Information
Data source: http://snap.stanford.edu/data/
Structure in Communities
Image source: RolX/Discovering Roles and Anomalies in Graphs: Theory and Applications, Eliassi-Rad et al. ECML PKDD 2013
Defining Structure of a Community
*
*
Generate labels using existing techniques
un
Patterns in Communities vs Random Subgraphs
More common in random subgraphs
More common in communities
Semi-Supervised Community Detection
Summarizing Community Patterns
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
DBLP Communities
DBLP Community Patterns
CP1
CP2
CP3
CP4
CP5
Identifying Good Seed Nodes
Image source: RolX/Discovering Roles and Anomalies in Graphs: Theory and Applications, Eliassi-Rad et al. ECML PKDD 2013
Communities
Node ego nets
✅
✅
Putting it together: The Bespoke Algorithm
Input:
Graph + known communities
Node CP1 CP2 CP3 CP4
n1 0.9 0.2 0.5 0.3
n2 0.1 0.9 0.4 0.2
n3 0.3 0.2 0.1 0.1
CP1
CP4
Size of communities in CP1
Size of communities in CP4
Size of communities in CP3
Size of communities in CP2
CP2
CP3
Training
Putting it together: The Bespoke Algorithm
CP1
CP4
Size of communities in CP1
Size of communities in CP4
Size of communities in CP3
Size of communities in CP2
CP2
CP3
Node CP1 CP2 CP3 CP4
n1 0.9 0.2 0.5 0.3
n2 0.1 0.9 0.4 0.2
n3 0.3 0.2 0.1 0.1
Results: Datasets
| Nodes | Edges | Communities |
Amazon product network | 334,863 | 935,873 | 5,000 |
DBLP collaborations | 317,080 | 1,049,866 | 5,000 |
Youtube | 1,134,890 | 2,987,624 | 5,000 |
Twitter (ego) | 81,306 | 1,768,149 | 3,662 |
Facebook (ego) | 4,039 | 88.234 | 191 |
1: http://snap.stanford.edu/data/
Results: F1 scores
Results: F1 Scores
Results: Community Patterns (Amazon)
Bespoke
Actual Communities
BigClam
Random Subgraphs
Conclusion