1 of 51

Community Detection in Networks

January 15th, 2026

CS60078

2 of 51

An Example: Communities in Belgium

3 of 51

Communities in Social Networks

The employees of a company are more likely to interact with their coworkers than with employees of other companies. Consequently work places appear as densely interconnected communities within the social network.
Communities could also represent circles of friends, or a group of individuals who pursue the same hobby together, or individuals living in the same neighborhood.

Zachary’s Karate Club

4 of 51

Defining Communities

Communities are locally dense connected subgraphs in a network. This expectation relies on two distinct hypotheses:

Connectedness Hypothesis

Each community corresponds to a connected subgraph, like the subgraphs formed by the orange, green or the purple nodes.

Density Hypothesis

Nodes in a community are more likely to connect to other members of the same community than to nodes in other communities.

5 of 51

Defining Communities

Network communities are group of vertices such that vertices inside of the group connected with many more edges than between groups.

What makes a community?

Mutuality of ties: Almost everyone in the group has ties to each other
Compactness: closeness or reachability of group in small number of steps
Density of edges
Separation: Higher frequency of ties among group members than non-group members.

https://www.nature.com/articles/srep05739

6 of 51

Maximal Cliques

One of the first papers on community structure, published in 1949, defined a community as group of individuals whose members all know each other.
In graph theoretic terms this means that a community is a complete subgraph, or a clique.

Any drawbacks of this notion?

While triangles are frequent in networks, larger cliques are rare.
Requiring a community to be a complete subgraph may be too restrictive, missing many other legitimate communities.

7 of 51

Formally

8 of 51

9 of 51

Strong vs Weak Community

Strong Community

C is a strong community if each node within C has more links within the community than with the rest of the graph, thus for each node i in C

Weak Community

C is a weak community if the total internal degree of a subgraph exceeds its total external degree

10 of 51

A simple version: division of network into two groups

Do you see an issue?

11 of 51

Ratio Cut

Ratio Cut Partitioning: Instead of minimizing the standard cut size R, we instead minimize the ratio R / (n₁n₂), where n₁ and n₂ are the sizes of the two groups. However, it biases for groups with equal sizes.

Pawan

12 of 51

Modularity: Random Hypothesis

Random Hypothesis: Randomly wired networks lack an inherent community structure.

In a randomly wired network the connection pattern between the nodes is expected to be uniform, independent of the network's degree distribution.
Consequently these networks are not expected to display systematic local density fluctuations that we could interpret as communities.

How to use this?

By comparing the link density of a community with the link density obtained for the same group of nodes for a randomly rewired network, we could decide if the original community corresponds to a dense subgraph, or its connectivity pattern emerged by chance.

13 of 51

Formal Definition

Consider a network with N nodes and L links and a partition into n_c communities, each community c having N_c nodes connected to each other by L_c links, where c =1,...,n_c

We measure the difference between the network’s real wiring diagram (A_ij) and the expected number of links between i and j if the network is randomly wired (p_ij)

14 of 51

What will be the expected number of links?

p_ij can be determined by randomizing the original network, while keeping the expected degree of each node unchanged

15 of 51

16 of 51

Modularity for the network

where C_i is the label of the community to which node i belongs to

This can be simplified further to

17 of 51

Modularity for the network

18 of 51

Maximum Modularity Hypothesis

For a given network the partition with maximum modularity corresponds to the optimal community structure.

The maximum modularity hypothesis is the starting point of several community detection algorithms, each seeking the partition with the largest modularity.
In principle we could identify the best partition by checking M for all possible partitions, selecting the one for which M is largest.
Given, however, the exceptionally large number of partitions, this brute-force approach is computationally not feasible.

19 of 51

Greedy Algorithm

Step 1. Assign each node to a community of its own, starting with N communities of single nodes.

Step 2. Inspect each community pair connected by at least one link and compute the modularity difference ∆M obtained if we merge them.

Identify the community pair for which ∆M is the largest and merge them. Note that modularity is always calculated for the full network.

Step 3. Repeat Step 2 until all nodes merge into a single community, recording M for each step.

Step 4. Select the partition for which M is maximal.

20 of 51

Issue with the Greedy Algorithm

The O(N²) computational complexity of the greedy algorithm can be prohibitive for very large networks.
A modularity optimization algorithm with better scalability was proposed by Blondel and collaborators

21 of 51

Louvain Algorithm

Slide Courtesy: Jure Leskovec, Stanford CS224W