blog banner

What is Community Detection in graphs?

What is Community Detection in complex networks

The problem of community detection in graphs does not have a single, globally accepted definition; rather, it is entirely dependent on the application type and the system under consideration. After a careful examination of these definitions, it can be concluded that communities are subgraphs in which nodes are densely connected to each other compared to the rest of the network. For example, in social networks, individuals naturally tend to form groups in their work environment, family, and friends. All these groups are examples of subgraphs where the connections between members within them are more significant than outside them. Network analysts, computer scientists, and physicists have provided various definitions of community detection, generally categorized into local, global, and similarity-based.

Local Definitions

In social networks, a community often refers to a group where all participants are friends with each other. In graph theory, such a group is called a clique, where every two separate nodes are adjacent. A clique is essentially a graph where all pairs of nodes are connected, considered as a complete graph, which is a relatively strict definition of a community. According to this definition, if a pair of nodes is not connected to other nodes in the network, it can be considered a community. However, this definition has limitations, such as the inability to define hierarchical roles for nodes within a community.

The concept of a clique can be generalized by defining an n-clique, which is a subgraph that does not exceed a distance of n between any pair of nodes. For n equal to 1, this subgraph is a complete graph, equivalent to the previous definition of a clique.

Clique of Graph

Global Definitions

Unlike local definitions, in global community definitions, we specify not only the relationships between nodes within the community but also their connections to nodes outside the community. A community is defined as a subgraph where nodes are densely connected to each other and sparsely connected to the rest of the network. Algorithms in this context use a general graph feature to identify and deliver communities. The key idea is that a network with a community structure differs from a random graph. Several definitions extract from this concept, and one common model is the empty model proposed by Newman and Girvan, which redraws edges randomly while preserving node degrees.

The empty model introduces the concept of modularity, a crucial parameter representing the quality of the division of the graph into communities. According to Newman and Girvan, a subgraph is a community only if the total number of internal edges in the subgraph is greater than the expected number of edges based on the empty model. Modularity serves as a qualitative function mapping each section of the graph to a numerical value, indicating how well it maximizes modularity.

community structure in graph

Definitions Based on Node Similarity

It is natural to assume that communities are groups of nodes with similar characteristics. Similarity between any pair of nodes can be calculated based on some reference features, whether local or global, regardless of whether they are connected by an edge. Similarity metrics are based on traditional clustering methods such as hierarchical, partitional, and spectral clustering. In this definition, communities are identified not only by the topological structure of the network but also by the distance between nodes, which can be geographical distance, node degrees, etc.

Leave a Reply

Your email address will not be published. Required fields are marked *