You are on page 1of 32

Topic: Graph Theory in Social Network Analysis

Research question: How can graph theory be used to analyze social networks?

Contents
1

Abstract...........................................................................................................................3 Introduction: History & Clarification of Terminology Graph Theory in Social Network Analysis........................................................5 A Brief History of Social Network Analysis......................................................6 Definitions..........................................................................................................7 Applications of Graph Theory in Social Network Analysis Levels of Analysis..............................................................................................9 Centrality and Density ....................................................................................10 Identifying Subgroups: Cliques, n-Cliques, and Cohesiveness .....................14 Structural Equivalence ..................................................................................20 Ramseys Theorem .........................................................................................24 A Case Study..............................................................................................................27 Conclusion ................................................................................................................30 Works Cited...............................................................................................................32

Abstract
2

Social network analysis refers to a methodical analysis of social networks. (Reis 4) In its present condition, it is a somewhat disjointed field of knowledge, being comprised of distinct (but related) analysis techniques. To date, no cohesive theories of social network analysis have been developed. In lieu of indigenous theories, the field has imported a number of theories from related areas of study. Particularly, it has drawn heavily from the mathematical field of graph theory. (Kilduff 38) This essay attempts to illustrate some of the ways in which graph theory is an essential component of social network analysis. Thus the research question: How can graph theory be used to analyze social networks?

Several quantitative techniques are examined, most of which belong to an existing canon of social network analysis approaches. In order, these topics are: levels of analysis, centrality, density, cliques, n-cliques, cohesiveness, structural equivalence, and Ramseys Theorem. All are common in that they build on graph theory concepts, therefore addressing the stated research question.

In practice, there is not one single approach to the rather daunting task of analyzing social networks. Instead, the way in which one confronts an analysis varies with the constraints of the experiment or study being conducted, and with the goals of the researcher in conducting the study. This essay discusses quantitative methods and concepts that are universally applicable to social network analysis problems. This essay does not attempt to provide a comprehensive overview of either social network analysis or the graph theory from which the former inherits. It does, however, provide limited

insight into the issues that a social network analyst using graph theory might be concerned with. Word count: 274

Word count: 3723

Introduction: History & Clarification of Terminology


Graph Theory in Social Network Analysis

In popular culture, social network has come to be synonymous with social networking service. This misnomer is further complicated by the fact that online social networking services do indeed deal with social networks, as they are understood in the context of social network analysis.

Social networking services, as their name suggests, facilitate the cultivation of social networks: those structures that represent people and the connections between them. (Scott 8) Just as one can represent a 2-dimensional figure either as a series of listed coordinates, or graphically on a cartesian plane, social networks can be represented either in tabular form, or in graph form. (Scott 49)

Graphs represent an attractive alternative to the sociomatrix, that is, a social network in matrix form. While the latter is useful for representing large sets of analyses, it lacks the visual immediacy of graphs. How can graph theory be used to analyze social networks?

The emphasis on relationships in social network analysis, as opposed to isolated properties, is highly profitable in the context of sociology. The former field, and the graph theory that it makes use of, prioritize the relations between entities. It is assumed

that social structure is more often determined by the connections between entities than it is by the individual properties of entities. This preference is supported by a study that categorized elite families in 15th-century Florence, as supporting either the Medici or oligarchic political factions. As the study showed, the marital and economic relations between families more closely aligned with the familys political leanings than did individual status attributes. (Scott 4)

A Brief History of Social Network Analysis

The Swiss mathematician Leonhard Euler laid the initial foundation for modern social network analysis in earnest. (It must be pointed out that ancient Greek scholars had discussed network analysis ideas well before the eighteenth century.) (History of Social Network Analysis) In the renowned analysis of the Seven Bridges of Knigsberg problem, Euler developed a mathematical notation of points and lines, from which he derived proofs. Eulers formalization of paths across bridges was an important precursor to modern graph theory. (Konigsberg Bridge Problem)

Fig. 1

In the 1930s, ideas from diverse fields such as psychology, anthropology, and mathematics coalesced into the field now known as social network analysis. (Scott 2) While earlier sociology paradigms used cultural ideas to explain social patterns, social network analysis instead placed attention on patterns of interaction and interconnection. Similarly to Euler, pioneering social network theorists adopted a terminology of points and lines to represent the webs of social structure. In the 1950s, researchers in the social psychology specialism group dynamics began to use systematic mathematical arguments to analyze and interpret group structure. Those researchers embraced the mathematical approach known as graph theory, which enabled them to operationalize concepts about social networks, like the centrality of individuals. (Bloomsbury Academic)

Definitions

Before proceeding in the investigation, it is necessary to clarify several graph theory and social network analysis terms.

The fundamental elements of any social network are actors and relations.

Actors may represent individuals persons or groups. An individual actor may be a child in a daycare, or a group of construction workers. In graph theory, actors are represented as vertices, or points on a graph. (Scott 45)

Relations are connections between actors. The relation between two actors in conversation is conversing. In graph theory, relations are represented as lines between vertices, commonly called edges. Relations can either be directed or nondirected. Where mutuality occurs, as in the example of two conversing persons, the relation is considered nondirected. By contrast, the relation between a teacher and a student is directed because the teacher teaches the student, and not the inverse. A directed graph is known as a digraph. (Scott 51)

The degree of a vertex is the number of points to which a vertex is adjacent. The degree of a vertex is otherwise known as its centrality. In a social network, centrality correlates directly with actor prestige. (Scott 62)

The order of a graph is the number of vertices on the graph.

A graph is connected when there exists a path from an arbitrary point to any other arbitrary point on the graph. (Graph Theory Glossary)

A sociomatrix is a tabular representation of a social network. (Scott 49)

A complete graph is an undirected graph in which every pair of vertices is connected by an edge. That is to say, a connection exists between nodes where a connection is possible. (Graph Theory Glossary)

Cliques will likely exist in a large graph. A clique is a complete subgraph (a graph within a graph) of three or more constituent vertices. Cliques are distinct from the larger network in that no other node in the network has a connection to every node in the clique. If that were the case, then that node would be included in the clique. (Scott 72)

Applications of Graph Theory in Social Network Analysis

Levels of Analysis

Network analysts have several options when beginning an analysis of structures in a dataset. They choose out of four conceptually distinct levels of analysis; the egocentric level, the dyadic level, the triadic level, and the complete level.

An egocentric network considers one actor (the ego), and the other actors with which the ego has direct relations with. One would use an egocentric network to examine the personal network of a middle school teacher: the co-workers, family members, and friends she interacts with. The next level, the dyadic network, considers pairs of actors. An analysis of a married couple a dyadic network might seek to explain the effect of contrasting economic backgrounds on marital strength. Put in other words, the study would seek the variation in dyadic relations as a function of pair characteristics. At the third level, the triadic level, one searches for triangular structures within a survey of actors.

Triadic structures hold interesting transitive implications. For instance, if actor A is friends with actor B, and B is friends with actor C, what is the likelihood that actor A is friends with actor C? (It is worth noting that triadic structures are the simplest case of cliques, which will be elaborated on later.) Lastly, the complete network analysis takes into consideration every possible relation in a set of actors.

Each level of analysis provides a distinct perspective. Suppose one initiates a study of the 2009 global recession. On one hand, researchers may gain invaluable insights from an ego network analysis, which is in this case the effect of the collapse of the US housing market on the economies of other nations. But there is undoubtedly value in examining relations on the dyadic, triadic, and complete levels. Analyzing the interactions between various nations yields a particular class of insights, as does an analysis of the interactions between citizens. The point being that phenomena may be obvious on one level of analysis, and less than obvious on another. (Scott 12)

Centrality and Density

We now consider a network of friends, depicted on the following graph:

10

Fig. 2

Each connection represents a friendship between actors. Some actors contribute more to the health of the network than others. The measure of an actors connectedness is centrality, which is calculated as follows, where g is the number of actors in the network. is the centrality of actor i, and

evaluates to 1 if a connection exists between

actor i and j, and to 0 if no connection exists. (Scott 63)

Note that the summation excludes the actor in question, hence the

We can calculate the centrality of each actor using the equation above.

Sample calculation for Micah:

11

=1+1+0+0 =2

Actor Connor Erica Micah Sandy Devon

Centrality 2 3 2 4 1

As Sandy is the actor with the most connections, her centrality is the highest. She is therefore the most prestigious actor in the network. Sandy is what is known as a cutpoint, insofar as the removal of her connections would disconnect the graph into several constituent graphs. (Scott 49)

The removal of Sandys connections:

Fig. 3

12

Recall that a connected graph contains a path from any point to any other point in the graph. The graph, now disconnected via the cutpoint, cripples the communicative ability of an actor like Devon.

While centrality quantifies connectedness of any one actor, how does one evaluate the connectedness of the graph as a whole?

Density provides a concrete measure of a graphs connectedness. It is the ratio of dyadic ties (ties between two actors) divided by the maximum possible dyadic ties, where L is the number of dyadic ties present, and ties for a graph of N nodes. (Scott 53) is the maximum possible dyadic

In fig. 2, the graph density is calculated as follows:

= .6 or 60%

In fig. 3, the graph density is:

= .2 or 20%

13

The drop in density after the cutpoints ties were severed illustrates the importance of locating central actors. In our case-study network, Sandys connections accounted for 40% of the graphs density. As such, she occupies a perilous position whereby her removal significantly compromises the connectivity of the graph.

Identifying Subgroups: Cliques, n-Cliques, and Cohesiveness

Fig. 4

It is of interest to social network analysts to identify sub-structures within a network.

Recall that a clique is (formally speaking) a complete subgraph, distinct from the network in that no other node in the network is connected to every node in the clique. (Scott 72) The above network contains 24 such structures.

Cliques in Fig. 4: {7,11,12,13,14}, {7, 12, 13, 14}, {11, 12, 13, 14}, {7, 11, 12, 13}, {7, 11, 13, 14}, {7, 11, 12, 14}, {1,2,3,4}, {1,2,4}, {2,3,4}, {1,2,3}, {1,3,4}, {5, 6, 10}, {6,7, 11},

14

{7,8,9}, {7,13,14}, {7,11,14}, {11,12,14}, {11, 12, 13}, {12,13,14}, {7, 12, 14}, {7, 12, 13}, {11,13,14}, {7, 11, 12}, {7,11,13}

In fig. 4, many vertices belong to more than one clique. This alludes to the tendency for social groups to have overlapping members.

The majority of the cliques in fig. 4 are triadic, meaning that they are comprised of three connected actors. Of these triangle shaped cliques, 14 out of 17 (~ 82%) exist within larger cliques. In fig. 4, the two larger cliques are {7, 11, 12, 13, 14} and {1,2,3,4}.

Fig. 5

The above figure shows the subgraph {7, 11, 12, 13, 14}, otherwise denoted by a complete graph of 5 vertices. 10 three-node cliques exist within this subgraph. The number of cliques within a clique by definition a complete subgraph can be

15

calculated by clique.

, where J is the order of the sub-clique, and N is the order of the larger

Cliques Within J 3 10 Names of cliques {7,13,14}, {7,11,14}, {11,12,14}, {11, 12, 13}, {12,13,14}, {7, 12, 14}, {7, 12, 13}, {11,13,14}, {7, 11, 12}, {7,11,13} {7, 12, 13, 14}, {11, 12, 13, 14}, {7, 11, 12, 13}, {7, 11, 13, 14}, {7, 11, 12, 14}

Cliques have implications in Ramseys Theorem, which will be touched upon later.

In the case that one is dealt real world sociological data, the complete graph requirement generally proves too stringent. Where one wishes to identify looser groupings, that is, connected subgraphs that arent complete, the n-clique approach is preferable.

The n-clique approach is concerned with distance, n, between any dyad in the graph. A 2-clique requires that every node be at maximum two units of path length away from every other node in the subgraph. A 3-clique requires that every node be at

16

maximum three units of path length away from every other node in the subgraph. Notice that fig. 5 is a 1-clique, according to the n-clique definition. (Scott 74)

Fig. 6

In order to classify the above n-clique, a table has been generated which lists the geodesic distance (minimum distance) between all possible pairs of nodes .

Pair name

Geodesic distance between nodes 1 1 1 2 2 2 2

17

1 2 1 2 2 1 1 1 Since in no pair does the geodesic distance exceed 2, fig. 6 shows a 2-clique. A second subgraph of order 6 is shown below.

Fig. 7

Pair name

Geodesic distance between nodes 1 3 2

18

2 3 2 1 1 2 3 1 2 2 3 1

As shown in the table above, the maximum geodesic distance between nodes in fig. 7 is 3. So, fig. 7 is a 3-clique.

fig. 8

Fig. 8 shows three graphs of order six from left to right, a 1-clique, 2-clique, and 3-clique. As the n value increases, the graph becomes less connected. Fig. 8 is
19

meant to illustrate the property of cohesion within subgroups, which is relatively intuitive with the graph representation. An n-clique that has a relatively high n is less cohesive than an n-clique with a relatively low n. Note that a cohesive graph will necessarily be more dense than an incohesive graph. (Scott 72)

Identifying subgroups in graphs cliques and n-cliques is of interest to social scientists who wish to understand the ways in which smaller structures form within a larger network. Furthermore, classifying these smaller groups according to their cohesiveness speaks to the degree to which actors trust each other, and to the amount of social capital, or collective value, present in the group.

Structural Equivalence

While cliques naturally extend from ideas of cooperation within groups, structural equivalence, generally speaking, refers to competition between actors in a network. Actors that are structurally equivalent have identical sets of connections to other nodes. In precise terms, node i and j are structurally equivalent if, for all nodes in the network k, node i has a tie with k if and only if j has a tie with k, and j has a tie with k if and only if i has a tie with k.

20

fig. 9

While the above example is perhaps facile, it serves to illustrate an elementary case of structural equivalence. The digraph represents a school classroom, where every student has identical connections with their teacher. Each student is structurally equivalent vis-a-vis the equivalency of all student-teacher relationships. The structural equivalence present in the above graph could indicate competition among students for the limited resources and attention of the teacher.

But as with cliques, structural equivalence is better understood as a phenomenon on a spectrum, as opposed to one that is more rigidly defined. In a real world situation, two actors are more likely to be approximately structurally equivalent than they are to be precisely structurally equivalent, as the formal definition prescribes. (Scott 76)

21

fig. 10

Sociomatrix Representation of Fig. 10

1 1 2 3 4 5 6 7 8 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0

3 0 0 0 1 0 0 0 1

4 0 1 0 0 1 0 0 0

5 0 0 0 0 0 0 0 0

6 0 0 0 0 0 0 0 1

7 0 0 0 0 0 0 0 0

8 1 0 0 0 1 0 1 0

In a digraph, the degree to which two nodes are structurally equivalent can be measured using a Euclidean distance formula. k iterates through all possible vertices, excluding those being measured for structural equivalence. The first subscript refers to a row in the sociomatrix, and the second subscript refers to a column. (Scott 77)

22

The structural equivalence of two points and nodes that are structurally equivalent yield a structural equivalence decreases .

are inversely proportional. Two increases,

value of 0, and as

For the above graph, the structural equivalence of nodes 4 and 8 (highlighted in orange) is calculated as follows:

The distance formula measures the dissimilarities in the actors structural relations. In the above case, the formula yielded a distance greater than zero. Therefore, nodes 4 and 8 on fig. 10 are not structurally equivalent. That being said, other nodes on the graph are more structurally equivalent than nodes 4 and 8, without being precisely structurally equivalent.

23

To illustrate this, the structural equivalence of nodes 1 and 3 is calculated below:

Ramseys Theorem

Since social networks graphs are analogous to graphs in graph theory, most ideas from graph theory have interesting sociological implications. One such idea was proposed by the mathematician F.P. Ramsey. Broadly, Ramseys Theorem makes statements about the existence of complete subgraphs within a graph of a given size. (Ramsey Theory)

Ramseys Theorem states:

For each pair of positive integers k and l there exists an integer R(k, l) such that any graph with R(k, l) nodes contains a clique with at least k nodes or an independent set with at least l nodes. [6] Stated differently, R(k, l), the Ramsey number, is the minimum number of vertices a graph must have in order to contain either a clique of order k, or l unconnected vertices.

24

The party problem is Ramseys Theorem is applied to a common social situation. The problem asks the minimum number of guests, R(k, l) , that must be invited to a party so that either at least k guests know each other, or at least l guests do not know each other. In graph theory terms, to satisfy the party problem, either a clique of order k, or an independent set of l vertices exists in the graph of order R(k, l). (Ramseys Theorem)

While a proof of Ramseys Theorem is beyond the scope of this paper, the case of R(3, 3) = 6 will be shown.

In the following graph, let blue edges represent friendships, and black edges represent the relation between strangers.

fig. 11

In R(3, 3), both k and l are 3, meaning that either a clique of order 3 or an independent set of of 3 vertices must exist. On fig. 11, therefore, the presence of a black

25

or blue triadic structure confirms the conclusion of Ramseys Theorem, that R(3, 3) = 6. Indeed, by inspection, 7 black triangles exist: {1, 5, 6}, {2, 5, 6}, {1, 4, 5}, {2, 3, 5}, {3, 4, 5}, {4, 5, 6}, and {1, 4, 6}. Note that no blue triangles need appear on the graph, as the condition of Ramseys Theorem states that either a clique or an independent set must exist, with the order prescribed by k and l respectively.

In terms of social network analysis, Ramseys Theorem has interesting sociological implications. As shown above, in any group of six people, at least one or more groups of three will be acquaintances, or one or more groups of three will be unacquainted. This is significant because in either case, there is organization within the graph. Ramseys numbers allow us to make specific predictions about the substructures of a graph of arbitrary order.

Although the exact value of many Ramsey numbers is unknown, those that have been found by mathematicians can provide insight into the dynamics of a correspondingly sized group, represented in graph form. As proven by Greenwood and Gleason in 1955, for example, any graph of 18 actors has at least one clique or order 4, or one independent set of 4 vertices. (Greenwood 7) While previously described techniques for social network analysis have relied on information about specific instances of graphs, Ramseys Theorem makes generalizations about graphs of a certain size. One gains insights from known Ramsey numbers by applying inductive reasoning, moving from the general to the specific.

26

A Case Study

To summarize the previous sections, and to demonstrate how one might apply a combination of the aforementioned techniques to a social network analysis, a case study is presented.

fig. 12

fig. 13

27

Levels of Analysis:

Fig. 12 shows a network at the egocentric level of analysis. In this case, the ego is the actor named Nesbit, highlighted in orange. This study considers Nesbit and his immediate circle of friends.

Cliques and Ramseys Theorem:

By virtue of the fact that fig. 12 is a graph of order 6, Ramseys Theorem can be used to make predictions about its relational content. As explained in the previous section about Ramseys Theorem, 6 is the Ramsey number for R(3, 3). This means that in any graph of order 6, there will be either a clique of order 3 or an independent set of 3 vertices. Currently, the network abides by these rules. Fig. 12 shows 3 order cliques, and fig. 13, which superimposes the stranger relations on the previous graph, shows independent sets of order 3 as well. The insight to be gained from the application of Ramseys Theorem on this network is that at any point in the future, either 3 actors out of the set of Nesbit and his friends will remain friends, or 3 actors out of the set of Nesbit and his friends will disband.

Centrality

In that the graph shows Nesbits egocentric network, it is not surprising that Nesbit is the most central actor in the network. But note that the ego is not always the

28

most central actor; an egocentric analysis of a clique would yield actors with identical centralities.

Actor Nesbit Kim Drake Lonnie Norman Kaitlyn

Centrality 5 1 2 4 2 2

Density

= .5333 or about 53%

Nesbits egocentric network is just slightly more than 50% dense, indicating that there is potential for the network to become significantly more connected or cohesive. By comparing this networks density to that of others, one might judge the relative health of the network of Nesbit and his cohorts.

Structural Equivalence:

Let the subscript i represent the actor Drake, and the subscript j represent the actor Norman.

29

Drake and Norman are structurally equivalent, since the Euclidean distance between their shared connections is 0. We might conjecture that Drake and Norman actively compete for the attention of their mutual connections, namely Lonnie, and the ego Nesbit.

Conclusion
Graph theorys emphasis on relationships is highly profitable in the context of sociology. Social network analysis emphasizes the relationships between actors, as represented by edges on a graph. In a given research task, one chooses to operate at a certain level of analysis, which may entail a focus on the egocentric level, or on the dyadic level.

In this essay, several analysis techniques were examined. Two of these measures centrality and structural equivalence focused on the connections between individual actors. Centrality refers to the prestige of an individual, while structural equivalence alludes to competition for resources. Central actors that have the potential to break a graph into several subgraphs are known as cutpoints. Removal of a cutpoint reduces the graphs average connections, hence making it less dense overall.
30

Other techniques attempted to locate substructures within a graph. Cliques identified those subgroups that were completely connected, or complete. N-cliques relaxed the requirement somewhat, prescribing looser criteria for grouping. Although cliques have mathematical definitions, they are also social structures, as understood by the conversational usage of the word clique. Ramseys Theorem made statements regarding substructures as well: It predicted the size of a graph that contains either a clique or an independent set of a given order.

In applying the above techniques to an actual network study, one gains deeper insight into the subtleties of social networks. One begins to form a coherent understanding of networks in terms of the connectedness of actors, and dominant groupings.

Works Cited Fox, Jacob. "Ramsey Theory." N.p., n.d. Web. "Graph Theory Glossary." Graph Theory Glossary. N.p., n.d. Web. 1 Jan. 2013.

31

Greenwood, R. E. and Gleason, A. M. "Combinatorial Relations and Chromatic Graphs." Canad. J. Math. 7, 1955. Kilduff, Martin, and Wenpin Tsai. Social Networks and Organizations. London: SAGE, 2003. Print. "Konigsberg Bridge Problem." 2010. 3 Jan. 2013 <http://www.contracosta.edu/legacycontent/math/konig.htm> "Notes on the History of Social Network Analysis." History of Social Network Analysis. N.p., n.d. Web. 10 Jan. 2013. "Ramsey's Theorem." Wolfram MathWorld. N.p., n.d. Web. Reis, Pinheiro Carlos Andre. Social Network Analysis in Telecommunications. Hoboken, NJ: Wiley, 2011. Print. Scott, John. "History of Social Network Analysis : What Is Social Network Analysis? : Bloomsbury Academic." History of Social Network Analysis : What Is Social Network Analysis? : Bloomsbury Academic. N.p., n.d. Web. 1 Jan. 2013. Scott, John. Social network analysis: A handbook. Sage Publications Limited, 2000. Weisstein, Eric W. "Ramsey Number." From MathWorld--A Wolfram Web Resource.http://mathworld.wolfram.com/RamseyNumber.html

32

You might also like