Social Network Theory
You are probably Facebook friends with your parents, even your grandparents. They “get” social networking. But the concept of social networks are considerably older than Mark Zuckerberg and go back to before even your oldest relatives were in diapers. Social networks emerged from a relatively obscure branch of 19th-century mathematics called graph theory that sought to understand the math of relationships. Beginning in the 1930s and continuing through the 1950s, psychologists, sociologists, and other social scientists used these mathematical developments to stop talking abstractly about social “structure” and began to collect actual data about the structures of individual people (nodes) having connections to each other based on friendship, communication, or membership (links). From these early developments, social network analysis has grown into an entire field of scholarship that includes established theories, common methods, and long-standing unanswered questions. But social networks are far from abstract ivory tower musings: blockbuster technology companies like Facebook and Google are fundamentally designed around social network theories and algorithms.
Social Network Perspective
Social network data contain records of who is connected to whom. Crucially, the patterns of connections for individuals are extremely diverse. Combining all these records together creates a social network of all the nodes and all their relationships. These can be visualized to illustrate patterns or analyzed using algorithms to identify individuals who occupy important positions, reveal groups, or understand how the network changes over time. Social networks require a shift in perspective, new types of data, and new approaches for analysis compared to traditional analytics.
A social network perspective requires abandoning some of the core assumptions in traditional analytics:
Data about relationships between individuals are often more important than data about the individuals themselves.
Individual nodes can be directly connected, indirectly linked, or completely unconnected from other nodes.
Some nodes have considerably more connections than other nodes.
Some parts of the network clump together into tightly-knit clusters.
Traditional analytical approaches take each individual user as an atom, separate and independent: my height doesn’t depend on the height of my close friends. In many contexts this individualistic assumption is an appropriate. But in the vast majority of applications, it’s incomplete or just wrong. A social network perspective says my connections to others influence my own behavior: research has shown that my weight is actually influenced by the weight of my close friends. Ignoring these relationships obscures important processes of how information and influence travels between individuals over these relationships.
The data to be collected for social network analysis must reflect observations about relationships. Who is friends with whom? Who teams up with whom? Who buys what? Who plays where? These types of relationships are pervasive in many types of data but the relational implications are often overlooked. Fortunately, tools for social network analysis are widely-available even if they are not widely-adopted. Algorithms that replicate Google’s PageRank to identify relevant webpages can also be applied to social networks to identify influential individuals. Algorithms that replicate Facebook’s EdgeRank to rank content on the NewsFeed can also be applied to other social network data to recommend products to new customers. The key insight is that the value of these data comes from the patterns of relationships between people--not the properties of the people themselves. Most analytics stop at the individual and never examine the web of connections that tie them together.