Social Network Analysis (week three)

Different notions of centrality

Indegree: the number of incoming edges to a specific node
Outdegree: the number of outgoing edges from a specific node
Betweenness: the number of shortest paths a node intersects
Closeness: the number of nodes the node your observing needs to pass in order to reach the hub

Calculating degree centrality

For nodes within **undirected networks **
Indegree and Outdegree
The nodes with more connections are more central
Normalization: divide degree by the max possible (N-1).
Normalization is not done often, however software typically returns normalized values. To get the raw number, you’ll need to multiple to normalize value by (N-1).
Centralization: characterizes the entire network, it captures the inequality in the distribution of centrality.
Centralization can be found by using gini coefficient (varies from 0 – 1) or standard deviation. You can also use Freeman’s General Formula (see below).

Betweenness

Captures brokerage
There can be multiple shortest paths on which multiple nodes lie. To calculate the betweenness of a specific node, you should sum up all the shortest paths it lies on and distribution the value evenly amongst the nodes that share the same path (if two nodes share a shortest path, each node receives .5 points for that one shortest path).

Closeness

Captures the number of hops (how many node it needs to pass) a node needs to take to reach a hub
It’s based on the length of the average shortest path between a node and all other nodes in a network.
Another measure of centrality:
Eigenvector centrality – your neighbors are important as you and vise versa
Bonacich formula
A small beta results in high attenuation, meaning that only your immediate friends / connections matter
A high beta results in low attenuation, meaning that a global network structure matters (the larger your network – friends of friends of… - the better)
When beta is 0, your result is the degree centrality
If beta is positive, you are as important as your neighbors (the central node will have high centrality, as it benefits from having important neighbors)
If beta is negative, nodes have higher centrality when they’re connected to less central nodes (the central node will have low centrality, as it is harmed by having important neighbors)
For nodes within directed networks

Betweenness

When normalizing, we have (N-1)(N-2) instead of (N-1)(N-2)/2 because there end up being twice as many ordered pairs as unordered pairs. Getting from J to K is likely different from getting from K to J and this needs to be accounted for.

Closeness

Direction matters, meaning you must calculate in-closeness or out-closeness (similar to in-degree and out-degree directionality)
Eigenvector centrality
This was applied by Google’s founder for search ranking aka PageRank
They created a algorithm that allowed for random pages to be found (through teleportation) and increase in ranking.

Note: As the teleportation probability increases the relative PageRank scores of the nodes being to equalize – the random user (in this case a Google search user) will make a lot of random jumps from node to node (webpage to webpage) instead of following edges, which could result in the user becoming stuck in an endless loop (the original limitation of Eigenvector centrality), therefore leading the user to visit all nodes will equal probability.

Above are notes I’ve taken to keep track of the material covered in the Social Network Analysis course I am taking on Coursera. Notes are derived from week-based video lectures and supplemental information I’ve found online to help clarify certain concepts. If you are taking this course, most of the information below will be found in Week 3 associated material. If you are not taking the course, I hope you find the information below helpful in your search