Different notions of centrality

  • Indegree: the number of incoming edges to a specific node
  • Outdegree: the number of outgoing edges from a specific node
  • Betweenness: the number of shortest paths a node intersects
  • Closeness: the number of nodes the node your observing needs to pass in order to reach the hub

Calculating degree centrality

  • For nodes within **undirected networks **
  • Indegree and Outdegree
  • The nodes with more connections are more central
  • Normalization: divide degree by the max possible (N-1).
  • Normalization is not done often, however software typically returns normalized values. To  get the raw number, you’ll need to multiple to normalize value by (N-1).
  • Centralization: characterizes the entire network, it captures the inequality in the distribution of centrality.
  • Centralization can be found by using gini coefficient (varies from 0 – 1) or standard deviation. You can also use Freeman’s General Formula (see below).

Betweenness

  • Captures brokerage
  • There can be multiple shortest paths on which multiple nodes lie. To calculate the betweenness of a specific node, you should sum up all the shortest paths it lies on and distribution the value evenly amongst the nodes that share the same path (if two nodes share a shortest path, each node receives .5 points for that one shortest path).

Closeness

  • Captures the number of hops (how many node it needs to pass) a node needs to take to reach a hub
  • It’s based on the length of the average shortest path between a node and all other nodes in a network.
  • Another measure of centrality:
  • Eigenvector centrality – your neighbors are important as you and vise versa
  • Bonacich formula
  • A small beta results in high attenuation, meaning that only your immediate friends / connections matter
  • A high beta results in low attenuation, meaning that a global network structure matters (the larger your network – friends of friends of… - the better)
  • When beta is 0, your result is the degree centrality
  • If beta is positive, you are as important as your neighbors (the central node will have high centrality, as it benefits from having important neighbors)
  • If beta is negative, nodes have higher centrality when they’re connected to less central nodes (the central node will have low centrality, as it is harmed by having important neighbors)
  • For nodes within directed networks

Betweenness

  • When normalizing, we have (N-1)(N-2) instead of (N-1)(N-2)/2 because there end up being twice as many ordered pairs as unordered pairs. Getting from J to K is likely different from getting from K to J and this needs to be accounted for.

Closeness

  • Direction matters, meaning you must calculate in-closeness or out-closeness (similar to in-degree and out-degree directionality)
  • Eigenvector centrality
  • This was applied by Google’s founder for search ranking aka PageRank
  • They created a algorithm that allowed for random pages to be found (through teleportation) and increase in ranking.

Note: As the teleportation probability increases the relative PageRank scores of the nodes being to equalize – the random user (in this case a Google search user) will make a lot of random jumps from node to node (webpage to webpage) instead of following edges, which could result in the user becoming stuck in an endless loop (the original limitation of Eigenvector centrality), therefore leading the user to visit all nodes will equal probability.

Above are notes I’ve taken to keep track of the material covered in the Social Network Analysis course I am taking on Coursera. Notes are derived from week-based video lectures and supplemental information I’ve found online to help clarify certain concepts. If you are taking this course, most of the information below will be found in Week 3 associated material. If you are not taking the course, I hope you find the information below helpful in your search