Monday, October 20, 2008

Internet Topology

Michalis Faloutsos, Petros Faloutsos, Christos Faloutsos, "On Power-Law Relationships of the Internet Topology"

Lun Li, David Alderson, Walter Willinger, John Doyle, "A First-Principle Approach to Understanding the Internet's Router-level Topology"

Introduction

There have been developed various measures to classify graphs (=networks) which are coarser than graph isomorphism and various algorithms to generate graphs with different characteristics. These have been applied in numerous areas like neuroscience to understand the wiring of neurons or the web as hypertext, where one particular eigenvector problem called PageRank had tremendous success as a search engine. Among them are:

Measures:
  • Diameter = longest shortest path between any two vertices, can be an unstable measure
  • Average path length = shortest path averaged between any pair of vertices
  • Shortest path length distribution
  • Clustering Coefficient, CC of a vertex is defined by taking the subgraph generated by the vertex and its neighbors and divide its number of edges by the number of edges of a complete graph. CC of a graph is the CC averaged over the vertices
  • degree distributions of vertices
  • eigenvalues of wiring matrix (symmetric with positive entries only, so diagonalizable over R with only positive eigenvalues)
Typical graphs:
  • random graph (short average path length, low CC, normal distribution of degrees)
  • small-world (short average path length, high CC)
  • hierarchical networks
  • scale-free networks where the degree distribution follows a power law
Algorithms:
  • small-world network generation due to Watts/Strogatz observation that little rewiring of a highly clustered network drops the path length [Watts, Strogatz, "Collective dynamics of 'small-world' networks"]
  • scale-free networks can be archived using degree-preserving rewiring: pick four vertices A, B, C, D with edges A-B and C-D. Drop edges A-B and C-D and create edges A-C and B-D. This procedure is ergodic.
On Power-Law Relationships of the Internet Topology

  • Regarding that there have been hundreds of papers who take some real world graph like the highway systems or a friend network and show that it is a scale-free network, it came as little surprise to me that also the degree distributions of Internet router (as of 1995) and AS graph (as measured three times in 1997-1998) are well approximated as scale-free.
  • The paper presents the rank degree distribution and the outdegree exponent as two power-laws when in fact they are the same: the rank of a node is given by integrating the frequency of an outdegree over all outdegrees higher than the node. But integration and inverting functions all turn power-laws into power-laws.
  • P(h) the number of pairs of nodes with distance
  • The eigenvalues follow a power law (it is not clear to me what the eigenvalues of a random graph are, so I don't know how meaningful this is)
A first-Principles Approach to Understanding the Internet's Router-level Topology

  • show that the degree distribution is an incomplete measure of network: by rewiring graphs with the same degree distribution can be obtained which have very different routing properties
  • stresses that low level(physical) links are more important than virtual (IP, AS) links: they determine performance, reliability and robustness
  • generation of realistic graphs has to go beyond creating abstract graphs and consider technological constraints: edges have bandwidths, economic routers only have a certain number of links, trade-off how many links versus BW/link, flow conservation. And economic considerations: degree of nodes decreases and BW increases from edge to core, few long-haul connections
  • new measures:

    • network performance: maximal (in alpha) total throughput where traffic is modeled as X_ij=alpha*x_i*x_j [Criticism: unrealistic regarding that most traffic goes from clients->websites]
    • network likelihood: gives an edge weight=product of degrees of vertices, likelihood=total weight, networks where edges are between high-degree links score higher
  • tested networks:
    • new: heuristically optimal (tree-like at edge, low-degree loose mesh at core) inspired by real networks
    • real networks
    • preferential attachment
    • random network with scale-free distribution
    • suboptimal networks
  • Results:
    • all networks have the same degree-distribution
    • but score differently with respect to new metric
    • performance decreases drastically with the likelihood
    • real networks have low likelihood and high performance
    • a "generic" graph with given degree distribution is having bad performance
This was a really great paper to read!!! It breaks down the assumptions of how relevant the degree-distribution really is, and then comes up with new measurements which yield good results - results which are actually corresponding to real links (physical) and not abstracted networks like IP layer.

No comments: