Monday, October 27, 2008

RON

David Andersen, Hari Balakrishnan, Frans Kaashoek, Robert Morris, "Resilient Overlay Networks"

Shortcomings of today's routing structure
  • A quarter of randomly picked connections between two hosts suffers outages of more than half an hour in a three day period
  • BGP filters and summarizes routing information in the interest of scalability and policy enforcement, delays information to prevent oscillations
  • BGP deems connection alive if BGP session is still alive, however persisten congestion a link might degrade E2E performance
  • BGP policies often do not use private peering links to provide failover capabilities
Proposal
  • Overlay Network of nodes routing packets between them based on RON header
  • Size: a couple of dozen nodes (cluster maintained through RON membership manager)
  • Virtual link between each pair of distinct nodes
  • optimizes different application-specific metrics: API provides explicit way for application to prioritize among throughput, delay, packet loss rate and to define failures
  • fast failure detection and recovery (<20sec) through constant active and passive monitoring of virtual links
  • expressive policy routing
  • Data path: application --conduit API--> entry node --> RON routers --> exit node
  • link state, hop-by-hop routing
  • RON routers and RON membership managers communicate through RON network (versus IP)
  • RON headers have tag identifying flow, keeps flow to a chosen path
  • hysteresis to avoid route flapping
Results
  • outages for no longer than 20secs
  • maximal 30% packet loss rate
  • in most cases RON reduces latency
  • in some cases, RON can significantly improve TCP performance

Comments
  • Exact routing mechanism unclear, looks like hop-by-hop routing, but wouldn't source rouing be better to optimize the application-specific metric, and optimize it only once at the source? (doesn't really matter because most of the time the chosen route has only one RON hop between E2E)
  • What would large-scale deployment of RON mean for the Internet? Will RON further inflate data paths? Will it RON nodes deployed in two peering ASs turn these into ASs defacto transiting traffic?
  • How does RON's forwarding compare to a TCP split connection (split at RON forwarder)?
  • The user experience does contradict the statement that 1/4 of the the routes experience outages of half-an hour in three days. Does this mean that the Internet is really partitioned into Clients and Servers with reliable routes between Clients and Servers, but actually less reliable communication between Clients and Clients? Or is the user experience due to the fact that websites deploy servers across multiple different networks and our querries in fact do not go through the "Inter-"net?

2 comments:

Matei Zaharia said...

I agree with most of your comments. I think it's especially important to note that serious servers generally have multiple ISPs and thus are highly reachable. If people spend most of their time on the Internet accessing such servers, then outages between random pairs of hosts are a poor measure of the problem.

About the routing - they take either 1 or 2 hops and plan it at the first node; there's nothing done beyond that. This was one of the cool aspects of the paper, that just one extra hop is enough to improve reliability and performance.

Randy H. Katz said...

Note that this isn't about connectivity to popular servers so much as how to construct your own application-specific overlay network. As Matei points out, there are other mechanisms for popular sites (multi-homing, content distribution networks, etc.), but RON is targeted at something like multipoint conferencing or specialized content sharing. The n2 probing really limits the scalability of RON. So a small scale approach can make sense.