Van Jacobson, Michael J. Karels, "Congestion Avoidance and Control"
Dah-Ming Chiu, Raj Jain, "Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks"
Underlying both papers is the observation that when a link is overutilized there is a very drastic degradation and the throughput can drop suddenly by several orders of magnitude. This can happen when several links converge to one such that the sum of the bandwidths is larger than the bandwidth of the link they converge to. The buffer at the respective router overflows, and it is by pure chance which packets are dropped or make it onto the buffer. ACK packets are dropped, forcing retransmissions, making more traffic. This phenomenon is called congestion.
Both papers are trying to use close to the full bandwidth of links and at the same time avoid congestion. Both papers work at the TCP level on the end-points of a connection. These end-points perceive whether a link is congested and act by changing the amount of data they send through different mechanisms. Both papers assume that the end-points behave cooperatively to avoid congestion and converge to fair resource usage.
The "Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks" makes simplyfing assumptions to make a precise mathematical model of the abstracted network.
Some of these assumptions are:
- time is discretized into time steps
- the endpoint perceives that a link is congested through a simple reliable binary signal send by the router in one time step
- a host is simply sending x_i(t) amount of data/time unit where x_i(t) is main subject of discussion of the paper
The paper analyzes a linear model, i.e. x_i(t) is equal to one of two linear function in x_i(t-1) depending on whether the line is congested or not. It then shows for which pair of linear functions the model will converge to fairness and optimal resource. The easiest example is to multiply x_i(t-1) by <1 class="nfakPe">Congestion
Avoidance implements seven changes to TCP. They mostly come
from the principle of "conservation of packets", meaning that the
number of packets traveling through the network from one machine to
another one should stay constant, a new packet is put into the network
exactly when an old ones leaves.
Criticism: the equations in Figure 1 are also true if I just send a
package every ten minutes, which is under the equlibrium.
TCP can be self-clocked. If I measure the frequency of ACK packages, I
know with what frequency packages leave the pipe, hence with what
frequency I should send.
In order to get to this equlibrium point, I should steadily increase
the frequency of packages just until I don't get as many ACK packages
per time unit as I am sending packages out. They call this slow-start.
The retransmit is a very sensitive issue. If retransmits are done to
frequently, they further congest the network, yielding to more
retransmits, and ending in a chain reaction. First: get a good timeout
when to retransmit (chapter 2), second: avoid a chain reaction using
exponentially increasing retransmit times. Assume that every second a
package gets lost and has to be retransmit infinitely many often.
Since the retransmit times are increasing exponentially, the average
bandwidth for retransmitting one package declines exponentially,
meaning that even in the above case where I have to retransmit
infinitely often, the total bandwidth is still finite. Criticism: why
does the paper talk about linear system theory and Lyanpow exponents
when you can do the calculation in three lines.
Criticism: the paper assumes that all hosts are cooperative. When an
ACK package is lost, it causes unnecessary retransmits. In a congested
network, ACK packages should be preferred. Assuming your retransmits
are far apart so that redundant data aren't traversing the pipe, you
still get the full effective bandwidth if ACKs are delivered in a
congested network.