Wednesday, October 5, 2011

Network Coding

In conventional communications networks the active network elements (e.g., Ethernet switches or IP routers) are store-and-forward devices. They perform no nontrivial computation. It turns out that in certain cases it is possible to optimize network operation (to conserve some network resource or to improve some network performance measure) by embedding more intelligence in the network elements.

In order to understand how this is done, it is useful to start with two special cases.

First case : Two individuals communicating via a satellite having a single downlink/uplink coverage beam.

A transmits to B via satellite S and B transmits back to A via the same satellite. Since A and B must share satellite resources (namely time and frequency), the uplink transmissions must be separated in either time or frequency. In the conventional case the downlink transmissions are separated as well. Thus, if it costs one cent to transmit an uplink message from A or B to the satellite, and similarly one cent for the downlink message from the satellite to A or B, then the exchange of two messages, one from A to B and one from B to A costs 4 cents.

But, this does not need to be the case! Rather than S transmitting A’s message to B and afterwards (or on another frequency) B’s message to A, it can transmit only once the message A xor B on a frequency and at a time when both A and B are listening. A retrieves B’s message by xoring the received message with his own message (since B = (A xor B) xor A), and B performs the same operation to retrieve the message from A (since A = (A xor B) xor B).

This reduces the price from 4 units to 3 units (since there are only three transmissions: A-S, B-S, S-A+B) at the cost of the satellite having to perform the simple operation of xoring two messages. The xor operation performed by the satellite is a kind of “coding” operation that leads to reduction of required network resources.

Second case : Using coding to protect real-time or broadcast transmissions against packet loss.

In real-time and broadcast transmissions it is not possible for a receiver to request retransmission of a lost packet, as TCP and ARQ systems do. Some critical control protocols send each packet multiple times (three times is common), but this is extremely wasteful in network resources. RFC 2198 proposes repeating the audio data from the previous RTP packet in the present one, thus maintain the number of packets per second, but still doubling bandwidth requirements. The FECFRAME working group in the IETF standardized more efficient mechanisms in RFCs 5053 and 6015. I will explain only the simplest possible coding.

Assume that we know that there will never be more than 1 packet loss in 4 consecutive packets. Then for every four packets transmitted, a fifth “protection” packet consisting of the xor of these four packets is sent. If all four packets are received then this fifth packet is discarded. If any single packet is lost then it can be recovered by xoring the received three packets with the fifth “protection” packet. Thus, packet loss can be mitigated with only an increase of 25% in bandwidth, and an increase in delay.

But what does this have to do with network coding? In both of the above cases an information source performed some nontrivial operation in order to conserve some resource or to protect against some network defect. The extension to full network coding requires simply that the computation be performed by some network element along the information path that is able to perform the network coding. Unfortunately, examples of network coding can be quite complex. The simplest one is the “butterfly network” (see Figure 1) presented in a paper by Ahlswede, Cai, Li, and Yeung entitled “Network Information Flow”.





In this example a source S needs to multicast two packets of information P1 and P2 to two destinations A and B over the particular network of network elements U, V, W and X shown in the figure. All of the links have the same bandwidth, which is precisely the bandwidth needed to transmit the packets in the desired time.





It turns out that S can send P1 and P2 to both A and B at once, as shown in Figure 2. Network elements U, V, and X are multicast devices that are able to replicate a packet received on its input port to both of its output port. Network element W performs network coding by calculating the xor of two packets received on its two input ports and sending this to its output port.



Were W not able to perform this operation, it would need first to send P1 and then P2, thus taking twice the time, or alternatively would require twice the bandwidth on the link to X (contrary to our assumption on link bandwidths). It is not hard to convince yourself that without the network coding it is not possible to perform the desired task.

Network coding can be used for purposes other than bandwidth or delay minimization and packet loss protection. Recent research has explored applications to energy reduction, information security, file sharing, congestion control, and fairness.

Y(J)S