The internet's patch job
7 mins read
Practically every year, networking experts claim the internet is about to run out of room, but the truth is the internet has been close to running out of room since the mid 1990s.
In the wake of the Netscape-driven stock bubble, the Internet Engineering Task Force (IETF) put forward a couple of sticking plasters meant to tide the industry over until it could finally move on with a wholesale upgrade from Internet Protocol version 4 (IPv4) to IPv6 or, as it was when it was proposed in 1994, IP Next Generation (the reinvented Star Trek series was drawing to a close that year).
If you were wondering whatever happened to IPv5, it never existed – at least technically. The version number comes from one of the fields of the IP packet header. IPv4, naturally, used the number '4'. But, in the 1970s, companies supplying internet hardware trialled a rather different form of IP, one aimed at passing voice, video and real-time simulation data – all useful for distributed simulations of battlefield activities. To mark these packets as being different from regular IP packets, they chose to use '5' in the version-number field. As it did not make sense to call IPng version 5 and use the number '6' to identify the packets, the IETF decided to move straight from IPv4 to IPv6. For a brief period, it was going to IPv7, until the IETF established that no-one had successfully snaffled the ID '6' during the previous 20 years.
IPv4 has the capacity to handle 4billion addresses, in theory. Even before you consider the possibility of giving each toaster, fridge and light bulb its own IP address, simply totting up the global population quickly reveals the heart of the problem that was spotted more than 15 years ago. There is a shortfall of more than 2.5billion if everybody on the planet demanded just one IP address. Not only that, IP addresses are not distributed evenly; because of the way in which IP addresses were handed out in the early days, North America lays claim to 60% of available addresses. Only 20% are available to those outside the Western nations.
With its 128bit address field, instead of one with just 32bits, IPv6 would sweep aside any concerns over available addresses. With its 300 undecillion (3 x 10^38) IDs, you could give an internet address to practically anything (see fig 1). But the shift to IPv6 has not happened.
Many organisations will have hundreds, maybe thousands, of machines sitting behind a NAT server so traffic, as seen by routers on the internet, seems to come from just one machine. Some countries, starved of IP addresses, even force companies to share a single address behind a NAT server.
Managing the mapping between local and internet addresses looks straightforward, but is fraught with difficulties: the NAT server has to maintain a table of ingoing and outgoing addresses so it can direct packets to the right machines.
IP itself is a 'stateless' protocol. Every packet is a single entity with no data about packets that arrive before or after it in a stream; the transmission of that information is left to higher-layer protocols. These, in turn, provide the mechanism for a NAT server to determine how packets sent to the same IP address can be distributed to multiple machines on the local side of the network (see fig 3).
The Transmission Control Protocol (TCP), which is normally carried by IP packets, provides a stateful connection. The TCP protocol takes care of setting up, controlling and tearing down connections using a series of packets carried using IP. The User Datagram Protocol (UDP), on the other hand, provides a similar service to IP, making it possible for different applications on a machine to send packets out to the network using the same IP address. The operating system can direct the incoming packets to the right application by monitoring which application is using which UDP port number.
The process has been cited as one of the mechanisms for shortening the battery life of portable devices: without special provision, they have to keep sending packets in order to maintain a connection on the internet, allowing mail servers to 'push' messages to them.
The NAT server maintains as little state information as it can. Until a computer on the local side tries to talk to a machine on the internet, the NAT server has no idea it exists. But when it receives the packet, the server inspects the contents, adds the local IP address of the sending machine to its translation table, then reformats the packet to contain the IP address of the NAT server itself. By taking a note of which TCP or UDP ports the packet uses, the NAT server can trace back to the original sender any reply from the remote internet machine. Because the NAT server has no knowledge of clients on the local side until they start sending data, it cannot relay unsolicited packets from the internet side to local machines unless they have initiated contact already. This has become a key security feature for many users. If a machine on the local side cannot be found, it is extremely hard to hack.
The problem for mobile devices comes from the need to keep the table entries alive in the NAT server. Normally, UDP translations are only stored for a few minutes; those for TCP can last up to an hour. In many cases, the short-lived nature of UDP translation is not problematic. UDP/IP is used for voice-over-IP and gaming protocols, where activity is often quite high for relatively short periods and once machines have stopped communicating they are unlikely to start again without initiating a new connection. These protocols also tend to use UDP because they can tolerate some packet loss and it is better, for latency reasons, to compensate for lost data using forward error correction than to force the remote machine to resend packets.
However, UDP is also used for some IP tunnelling protocols that implement virtual private networks. In this situation, the virtual network connection needs to be sustained for some time, with bursts of activity separated by relatively long silences. To stop the channel from being 'lost' when the NAT server cleans up, the mobile device has to keep sending 'keepalive' packets to the virtual network. Mobile companies have come up with techniques that try to maintain UDP connections without demanding high-frequency keepalives.
Skype and other VoIP systems use a method called 'hole punching' to make machines on the local side of NAT server visible to internet-based computers. The technique works by having clients maintain long-term connections with the Skype server that relays call requests from one user to another. If both users are behind NAT firewalls, they cannot get in touch with each other.
Hole punching (see fig 4) works by having the Skype server tell one client that an internet user at a given address is trying to call. The receiver then 'punches' a hole in the firewall by sending a packet to that machine, even though it has no chance of reaching the initiator directly. However, when the Skype server tells the calling machine about the receiver's location, the caller then 'punches' a corresponding hole in its firewall with the receiver's known IP address and port. Because the Skype server has coordinated the requests with what are now known ports and address, the NAT firewalls will direct the Skype packets to the correct destinations.
Because hole punching allows local machines to set up connections that listen for packets from callers, it is not popular with network administrators. It provides a method of attack that can be exploited by hackers on machines sitting behind a firewall. A similar problem arises for users of peer-to-peer networks.
Many machines sitting behind NAT servers and firewalls will be used to surf to information stored on servers on the world wide web, and these have their own IP address conservation mechanism. Before the release of version 1.1 of the HyperText Transfer Protocol (HTTP), you could only associate one website address with each IP address. Because companies wanted to buy lots of vanity addresses and to support advertising campaigns, there were legitimate concerns that addresses would run out even more quickly than expected. HTTP 1.1 made it possible to map multiple uniform resource identifiers (URIs) onto one IP address – leaving servers behind a machine sitting on that address to determine which one should handle a request.
IPv6 potentially opens up possibilities for all forms of networking, but if the past 15 years are anything to go by, and thanks to address conservation, it would be a brave person who bets on IPv6 coming into widespread use anytime soon. The addresses can be applied to anything with a network interface, letting you control the light bulbs and thermostat in your home from anywhere. Unfortunately, it also means the end of NAT-based security, so it could mean letting anyone have access to your home's electronics.
NAT-based security is hardly perfect, as hole-punching demonstrates, but IPv6 provides the possibility of adding end-to-end security by enabling IPsec encryption by default. However, this, in turn, demands that every peer on the network has effective protection against hacking and worm attacks, which is going to be difficult for highly resource-restricted clients based on simple microcontrollers.
The future for the traditional protocols that sit on top of IP is even more assured. Researchers keep coming up with new protocols aimed at transferring video or replicating the telephony system on an IP infrastructure, but TCP and UDP remain the primary vehicles for transferring data, despite their flaws. For example, TCP is vulnerable to denial-of-service attacks through a process called SYN flooding, in which an attacker send multiple startup messages to a server then disappears. The series of half-open connections this creates can starve others users from access to the server.
Another problem is that some applications do not fit well on top of either TCP or UDP. Using UDP for relaying telephone calls over IP is not ideal and prompted the IETF to create a new version of TCP called the stream control transmission protocol (SCTP).
SCTP makes it possible to send streams of data in independent channels to a client. For example, text and pictures for a web page can go in different streams. This helped overcome a problem with TCP, where one stream can block others because they have to be sent one after the other and transmission may stall for the entire sequence if one packet gets lost and needs to be retransmitted.
In common with other improved protocols for the internet, SCTP is not used widely, even though implementations exist for a variety of operating systems. Its main application today is as the protocol that relays telephony signaling, such as SS7 packets, over IP networks.
The core problem is that applications need to be rewritten to take account of SCTP – it's not possible to build support into operating systems such that the transport protocol can be switched transparently from TCP to SCTP. And without a growing list of applications demanding support, no-one is in a hurry to build SCTP handling into internet equipment other than that installed in telecom companies' core networks.
Despite many meetings, proposals and promises, the internet is likely to carry on working in just the way that it has for some time to come. Change is only going to come when the sticking plasters cannot stop the bleeding and governments and corporations come together to order major surgery.