Introduction to TCP/IP

TCP/IP, acronym for Transmission Control Protocol/Internet Protocol, the suite of communications protocols used to connect hosts on the Internet.

TCP and IP were developed by a Department of Defense (DOD) research project to connect a number of different networks designed by different vendors into a network of networks (the "Internet"). It was initially successful because it delivered a few basic services that everyone needs (file transfer, electronic mail, remote logon) across a very large number of client and server systems. Several computers in a small department can use TCP/IP (along with other protocols) on a single LAN. The IP component provides routing from the department to the enterprise network, then to regional networks, and finally to the global Internet. On the battlefield a communications network will sustain damage, so the DOD designed TCP/IP to be robust and automatically recover from any node or phone line failure. This design allows the construction of very large networks with less central management. However, because of the automatic recovery, network problems can go undiagnosed and uncorrected for long periods of time.

The Army puts out a bid on a computer and DEC wins the bid. The Air Force puts out a bid and IBM wins. The Navy bid is won by Unisys. Then the President decides to invade Grenada and the armed forces discover that their computers cannot talk to each other. The DOD must build a "network" out of systems each of which, by law, was delivered by the lowest bidder on a single contract.

The Internet Protocol was developed to create a Network of Networks (the "Internet"). Individual machines are first connected to a LAN (Ethernet or Token Ring). TCP/IP shares the LAN with other uses (a Novell file server, Windows for Workgroups peer systems). One device provides the TCP/IP connection between the LAN and the rest of the world.

To insure that all types of systems from all vendors can communicate, TCP/IP is absolutely standardized on the LAN. However, larger networks based on long distances and phone lines are more volatile. In the US, many large corporations would wish to reuse large internal networks based on IBM's SNA. In Europe, the national phone companies traditionally standardize on X.25. However, the sudden explosion of high speed microprocessors, fiber optics, and digital phone systems has created a burst of new options: ISDN, frame relay, FDDI, Asynchronous Transfer Mode (ATM). New technologies arise and become obsolete within a few years. With cable TV and phone companies competing to build the National Information Superhighway, no single standard can govern citywide, nationwide, or worldwide communications.

The original design of TCP/IP as a Network of Networks fits nicely within the current technological uncertainty. TCP/IP data can be sent across a LAN, or it can be carried within an internal corporate SNA network, or it can piggyback on the cable TV service. Furthermore, machines connected to any of these networks can communicate to any other network through gateways supplied by the network vendor.

TCP / IP LAYERS

As with all other communications protocol, TCP/IP is composed of layers:

TCP - establish a connection AND guarantees delivery of PACKETS

IP - moves packet of data from node to node.

Sockets - name given to the package of subroutines that provide access to TCP/IP on most systems.

TCP Layer

It is responsible for verifying the correct delivery of data from client to server. Data can be lost in the intermediate network. TCP adds support to detect errors or lost data and to trigger retransmission until the data is correctly and completely received.

TCP enables two hosts to establish a connection and exchange streams of data. TCP guarantees delivery of data and also guarantees that packets will be delivered in the same order in which they were sent.

TCP assist networks to route messages based on the IP address of the destination

There are many types of computer networks, including:

 local-area networks (LANs) : The computers are geographically close together (typically in the same building).

 wide-area networks (WANs) : The computers are farther apart and are connected by telephone lines or RADIO waves.

In addition to these types, the following characteristics are also used to categorize different types of networks:

 topology : The geometric arrangement of a computer system. Common topologies include a bus, star, and ring.

 protocol : The protocol defines a common set of rules and signals that computers on the network use to communicate. One of the most popular protocols for LANs is called Ethernet. Another popular LAN protocol for PCs is the IBM token-ring network .

 architecture : Networks can be broadly classified as using either a peer-to-peer or client/server architecture.

IP Layer

It is responsible for moving packet of data from node to node. IP specifies the format of packets, also called datagrams, and the addressing scheme. The format of an IP address is a 32-bit numeric address written as four numbers (the IP number) separated by periods. Each number can be zero to 255. For example, 1.160.10.240 could be an IP address. IP forwards each packet based on this four byte (the IP number) destination address.

IP by itself is something like the postal system. It allows you to address a package and drop it in the system, but there's no direct link between you and the recipient. TCP/IP, on the other hand, establishes a connection between two hosts so that they can send messages back and forth for a period of time.

Within an isolated network, you can assign IP addresses at random as long as each one is unique. However, connecting a private network to the Internet requires using registered IP addresses (called Internet addresses) to avoid duplicates. The Internet authorities assign ranges of numbers to different organizations. The organizations assign groups of their numbers to departments. IP operates on gateway machines that move data from department to organization to region and then around the world.

These four numbers in an IP address are used in different ways to identify a particular network and a host on that network. The InterNIC Registration Service assigns Internet addresses from the following three classes.

 Class A - supports 16 million hosts on each of 127 networks – Large Networks  Class B - supports 65,000 hosts on each of 16,000 networks – Medium Networks  Class C - supports 254 hosts on each of 2 million networks – Small Networks

In addition to the three most popular classes, there are two additional classes. Class D addresses have their leading four-bits set to 1-1-1-0 and are used to support IP Multicasting. Class E addresses have their leading four-bits set to 1-1-1-1 and are reserved for experimental use.

 Class D - Multicast  Class E - Reserved

Dotted-Decimal Notation

To make Internet addresses easier for human users to read and write, IP addresses are often expressed as four decimal numbers, each separated by a dot. This format is called "dotted-decimal notation."

Dotted-decimal notation divides the 32-bit Internet address into four 8-bit (byte) fields and specifies the value of each field independently as a decimal number with the fields separated by dots. The figure below shows how a typical /16 (Class B) Internet address can be expressed in dotted decimal notation.

Table below displays the range of dotted-decimal values that can be assigned to each of the three principle address classes. The "xxx" represents the host-number field of the address which is assigned by the local network administrator.

Dotted-Decimal Ranges for Each Address Class

Classification of IP Address Class

IP Addresses of Class A ( /8 Prefixes ) - Large networks

Each Class A network address has an 8-bit network-prefix with the highest order bit set to 0 and a seven-bit network number, followed by a 24-bit host-number. Today, it is no longer considered 'modern' to refer to a Class A network. Class A networks are now referred to as "/8s" (pronounced "slash eight" or just "eights") since they have an 8-bit network-prefix.

A maximum of 126 (2^7 -2) /8 networks can be defined. The calculation requires that the 2 is subtracted because the /8 network 0.0.0.0 is reserved for use as the default route and the /8 network 127.0.0.0 (also written 127/8 or 127.0.0.0/8) has been reserved for the "loopback" function. Each /8 supports a maximum of 16,777,214 (2^24 -2) hosts per network. The host calculation requires that 2 is subtracted because the all-0s ("this network") and all-1s ("broadcast") host-numbers may not be assigned to individual hosts.

Since the /8 address block contains 2^31 (2,147,483,648 ) individual addresses and the IPv4 address space contains a maximum of 2^32 (4,294,967,296) addresses, the /8 address space is 50% of the total IPv4 unicast address space.

IP addresses of Class A have the following format:

Bitno. 0 1 2 3 4 5 6 7 8 16 24 31 +---------------+---------------+---------------+---------------+ ClassA |0| netid (7) | hostid (24) | +---------------+---------------+---------------+---------------+

Class A addresses are identified by a zero in bit 0 of the address. This means that 50% of all available IP addresses are of this class. However, the networks with all binary zeros and all binary ones, and network address 127 are special network addresses. The range of Class A addresses expressed in dotted decimal notation is from 1.h.h.h to 126.h.h.h, where the 'h's represent the host part of the address. Each of the 'h's is a number from 0 to 255. This limits the number of Class A networks to 126.

The host part is 24 bits long, but as for the network part, addresses with all binary zeros or ones in the host part are special host addresses. This gives more than 4 million IP addresses within a single Class A network.

Special IP Address Conventions

Some of the IP addresses available are used for special purposes. In these cases an network part or host part containing all 0s is means 'this', while all 1s means 'all'.

Table of Special Addresses:

Network part Host part Example Meaning ------------ --------- --------------- -------------------------- All 0s All 0s 0.0.0.0 This host All 0s Host 0.0.0.34 Host on this net All 1s All 1s 255.255.255.255 Broadcast to local net Net All 1s 197.21.12.255 Broadcast to net 127 +nything 127.0.0.0 Loopback, internal in host

IP Addresses of Class B ( /16 Prefixes ) - Medium sized networks

Each Class B network address has a 16-bit network-prefix with the two highest order bits set to 1-0 and a 14-bit network number, followed by a 16-bit host-number. Class B networks are now referred to as"/16s" since they have a 16-bit network-prefix.

A maximum of 16,384 (2^14 ) /16 networks can be defined with up to 65,534 (2^16 -2) hosts per network. Since the entire /16 address block contains 2^30 (1,073,741,824) addresses, it represents 25% of the total IPv4 unicast address space.

IP addresses of Class B have the following format:

Bitno. 0 1 2 3 4 5 6 7 8 16 24 31 +---------------+---------------+---------------+---------------+ Class B |1 0| netid (14) | hostid (16) | +---------------+---------------+---------------+---------------+

Class B addresses are identified by a one in bit 0 and a zero in bit 1 of the address. This means that 25% of all available IP addresses are of this class.

There are 14 bits left to identify the Class B networks. This limits the number of Class B networks to more than 16000. The range of Class B addresses expressed in dotted decimal notation is from 128.0.h.h to 191.255.h.h, where the 'h's represent the host part of the address. Each of the 'h's is a number from 0 to 255.

The host part is 16 bits long, but addresses with all binary zeros or ones in the host part are special host addresses. This gives more than 65000 IP addresses within a single Class B network.

IP Addresses of Class C ( /24 Prefixes ) - Small networks

Each Class C network address has a 24-bit network-prefix with the three highest order bits set to 1-1-0 and a 21-bit network number, followed by an 8-bit host-number. Class C networks are now referred to as "/24s" since they have a 24-bit network-prefix.

A maximum of 2,097,152 (2^21 ) /24 networks can be defined with up to 254 (2^8 -2) hosts per network. Since the entire /24 address block contains 2^29 (536,870,912) addresses, it represents 12.5% (or 1/8th) of the total IPv4 unicast address space.

IP addresses of Class C have the following format:

Bitno. 0 1 2 3 4 5 6 7 8 16 24 31 +---------------+---------------+---------------+---------------+ Class C |1 1 0| netid (21) | hostid (8) | +---------------+---------------+---------------+---------------+

Class C addresses are identified by a one in bit 0 and 1, and a zero in bit 2 of the address. This means that 12.5% of all available IP addresses are of this class.

There are 21 bits left to identify the Class C networks. This allows more than 2 million Class C networks to be used. The range of Class C addresses expressed in dotted decimal notation is from 192.0.0.h to 223.255.255.h, where the 'h's represent the host part of the address. Each of the 'h's is a number from 0 to 255.

The host part is 8 bits long, but addresses with all binary zeros or ones in the host part are special host addresses. This gives a maximum of 254 IP addresses within a single Class C network.

IP Addresses of Class D - Multicasting

IP addresses of Class D have the following format:

Bitno. 0 1 2 3 4 5 6 7 8 16 24 31 +---------------+---------------+---------------+---------------+ Class D |1 1 1 0| multicast address (28) | +---------------+---------------+---------------+---------------+

Class D addresses are identified by a one in bit 0,1 and 2 and a zero in bit 3 of the address. This means that 6.25% of all available IP addresses are of this class.

The range of Class D addresses are in dotted decimal notation from 224.h.h.h.h to 239.h.h.h, where h is a number from 0 to 255. Address 224.0.0.0 is reserved and can not be used, while address 224.0.0.1 is used to address all hosts that take part in IP multicasting.

Class D addresses are used for multicasting and does not have a network part and hosts part.

IP multicasting makes it possible to send IP datagrams to a group of hosts, which may be spread across many networks

IP Addresses of Class E - Reserved

IP addresses of Class E are reserved for future use. They have the following format:

Bitno. 0 1 2 3 4 5 6 7 8 16 24 31 +---------------+---------------+---------------+---------------+ Class E |1 1 1 1 0| reserved (27) | +---------------+---------------+---------------+---------------+

Class E addresses are identified by a one in bit 0,1,2 and 3 and a zero in bit 4 of the address. This means that 3.125% of all available IP addresses are of this class.

The range of Class E addresses are in dotted decimal notation from 240.0.0.0 to 247.255.255.255.

Remember that each technology has its own convention for transmitting messages between two machines within the same network. On a LAN, messages are sent between machines by supplying the six byte unique identifier (the "MAC" address). In an SNA network, every machine has Logical Units with their own network address. DECNET, Appletalk, and Novell IPX all have a scheme for assigning numbers to each local network and to each workstation attached to the network.

On top of these local or vendor specific network addresses, TCP/IP assigns a unique number to every workstation in the world. This "IP number" is a four byte value that, by convention, is expressed by converting each byte into a decimal number (0 to 255) and separating the bytes with a period. For example, the PC Lube and Tune server is 130.132.59.234.

An organization begins by sending electronic mail to Hostmaster@INTERNIC.NET requesting assignment of a network number. It is still possible for almost anyone to get assignment of a number for a small "Class C" network in which the first three bytes identify the network and the last byte identifies the individual computer.

The number of unassigned Internet addresses is running out, so a new classless scheme called CIDR is gradually replacing the system based on classes A, B, and C and is tied to adoption of IPv6.

IPng, short for Internet Protocol next generation, a new version of the Internet Protocol (IP) currently being reviewed in IETF standards committees. The official name of IPng is IPv6, where the v6 stands for version 6. The current version of IP is version 4, so it is sometimes referred to as IPv4.

IPng is designed as an evolutionary upgrade to the Internet Protocol and will, in fact, coexist with the older IPv4 for some time. IPng is designed to allow the Internet to grow steadily, both in terms of the number of hosts connected and the total amount of data traffic transmitted.

Subnets or Subnetting

Although the individual subscribers do not need to tabulate network numbers or provide explicit routing, it is convenient for most Class B networks to be internally managed as a much smaller and simpler version of the larger network organizations. It is common to subdivide the two bytes available for internal assignment into a one byte department number and a one byte workstation ID.

In 1985, RFC 950 defined a standard procedure to support the subnetting, or division, of a single Class A, B, or C network number into smaller pieces. Subnetting was introduced to overcome some of the problems that parts of the Internet were beginning to experience with the classful two-level addressing hierarchy:

 Internet routing tables were beginning to grow.  Local administrators had to request another network number from the Internet before a new network could be installed at their site.

The address shortage problem is aggravated by the fact that portions of the IP address space have not been efficiently allocated. Also, the traditional model of classful addressing does not allow the address space to be used to its maximum potential. The Address Lifetime Expectancy (ALE) Working Group of the IETF has expressed concerns that if the current address allocation policies are not modified, the Internet will experience a near to medium term exhaustion of its unallocated address pool. If the Internet's address supply problem is not solved, new users may be unable to connect to the global Internet!

The second problem is caused by the rapid growth in the size of the Internet routing tables. Internet backbone routers are required to maintain complete routing information for the Internet. Over recent years, routing tables have experienced exponential growth as increasing numbers of organizations connect to the Internet - in December 1990 there were 2,190 routes, in December 1992 there were 8,500 routes, and in December 1995 there were 30,000+ routes.

Both of these problems were attacked by adding another level of hierarchy to the IP addressing structure. Instead of the classful two-level hierarchy, subnetting supports a three-level hierarchy. The figure below illustrates the basic idea of subnetting which is to divide the standard classful host-number field into two parts - the subnet-number and the host-number on that subnet.

Subnetting attacked the expanding routing table problem by ensuring that the subnet structure of a network is never visible outside of the organization's private network. The route from the Internet to any subnet of a given IP address is the same, no matter which subnet the destination host is on. This is because all subnets of a given network number use the same network-prefix but different subnet numbers. The routers within the private organization need to differentiate between the individual subnets, but as far as the Internet routers are concerned, all of the subnets in the organization are collected into a single routing table entry. This allows the local administrator to introduce arbitrary complexity into the private network without affecting the size of the Internet's routing tables.

Subnetting overcame the registered number issue by assigning each organization one (or at most a few) network number(s) from the IPv4 address space. The organization was then free to assign a distinct subnetwork number for each of its internal networks. This allows the organization to deploy additional subnets without needing to obtain a new network number from the Internet.

In the figure above, a site with several logical networks uses subnet addressing to cover them with a single /16 (Class B) network address. The router accepts all traffic from the Internet addressed to network 130.5.0.0, and forwards traffic to the interior subnetworks based on the third octet of the classful address. The deployment of subnetting within the private network provides several benefits:

 The size of the global Internet routing table does not grow because the site administrator does not need to obtain additional address space and the routing advertisements for all of the subnets are combined into a single routing table entry.

 The local administrator has the flexibility to deploy additional subnets without obtaining a new network number from the Internet.

 Route flapping (i.e., the rapid changing of routes) within the private network does not affect the Internet routing table since Internet routers do not know about the reachability of the individual subnets - they just know about the reachability of the parent network number.

Extended-Network-Prefix

Internet routers use only the network-prefix of the destination address to route traffic to a subnetted environment. Routers within the subnetted environment use the extended-network- prefix to route traffic between the individual subnets. The extended-network-prefix is composed of the classful network-prefix and the subnet-number.

The extended-network-prefix has traditionally been identified by the subnet mask. For example, if you have the /16 address of 130.5.0.0 and you want to use the entire third octet to represent the subnet-number, you need to specify a subnet mask of 255.255.255.0. The bits in the subnet mask and the Internet address have a one-to-one correspondence. The bits of the subnet mask are set to 1 if the system examining the address should treat the corresponding bit in the IP address as part of the extended-network- prefix. The bits in the mask are set to 0 if the system should treat the bit as part of the host-number. This is illustrated in figure below:

The standards describing modern routing protocols often refer to the extended-network-prefix- length rather than the subnet mask. The prefix length is equal to the number of contiguous one-bits in the traditional subnet mask. This means that specifying the network address 130.5.5.25 with a subnet mask of 255.255.255.0 can also be expressed as 130.5.5.25/24. The / notation is more compact and easier to understand than writing out the mask in its traditional dotted-decimal format. This is illustrated in figure below:

However, it is important to note that modern routing protocols still carry the subnet mask. There are no Internet standard routing protocols that have a one-byte field in their header that contains the number of bits in the extended-network prefix. Rather, each routing protocol is still required to carry the complete four-octet subnet mask.

Subnet Design Considerations

The deployment of an addressing plan requires careful thought on the part of the network administrator. There are four key questions that must be answered before any design should be undertaken: 1) How many total subnets does the organization need today?

2) How many total subnets will the organization need in the future?

3) How many hosts are there on the organization's largest subnet today?

4) How many hosts will there be on the organization's largest subnet in the future?

The first step in the planning process is to take the maximum number of subnets required and round up to the nearest power of two. For example, if a organization needs 9 subnets, 2^3 (or 8) will not provide enough subnet addressing space, so the network administrator will need to round up to 2^4 (or 16). When performing this assessment, it is critical that the network administrator always allow adequate room for future growth. For example, if 14 subnets are required today, then 16 subnets might not be enough in two years when the 17th subnet needs to be deployed. In this case, it might be wise to allow for more growth and select 2^5 (or 32) as the maximum number of subnets.

The second step is to make sure that there are enough host addresses for the organization's largest subnet. If the largest subnet needs to support 50 host addresses today, 2^5 (or 32) will not provide enough host address space so the network administrator will need to round up to 2^6 (or 64).

The final step is to make sure that the organization's address allocation provides enough bits to deploy the required subnet addressing plan. For example, if the organization has a single /16, it could easily deploy 4-bits for the subnet-number and 6-bits for the host number. However, if the organization has several /24s and it needs to deploy 9 subnets, it may be required to subnet each of its /24s into four subnets (using 2 bits) and then build the internet by combining the subnets of 3 different /24 network numbers. An alternative solution, would be to deploy network numbers from the private address space (RFC 1918) for internal connectivity and use a Network Address Translator (NAT) to provide external Internet access.

A Uncertain Path Every time a message arrives at an IP router, it makes an individual decision about where to send it next. There is concept of a session with a preselected path for all traffic. Consider a company with facilities in New York, Los Angeles, Chicago and Atlanta. It could build a network from four phone lines forming a loop (NY to Chicago to LA to Atlanta to NY). A message arriving at the NY router could go to LA via either Chicago or Atlanta. The reply could come back the other way.

How does the router make a decision between routes? There is no correct answer. Traffic could be routed by the "clockwise" algorithm (go NY to Atlanta, LA to Chicago). The routers could alternate, sending one message to Atlanta and the next to Chicago. More sophisticated routing measures traffic patterns and sends data through the least busy link.

If one phone line in this network breaks down, traffic can still reach its destination through a roundabout path. After losing the NY to Chicago line, data can be sent NY to Atlanta to LA to Chicago. This provides continued service though with degraded performance. This kind of recovery is the primary design feature of IP. The loss of the line is immediately detected by the routers in NY and Chicago, but somehow this information must be sent to the other nodes. Otherwise, LA could continue to send NY messages through Chicago, where they arrive at a "dead end." Each network adopts some Router Protocol which periodically updates the routing tables throughout the network with information about changes in route status.

If the size of the network grows, then the complexity of the routing updates will increase as will the cost of transmitting them. Building a single network that covers the entire US would be unreasonably complicated. Fortunately, the Internet is designed as a Network of Networks. This means that loops and redundancy are built into each regional carrier. The regional network handles its own problems and reroutes messages internally. Its Router Protocol updates the tables in its own routers, but no routing updates need to propagate from a regional carrier to the NSF spine or to the other regions (unless, of course, a subscriber switches permanently from one region to another).

Undiagnosed Problems

IBM designs its SNA networks to be centrally managed. If any error occurs, it is reported to the network authorities. By design, any error is a problem that should be corrected or repaired. IP networks, however, were designed to be robust. In battlefield conditions, the loss of a node or line is a normal circumstance. Casualties can be sorted out later on, but the network must stay up. So IP networks are robust. They automatically (and silently) reconfigure themselves when something goes wrong. If there is enough redundancy built into the system, then communication is maintained.

In 1975 when SNA was designed, such redundancy would be prohibitively expensive, or it might have been argued that only the Defense Department could afford it. Today, however, simple routers cost no more than a PC. However, the TCP/IP design that, "Errors are normal and can be largely ignored," produces problems of its own.

Data traffic is frequently organized around "hubs," much like airline traffic. One could imagine an IP router in Atlanta routing messages for smaller cities throughout the Southeast. The problem is that data arrives without a reservation. Airline companies experience the problem around major events, like the Super Bowl. Just before the game, everyone wants to fly into the city. After the game, everyone wants to fly out. Imbalance occurs on the network when something new gets advertised. Adam Curry announced the server at "mtv.com" and his regional carrier was swamped with traffic the next day. The problem is that messages come in from the entire world over high speed lines, but they go out to mvt.com over what was then a slow speed phone line.

Occasionally a snow storm cancels flights and airports fill up with stranded passengers. Many go off to hotels in town. When data arrives at a congested router, there is no place to send the overflow. Excess packets are simply discarded. It becomes the responsibility of the sender to retry the data a few seconds later and to persist until it finally gets through. This recovery is provided by the TCP component of the Internet protocol.

TCP was designed to recover from node or line failures where the network propagates routing table changes to all router nodes. Since the update takes some time, TCP is slow to initiate recovery. The TCP algorithms are not tuned to optimally handle packet loss due to traffic congestion. Instead, the traditional Internet response to traffic problems has been to increase the speed of lines and equipment in order to say ahead of growth in demand.

TCP treats the data as a stream of bytes. It logically assigns a sequence number to each byte. The TCP packet has a header that says, in effect, "This packet starts with byte 379642 and contains 200 bytes of data." The receiver can detect missing or incorrectly sequenced packets. TCP acknowledges data that has been received and retransmits data that has been lost. The TCP design means that error recovery is done end-to-end between the Client and Server machine. There is no formal standard for tracking problems in the middle of the network, though each network has adopted some ad hoc tools.

Need to Know

There are three levels of TCP/IP knowledge. Those who administer a regional or national network must design a system of long distance phone lines, dedicated routing devices, and very large configuration files. They must know the IP numbers and physical locations of thousands of subscriber networks. They must also have a formal network monitor strategy to detect problems and respond quickly.

Each large company or university that subscribes to the Internet must have an intermediate level of network organization and expertise. A half dozen routers might be configured to connect several dozen departmental LANs in several buildings. All traffic outside the organization would typically be routed to a single connection to a regional network provider.

However, the end user can install TCP/IP on a personal computer without any knowledge of either the corporate or regional network. Three pieces of information are required:

1. The IP address assigned to this personal computer

2. The part of the IP address (the subnet mask) that distinguishes other machines on the same LAN (messages can be sent to them directly) from machines in other departments or elsewhere in the world (which are sent to a router machine)

3. The IP address of the router machine that connects this LAN to the rest of the world.

In the case of the PCLT server, the IP address is 130.132.59.234. Since the first three bytes designate this department, a "subnet mask" is defined as 255.255.255.0 (255 is the largest byte value and represents the number with all bits turned on). It is a Yale convention (which we recommend to everyone) that the router for each department have station number 1 within the department network. Thus the PCLT router is 130.132.59.1. Thus the PCLT server is configured with the values:

 My IP address: 130.132.59.234  Subnet mask: 255.255.255.0  Default router: 130.132.59.1

The subnet mask tells the server that any other machine with an IP address beginning 130.132.59.* is on the same department LAN, so messages are sent to it directly. Any IP address beginning with a different value is accessed indirectly by sending the message through the router at 130.132.59.1 (which is on the departmental LAN).

Basic IP Routing Classed IP Addressing and the Use of ARP

Consider a small internal TCP/IP network consisting of one Ethernet segment and three nodes. The IP network number of this Ethernet segment is 200.1.2. The host numbers for A, B, and C are 1, 2, and 3 respectively. These are Class C addresses, and therefore allow for up to 254 nodes on this network segment.

Each of these nodes have corresponding Ethernet addresses, which are six bytes long. They are normally written in hexadecimal form separated by dashes (02-FE-87-4A-8C-A9 for example).

In the diagram above and subsequent diagrams, we have emphasized the network number portion of the IP address.

Suppose that A wanted to send a packet to C for the first time, and that it knows C's IP address. To send this packet over Ethernet, A would need to know C's Ethernet address. The Address Resolution Protocol (ARP) is used for the dynamic discovery of these addresses.

ARP keeps an internal table of IP address and corresponding Ethernet address. When A attempts to send the IP packet destined to C, the ARP module does a lookup in its table on C's IP address and will discover no entry. ARP will then broadcast a special request packet over the Ethernet segment, which all nodes will receive. If the receiving node has the specified IP address, which in this case is C, it will return its Ethernet address in a reply packet back to A. Once A receives this reply packet, it updates its table and uses the Ethernet address to direct A's packet to C. ARP table entries may be stored statically in some cases, or it keeps entries in its table until they are "stale" in which case they are flushed.

Consider now two separate Ethernet networks that are joined by a PC, C, acting as an IP router (for instance, if you have two Ethernet segments on your server).

Device C is acting as a router between these two networks. A router is a device that chooses different paths for the network packets, based on the addressing of the IP frame it is handling. Different routes connect to different networks. The router will have more than one address as each route is part of a different network.

Since there are two separate Ethernet segments, each network has its own Class C network number. This is necessary because the router must know which network interface to use to reach a specific node, and each interface is assigned a network number. If A wants to send a packet to E, it must first send it to C who can then forward the packet to E. This is accomplished by having A use C's Ethernet address, but E's IP address. C will receive a packet destined to E and will then forward it using E's Ethernet address. These Ethernet addresses are obtained using ARP as described earlier.

If E was assigned the same network number as A, 200.1.2, A would then try to reach E in the same way it reached C in the previous example - by sending an ARP request and hoping for a reply. However, because E is on a different physical wire, it will never see the ARP request and so the packet cannot be delivered. By specifying that E is on a different network, the IP module in A will know that E cannot be reached without having it forwarded by some node on the same network as A.

Direct vs. Indirect Routing

Direct routing was observed in the first example when A communicated with C. It is also used in the last example for A to communicate with B. If the packet does not need to be forwarded, i.e. both the source and destination addresses have the same network number, direct routing is used.

Indirect routing is used when the network numbers of the source and destination do not match. This is the case where the packet must be forwarded by a node that knows how to reach the destination (a router).

In the last example, A wanted to send a packet to E. For A to know how to reach E, it must be given routing information that tells it who to send the packet to in order to reach E. This special node is the "gateway" or router between the two networks. A Unix-style method for adding a routing entry to A is

route add [destination_ip] [gateway] [metric]

Where the metric value is the number of hops to the destination. In this case,

route add 200.1.3.2 200.1.2.3 1

will tell A to use C as the gateway to reach E. Similarly, for E to reach A,

route add 200.1.2.1 200.1.3.10 1

will be used to tell E to use C as the gateway to reach A.

It is necessary that C have two IP addresses - one for each network interface. This way, A knows from C's IP address that it is on its own network, and similarly for E. Within C, the routing module will know from the network number of each interface which one to use for forwarding IP packets.

In most cases it will not be necessary to manually add this routing entry. It would normally be sufficient to set up C as the default gateway for all other nodes on both networks. The default gateway is the IP address of the machine to send all packets to that are not destined to a node on the directly-connected network. The routing table in the default gateway will be set up to forward the packets properly, which will be discussed in detail later.

Static vs. Dynamic Routing

Static routing is performed using a preconfigured routing table which remains in effect indefinitely, unless it is changed manually by the user. This is the most basic form of routing, and it usually requires that all machines have statically configured addresses, and definitely requires that all machines remain on their respective networks. Otherwise, the user must manually alter the routing tables on one or more machines to reflect the change in network topology or addressing. Usually at least one static entry exists for the network interface, and is normally created automatically when the interface is configured.

Dynamic routing uses special routing information protocols to automatically update the routing table with routes known by peer routers. These protocols are grouped according to whether they are Interior Gateway Protocols (IGPs) or Exterior Gateway Protocols. Interior gateway protocols are used to distribute routing information inside of an Autonomous System (AS). An AS is a set of routers inside the domain administered by one authority. Examples of interior gateway protocols are OSPF and RIP. Exterior gateway protocols are used for inter-AS routing, so that each AS may be aware of how to reach others throughout the Internet. Examples of exterior gateway protocols are EGP and BGP. See RFC 1716 [11] for more information on IP router operations.

In Summary

We can define a ROUTER as:

A device that connects two LANs.

Routers are similar to bridges, but provide additional functionality, such as the ability to filter messages and forward them to different places based on various criteria.

The Internet uses routers extensively to forward packets from one host to another.