Honda Super Hawk

Retransmission behaviour of Solaris 2.5.1 and 2.6


Last update: 20.09.98 (change log)

  1. Solaris 2.5.1 initial SYN segment loss test
  2. Solaris 2.6 initial SYN segment loss test
  3. Solaris 2.5.1 data retransmission test
  4. Solaris 2.6 data retransmission test

Most settings shown on this page are for experimentational purposes!
Do not attempt to use them in a production environment!

For the experiments on this page, I use two Solaris hosts, A and B. Host A runs Solaris 2.5.1 and host B runs 2.6. Both hosts are connected - as far as the test are concerned - by a 100BaseT switched ethernet.

The first set of tests will route PDUs to a non active interface on another host. Thus the PDUs will be delivered to the remote host TCP/IP stack, but will never be acted upon. This is the same as if the initial SYN segment was lost. The second set of tests temporarily shuts down an previously active interface, thus causing lost data segments. Retransmission tests with the FIN segment can be done, but not as easily as the other experiments.

1. Solaris 2.5.1 initial SYN segment loss test

1.1 Setup of tunable parameters on A

A Solaris 2.5.1 host A has the following settings which are of interest to the reader. Since host B is not actively involved in this experiment, its tunables are shown when appropriate.

     tcp_ip_abort_cinterval = 60000
     tcp_ip_abort_linterval = 180000
      tcp_ip_abort_interval = 600000
    tcp_ip_notify_cinterval = 10000
     tcp_ip_notify_interval = 10000
tcp_rexmit_interval_initial = 3000
    tcp_rexmit_interval_max = 240000
    tcp_rexmit_interval_min = 200
  tcp_deferred_ack_interval = 200

1.2 Setup of the test environment

I set up a virtual interface with a private network IP address as an alias of the regular ethernet interface on a host B, and tell A about the route to B. For the first experiment, this virtual interface on host B has to exist, but must be down. Mind that the 192.168.0.0/16 address space does not hurt anybody (see RFC 1918).

A # route add host 192.168.4.17 B 1
B # ifconfig hme0:1 192.168.4.17 down
B # ifconfig -a
...
hme0:1: flags=842<BROADCAST,RUNNING,MULTICAST> mtu 1500
        inet 192.168.4.17 netmask ffffff00 broadcast 192.168.4.255
...

1.3 The experiment

On either host A or B I trace the network PDUs with the help of the tcpdump tool. You can use snoop, but the tcpdump has a more concise output. Depending on the verbosity level, snoop either shows too little or too much. Using another terminal on host A, I try to connect to the daytime port of the private address. Due to the fact that I use ATM to connect to both hosts A and B, the terminal output won't show in the TCP dump. gtod is a small tool that calls gettimeofday() internally and display the current time in 24 hour format with milliseconds. It could display microseconds, but Solaris timers are only 4 ms accurate anyway.

A $ gtod ; telnet 192.168.4.17 daytime ; gtod
14:17:45.919
Trying 192.168.4.17...
telnet: Unable to connect to remote host: Connection timed out
14:19:11.284

A # tcpdump -Ni hme0 host 192.168.4.17
tcpdump: listening on hme0
14:17:45.954893 A.11195 > 192.168.4.17.daytime: S 764681606:764681606(0) win 64240 <mss 1460> (DF)
14:17:58.136084 A.11195 > 192.168.4.17.daytime: S 764681606:764681606(0) win 64240 <mss 1460> (DF)
14:18:22.516140 A.11195 > 192.168.4.17.daytime: S 764681606:764681606(0) win 64240 <mss 1460> (DF)
Since both outputs originate from the same host A, the timestamps are comparable.

1.4 Results

The Solaris 2.5.1 host A retransmits its initial SYN segment only after 12.18 seconds, which is 4 * tcp_rexmit_interval_initial (tested with further experiments) plus some unknown constant C. Decreasing the parameter will lower the initial segment retransmission interval. For instance, using 500 milliseconds will start the first retransmit after 2 seconds, the second after 4 seconds, etc. This admittedly odd behaviour may be among the reasons older versions of Solaris defaulted the parameter to 500 ms. Solaris 2.6 does not display this odd behaviour, as shown below.

In the given experiment, Solaris 2.5.1 retransmits the initial segment thrice, bevor giving up. Note that Solaris uses the required exponentiell backoff intervals. The time between each connection attempt doubles. The expiration is triggered by tcp_ip_abort_cinterval, which says to give up after 60 seconds. If set to a higher value, all Solari try for a longer time to connect, thus effectively employing more segments to be sent. This may be of importance on very bad lines.

1.5 Retesting with different settings

tcp_rexmit_interval_initial = 500
    tcp_rexmit_interval_min = 5000
    tcp_rexmit_interval_max = 20000
     tcp_ip_abort_cinterval = 120000
The settings of tcp_rexmit_interval_min didn't seem particularly important. In order to have a good look at the behaviour of this parameter, we need to increase it over four times of the initial interval. Also, we decrease the tcp_rexmit_interval_max to see its effect, and tcp_ip_abort_cinterval to show what is happening.

A $ gtod ; sock 192.168.4.17 daytime ; gtod
16:33:17.738
connect() error: Connection timed out
16:35:11.233

A # tcpdump -Ni hme0 host 192.168.4.17
tcpdump: listening on hme0
16:33:17.771977 A.13448 > 192.168.4.17.daytime: S 2113264372:2113264372(0) win 21900 <mss 1460> (DF)
16:33:20.587369 A.13448 > 192.168.4.17.daytime: S 2113264372:2113264372(0) win 21900 <mss 1460> (DF)
16:33:26.217241 A.13448 > 192.168.4.17.daytime: S 2113264372:2113264372(0) win 21900 <mss 1460> (DF)
16:33:37.467032 A.13448 > 192.168.4.17.daytime: S 2113264372:2113264372(0) win 21900 <mss 1460> (DF)
16:33:56.216666 A.13448 > 192.168.4.17.daytime: S 2113264372:2113264372(0) win 21900 <mss 1460> (DF)
16:34:14.966287 A.13448 > 192.168.4.17.daytime: S 2113264372:2113264372(0) win 21900 <mss 1460> (DF)
16:34:33.715931 A.13448 > 192.168.4.17.daytime: S 2113264372:2113264372(0) win 21900 <mss 1460> (DF)
16:34:52.465540 A.13448 > 192.168.4.17.daytime: S 2113264372:2113264372(0) win 21900 <mss 1460> (DF)
sock is a tool well known to all TCP/IP Illustrated I readers. In this instance, tries to establish a TCP connection in the same way telnet does. Of interest are the intervals between the tcpdump timestamps.

  1. retranmission after 2.8154 seconds (4*initial + C).
  2. retranmission after 5.6299 seconds (the double of first interval).
  3. retranmission after 11.2499 seconds (doubled again).
  4. retranmission after 18.7496 seconds (below configured 20 seconds).
  5. retranmission after 18.7496 seconds (does not rise further).
  6. retranmission after 18.7496 seconds.
  7. retranmission after 18.7496 seconds.

The tcp_rexmit_interval_min does not seem to play any part in deciding the retransmission interval for the initial segment. The behaviour takes the doubles of the first interval used (regardless of the means it is calculated) until the upper boundary given by tcp_rexmit_interval_max would be crossed. Still unknown to me is the way Solaris arrives at the value of 18.74 seconds. The application took a trife less than the configure 120 seconds to run into the connection timeout.

2. Solaris 2.6 initial SYN segment loss test

2.1 Setup of tunable parameters on host B

A Solaris 2.6 host B has the following settings which are of interest to the reader. This time, host A is not actively involved in the experiment. The parameter tcp_rexmit_interval_extra is new with Solaris 2.6, and most likely also connected to retransmission. By default, it is set to zero.

     tcp_ip_abort_cinterval = 180000
     tcp_ip_abort_linterval = 180000
      tcp_ip_abort_interval = 480000
    tcp_ip_notify_cinterval = 10000
     tcp_ip_notify_interval = 10000
tcp_rexmit_interval_initial = 3000
    tcp_rexmit_interval_max = 240000
    tcp_rexmit_interval_min = 2000
  tcp_deferred_ack_interval = 500
  tcp_rexmit_interval_extra = 0

2.2 Setup of the test environment and experiment

A # ifconfig hme0:1 192.168.4.18 down
B # route add 192.168.4.18 A 1

The setup is similar to the previous one, except that I now use another (new) private address on host A and add a route to it on host B.

B $ gtod ; telnet 192.168.4.18 daytime ; gtod
16:58:42.559
Trying 192.168.4.18...
telnet: Unable to connect to remote host: No route to host
17:01:52.135

B # tcpdump -Ni hme0 host 192.168.4.18
tcpdump: listening on hme0
16:58:42.621773 B.9609 > 192.168.4.18.daytime: S 2960697070:2960697070(0) win 49640 <mss 1460> (DF)
16:58:46.114378 B.9609 > 192.168.4.18.daytime: S 2960697070:2960697070(0) win 49640 <mss 1460> (DF)
16:58:52.114612 B.9609 > 192.168.4.18.daytime: S 2960697070:2960697070(0) win 49640 <mss 1460> (DF)
16:59:04.115096 B.9609 > 192.168.4.18.daytime: S 2960697070:2960697070(0) win 49640 <mss 1460> (DF)
16:59:28.115897 B.9609 > 192.168.4.18.daytime: S 2960697070:2960697070(0) win 49640 <mss 1460> (DF)
17:00:16.117663 B.9609 > 192.168.4.18.daytime: S 2960697070:2960697070(0) win 49640 <mss 1460> (DF)
For a change, telnet dies with a different error message. As I try to ping the down interface on host A from the host B, the default route is used instead of the host route, and the router complains that the host is unreachable. In the same situation, trying with ping to reach B from host A, Solaris 2.5.1 only tells me that there are no replies to my probes.

The first retransmission takes place after 3.5 seconds, which looks as if this is the sum of tcp_rexmit_interval_initial and tcp_rexmit_interval_min. But as further experimentation showed, the 0.5 second part is the anonymous constant C again. In Solaris 2.6 the factor four is no longer present. The final timeout on the connection is a little above the configured 180 seconds.

2.3 Restesting with different parameters

tcp_rexmit_interval_initial = 500
    tcp_rexmit_interval_min = 5000
    tcp_rexmit_interval_max = 20000
     tcp_ip_abort_cinterval = 120000
The settings of tcp_rexmit_interval_min didn't seem particularly important for connection initiation on Solaris 2.6, either. In order to be comparative with the first set of experiments, the same parameters are adjusted, though the reasons are no longer all true for Solaris 2.6.

B $ gtod ; telnet 192.168.4.18 daytime ; gtod
11:10:10.384
Trying 192.168.4.18...
telnet: Unable to connect to remote host: No route to host
11:12:21.425

B # tcpdump -Ni hme0 host 192.168.4.18
tcpdump: listening on hme0
11:10:10.412102 B.32849 > 192.168.4.18.daytime: S 3156818858:3156818858(0) win 49640 <mss 1460> (DF)
11:10:11.404921 B.32849 > 192.168.4.18.daytime: S 3156818858:3156818858(0) win 49640 <mss 1460> (DF)
11:10:21.406536 B.32849 > 192.168.4.18.daytime: S 3156818858:3156818858(0) win 49640 <mss 1460> (DF)
11:10:41.406214 B.32849 > 192.168.4.18.daytime: S 3156818858:3156818858(0) win 49640 <mss 1460> (DF)
11:11:01.406879 B.32849 > 192.168.4.18.daytime: S 3156818858:3156818858(0) win 49640 <mss 1460> (DF)
11:11:21.407646 B.32849 > 192.168.4.18.daytime: S 3156818858:3156818858(0) win 49640 <mss 1460> (DF)
11:11:41.408414 B.32849 > 192.168.4.18.daytime: S 3156818858:3156818858(0) win 49640 <mss 1460> (DF)
11:12:01.409192 B.32849 > 192.168.4.18.daytime: S 3156818858:3156818858(0) win 49640 <mss 1460> (DF)
The final connection timeout, again, stops the telnet after a little over 10 seconds beyond the configured tcp_ip_abort_cinterval. The tcp_ip_notify_cinterval does nost play a part in that particular story, as further experiments revealed.

  1. retransmission after 0.993 seconds (initial + C).
  2. retransmission after 10.002 seconds (2 * tcp_rexmit_interval_min).
  3. retransmission after 20.000 seconds (doubled, and maximum).
  4. retransmission after 20.000 seconds (at maximum, no growth).
  5. retransmission after 20.000 seconds.
  6. retransmission after 20.000 seconds.
  7. retransmission after 20.000 seconds.

As usual, we are interested into the interval between two connection attempts. This time, the tcp_rexmit_interval_min does play the stated role in the determination of the retransmission interval length. I guess, with TCP part of the kernal again in 2.6, Solaris is able to time things much more accurately.

3. Solaris 2.5.1 data retransmission test

[TBD]

4. Solaris 2.6 data retransmission test

[TBD]


[Back]  [Solaris tuning]  [TCP transactions]  [SYS-V-IPC]  [TCP rexmit]  [Slow start]  [JSV homepage]  [RVS homepage]
Please send your suggestions, bugfixes, comments, and ideas for new items to voeckler@rvs.uni-hannover.de
In hope of supplying useful information, Jens-S. Vöckler

Last Modified: Tuesday, 22-Sep-1998 10:20:23 MET DST