You are on page 1of 20

Connection Management in Transport

Protocols
1. Introduction

Carl A. Sunshine
Information Sciences Department, The Rand Corporation,
.1700 Main Street, Santa Monica, California 90406, USA

Distributed computing systems, consisting o f


computers connected to one another by a communication network, require processes to interact by
means o f explicit interprocess communication protocols. These protocols are often composed of several
layers, with higher layers making use of the functions
performed by lower layers. A typical protocol
architecture consists o f a partially reliable ' b e s t
effort" communication service at the lowest level,
followed by a general purpose fully reliable transport
protocol, and finally various higher level protocols
that provide specialized services such as file transfer,
remote job entry, interactive terminal support,
graphics, and resource coordination. Each protocol
layer provides an augmented service b y using appropriate mechanisms on top of the lower level protocol.
This paper focuses on transport protocols (TP) for
use ori packet switching networks which may damage,
duplicate, lose, and delay packets, or deliver them in
a different sequence than submitted. The basic technique used by transport protocols to achieve reliable

Yogen K. Dalai
Xerox Systems Development Department, 3408 Hillview
Avenue, Pal6.Alto, California 94304, USA
Transport protocols are designed to provide t'uJ.]y reliable
communication between processes which must communicate
over a less reliable medium such as a packet switching network (which may damage, lose, or duplicate packets, or
deliver them out of order). This is typically accomplished by
assigning a sequence number and checksum to each packet
transmitted, and retransmitting any packets not positively
acknowledged by the other side. The use of such mechanisms
requires the maintenance of state informatiol: describing the
progress of data exchange. The initialization and ~,~ahltenance
of *his state information constitutes a connection between
the two processes, provided by the transport protocol
programs on each side of the connection. Since a connection
requires significant resources, it is desirable to maintain a
connection only while processes are communicating. This
requires mechanisms for opening a connection when needed,
and for closing a connection after ensuring that all user data
have been properly exchanged. These connection management procedures form the main subject of this paper. Mechanisms for establishing connections, terminating connections,
recovering from crashes or failures of either side, and for
resynchronizing a connection are presented. Connection
management functions are intimately involved in protocol
reliability, and if not designed properly may result in deadlocks or old data being erroneously delivered in place of
current data. Som~ protocol modeling techniques useful in
analyzing cennecrion management are discussed, using
verification of connection establishment as an example. The
paper is based on experience with the Transmission Control
Protocol (TCP), and examples throughout the ~aper are
taken from TCP.

Carl Sunshine received a PhD in computer science from Stanford Utliversity in 1975 where he worked on
analysis, design, and implementation
of communication pratocols for computer networks. Since 1975 he has
been with the Rand Corporation,
where he is involved in research on
computer network protocols, network interconnectioa network planning, distributed systems, and operating systems. Dr. Sun~;hine is active in
IFIP TC6.1, the Internetwork Working Group.,
Yogen K. Dalai received his BTech
degree in Electrical Engineering from
the Indiap Institute of Technology,
Bombay, in 1972, and his MS and
PhD degrees in Electrical Engineering
from Stanford Univ,~rsity in ~973
and 1977 respectively. He is currently with the Xerox Systems Development Department in Palo Alto
working on the design, analysis and
implementation of computer communication protocols, and the interconnection of computer networks. His research interests also
include local networks, distributed system~ architecture,
packet switching~ broadcast protocols and operating systems.
Dr. Dalai is a member of the ACM and IEEE.

Keywords: Computer network, host-to.host protocol,


transport protocols, interprocess communication, connections, connection management,
reliability, synch ronization, 3 way handshake,
resynchronization, correctness, verification.

North-HoUand Publishing Company


Computer.Networks 2 (1978) 454-473
454

CA. Sunshine, Y.K. Dalai / Connection management in transport protocols

communication over such an unreliable transmission


medium is to assign a sequence number and checksum
to each packet transmitted, to verify the checksum
and positively acknowledge successfully received
packets, and to retransmit packets that remain
unacknowledged beyond a timeout period. Further
discussion on the motivation for and use of these
techniques in transport protocols may be found in
refs. [4,17,22]. Transport protocols can also perform
a multiplexing function to allow many processes in a
host to share the network access path.
The salient point for this paper is that the use of
these mechanisms requires the maintenance of state
information describing the progress of data exchange.
The initialization and maintenance of this state
information constitutes a connection between the
two processes, provided by the protocol machines on
each side of the connection. Connection oriented
protocols typically include an initialization phase
during which necessary parameters (e.g. sequence
numbers) are synchronized at each end, a data
transfer phase, and a termination phase. Such connection oriented or virtual circuit protocols are in
contrast to message oriented interprocess communication protocols [25] which are inherently unreliable.
Although a connection potentially exists between
every pair of processes, only some processes will need
to converse at any given time. Since a connection
requires significant resources, it is desirable to main-

l
OPEN

% /

Opening

packet

exchange

tain a connection only while processes are communicating, and then to terminate the connection and free
the resources when the processes are done. This leads
to mechanisms for opening a connection when
needed, and for closing a connection after insuring
that all user data have been properly exchanged.
In opening and then closing a connection, it is
convenient to describe the protocol machines as going
through a number of states. Each machine starts in
the NoMctive state in which no connection exists.
On command from the user process, a machine may
actively initiate connection establishment (Opening
state), or passively wait for connection establishment
to be initiated from the remote end (Listening state).
Once the connection is established (Established
state), either user may terminate it, placing the
protocol machine in a Closing state. Once the connection has been closed, the protocol machine is in the
NotActive state again. Figure 1 illustrates the lifecycle of a connection as it passes from one: state to
another. The Opening and Closing states may in
reality consist of several substates depending on the
protocol used for accomplishing the desired effects.
These connection management procedures fo~m
the main subject of this paper. They include procedures for dealing ~ith crashes or failures of either side
of a connection as well as opening and dosing
connections under normal circuinstances. This paper
does not discuss the multiplexing function of

NotActive

.,
packet
exchange

LISTEN

',,.L~

Closing

Listening

....

packet
exchange

CLOSE

~I

45.5

Established

I. . . .

Fig. 1. Comlcctionlife-cycle.

456

CA. Sunshine, Y.K. Dalai / Connection management #~ transport protocols

transport protocols which is independent of connection management functions for individual connections. Connection management functions are intimately involved in protocol reliabiiity, and if not
designed properly may result in deadlocks or old data
being erroneously accepted in place of current data.
This paper is based on our experience with the
Transmission Control Program (TCP) [5,7,8]. Our
efforts in designing TCP have resulted in continuing
changes to the original lnternetwork Protocol
described by Cerf and Kahn [4]. We will use TCP
to designate the class of transport protocols based
on the Intemetwork Protocol. Lessons are drawn
from various stages of TCP development throughout the paper, so the reader is cautioned that some
examples do not reflect the current TCP. In particular, the formal analysis in Section 6 is based on
an early version of TCP. We will use TP and TCP to
stand for both the protocol and the program that
implements it.
TCP wzs designed with strong "worst case"
assumptions about the underlying transmission
medium. In particular it was assumed that packets
could be damaged, lost, duplicated, and delivered out
of order, with a widely varying delay. All of these
events are known to occur through various "natural"
causes in packet switching networks, ltowever, TCP
was not designed to solve the additional problems
imposed by "malicious intruders" [15] which require
procedures involving encryption. In considering
connection management mechanisms within a TP, it
will be helpful to use examples of the dialogue
between two TPs. All of our examples will be drawn
from TCP, but similar scenarios exist in other
transport protocols. These dialogues consist of the
exchange of packets between TCPs A and B and will
be illustrated as follows:
Each packet contains a sequence number, some
optional control information, an optional acknow.
ledgement, and some optional data, represented in
the following notation:
( Seq #)(Control )( Ack #)(data ).
Each line of the dialogue consists of a packet label in
pareatheses followed by the activity at A where
"<- -" signifies the packet being received at A, "- ->"
sign,.'fies ~he packet being transmitted at A, " ..... "
signifies that A is unaware of the packet at that time,
and ' T ' indicates no activity. Next appears a descrip.
tion of the packet in the notation described earlier.
Lastly, the activity at process B is described, where

"- ->" signifies that the packet is received at B, "<- -"


signifies that the packet is transmitted at B, and
" ..... " signifies that B is unaware of the packet at this
particular time. Ccmments appear on the extreme
right. The vertical direction towards the bottom of
the page represents the time axis.
The label of a packet is simply a reference for purposes of discussion. When a packet is duplicated or
retransmitted, it is given the same label, the superscript after the closing parenthesis indicating the
number of the copy. For brevity, the original copy of
the packet does not contain the superscript 1. Packets
that are damaged or lost by the communication
system will have their label superseripted with an
asterisk. When the dealy between transmission and
reception of a packet is unimportant, the packet will
be shown to be transmitted and received on the same
line. This notation is a modification of the notation
introduced by Tomlinson [24]. When a packet arrives
at a TP it may cause a state change at that TP, and in
addition may cause the TP to perform some actions.
In examples of dialogues between two TPs we will use
a small font to represent the state of the TP and a normal
font to describe actions a TP may perform. User
process commands to the TP will be shown USING
THIS FONT. Figure 2 illustrates these conventions.

2.Connection establishment
Since tiansport protocols use sequence numbers to
validate at, lying packets (check for duplicates and
correct order), one of the main function of connection establi'~hment is to properly initialize sequence
number parameters used by the protocol. Each TP
has a Next Sequence Number (NSN) that it will assign
to the ne'~t (new) packet transmitted, and an
Expected Sequence Number (ESN) which it expects
to see on the next arriving packet. NSN must be
initialized to some Initial Sequence Number (ISN)
when a connection is started, and ESN in the
opposite TP must be set equal to ISN. Sequence
numtiers may be assigned to various size units of
information from a full packet to a bit. In TCP, each
octet is assigned a sequence number, and each packet
carries the sequence number of the first octet
contained and a count of the number of octets in the
packet. This facilitates fragmentation of packets at an
intermediate point and their later reassembly at their
destination as described in [4].
Many transport protocols employ the simple

.A. Sunshine, Y.K. Dalai/Connection management in transport protocols

(1)
(2)
(3)
(3) 2
(4)

OPEN

LISTEN

Opening

Listening

- ->
I
<-- ->
- ->
<- CLOSE

457

( Seq 0 )(data ABe )


( Seq 0 )(Ark 3 )
( Seq 3 )( data DE )
( Seq 3 )(data DE )
(Seq 0 ) ( A r k 5 )

- .>
accept
<-.....
- ->
<--

Not active

CLOSE
Not active

I
I
I

I
I
I

OPEN

LISTEN

Opening

Listening

(5)

- ->

( Seq 0 )( data GHI )

(6)
(7)
(3) ~
(8)

<-- ->
.....
<-I

(Seq 0 )( Ark 3 )
(Seq 3)(data JK )
(Seq 3 )(data DE )
.',Seq 0 )(Ack 5 )

delayed
retransmission
terminate connection

start new connectiotz

- ->
accept
<-.....
new data delayed
- ->
ola duplicate arrives
~-old data accepted

Fig. 2. Old data delivered instead of current data.

approach of using a fixed number (zero) for ISN (and


ESN) whenever a connection is eatablished. This is
satisfactory for opening a single instance of a connection, but if the connection is opened and closed in
rapid succession then such a strategy may lead to
errors as shown in figure 2 when packets are retransmitred or duplicates are generated inside the transmission medium.
If the duplicate packet is delayed in the network
until another connection has started between the
processes, it may look like a valid new data packet
and be accepted, causing the data to be delivered
twice. Solutions to this problem revolve around a
more careful selection of ISN for new connections.

2.1. ISN selection


The fundamental problem shown in fig. 2 stems
from two causes: (1) the possibility for the network
to delay and subsequently deliver duplicate packets
out of order., and (2) the inability of the TP to
differentiate packets on an old connection from
packets on a current connection between the same
pair of processes. Accordingly, there are two avenues
for solution: (1) Prevent packets from old connections from arriving during later connections, or (2)

make sure that old packets can be distinguished from


new packets.
Solutions of the first type require that a packet
may not be stored in the transmission medium
beyond a nmximum packet lifetime L. This may be
enforced by using "self-destructing" packets [ 12], or
by the physical characteristics of the medium, if no
connection is opened before time L after its last
closing, all old packets will be gone, and any ISN may
be used to initialize the connection.
Tiffs solution requires TPs to remember for time L
that a connection was closed, and hence runs counter
to the goal of minimizing state information maintained for inactive connections. Furthermore, if a TP
fails and forgets which connections were recently
closed, it must prevent opening connections for all its
processes for time L since any of them might have
recently closed connections. The cost of this type of
solution is then storage of status infonuation, and
delay in reestablishing connections after failures.
When L is large, as in multinetwork systems, these
costs may be high.
Ways to achieve the second type of solution
require a careful selection of ISN or some other
identifier to unanlbiguously distinguish packets of
different connections The other identifier may be an

C.A. Sunshine, Y.K. Dalai/ Connection management in transportprotocols

458

incarnation number[13,22] which is derived from


some global counter. If the counter has cycle time
greater than L, no confusion is possible. However,
another field on every packet sent is required, increashag overhead. The address field may also serve as a
unique identifier if a new port address is used by a
process each time it opens a connection [4,19]. Some
port addresses may be reused to facilitate addressing
"well-known" services, but two reusable ports can
never be connected, guaranteeing that the pair of port
addresses will be unique for every connection.
If the ISN is used to distinguish packets from
different connections, it must be carefully selected
based on memory of previous connection sequence
numbers, or based on a clock. In the memory
approach, the ISN is set to the last sequence number
used in the previous connection, plus one. This
requires maintaining state infomlation for inactive
connections because the last sequence number used
must be remembered for time L on every conncction.
Once time L has passed, any value for ISN may be
used. In the clock approach, the ISN is set from a
single clock for all connections at a host. The clock
value is the only state information that must be
preserved through inactive connections and host
crashes. However, use of a cyclic clock requires resetting the sequence number if the clock is about to
"catch up" with the sequence number as described in
Section 3. The process of resetting the sequence
number is called resynchronization. An additional
cost of this mechanism is testing to see if it is time to
resynchronize the connection.

In general, all solutions of the second type may


fail if the state information which distinguished
connectior.~s is lost. In this case the TP must resort to
a solution of the first type and wait time L before
initiating new connections. To reduce the likelihood
of failure, the state information can be reduced to a
minimum and maintained by some specially reliable
mechanism like an independent clock or counter.
A combination of mechanislns is also possible. For
example, Garlick et. al. [13] propose remembering
sequence numbers for a time L after closing connections, plus a clock-based incarnation number that is
changed only after crashes and thus avoids the need
to wait before restarting.

2.2. Validation o f connection requests


The techniques for ISN selection described above
define procedures for the sending portion of TP that
will allow the receiving modules to correctly reject
old packets. Once a connection is established, the
ESN, in combination with the incarnation number or
unique port address pair if they are used, allow the
receiver to tell old packets from new ones. The maj
function of conn,~ction establishment is to set ES=
and any additional connection identifiers in the
receiving modules to match ISN values used by the
sending modules on the other side of the connection.
To accomplish this, each TP may try to maintain
enough state information about dosed connections to
set its own ESN and recognize old packets. This
would require remembering sequence numbers and/or

A
OPEN

(I)

Opening
pick ISN = x
- ->

(2)

<--

LISTEN

Listening

I
( Seq x )( SYN )

(3)
(4)

(Ack x + 1)

t
<..

set ESN = x + 1
<.pick ISN = y

<Seq y )(SYN )

set ESN = v + 1
- ->
Established

[
<Seqx + 1 )(Ack y + 1)(data AB)

Fig. 3. Simple connection establishmentusing SYN control packet.

Established

CA. Sunshine, Y.K. Dalal ] Connection management in transport protocols

incarnation numbers for every connection for a time


L [12], and may again impose an unacceptable
burden. In the vent of failure with loss of memory,
no connections may be accepted for time L.
Alternatively, the sending modules responsible for
selecting ISN may also infon~ the remote TP of the
ISN they will use for each new connection. The sending modules transmit a Synchronization control
packet (SYN) containing the ISN value as the first
packet on the connection (Figure 3). The SYN packet
is assigned a sequence number and must be acknowledged just as a data packet to ensure its reliable
transmission. The receiving module can set ESN to
this value without maintianing any state information
about the connection. The receiving TP returns a
SYN giving its own ISN, or can reject any SYN that
arrives when the protocol machine is in an inappropriate state. Inappropriately timed arrivals are either
old retransmissions, protocol errors, or attempts to
establish a conversation with an unwilling partner.
Unfortunately, this simple system of a credulous
TP is inadequate when packets may arrive out-oforder. Once the connecO~3n is established, sequence
numbers serve to validate incoming packets. But there
is no mechanism for validating an arriving SYN
control packet. Suppose a SYN packet is delayed in
the network and retransmitted by A. It may arrive at
B just at the moment when B is ready to establish a
raew connection (Figure 4). B will accept the SYN as
new connection request, reply with its own SYN,
and consider the connection established. A will

receive the replying SYN and interpret it as a new


connection request. A's attempt to reply will be
discarded by B who thinks the connection is already
established, and a deadlock will occur which prevents
successful data transfer.
2.3 Three way handshake

To avoid this problem, a more reliable means of


transmitting the current ISN to a remote TP must be
used. Tomlinson [24] has presented such a scheme
called the 3 way handshake which is used in TCP.
Instead of simply accepting an arriving SYN, the
receiving TP must ask the sending TP to verify the
SYN as current. The receiving TP returns a SYN-ACK
control packet to the sending TP which refers to the
ISN from the SYN (Figure 5). If the SYN was a
current packet, the sender returns a positive acknowledgement, and only then does the receiver accept the
SYN and set ESN. This synchronization must occur
in both directions, with the SYN-ACK also carrying
ISN of the receiver in the other direction. Any packet
of the 3 way handshake may also carry data which
will be processed after the connection is established,
but this is omitted from the figures for simplicity.
If B receives an old SYN packet, it returns a SYNACK referencing the ISN in the old packet (Figure 6).
A returns a negative acknowledgement (Reset packet)
since it is not trying to initiate a connection. Upon
receiving the Reset (RST), B knows that the original
SYN was an old packet, so B returns to the Opening

Listening

Listening

(1)

.....

(2)

<-discard

set ESN = z + 1
(Ack z + 1 )
I
pick ISN = y

I
<--

(4)

set ESN = y + 1
. .>

(Seq y )( SYN )
t
(Ack y + 1 )

I
(5)

pick ISN : x
. .>
<-discard

Established

I
(Seq x )(SYN)
dis,:ard

I
(6)

oM connection request

(Seq z)(SYN )

(3)

459

(Seq y + l )(Ack z + l )

I
Fig. 4. Deadlock with simpleestablishment.

460

CA. Sunshine, Y.K. Dalai / Connection mat~agement in transport protocols

(1)

OPEN
Opening

LISTEN
I,istening

pick ISN = x

--~>

(Seq.0)(SYN)

!
I

remetnber x
pick ISN = y

t,

(Seq y)(SYN)(Ack x + 1 )

(2)

I
I

set ESN = y + 1
Established

I
(3)

-->

(Seq x + 1 )(Ack y + 1 )

--~,

wait for verification

set ESN = x + 1
Established

I
I

Fig. 5. Three way handshake connection cstablishm~.nt.

or Listening state. A similar scenario takes care of old


SYN-ACK packets. The use of RST packets is
described in greater detail in section 5 on recovering
from crashes.
The basic 3 way handshake mechanism for establishing connections is inherently asymmetric, with
one side initiating the attempt by sending a SYN, and
the other side waiting to respond to a SYN from the
active side. Occasionally a pair of processes may
simultaneously attempt to initate a ccnnection to
each other, in which case a "collision" is said to
occur. Early versions of TCP simply gwe up and
retried after a random delay to resolve such collisions.
More recent versions of TCP will accept a SYN packet
after having sent their own SYN by replying with an
ACK to successfully establish the connection. However, if two different SYN packets (one an old duplicate) arrive while a connection is being established,
the approach of retrying after a delay may still have
to be used.

The 3 way handshake mechanism is adequate to


reliably establish connections under the worst case
assumptions of packet switching network transmission characteristics without requiring any memory of
previous connections to set ESN. It also accommodates failure recovery as discussed in Section 5.
Techniques for verifying these mechanisms more
rigorously than the informal discussion in this section
has allowed are presented in Section 6.
Incarnation numbers are essentially equivalent to
sequence numbers for purposes of connection establishment, and the 3 way handshake may also be used
to reliably exchange incarnation numbers [ 13]. Other
mechanisms requiring memory of previous activity by
~he receiving modules [12], or use of unique port
addresses [19] may provide the required reliability
without a 3 way handshake. If end-to-end encryption
is hlcorporated into the transport protocol, then the
initialization of a new encryption key for each
connection serves a similar role to the 3 way hand-

A
(1)

Not active
.....

B
Listening
(Seq z)(SYN)

I
I

(2)

<-invalid

(3)

.->
I
Not active

old connection request

remember z
pick ISN --- y
(Seq y)(SYN)(AcK z ~ 1 )

I
( Seq z + 1 )(RST )( Ack y + l )
reject SYN
Listening
Fig. 6. Rejections of old SYN packet with 3 way handshake.

wait for verification

CA. Sunshine, Y.K. Dalai / Connection management in transport protocols

shake in guaranteeing that old packets will be rejected


[151.

461

sequence numbers in bits, p = length of clock time in


bits (p < S), q = S - p.

The figure illustrates a curve, the I S N curve,


representing a clock ticking every D units of time,
and wrapping around in a time C. The I S N + 1 curve
is a curve similar to the ISN curve, but displaced by
D, and the .forbidden zone boundary is another curve
similar to the ISN curve, but displaced from it by a
time L +D. Sequence numbers are S bits long. (This
theory and analysis holds for any radix system, we
have chosen binary only for convenience.)When a
connection is to be established, a unique ISN is
picked, and from then on sequence numbers aie
incremented as and when dzta is transmitted. In the
proposed scheme [24], when a connection is to be
established, the clock is allowed to tick at least once.
and then the ISN is a number which has the current
value of the clock in the high order p bits and zeros in
the low order q bits of S (i.e. the ISN is picked from
the ISN curve). The actual rate b at which data is
transmitted, and sequence numbers assigned are
illustrated by the curve marked actual t ~ z : ; , i s s i 6 ,
rate. The connection must be resynchronized when
the actual transmission rate curve touches the forbidden zone boundary. Otherwise should a connection be closed and reopened within the forbidden
zone, then old duplicate packets might be accepted
by the receiver, as follows (Figure 8).
Suppose that at time tb, sequence numbers Sa

3. Clock driven ISN and resynchronization

A basic goal of the ISN selection mechanism is to


prevent packets from being emitted with sequence
numbers which duplicate those that are atill in the
multinetwork. This should be assured even when TPs
crash and lose all knowledge of the sequence numbers
they have been using. In TCP, when new connections
are created, an ISN generator is employed which
selects a new 32 bit ISN. The generator is bound to a
(possibly fictitious) clock that is assumed to keep
running even if the TCP or its supporting host
crashes. We now describe the implementation of such
a clock driTen scheme, and the necessary relationships
between the clock tick, the maximum packet lifetime
L, the size of the sequence number, the transmission
rate, and the notion of resynehronization [6,9].
Refer to Fig. 7 for a pictorial representation of the
analysis to follow. Let D = duration of the clock tick
in seconds, C = period of the clock in seconds, L =
maximum lifetime of packets in the multinetwork in
seconds, R = time in seconds until resynchronization
is necessary under average bandwidth b, b = average
bandwidth of transndssion in octets/see, B = maximum average bandwidth in octets/see, S = length of

Forbidden Zone Boundary


ISN +1 Curve
ISN Curve

Forbidden Zone

seq # s

~~:~'i~i!

res y nchroniz

!i .i! .

.-- "~4!~::"%Ii

....----"

2S

~'::i'.:.~'.:~';~.

_ _LZ

~.i~I~:i ? ~ "":

~'~!."~
actual
~~:::H
I : ' , ! .r..,:
i l I...'F.,'..J.
='.':L, ." "-"
~.i."-ransmission rate i ~ t ~ ~
,.--I"

,,

!~;!!_:!

I
~!!

time t

Fig. 7. Tile forbidden zone and ISN curve.

,.

462

C.A. Sunshine, Y.K. Dalai / Connection management in transport protocols

i ...... ' ............


i
i
i

............

I
L

~....,...[......
m

sl~
Sa

seq # s

S1

sO

II

actual ~ransmission

time t

i!0

tb

Forbidden Zone Boundary


ISN +t, Curve
ISN Curve

t1

Fig. 8. The need to resynchronize.

through sb have been used, at which point the


connection is closed. A new connection is started at
time to using ISN So derived from the clock. But
packets carrying sequence numbers Sa through st, can
persist in the net for L seconds, and tile new connection can u ~ sequence numbers So through s~ during
time to to t~. Since this includes Sa to st,, the old
duplicates can be confused with new packets. The
same arguments hold for the case where the transmitting TCP crashed while generating packets with
sequence numbers ha the forbidden zone, and then
hnmediately restarted.
This qxample shows that the TCP must not be
allowed ~o generate packets with sequence numbers
in the forbidden zone. This can happen (1) if the
clock catches up with the TCP, thus causing entry
into the forbidden zone from the left, or (2) if the
TCP transmits too fast and enters the forbidden zone
fi'om thebottom. Ti,ereIbre two tests are required:
I) ls the sequence number of the packet about to
be trans~;)itted (NSN) ha the forbidden zone to the
right?
2) Will the sequence numbers assigned to the data
in the entire packet lie outside the forbidden zone?
(i.e. Will any of tile sequence numbers used exceed
the maximum given by the I S N + I curve at this
instant?)
If the first condition is true, then the NSN at the
transmitting TCP must be resynchronized in order to
permit continued transmission without assigning
sequence numbers in the forbidden zone. A protocol
by which this can be achieved can be fotmd in [7].

Since the protocol involves the exchange of control


packets the forbidden zone boundary is made wider
to 2!'ow the exchange to be completed before the real
forbidden zone is catered. If the second condition is
t, e, then the TCP must either delay transmitting the
packet, or must transmit less data in the packet~ If
both conditions are false then the packet may be
transmitted.
Tile test for re~ynchronization need only be made
when a packet is to be transmitted. Hence, if a
process does not transmit anything while the forbidden zone moves past its NSN then no resynchrenization is necessary. However, if a sender is inactive
for a period of time and then tries to transmit while
in the forbidden zone, the sender must be prohibited
from sending until the forbidden zone had been
crossed. Hence, some implemelatations may decide to
periodically cheek the need to! resynchronize connections, or to perfonn resynchrc!nization at fixed intervals such as C/2 to avoid such d!ifticulties.
An exact formulation of ruiles (1) and (2) in terms
of the clock parameters is pres,lmted in the Appendix.
It is also shown that the time until resynchronization
is needed on a connection ranli~esfrom C - L for smaU
b, to infinity as b approaches B (if data transmission
proceeds at the clock rate, the clock never catches up
with packet sequence numberl0. A set of reasonable
parameters for S= 32 bits ai:ld L = 30 seconds is
presented, allowhlg a maxrmum bandwidth of
MBits/sec with resynchronizal!ion required at most
every 4.5 hours.

C.A. Sunshine, Y.K. Dalai / Connection management in transport protocols


4. Connection termination
Once a connection between two communicating
processes has become established, reliable communication can take place over it. Eventually one or both
of the processes will decide that the connection
should be closed because there is no more to say.
There are a number of different ways by which an
established connection may be closed.
4. I. Using a higher level protocol
The simplest way to close a connection is for both
processes to have decided, using a higher level protocol, that they are going to stop communicating and
then to inform their local TP that the present connection should no longer exist by giving the Abort
command. The TP would then sinlply remove knowledge of that connection from its local tables.
4.2. Using TP supplied mechanisms
Alternatively, the connection management aspect
of the TP could provide a mechanism by which one
of the processes can inform the other that communicatien over the connection is to cease. A special
control packet indicating that that conversation is
finished (FIN) is used achieve this. A FIN is assigned
a sequence number for reliability, and travels in a
packet with no accompanying data, or control. There
are three cases of connection closing as seen by a TP:
1) A user process initiates connection termination by
telling its TP to close the connection.
2) The ,emote TP initiates term,nat,eli (on request
from the remote process) by sending a FIN packet.
3) Both users initiate termination simultaneously by
telling their respective TPs t9 close the connection.
There are a number of different protocols by which
FIN packets may be exchanged. These protocols ~aust

(1)

accommodate all three cases described above. We now


describe some possible protocols and indicate their
suitability for various purposes.
4.3. Simple FIN exchange
In the simplest case, the user would like to
terminate communication immediately without
caring about the state of previously transmitted data
which may as yet be unacknowledged, or data as yet
not received. Instead of just aborting the connection,
the TPs could conclude that the connection has
closed when each has transmitted and received a FIN
command. The reason for requiring receipt of a FIN
in addition to transmitting one, is that the receipt of
tile FIN indicates that the other end has noticed that
the connection should be terminated and will proceed
to do so. The initiator of the termination has assurance that the remote end will not continue computing and/or charging the user as in a timesharing
system. If a FIN is not received in response to one
sent within a timeout, then the connection is closed
none the less.
When one of the user processes decides that the
connection should be closed it tells its h)cal TP to
close the connection. The TP sends a FIN to the
other TP with the usual next sequence number and
current acknowledgement. Upon receipt of an
unsolicited FIN packet with an acceptable sequence
number, the TP informs its local user that the connection is closing and replies with a FIN packet of its
own (Figure 9). A TP that has both transmitted and
received a FIN packet may destroy all knowledge of
the connection. If both TPs transmit a FIN simultaneously, then they will interpret the other's as a
response to their own.
This simple scheme is not very reliable. The data
transmitted prior to the transmission of FINs is not
guranteed to be delivered (data in packet 1 of Figure

l.:stablislaed

Established
I

CLOSE

(2)

Closing
- ->

( Seq 29 )( FIN >( Ack 78 >

(3)

<-Not

data in transit

( Seq 78 >( Ack 29 >(data HI )

,1...

( Seq 80 >( FIN >HAck 30 >


active

463

A requests termination
Closing
B complies
Not active

Fig. 9. Connection termination using simple FIN exchange.

464

C.A. Sunshine, Y.K. Dalai [ Connection management in transport protocols

9). I f full delivery is important, the high level protocol must wait to close the connection until all data
has been successfully transmitted. More seriously, if
the initiating TP times out it can not be certain
whether the other side has received the FIN and the
reply was lost, or whether the other side never
received the FIN and is still open.
4.4. Acknowledged FIN exchange

To guarantee that the connection will be closed on


both sides (without necessarily guaranteeing delivery
of data in transit), a protocol requiring acknowledgement of FINs may be used. Each TP transmits a
FIN packet and waits for it to be acknowledged by
the other end, in addition to waiting for a FIN from
the other end. This scheme is very much like the 3
way handshake used to exchange SYNs reliably.
In the first of the three cases mentioned earlier,
when the user process closes the connection, the TP
transmits a FIN packet and will not accept any more
data from the user process. All packets preceeding
and including the FIN will be retransmitted until
acknowledged. When the remote TP has both
acknowledged the FIN and sent a FIN of its own, the
first TCP can acknowledge the remote FIN and delete
knowledge of this connection from its state tables.
In the second case, a TP receives an unsolicited
FIN from the remote TP. The receiving TP then
acknowledges the FIN and sends back one of its own
even if intervening data have not been received. (Data
packet 1 in Figure 10 is never delivered.). It also
informs its user of the remote close request, and does
not accep~ any more data to send. The TP waits for
an acknowledgement of its FIN packet before it can
delete knowledge of the connection (Figure 10). If
an acknowledgement is not forthcoming, after a timeout the connection is deleted none the less.

In the third case, a simultaneous close by both


processes will cause FIN packets to be exchanged.
Each TP acknowledges the FIN it received. Both TPs
upon receiving these eacknowledgements can delete
the connection (Figure 11). In the event that the
acknowledgements are lost, one or both TPs will
conti~iue retransmitting the FIN until they timeout.
By requiring FINs to be acknowledged, both sides
can be sure that the connection will close, once the
initiating FIN has been received. In particular, if the
replying TP times out it may be that the replying FIN
never reached the initiating TP, or that the final
acknowledgement was lost as in Figure 12. In the first
case, both TPs will timeout, and in the second, the
initiating TP closes normally and the replying TP
times out. In either case both sides will close the
connection.
4.5. Graceful acknowledged FIN exchange

Users may in addition desire that the data in tile


pipe during connection termination be reliably
delivered. Close is an operation meaning I have no
more data to send. The notion of closing a full duplex
connection is subject to ambiguous interpretation,
since it may not be obvious how to treat the receiving
side to the connection. The TP interprets close in a
half duplex fashion. The user who closes may
continue to receive data until it is told that the other
end has closed too. The TP will reliably deliver all
data sent prior to closing the connection, so a user
that expects no data in return need only wait to hear
that the connection was closed successfully to know
that all its data was received at the destination end.
This is often been termed a graceful termination. The
algorithms by which this is achieved are very similar
to those described for the acknowledged FIN
exchange. The receiver of a FIN does not acknow-

E~t'.'..,tshed

(1)

Established
(Seq 29 )(Ack 78 )( data HIJKL )

<Seq 34 )( FIN )(Ack 78 )


I

(3)
(4)

A requests termination

Closing
(Seq 78 )(FIN)(Ack 35)
( Seq 35 )(Ack 79 )

Not active

data in transit

I
I

('l,~s: rt[,

(2)

,,.I.

B complies

Not active
Fig. 10. Connection termination using acknowledged FIN exchange.

CA. Sunshine, Y.K. Dalai / Connection management in transport protocols

Established, CLOSE
(1)
(2)

Established, CLOSE
(Seq 29 )(FIN )(Ack 78 )
(Seq 78 )( FIN )(Ack 29 )

,.==o

Closing

(2)
(1)

465

*..H

Closing
( Seq
(Seq 29
( Seq 30
( Seq

78 )(FIN )(Aek 29 )
)( FIN )(Ack 78 )
)(Aek 79 )
79 )(Ack 30)

Not active

<~--

Not active
Fig. 11. Simultaneous exchange of confirming FINs.

ledge the FIN until all data transmitted before the


FIN has been successfully received. In addition, a FIN
is not returned until the local user indicates that it
too wants the connection closed. On receipt of a FIN
the TCP informs its local user so that the user may
request that the connection be closed from its end as
well.

5. Failure recovery

The protocal mechanisms described so far adequately manage connections as long as tile protocol
machines on each side continue to function normally.
Unfortunately, hosts and their TPs occasionally crash
with loss of state intormation. When a host fails and
subsequently restarts, all knowledge of TP connections is lost. Connections which were established
become half open with file failed si~i~ thinking they
are not active, while the other side believes the
connection is still established. Half open connections

may also result from protocol design flaws without


requiring crashes.
Under these circumstances a TP cannot guarantee
whether data in transit during the crash was delivered
or not [1,22] and higher level error recoveIy must be
invoked.
Since the established side is consuming resources,
it is desirable to reset half open connections
promptly. One way to close half open connections is
to have a tim.er in each TP that goes off if an established connection has not actively received any
correct packets for some tilne, or if a retransmissicn
tim,out exceeds some predefined value. The occur.
fence of such a timeout will prompt the TP to remove
knowledge of the half opt n connection once the user
process owning the ,;onnection agrees. This
mechanism was used to terminate connections owing
to timeouts in Section 4.
Alternatively, a control packet indicating that the
connection should be Reset (RST), can be used. An
example of this was illustrated in Figure 6. The

Established

Established
(Seq 29 )( Ack 78 )( data ABCDE)

(I)

-->

(2)

Closing
- ->

I
I

CLOSE

(Seq 34 )( FIN )( Ack 78 )


Closing

I
(3)

<--

(4)

- ->

(Seq 78 )(FIN)(Ack 35 )
(Seq 35 )(Ack 79)

...,,

Not active
(3)1

.....

I
I

(Seq 78)(FIN )( Ack 35 )


Timeout
Not active
l'ig. 12. Acknowledged FIN exchange terminated by fimeout.

CA. Sunshine, Y.K. Dalal / Connection management in transport protocols

466

B
I

crash

NSN = 300, ESN = 100

OPEN

Established

Opening

(1)
(2)

( Seq 6 )(SYN )
( Seq 300 )( Ack 100)
I

rejected
(3)

( Seq 1O0 )( RST )(Ack 300)

I
I
(1) 2

believable
Not active

( Seq 6 )( SYN )

..o.a

Fig. 13. |lalf-open connection discovery and reset.

general use of this command is now described.


Assume that two TPs A and B are communicating
with one another when a crash occurs causing loss of
memory to A. When A is up again it is likely to
restart from the beginning, or conthiue from some
recovery point. As a result the user process at A will
probably try to open the connection again or try to
send on the connection it believes is established. In
the latter case it receives the error message connection n o t o p e n from it~ TP. In an attempt to establish
the connection, A will send a packet containing SYN.
When the SYN arrives, B being in the Established
state ignores the SYN; but responds with an acknowledgement indicating what sequence it next expects
to hear. A sees that this packet does not acknowledge
anything it sent and, being unsynchronized, sends a
RST because it has detected a half open connection.
B can believe the RST since it refers to its ESN and
NSN and aborts the connection, notifying the user.
This scenario is illustrated in Figure 13. If the
acknowledgement packet was instead an old duplicate, the RST referring to it would not be acceptable
at B. A will continue to retransmit its SYN and if the

user at B reopens the connection, it will eventually be


established.
The case when A crashes and B tries to send data
on what it thinks is an established connection is illustrated in Figure 14. The data arriving at A from B is
unacceptable because no connection exists. So A
sends back a RST. The RST is acceptable and B
processes it and aborts the connection.
A variety of other cases are possible, all of which
are accounted for, by the following rules for RST
generation and processing:
1) If the connection is not yet Established (or does
not exist), a RST should be formed and sent for any
packet that does not acknowledge something the
receiver sent earlier. The RST should take its
sequence number field from the acknowledgement
field of the offending packet (if it has one) and its
acknowledgement field should acknowledge all data
and control in the offending packet.
2) If the connection has been Established, any
unacceptable packet should elicit only an empty
acknowledgement packet containing the current NSN
and an acknowledgement indicating the ESN.

B
I

crash

NSN = 300, ESN -- 100

Not active

(1)

Established

(g~q 300)(Ack 100)(data AB)


rejected

(2)

I
(Seq 100)(RST)(Ack 302)

believable,

Not active

Not active
Fig. 14. Half-open connection discovery and reset.

C.A. Sunshine, Y.K. Dalai / Connection management in transport protocols

3) All RST packets are validated by first checking


their sequence field as for other packets, then if the
RST acknowledges something the receiver sent (but
has not yet received acknowledgement for), the RST
must be valid. After validating the RST, the TCP
changes state. If the connection was not Established
it returns to the Opeing, of Listening state. If the
connection was Established, it is aborted, placed in
the NotAetive state, and the local user process is notified.

6. Modeling and analysis techniques


Since overall protocol reliability depends heavily
on the operation of connection management mechanisms, it is particulady important to verify that they
function correctly. Most often this has been done by
the sort of informal case studies performed in Section
3 to 5. In this form of analysis, the designer t~ies to
identify all situations of interest, and to verify that
the protocol "does the right thing" in each case.
These narrative analyses are very valuable to provide
motivation for and intuitive understanding of protocol mechanisms, and have successfully uncovered a
number of design flaws. In the case of TCP, the
problems associated with the credulous connection
opening mechanism described in Section 2 were
found this way, leading to the development of the 3
way handshake.
More rigorous analysis tecimiques must be used to
verify the reliability of transport protocols with
greater certainty. This requires a precise model of the
protocol whose detailed operation can be analyzed.
This model typically consists of a pair of protocol
machines connected by a transmission medium. The
machines receive commands from their respective
users, and messages from each other via the transmission medium. The transmission medium is itself a
simple kind of machine that may introduce errors,
delays, and other perturbations between its input and
its output.
This model may be approached from two viewpoints, local and global [14]. The local viewpoint
focuses on an individual protocol machine and is
most useful for specification and implementation
purposes. The global viewpoint considers the entire
system as a black box with inputs and outputs to
each user. The global view is most useful for verification which must show that the outputs to each user
are as desired in all cases.

467

Protocol verification techniques fall into two main


classes, program proofs and state models [23]. In the
former, each machine is specified algorithmicallv, and
assertions which reflect the desired reliability goals
must be formulated and proved. 1his approach has
been effectively applied to verifying tile data transfer
features of transport protocols wheL~elarge or infinite
numbers of interaction sequences are possible due to
large sequence number spaces and retransmissions
[2,21].
State models have been developed using such
fomlalisms as Petri nets, state diagrams, state transition matrixes, flow charts, and programs to define the
protocol machines [14,16,18,20]. Some form of
reachability analysis is typically performed to
generate all possible interactions of tile system,
followed by a check for undesirable states. This
approach has been successful in verifying data
transfer features where simplifying assumptions are
made to keep the number of states tractable, but is
most applicable to connection management where the
number of states is inherently small. Several mixed
systems have also been developed in which state
models are used for the basic staler of tile protocol
machine, augmented by context variables which are
not part of the basic state. [3,11].
A comprehensive treatment of these alternative
techniques is beyond the scope of this paper. As an
illustration, we present a brief description of the
technique used to formaly verify portions of tile TCP
connection management procedures in [22]. This
technique falls into the state model class, and takes a
global view.
Each protocol machine is modeled as a classic state
machine with a', input set, an output set, a set of
internal states, and ~tmctions giving the next state and
output for each combination of input and current
state. Inputs consist of user commands, messages
from the other protocol machine (:~r tile network),
and internally generated events such as timeouts. Tile
two machines operate independently, with synchronization achieved by one machine waiting for a particular type of packet from the other machine.
We define the composite state of the system as tile
state of the protocol machine on each side of the
connection, plus any relevant packets in the transmission medium between them. Traasitions from one
composite state to another are derived from the state
trans~.tions of the individual protocol machines by
including all possible transitions of either protocol
machine, given the state of the transmission medium.

468

CA. Sunshine, Y.. Dalai / Connection management in transport protocols

The number of composite states is then the


number of protocol machine states squared times the
number of different states of the transmission
medium. In a straightforward approach, every packet
(including retransmissions) since the time the composite system was created would have to included in
the state, making the state space unworkably large.
This potential state explosion may be limited in two
ways: by reducing the number of protocol machine
states, and by treating as equivalent different sets of
packets in transit as described below.
In order to verify connection management
functions, the protocol machine is defined with basic
states for connection establishment and termination
functions only. These include a NotActive state, an
Established state, and several intermediate states for
going from one to the other as connections are
opened and colsed. All data transfer takes place in the
Established state, and is therefore not modeled by the
basic protocol machine. Information about sequence
numbers is part of the context information maintained outside the basic states of the protocol. This
context information is used along with the basic state
to determine the processing o f inputs.
To limit the number of transmission medium
states, all packets in the transmission medium can be
classified as either current or old by considering the

"

--

protocol's use of sequence numbers. Packets are


current if either
l) The packet is pecding (waiting for ret~ansmission) at the sender. Nc,nnally this condition holds
until the sender receives some form of acknowledgement. For packets which are not retransmitted (ACK,
RST) it does not hold.
2) The packet refers ~o a current packet traveling
in the opposite direction (e.g. ACK), RST of another
packet). When the opposite packet is no longer
current, both packets are removed from the composite state.
Since we are primarily interested in worst case
analysis, we assume 'that duplicate packets from
previous connections may be held in the transmission
medium and emerge during or after establishment of
the current connection (subject to packet lifetime
constraints). We group all old packets into a single
class for each packet type (since their processing will
be equivalent), and implicitly attach this set of old
packet types 1o the transmission medium state. This
means that we allow all old packet arrival events to
occur in every state, which is a more rigorous but
simpler to represent test for the protocol than any
real tlansmission medium. Hence only current
packets must be explicitly represented as part of the
composite state which determines which packets can

NotActive

SYN arrives

OPEN

SYN arrives ~

@,

set
set timeout
timeout

or Retry

I
~,

SYN-Received 1

K-~
~"~ SYN-Sent

I
ACK arrives

SYN- ACK arrives


send ACK
/I

sOb'ished V

Fig. 15. Three way handshake protoco! machine.

C.A. Sunshine, Y.K. Dalai / Connection management in transport protocols

arrive, since all old packets are implicitly part of the


state.
ONce a packet is generated by a protocol transition, it remains in the composite state until it is no
longer current, despite the assumption that the transmission medium can lose or damage packets. Since
every current packet is either being retransmitted, or
is a response to a packet being retransmitted, we can
assume that current packets are always available to
cause transitions. In reality, packets may temporarily
disappear from the composite state if lost or
damaged, .but will always reappear due to retransmissions.
By taking advantage of these reductions, a
relatively compact model of cormection establishment in TCP may be developed. Figure 15 shows the
basic protocol machine states and state transitions in
opening a Connection. Each transition is marked by
the input causing the transition, and the output (if
any) accompanying it. Four states are needed to
repre~nt a simple 3 way handshake protocol which
retries on collision. The figure shows only normal
events tor clarity. The set of operations {detect
event, take appropriate action, move to new state}
are assumed to be atomic or uninterruptable so that
no confusion can result from near!v shmultaneous
events.
For a complete model, the processing of all possible input events it, each state must be specified. As
an example, the table for the SYN-Received state is
shown in Figure 16. The set of input events consists
of packets (SYN, SYN-ACK, Data, ACK, RST), user
commands (OPEN), and internal timeouts (Retransmit, Quit, Retry-after-collision). Note that the same
input event results in different outcomes depending
on the context information.
Fig. 17 shows the composite state diagrana resulting
from the connection establishment protocol of Figure
15. Each composite state is represented by a pair of
process states and a list of current pa.:kets. Some
context is represented along wRh the basic state of
each process. This consists of the sequence number
for outgoing packets in the SYN-Sent, SYN.Received,
and Established states, and also the sequ~:;nce number
for incoming packets in the Establishet~ state. This
allows us to determine ~hether the protocol has
correctly irfitialized sequence numbers wlten the Established state is reached.
Current packets are represented by their packet
types, wRh a subscript giving their own sequence
number if relevant, followed by the sequ~mce number

469

Event

Next
State

Action and Context

SYN

self

SYN

self

SYN-ACK

self

Data

self

ACK

ES

RST

NA

ACK, RST

self

OPEN

self

Retrans
Quit
Retry

self
self
self

Send ACK if SYN is


duplicate for current
connection.
Send RST if SYN is not
from current connec.
tion.
Send RST referencing
SYN-ACK.
Discard as out of order
(or hold).
If ACK refers to transmitted SYN-ACK.
(Third part of 3 way
handshake).
If RST refers to transmitted SYN-ACK.
(SYN previously
received was an old
duplicate.)
Ignore if does not refer
to transmitted SYNACK.
Ignore since already in
progress.
Retransmit SYN-ACK.
Notify local process.
Ignore since other side
has already started.

ES = Established NA = Not Active


Fig. 16. State transition table for SYN-rcceived state.

of another packet they may reter to (in parentheses).


An arrow above the packet shows its di, ion of
travel. Thus SY~-ACKy(x) represents a
-ACK
packet with sequence number y, referring
othzr
packet with sequence number x, and travc
flom
left to right.
Symmetric states (identical except for switching
process identities and packet directions)have been
eliminated to simplify the figure. Transitions to the
same state such as retransmissions are not shown.
Composite transitions resulting from simultaneous
transitions of both protocol machines are perfectly
legal, but are shown as sequential individial transitions to reduce the number of arrows.

470

C.A. Sunshine, Y.K. Dalai / Connection management in transport protocols


111~

~SSx)(NA)(SYNx,R~(x)} . . . . . . . . . . .

"# T #~
-i~
(SSx.~(SRy)(SYNx,SYN.ACKy(Z),RST(x),RST(v))
-II, T 1,,,.
,,,,I,
(SSx)(SRy)(SYNx,SYN. ACKy(z),RST(Y))

,.1'

(NA)(~Ry)(SYN- ACK, (z),RST(v))


(,~
(,(SRx)(SR,)(SYN- ACKy(z),SYN- ACKx(W),RST(w))
..D

.,) (SRx)(SRv)(SYN- ACKy(Z),SYN.ACKx(W)) /


T

(NA)(SRy)(SCY~ACK, ,(Z))

. . . . . . . .

4...

,I...

(SSx~(SRy)(SYNx,SYN- ACKy(z),RST(x))

-,T,-

.... ) (SSx)(SRy)(SYNx,SYN- AGKy(z))

) (NA)(NA)0

(SSx)(SSy}(SYNx,SYNy) (""-'--"
=
~ (S':x)(NA)(SYNx) (
(SSx)~SR,, ) (SYNx,SYN- ACKy(X))
(ES)(~;Ry)(SYN. ACKy(x),ACK(Y))
x-'>
(ES)(ES)O
y(,- (,-.y
x -,~ -,->x

Process~IProcess~/Current\
(NA) NOT ACTIVE
(SSx) SYN SENT WITH SEQ # x
.($Ry) SYN RECEIVED AND A
SYN SENT WITH SEQ # Y
(ES) ESTABLISHED

X ~ incoming(expected) seq. #
Y ~ outgoing seq. #

Fig. 17. C o m p o s i l e state diagram for " t h r e e w a y h a n d s h a k e " .

This composite state model demonstrates several


aspects of protocol correctness for the normal case
where both protocol machines start in the NotActive
state and function according to their detinition (no
failures). This includes safety considerations (absence
of deadlocks, correctness of outcome), and liveness
considerations (progress and eventual termination)as
follows.
There are no terminal states with one machine
Established and the other not Established. The only
terminal states have both processes NotActive (if a
connection was rejected) or both processes Esta.
blished. Furthermo~.e, when both processes are Esta.
blished, sequence numbers for both directions are
properly initialized. Hence there is no deadlock in the

procedure for connection establishment.


All paths leading back to the NotActive state for
either process are caused by collisions (simultaneous
open requests) which will cause a later retry to establish the connection. Assuming that perpetual colli.
sions are avoided by the random retry interval, and
that the transmission medium provides a nonzero
probability of delivei'ing any packet, the protocol will
eventually succeed in establishing a connection
(unless the attempt is rejected).
These results show the sufficiency of the connection establishment mechanisms embodied in the 3
way handshake. The inadequacy of simpler gechnique
(given worst ease transmission medium behavior)was
demonstrated informally in the discussion accom.

C.A. Sunshine, Y.K. Dalai / Connection management in transport protocols

partying Figure 2, and is shown formally using the


above techniques in [22]. The mechanisms for
recovering from protocol failures described in Section
5 can also be incorporated into the model by adding
corresponding transitions to the protocol machine.
These introduce transitions out of the half closed
composite state which lead to reestablishing the connection [22].

7. Conclusions
Connection management functions are intimately
involved in transport protocol reliability and require
careful design if errors are to be avoided under unusual
but possible circumstances. Special care must be
taken to avoid the possibility of confusing packets
from different connections between the same pair of
processes since old packets may persist in the network and be delivered during later conne,~tlons. This
requires a unique identitier for each new connection,
or careful selection of the initial sequence number
(ISN) to be used. A clock may be used to reliably
determine the ISN even after protocol failures, but
clock-based schemes require resynchronization of the
connection at certain points. The 3 way handshake
provides a means for reliably synchronizing the two
sides of the protocol to establish a connection while
minimizing the amount of state information that
must be maintained for inactive connections. This
involves the exchange of special synchronization
control packets (SYN) at the start of a connection.
Control packets (FIN) may also be used to
terminate a connection. In a graceful closing, the
protocol guarantees that all data sent before the close
command is issued will be delivered. In an immediate
closing, data in transit may or may not be delivered,
but both sides know that the connection has
terminated. A unilateral closing or abort does not
guarantee that resources on the other side of the
connection have been released. Halt'open connections
resulting from failure of on side of the protocol can
also be terminated by an exchange of control packets.
While infonnal analysis of scenarios is a very useful
tool for designing connection management mechanisms, it is not adequate for a fully reliable analysis.
More precise models defining the detailed operation
of the protocol machines on each side of the connection must be developed for this purpose. One such
model specifying the machines as f'mite automata
augmented by context information is presented and

471

was successfully used to verify connection establishment in TCP.

Acknowledgements
The development of the Transmission Control Protocol
which provided the basis for this work was carried out while
the authors were at the Stanford University Digital Systems
Laboratory, supported in part by the Defense Advanced
Research Projects Agency (under contract number MDA90376C-0093) and by the National Science Foundation Graduate
Fellowship Program. Vint Cerf and Robert Kahn provided
many of the original ideas behind TCP and Vint Cerf has
been a constant participant in discussions on improvements.
The Cyclades research group under Louis Pouzin also contributed many early ideas through discussions i~. IFIP TC 6.1.
Ray Tomlinson suggested several important modifications
including the 3 way handshake. Bili Plummer, Dick Karp, Jim
Mathis, and Bob Metcalfe also contributed significantly to
the work. The continuing support of TCP development by
ARPA has made this wr.,rk l~ossible.

Appendix
The following relationships between the clock parameters
presented in section 5 are useful in analyzing resynchronization.
S = 32 bits

(1)

D - This is a design parameter which is chosen primarily on


the basis of the time the TCP is willing to wait before the
same processes c,'m communicate again. Since it affects some
of the other parameters too, it should not be chosen completely arbitrarily.
B = (2q)/D octets/sec

(2)

C = 2PD sec

(3)

2S > 2BL - T h e maximum rate at which sequence numbers


are used is related to L and S [4 I. This prevents packet
sequence numbers from cycling and hence being reused while
old packets with identical sequence numbers may still be in
the network.
Time until clock resynchronization

The connection must be resynchronized when packet


sequence numbers about to be assigned lie in the forbidden
zone to the right. We now show how the resynchronization
time R, since the connection was synchronized or last resynchronized, depends on the actual average bandwidth b being
achieved.
Assume that the step curves in Figure 7 have been liaearizeu, and have the same average slope, s is the sequence
number at any time t. The equation of the line giving the
(linearized) forbidden zone boundary is
s = B(t - (C - L)),

because the line is displaced in time from the origin by


(C - L), a~d has a slope B.
The equation of tile line giving the actual use of sequence

C.A. Sunshine, Y.K. Dalai/Connection management in transport protocols

472
numbers is

queue, and take into account the maximum packet length


and finite time to transmit by increasing the effective forbidden zone. We also assume that resynchronization occurs
before entry into the real forbidden zone.

s=bt.
Hence the point of intersection gives t = R.

bR = B(R - C + L)

Example parameters
A suitable choice of the various parameters is

R =(C-L)I(1-b/B)
= 00 when b - B
= C - L when b = 0

(4)

'l'he end conditions are intuitively correct, becaum if b = B


then the clock never catches up with the actual rate of transmission and therefore resynehronizafion never has to be
performed, or if b = 0 then resynckronization has to be
performed at a time corresponding to the potential assignment o f a sequence number that lies within the forbidden
zone.

Forbidden zone tests


We now describe the two tests for determining whether
sequence numbers assigned to data will lie in the forbidden
zone or not, in terms of the clock parameters. Let x be this
current sequence number and n be the current value of the
clock. In order to determine whether x is in the forbidden
zone to the right, the TCP tests if x lies ~ithin the range [ forbidden zone boundary, ISN + 1 curve] i.e. [(2q(n + 1 + F'L]
D "3 - 1), 2q(n + 1)] I91.
Therefore, if
({2q(n + 1 +FL[D-]) - 1 - x} mod 2S) < 2qFL[D'q,

(5)

then the connection must be resynchronized.


In order to determine whether sequence number assignment will enter the forbidden zone from the bottom, the
TCP tests whether any of the sequence numbers assigned to
dat,~ in the packet lie in the range [ISN + 1 curve, forbidden
zone boundary ].
Let d be the length of the data in the packet in octets and
define
y = (x + d - 1) m,:d 26'.
The test can be refornmlated as whether the ISN + 1
curve lies within the range [x,y].
Therefore, if

({)'- 2q(n + 1)} rood 2S) < d

(6)

then only the portion between x and 2q(n + 1) -1 can be


transmitted in this p,~cket. Note that this lest can be omitted
with if it can be guaranteed that sequence number assignement will never enter the borbidden zone from the bottom.
This is possible if B is larger than the bandwidth attainabk.
through the network.
Assuming that both tests must be performed, each packet
must be tested as foUc,ws before it can be transmitted:
IF eqn (5) is true THEN resynchronize the connection
ELSE 1F eqn (6) is true THEN transmit partia~ packet or wait
for the clock to tick
ELSE transmit entire packet.
We assume a zero time for putting a packet on the transmit

D=

14 bits,
1 sec,

q =

1~8bits,

L~30secs.

This hives B ~ 2 Mbits/sec and C ~ 4.5 hours and permits


opening new connecxtions every 1 second.

References
[ 1] D. Belsnes, Single Message Communication, IEEE Trans.
on Communication, Vol 24, No 2, February 1976,
pp. 190-194.
[2] G.V. Bochman, Logical Verification and Implementation of Protocols, Proc. Fourth Data Communications
Syrup., Quebec, Canada, October 1975, pp. 7.15-7.20.
IEEE 75CH 1001-7DATA)
[3] G.V. Bochmann and J. Gecsei, A Unified Method tbr
the Specification and Verification of Protocols, Proc,
IFIP Congress, Toronto, Canada, August 1977, pp.
229 -234.
[4] V.G. Cerf, and Robert E. Kahn, A Protocol for P,lcl-.:et
Network Intercommunication, IEEE Trans. on Communication, COM-22, May 1974, pp. 673-648.
[5} V.G. Cerf, Y.K. Dalai, and C.A. Sunshine, Specificiation
of lnternet Transmission Control Program, INWG Note
72, revised December 1974.
[6] V.G. Cerf, TCP Resynchronization, Digital Systems Lab
Technical Note Number 79, Stanford University,
January 1976.
[7] V.G. Cerf, Specificiation of Internet Transmission Control Program, TCP (Version 2), March 1977, (available
from DARPA/IPTO).
[8] V.G. Cerf and J.B. Postel, Spccificiation of lnternet
Transmission Control Program, TCP (Version 3),
January 1978, (available from USC/ISI).
[91 Y.K. Dalai, More on Selecting Sequence Numbers, Proc.
A CM SIGCOMM/SIGOPS h~terprocess Communications
lorkshop, Santa Monica, March 1975, pp. 25-36.
(A CM Operating Systems Review, 9, 3, July 1975). Also
INWG Protocol Note 4, October 1974.
1101 Y.K. Dalai, Estabishing a Connection, INWG Protocol
Note 14, March 1975.
1111 A. Danthine and J. Bremer, Modeling and Verification
of End-to-End Transport Protocol, Proc. Syrup. on
Computer Network Protocols, Liege, Belgium, February
1978.
1121 J.G. Fletcher and R.W. Watson, Mechanisms for a
Reliable Timer-Based Protocol, Proc. Syrup. on Com.
puter Network Protocols, Liege, Belgium, February
1978.

[131 L.L. Gadick and R. Rom, Reliable llost-to-Host Protocols: Problems and Techniques, Proc. Fifth Data Communications Syrup., Snowbird, Utah, September 1977,
pp, 4.58-4.6,';. (IEEE 77CH1260-9C).

CA. Sunshine, Y.K. Dalai / Connection management in transport protocols

1141 M.G. Gouda and E.G. Manning, On the Modelling,


Analysis, and Design of Protocols - A Special Class of
Software Structures, Proc. Int. Conf. on Software Engineering, October 1976, pp. 256-262.
[ISl S.T. Kent, Encryption Based Protection for Interactive
User/ComPuter Communication, Proc. Fifth Data Communications Syrup., Snowbird, Utah, September 1977,
pp. 5.7 -5.13.
[16i P.M. Merlin and D.J. 'Farber, Recoverability of Cornmunication Protocols- Implications of a Theoretical
Study, IEEE Trans. Communications, September 1976,
pp. 1036-1043.
[17l R.M. Metcalfe, Packet Communication, MIT Project
MAC Report TR-II4, December 1973. (PhD Thesis,
Harvard Univ.)
1181 J.B. Postel and D J. Farber, Graph Modeling of Computer Communications Protocols, Proc. Fifth Texas
Conf. on Computing Systems, Austin, Texas, October
1976, pp. 66-77.
[191 D.P. Reed, The Initial Connection Mechanism in DSP,
MIT Lab for Computer Science LNN i0, Aug,st 1977.
[201 H. Rudin, C.H. West, and P. Zafiropulo, Automated
Protocol Validation: One Chain of Development, Proc.

473

Symp. on Computer Network Protocols, Liege, Belgium, February 1978.


[21] N.V. Stenning, A Data Transfer Protocol, Computer
Networks, 1, 2, September 1976, pp. 99-110.
[22] C.A. Sunshine, lnterprocess Communication Protocols
for Computer Networks, Digital Systems Lab Technical
Report No. 105, Stanford Univ., December 1975. (PhD
Thesis).
[23] C.A. Sunshine, Survey of Communication Protocul
Verification Techniques, Proc. Symp. on Computer Net.
.. works: Trends and Applications, Gaithcrsburg, Maryland, November 1976, pp. 24-26. (IEEE 76( H 1143--

7O.
[24] R.S. Tomlinson, Selecting Sequence Numbers, Proc..
ACM SIGCOM/SIGOPS lnterprocess Communication~
Workshop, Santa Monica, Calitornia, March 1975, pp.
11-23. (A CM Operating Systems Review, Vol. 9, No. 3,
July 1975) Also INWG Protocol Note No. 2, August
1974.
D.C.
Walden, A System for Interprocess Communica[25]
tion in a Resource Sharing Computer Network, Comm.
ACM 15, 4, April 1972, pp. 221-230.