rfc 2460 pdf

AI Investment Up, ROI Remains Iffy

By George Leopold

Real-world applications for artificial intelligence are emerging in areas such as boosting the productivity of dispersed workforces. However, early adopters are still struggling to determine the return on initial AI investments, according to a pair of new vendor reports.

Red Hat released research this week indicating that AI deployments have yielded some tangible results in areas such as transportation and utilities that rely heavily on field workers. A separate forecast released Wednesday (Jan.17) by Narrative Science found growing enterprise adoption of AI technologies but little in the way of investment returns.

Chicago-based Narrative Science, which sells natural language generation technology, found that 61 percent of those companies it surveyed deployed AI technologies in 2017. Early deployments focused on business intelligence, finance and product management. “In 2018, the focus will be on ensuring enterprises get value from their AI investments,” company CEO Stuart Frankel noted in releasing the survey.

Early adopters are also encountering many of the hurdles associated with a “first mover” advantage. “More and more organizations are deploying AI-powered technologies, with goals such as improving worker productivity and enhancing the customer experience that are not only laudable, but achievable,” Narrative Science concluded. “A focus on realistic deployment timeframes and accurately measuring the effectiveness and [return on investment] of AI is critical to keeping the current momentum around the technology moving forward.”

Meanwhile, the Red Hat (NYSE: RHT) survey also found an uptick in AI deployments, with 30 percent of respondents planning to implement AI for “field service workers” this year. Other applications include predictive analytics, machine learning and robotics.

While issues such as securing data access and a lack of standards persist, Red Hat found that field workers are “now at the forefront of digital transformation where artificial intelligence, smart mobile devices, the Internet of Things (IoT) and business process management technologies have created new opportunities to better streamline and transform traditional workflows and workforce management practices.”

A predicted 25 percent increase in AI investment through November 2018 is seen transforming field service operations, Red Hat noted in a blog posted on Thursday (Jan. 18). Early movers cited increase field worker productivity (46 percent), streamlining field operations (40 percent) and improving customer service (37 percent) as the top business factors for investing in AI.

Along with a lack of standards, respondents said deployment challenges include keeping pace with technological change and integrating AI deployments with legacy systems. The survey notes that industry groups are focusing on standards and interoperability among IoT devices along with data security while improving integration technologies.

Earlier vendor surveys also have identified barriers to implementation ranging from a lack of IT infrastructure suited to AI applications to a lack AI expertise. For instance, a survey released last fall by data analytics vendor Teradata Corp. (NYSE: TDC) found that 30 percent of those it polled said greater investments would be required to expand AI deployments.

Despite the promise and pitfalls of AI—ranging from freeing workers from drudgery to displacing those same workers—early AI deployments appear to underscore the reality that the technology remains a solution in search of a problem.

Recent items:

AI Seen Better Suited to IoT Than Big Data

AI Adopters Confront Barriers to ROI

The post AI Investment Up, ROI Remains Iffy appeared first on Datanami.

Read more here:: www.datanami.com/feed/

RFC 2401 – Security Architecture for the Internet Protocol

 
Network Working Group                                            S. Kent
Request for Comments: 2401 BBN Corp
Obsoletes: 1825 R. Atkinson
Category: Standards Track @Home Network
November 1998

Security Architecture for the Internet Protocol

Status of this Memo

This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.

Copyright Notice

Copyright (C) The Internet Society (1998). All Rights Reserved.

Table of Contents

1. Introduction........................................................3
1.1 Summary of Contents of Document..................................3
1.2 Audience.........................................................3
1.3 Related Documents................................................4
2. Design Objectives...................................................4
2.1 Goals/Objectives/Requirements/Problem Description................4
2.2 Caveats and Assumptions..........................................5
3. System Overview.....................................................5
3.1 What IPsec Does..................................................6
3.2 How IPsec Works..................................................6
3.3 Where IPsec May Be Implemented...................................7
4. Security Associations...............................................8
4.1 Definition and Scope.............................................8
4.2 Security Association Functionality..............................10
4.3 Combining Security Associations.................................11
4.4 Security Association Databases..................................13
4.4.1 The Security Policy Database (SPD).........................14
4.4.2 Selectors..................................................17
4.4.3 Security Association Database (SAD)........................21
4.5 Basic Combinations of Security Associations.....................24
4.6 SA and Key Management...........................................26
4.6.1 Manual Techniques..........................................27
4.6.2 Automated SA and Key Management............................27
4.6.3 Locating a Security Gateway................................28
4.7 Security Associations and Multicast.............................29

5. IP Traffic Processing..............................................30
5.1 Outbound IP Traffic Processing..................................30
5.1.1 Selecting and Using an SA or SA Bundle.....................30
5.1.2 Header Construction for Tunnel Mode........................31
5.1.2.1 IPv4 -- Header Construction for Tunnel Mode...........31
5.1.2.2 IPv6 -- Header Construction for Tunnel Mode...........32
5.2 Processing Inbound IP Traffic...................................33
5.2.1 Selecting and Using an SA or SA Bundle.....................33
5.2.2 Handling of AH and ESP tunnels.............................34
6. ICMP Processing (relevant to IPsec)................................35
6.1 PMTU/DF Processing..............................................36
6.1.1 DF Bit.....................................................36
6.1.2 Path MTU Discovery (PMTU)..................................36
6.1.2.1 Propagation of PMTU...................................36
6.1.2.2 Calculation of PMTU...................................37
6.1.2.3 Granularity of PMTU Processing........................37
6.1.2.4 PMTU Aging............................................38
7. Auditing...........................................................39
8. Use in Systems Supporting Information Flow Security................39
8.1 Relationship Between Security Associations and Data Sensitivity.40
8.2 Sensitivity Consistency Checking................................40
8.3 Additional MLS Attributes for Security Association Databases....41
8.4 Additional Inbound Processing Steps for MLS Networking..........41
8.5 Additional Outbound Processing Steps for MLS Networking.........41
8.6 Additional MLS Processing for Security Gateways.................42
9. Performance Issues.................................................42
10. Conformance Requirements..........................................43
11. Security Considerations...........................................43
12. Differences from RFC 1825.........................................43
Acknowledgements......................................................44
Appendix A -- Glossary................................................45
Appendix B -- Analysis/Discussion of PMTU/DF/Fragmentation Issues.....48
B.1 DF bit..........................................................48
B.2 Fragmentation...................................................48
B.3 Path MTU Discovery..............................................52
B.3.1 Identifying the Originating Host(s)........................53
B.3.2 Calculation of PMTU........................................55
B.3.3 Granularity of Maintaining PMTU Data.......................56
B.3.4 Per Socket Maintenance of PMTU Data........................57
B.3.5 Delivery of PMTU Data to the Transport Layer...............57
B.3.6 Aging of PMTU Data.........................................57
Appendix C -- Sequence Space Window Code Example......................58
Appendix D -- Categorization of ICMP messages.........................60
References............................................................63
Disclaimer............................................................64
Author Information....................................................65
Full Copyright Statement..............................................66

1. Introduction

1.1 Summary of Contents of Document

This memo specifies the base architecture for IPsec compliant
systems. The goal of the architecture is to provide various security
services for traffic at the IP layer, in both the IPv4 and IPv6
environments. This document describes the goals of such systems,
their components and how they fit together with each other and into
the IP environment. It also describes the security services offered
by the IPsec protocols, and how these services can be employed in the
IP environment. This document does not address all aspects of IPsec
architecture. Subsequent documents will address additional
architectural details of a more advanced nature, e.g., use of IPsec
in NAT environments and more complete support for IP multicast. The
following fundamental components of the IPsec security architecture
are discussed in terms of their underlying, required functionality.
Additional RFCs (see Section 1.3 for pointers to other documents)
define the protocols in (a), (c), and (d).

a. Security Protocols -- Authentication Header (AH) and
Encapsulating Security Payload (ESP)
b. Security Associations -- what they are and how they work,
how they are managed, associated processing
c. Key Management -- manual and automatic (The Internet Key
Exchange (IKE))
d. Algorithms for authentication and encryption

This document is not an overall Security Architecture for the
Internet; it addresses security only at the IP layer, provided
through the use of a combination of cryptographic and protocol
security mechanisms.

The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this
document, are to be interpreted as described in RFC 2119 [Bra97].

1.2 Audience

The target audience for this document includes implementers of this
IP security technology and others interested in gaining a general
background understanding of this system. In particular, prospective
users of this technology (end users or system administrators) are
part of the target audience. A glossary is provided as an appendix

to help fill in gaps in background/vocabulary. This document assumes
that the reader is familiar with the Internet Protocol, related
networking technology, and general security terms and concepts.

1.3 Related Documents

As mentioned above, other documents provide detailed definitions of
some of the components of IPsec and of their inter-relationship.
They include RFCs on the following topics:

a. "IP Security Document Roadmap" [TDG97] -- a document
providing guidelines for specifications describing encryption
and authentication algorithms used in this system.
b. security protocols -- RFCs describing the Authentication
Header (AH) [KA98a] and Encapsulating Security Payload (ESP)
[KA98b] protocols.
c. algorithms for authentication and encryption -- a separate
RFC for each algorithm.
d. automatic key management -- RFCs on "The Internet Key
Exchange (IKE)" [HC98], "Internet Security Association and
Key Management Protocol (ISAKMP)" [MSST97],"The OAKLEY Key
Determination Protocol" [Orm97], and "The Internet IP
Security Domain of Interpretation for ISAKMP" [Pip98].

2. Design Objectives

2.1 Goals/Objectives/Requirements/Problem Description

IPsec is designed to provide interoperable, high quality,
cryptographically-based security for IPv4 and IPv6. The set of
security services offered includes access control, connectionless
integrity, data origin authentication, protection against replays (a
form of partial sequence integrity), confidentiality (encryption),
and limited traffic flow confidentiality. These services are
provided at the IP layer, offering protection for IP and/or upper
layer protocols.

These objectives are met through the use of two traffic security
protocols, the Authentication Header (AH) and the Encapsulating
Security Payload (ESP), and through the use of cryptographic key
management procedures and protocols. The set of IPsec protocols
employed in any context, and the ways in which they are employed,
will be determined by the security and system requirements of users,
applications, and/or sites/organizations.

When these mechanisms are correctly implemented and deployed, they
ought not to adversely affect users, hosts, and other Internet
components that do not employ these security mechanisms for

protection of their traffic. These mechanisms also are designed to
be algorithm-independent. This modularity permits selection of
different sets of algorithms without affecting the other parts of the
implementation. For example, different user communities may select
different sets of algorithms (creating cliques) if required.

A standard set of default algorithms is specified to facilitate
interoperability in the global Internet. The use of these
algorithms, in conjunction with IPsec traffic protection and key
management protocols, is intended to permit system and application
developers to deploy high quality, Internet layer, cryptographic
security technology.

2.2 Caveats and Assumptions

The suite of IPsec protocols and associated default algorithms are
designed to provide high quality security for Internet traffic.
However, the security offered by use of these protocols ultimately
depends on the quality of the their implementation, which is outside
the scope of this set of standards. Moreover, the security of a
computer system or network is a function of many factors, including
personnel, physical, procedural, compromising emanations, and
computer security practices. Thus IPsec is only one part of an
overall system security architecture.

Finally, the security afforded by the use of IPsec is critically
dependent on many aspects of the operating environment in which the
IPsec implementation executes. For example, defects in OS security,
poor quality of random number sources, sloppy system management
protocols and practices, etc. can all degrade the security provided
by IPsec. As above, none of these environmental attributes are
within the scope of this or other IPsec standards.

3. System Overview

This section provides a high level description of how IPsec works,
the components of the system, and how they fit together to provide
the security services noted above. The goal of this description is
to enable the reader to "picture" the overall process/system, see how
it fits into the IP environment, and to provide context for later
sections of this document, which describe each of the components in
more detail.

An IPsec implementation operates in a host or a security gateway
environment, affording protection to IP traffic. The protection
offered is based on requirements defined by a Security Policy
Database (SPD) established and maintained by a user or system
administrator, or by an application operating within constraints

established by either of the above. In general, packets are selected
for one of three processing modes based on IP and transport layer
header information (Selectors, Section 4.4.2) matched against entries
in the database (SPD). Each packet is either afforded IPsec security
services, discarded, or allowed to bypass IPsec, based on the
applicable database policies identified by the Selectors.

3.1 What IPsec Does

IPsec provides security services at the IP layer by enabling a system
to select required security protocols, determine the algorithm(s) to
use for the service(s), and put in place any cryptographic keys
required to provide the requested services. IPsec can be used to
protect one or more "paths" between a pair of hosts, between a pair
of security gateways, or between a security gateway and a host. (The
term "security gateway" is used throughout the IPsec documents to
refer to an intermediate system that implements IPsec protocols. For
example, a router or a firewall implementing IPsec is a security
gateway.)

The set of security services that IPsec can provide includes access
control, connectionless integrity, data origin authentication,
rejection of replayed packets (a form of partial sequence integrity),
confidentiality (encryption), and limited traffic flow
confidentiality. Because these services are provided at the IP
layer, they can be used by any higher layer protocol, e.g., TCP, UDP,
ICMP, BGP, etc.

The IPsec DOI also supports negotiation of IP compression [SMPT98],
motivated in part by the observation that when encryption is employed
within IPsec, it prevents effective compression by lower protocol
layers.

3.2 How IPsec Works

IPsec uses two protocols to provide traffic security --
Authentication Header (AH) and Encapsulating Security Payload (ESP).
Both protocols are described in more detail in their respective RFCs
[KA98a, KA98b].

o The IP Authentication Header (AH) [KA98a] provides
connectionless integrity, data origin authentication, and an
optional anti-replay service.
o The Encapsulating Security Payload (ESP) protocol [KA98b] may
provide confidentiality (encryption), and limited traffic flow
confidentiality. It also may provide connectionless

integrity, data origin authentication, and an anti-replay
service. (One or the other set of these security services
must be applied whenever ESP is invoked.)
o Both AH and ESP are vehicles for access control, based on the
distribution of cryptographic keys and the management of
traffic flows relative to these security protocols.

These protocols may be applied alone or in combination with each
other to provide a desired set of security services in IPv4 and IPv6.
Each protocol supports two modes of use: transport mode and tunnel
mode. In transport mode the protocols provide protection primarily
for upper layer protocols; in tunnel mode, the protocols are applied
to tunneled IP packets. The differences between the two modes are
discussed in Section 4.

IPsec allows the user (or system administrator) to control the
granularity at which a security service is offered. For example, one
can create a single encrypted tunnel to carry all the traffic between
two security gateways or a separate encrypted tunnel can be created
for each TCP connection between each pair of hosts communicating
across these gateways. IPsec management must incorporate facilities
for specifying:

o which security services to use and in what combinations
o the granularity at which a given security protection should be
applied
o the algorithms used to effect cryptographic-based security

Because these security services use shared secret values
(cryptographic keys), IPsec relies on a separate set of mechanisms
for putting these keys in place. (The keys are used for
authentication/integrity and encryption services.) This document
requires support for both manual and automatic distribution of keys.
It specifies a specific public-key based approach (IKE -- [MSST97,
Orm97, HC98]) for automatic key management, but other automated key
distribution techniques MAY be used. For example, KDC-based systems
such as Kerberos and other public-key systems such as SKIP could be
employed.

3.3 Where IPsec May Be Implemented

There are several ways in which IPsec may be implemented in a host or
in conjunction with a router or firewall (to create a security
gateway). Several common examples are provided below:

a. Integration of IPsec into the native IP implementation. This
requires access to the IP source code and is applicable to
both hosts and security gateways.

b. "Bump-in-the-stack" (BITS) implementations, where IPsec is
implemented "underneath" an existing implementation of an IP
protocol stack, between the native IP and the local network
drivers. Source code access for the IP stack is not required
in this context, making this implementation approach
appropriate for use with legacy systems. This approach, when
it is adopted, is usually employed in hosts.

c. The use of an outboard crypto processor is a common design
feature of network security systems used by the military, and
of some commercial systems as well. It is sometimes referred
to as a "Bump-in-the-wire" (BITW) implementation. Such
implementations may be designed to serve either a host or a
gateway (or both). Usually the BITW device is IP
addressable. When supporting a single host, it may be quite
analogous to a BITS implementation, but in supporting a
router or firewall, it must operate like a security gateway.

4. Security Associations

This section defines Security Association management requirements for
all IPv6 implementations and for those IPv4 implementations that
implement AH, ESP, or both. The concept of a "Security Association"
(SA) is fundamental to IPsec. Both AH and ESP make use of SAs and a
major function of IKE is the establishment and maintenance of
Security Associations. All implementations of AH or ESP MUST support
the concept of a Security Association as described below. The
remainder of this section describes various aspects of Security
Association management, defining required characteristics for SA
policy management, traffic processing, and SA management techniques.

4.1 Definition and Scope

A Security Association (SA) is a simplex "connection" that affords
security services to the traffic carried by it. Security services
are afforded to an SA by the use of AH, or ESP, but not both. If
both AH and ESP protection is applied to a traffic stream, then two
(or more) SAs are created to afford protection to the traffic stream.
To secure typical, bi-directional communication between two hosts, or
between two security gateways, two Security Associations (one in each
direction) are required.

A security association is uniquely identified by a triple consisting
of a Security Parameter Index (SPI), an IP Destination Address, and a
security protocol (AH or ESP) identifier. In principle, the
Destination Address may be a unicast address, an IP broadcast
address, or a multicast group address. However, IPsec SA management
mechanisms currently are defined only for unicast SAs. Hence, in the

discussions that follow, SAs will be described in the context of
point-to-point communication, even though the concept is applicable
in the point-to-multipoint case as well.

As noted above, two types of SAs are defined: transport mode and
tunnel mode. A transport mode SA is a security association between
two hosts. In IPv4, a transport mode security protocol header
appears immediately after the IP header and any options, and before
any higher layer protocols (e.g., TCP or UDP). In IPv6, the security
protocol header appears after the base IP header and extensions, but
may appear before or after destination options, and before higher
layer protocols. In the case of ESP, a transport mode SA provides
security services only for these higher layer protocols, not for the
IP header or any extension headers preceding the ESP header. In the
case of AH, the protection is also extended to selected portions of
the IP header, selected portions of extension headers, and selected
options (contained in the IPv4 header, IPv6 Hop-by-Hop extension
header, or IPv6 Destination extension headers). For more details on
the coverage afforded by AH, see the AH specification [KA98a].

A tunnel mode SA is essentially an SA applied to an IP tunnel.
Whenever either end of a security association is a security gateway,
the SA MUST be tunnel mode. Thus an SA between two security gateways
is always a tunnel mode SA, as is an SA between a host and a security
gateway. Note that for the case where traffic is destined for a
security gateway, e.g., SNMP commands, the security gateway is acting
as a host and transport mode is allowed. But in that case, the
security gateway is not acting as a gateway, i.e., not transiting
traffic. Two hosts MAY establish a tunnel mode SA between
themselves. The requirement for any (transit traffic) SA involving a
security gateway to be a tunnel SA arises due to the need to avoid
potential problems with regard to fragmentation and reassembly of
IPsec packets, and in circumstances where multiple paths (e.g., via
different security gateways) exist to the same destination behind the
security gateways.

For a tunnel mode SA, there is an "outer" IP header that specifies
the IPsec processing destination, plus an "inner" IP header that
specifies the (apparently) ultimate destination for the packet. The
security protocol header appears after the outer IP header, and
before the inner IP header. If AH is employed in tunnel mode,
portions of the outer IP header are afforded protection (as above),
as well as all of the tunneled IP packet (i.e., all of the inner IP
header is protected, as well as higher layer protocols). If ESP is
employed, the protection is afforded only to the tunneled packet, not
to the outer header.

In summary,
a) A host MUST support both transport and tunnel mode.
b) A security gateway is required to support only tunnel
mode. If it supports transport mode, that should be used
only when the security gateway is acting as a host, e.g.,
for network management.

4.2 Security Association Functionality

The set of security services offered by an SA depends on the security
protocol selected, the SA mode, the endpoints of the SA, and on the
election of optional services within the protocol. For example, AH
provides data origin authentication and connectionless integrity for
IP datagrams (hereafter referred to as just "authentication"). The
"precision" of the authentication service is a function of the
granularity of the security association with which AH is employed, as
discussed in Section 4.4.2, "Selectors".

AH also offers an anti-replay (partial sequence integrity) service at
the discretion of the receiver, to help counter denial of service
attacks. AH is an appropriate protocol to employ when
confidentiality is not required (or is not permitted, e.g , due to
government restrictions on use of encryption). AH also provides
authentication for selected portions of the IP header, which may be
necessary in some contexts. For example, if the integrity of an IPv4
option or IPv6 extension header must be protected en route between
sender and receiver, AH can provide this service (except for the
non-predictable but mutable parts of the IP header.)

ESP optionally provides confidentiality for traffic. (The strength
of the confidentiality service depends in part, on the encryption
algorithm employed.) ESP also may optionally provide authentication
(as defined above). If authentication is negotiated for an ESP SA,
the receiver also may elect to enforce an anti-replay service with
the same features as the AH anti-replay service. The scope of the
authentication offered by ESP is narrower than for AH, i.e., the IP
header(s) "outside" the ESP header is(are) not protected. If only
the upper layer protocols need to be authenticated, then ESP
authentication is an appropriate choice and is more space efficient
than use of AH encapsulating ESP. Note that although both
confidentiality and authentication are optional, they cannot both be
omitted. At least one of them MUST be selected.

If confidentiality service is selected, then an ESP (tunnel mode) SA
between two security gateways can offer partial traffic flow
confidentiality. The use of tunnel mode allows the inner IP headers
to be encrypted, concealing the identities of the (ultimate) traffic
source and destination. Moreover, ESP payload padding also can be

invoked to hide the size of the packets, further concealing the
external characteristics of the traffic. Similar traffic flow
confidentiality services may be offered when a mobile user is
assigned a dynamic IP address in a dialup context, and establishes a
(tunnel mode) ESP SA to a corporate firewall (acting as a security
gateway). Note that fine granularity SAs generally are more
vulnerable to traffic analysis than coarse granularity ones which are
carrying traffic from many subscribers.

4.3 Combining Security Associations

The IP datagrams transmitted over an individual SA are afforded
protection by exactly one security protocol, either AH or ESP, but
not both. Sometimes a security policy may call for a combination of
services for a particular traffic flow that is not achievable with a
single SA. In such instances it will be necessary to employ multiple
SAs to implement the required security policy. The term "security
association bundle" or "SA bundle" is applied to a sequence of SAs
through which traffic must be processed to satisfy a security policy.
The order of the sequence is defined by the policy. (Note that the
SAs that comprise a bundle may terminate at different endpoints. For
example, one SA may extend between a mobile host and a security
gateway and a second, nested SA may extend to a host behind the
gateway.)

Security associations may be combined into bundles in two ways:
transport adjacency and iterated tunneling.

o Transport adjacency refers to applying more than one
security protocol to the same IP datagram, without invoking
tunneling. This approach to combining AH and ESP allows
for only one level of combination; further nesting yields
no added benefit (assuming use of adequately strong
algorithms in each protocol) since the processing is
performed at one IPsec instance at the (ultimate)
destination.

Host 1 --- Security ---- Internet -- Security --- Host 2
| | Gwy 1 Gwy 2 | |
| | | |
| -----Security Association 1 (ESP transport)------- |
| |
-------Security Association 2 (AH transport)----------

o Iterated tunneling refers to the application of multiple
layers of security protocols effected through IP tunneling.
This approach allows for multiple levels of nesting, since
each tunnel can originate or terminate at a different IPsec

site along the path. No special treatment is expected for
ISAKMP traffic at intermediate security gateways other than
what can be specified through appropriate SPD entries (See
Case 3 in Section 4.5)

There are 3 basic cases of iterated tunneling -- support is
required only for cases 2 and 3.:

1. both endpoints for the SAs are the same -- The inner and
outer tunnels could each be either AH or ESP, though it
is unlikely that Host 1 would specify both to be the
same, i.e., AH inside of AH or ESP inside of ESP.

Host 1 --- Security ---- Internet -- Security --- Host 2
| | Gwy 1 Gwy 2 | |
| | | |
| -------Security Association 1 (tunnel)---------- | |
| |
---------Security Association 2 (tunnel)--------------

2. one endpoint of the SAs is the same -- The inner and
uter tunnels could each be either AH or ESP.

Host 1 --- Security ---- Internet -- Security --- Host 2
| | Gwy 1 Gwy 2 |
| | | |
| ----Security Association 1 (tunnel)---- |
| |
---------Security Association 2 (tunnel)-------------

3. neither endpoint is the same -- The inner and outer
tunnels could each be either AH or ESP.

Host 1 --- Security ---- Internet -- Security --- Host 2
| Gwy 1 Gwy 2 |
| | | |
| --Security Assoc 1 (tunnel)- |
| |
-----------Security Association 2 (tunnel)-----------

These two approaches also can be combined, e.g., an SA bundle could
be constructed from one tunnel mode SA and one or two transport mode
SAs, applied in sequence. (See Section 4.5 "Basic Combinations of
Security Associations.") Note that nested tunnels can also occur
where neither the source nor the destination endpoints of any of the
tunnels are the same. In that case, there would be no host or
security gateway with a bundle corresponding to the nested tunnels.

For transport mode SAs, only one ordering of security protocols seems
appropriate. AH is applied to both the upper layer protocols and
(parts of) the IP header. Thus if AH is used in a transport mode, in
conjunction with ESP, AH SHOULD appear as the first header after IP,
prior to the appearance of ESP. In that context, AH is applied to
the ciphertext output of ESP. In contrast, for tunnel mode SAs, one
can imagine uses for various orderings of AH and ESP. The required
set of SA bundle types that MUST be supported by a compliant IPsec
implementation is described in Section 4.5.

4.4 Security Association Databases

Many of the details associated with processing IP traffic in an IPsec
implementation are largely a local matter, not subject to
standardization. However, some external aspects of the processing
must be standardized, to ensure interoperability and to provide a
minimum management capability that is essential for productive use of
IPsec. This section describes a general model for processing IP
traffic relative to security associations, in support of these
interoperability and functionality goals. The model described below
is nominal; compliant implementations need not match details of this
model as presented, but the external behavior of such implementations
must be mappable to the externally observable characteristics of this
model.

There are two nominal databases in this model: the Security Policy
Database and the Security Association Database. The former specifies
the policies that determine the disposition of all IP traffic inbound
or outbound from a host, security gateway, or BITS or BITW IPsec
implementation. The latter database contains parameters that are
associated with each (active) security association. This section
also defines the concept of a Selector, a set of IP and upper layer
protocol field values that is used by the Security Policy Database to
map traffic to a policy, i.e., an SA (or SA bundle).

Each interface for which IPsec is enabled requires nominally separate
inbound vs. outbound databases (SAD and SPD), because of the
directionality of many of the fields that are used as selectors.
Typically there is just one such interface, for a host or security
gateway (SG). Note that an SG would always have at least 2
interfaces, but the "internal" one to the corporate net, usually
would not have IPsec enabled and so only one pair of SADs and one
pair of SPDs would be needed. On the other hand, if a host had
multiple interfaces or an SG had multiple external interfaces, it
might be necessary to have separate SAD and SPD pairs for each
interface.

4.4.1 The Security Policy Database (SPD)

Ultimately, a security association is a management construct used to
enforce a security policy in the IPsec environment. Thus an
essential element of SA processing is an underlying Security Policy
Database (SPD) that specifies what services are to be offered to IP
datagrams and in what fashion. The form of the database and its
interface are outside the scope of this specification. However, this
section does specify certain minimum management functionality that
must be provided, to allow a user or system administrator to control
how IPsec is applied to traffic transmitted or received by a host or
transiting a security gateway.

The SPD must be consulted during the processing of all traffic
(INBOUND and OUTBOUND), including non-IPsec traffic. In order to
support this, the SPD requires distinct entries for inbound and
outbound traffic. One can think of this as separate SPDs (inbound
vs. outbound). In addition, a nominally separate SPD must be
provided for each IPsec-enabled interface.

An SPD must discriminate among traffic that is afforded IPsec
protection and traffic that is allowed to bypass IPsec. This applies
to the IPsec protection to be applied by a sender and to the IPsec
protection that must be present at the receiver. For any outbound or
inbound datagram, three processing choices are possible: discard,
bypass IPsec, or apply IPsec. The first choice refers to traffic
that is not allowed to exit the host, traverse the security gateway,
or be delivered to an application at all. The second choice refers
to traffic that is allowed to pass without additional IPsec
protection. The third choice refers to traffic that is afforded
IPsec protection, and for such traffic the SPD must specify the
security services to be provided, protocols to be employed,
algorithms to be used, etc.

For every IPsec implementation, there MUST be an administrative
interface that allows a user or system administrator to manage the
SPD. Specifically, every inbound or outbound packet is subject to
processing by IPsec and the SPD must specify what action will be
taken in each case. Thus the administrative interface must allow the
user (or system administrator) to specify the security processing to
be applied to any packet entering or exiting the system, on a packet
by packet basis. (In a host IPsec implementation making use of a
socket interface, the SPD may not need to be consulted on a per
packet basis, but the effect is still the same.) The management
interface for the SPD MUST allow creation of entries consistent with
the selectors defined in Section 4.4.2, and MUST support (total)
ordering of these entries. It is expected that through the use of
wildcards in various selector fields, and because all packets on a

single UDP or TCP connection will tend to match a single SPD entry,
this requirement will not impose an unreasonably detailed level of
SPD specification. The selectors are analogous to what are found in
a stateless firewall or filtering router and which are currently
manageable this way.

In host systems, applications MAY be allowed to select what security
processing is to be applied to the traffic they generate and consume.
(Means of signalling such requests to the IPsec implementation are
outside the scope of this standard.) However, the system
administrator MUST be able to specify whether or not a user or
application can override (default) system policies. Note that
application specified policies may satisfy system requirements, so
that the system may not need to do additional IPsec processing beyond
that needed to meet an application's requirements. The form of the
management interface is not specified by this document and may differ
for hosts vs. security gateways, and within hosts the interface may
differ for socket-based vs. BITS implementations. However, this
document does specify a standard set of SPD elements that all IPsec
implementations MUST support.

The SPD contains an ordered list of policy entries. Each policy
entry is keyed by one or more selectors that define the set of IP
traffic encompassed by this policy entry. (The required selector
types are defined in Section 4.4.2.) These define the granularity of
policies or SAs. Each entry includes an indication of whether
traffic matching this policy will be bypassed, discarded, or subject
to IPsec processing. If IPsec processing is to be applied, the entry
includes an SA (or SA bundle) specification, listing the IPsec
protocols, modes, and algorithms to be employed, including any
nesting requirements. For example, an entry may call for all
matching traffic to be protected by ESP in transport mode using
3DES-CBC with an explicit IV, nested inside of AH in tunnel mode
using HMAC/SHA-1. For each selector, the policy entry specifies how
to derive the corresponding values for a new Security Association
Database (SAD, see Section 4.4.3) entry from those in the SPD and the
packet (Note that at present, ranges are only supported for IP
addresses; but wildcarding can be expressed for all selectors):

a. use the value in the packet itself -- This will limit use
of the SA to those packets which have this packet's value
for the selector even if the selector for the policy entry
has a range of allowed values or a wildcard for this
selector.
b. use the value associated with the policy entry -- If this
were to be just a single value, then there would be no
difference between (b) and (a). However, if the allowed
values for the selector are a range (for IP addresses) or

wildcard, then in the case of a range,(b) would enable use
of the SA by any packet with a selector value within the
range not just by packets with the selector value of the
packet that triggered the creation of the SA. In the case
of a wildcard, (b) would allow use of the SA by packets
with any value for this selector.

For example, suppose there is an SPD entry where the allowed value
for source address is any of a range of hosts (192.168.2.1 to
192.168.2.10). And suppose that a packet is to be sent that has a
source address of 192.168.2.3. The value to be used for the SA could
be any of the sample values below depending on what the policy entry
for this selector says is the source of the selector value:

source for the example of
value to be new SAD
used in the SA selector value
--------------- ------------
a. packet 192.168.2.3 (one host)
b. SPD entry 192.168.2.1 to 192.168.2.10 (range of hosts)

Note that if the SPD entry had an allowed value of wildcard for the
source address, then the SAD selector value could be wildcard (any
host). Case (a) can be used to prohibit sharing, even among packets
that match the same SPD entry.

As described below in Section 4.4.3, selectors may include "wildcard"
entries and hence the selectors for two entries may overlap. (This
is analogous to the overlap that arises with ACLs or filter entries
in routers or packet filtering firewalls.) Thus, to ensure
consistent, predictable processing, SPD entries MUST be ordered and
the SPD MUST always be searched in the same order, so that the first
matching entry is consistently selected. (This requirement is
necessary as the effect of processing traffic against SPD entries
must be deterministic, but there is no way to canonicalize SPD
entries given the use of wildcards for some selectors.) More detail
on matching of packets against SPD entries is provided in Section 5.

Note that if ESP is specified, either (but not both) authentication
or encryption can be omitted. So it MUST be possible to configure
the SPD value for the authentication or encryption algorithms to be
"NULL". However, at least one of these services MUST be selected,
i.e., it MUST NOT be possible to configure both of them as "NULL".

The SPD can be used to map traffic to specific SAs or SA bundles.
Thus it can function both as the reference database for security
policy and as the map to existing SAs (or SA bundles). (To
accommodate the bypass and discard policies cited above, the SPD also

MUST provide a means of mapping traffic to these functions, even
though they are not, per se, IPsec processing.) The way in which the
SPD operates is different for inbound vs. outbound traffic and it
also may differ for host vs. security gateway, BITS, and BITW
implementations. Sections 5.1 and 5.2 describe the use of the SPD
for outbound and inbound processing, respectively.

Because a security policy may require that more than one SA be
applied to a specified set of traffic, in a specific order, the
policy entry in the SPD must preserve these ordering requirements,
when present. Thus, it must be possible for an IPsec implementation
to determine that an outbound or inbound packet must be processed
thorough a sequence of SAs. Conceptually, for outbound processing,
one might imagine links (to the SAD) from an SPD entry for which
there are active SAs, and each entry would consist of either a single
SA or an ordered list of SAs that comprise an SA bundle. When a
packet is matched against an SPD entry and there is an existing SA or
SA bundle that can be used to carry the traffic, the processing of
the packet is controlled by the SA or SA bundle entry on the list.
For an inbound IPsec packet for which multiple IPsec SAs are to be
applied, the lookup based on destination address, IPsec protocol, and
SPI should identify a single SA.

The SPD is used to control the flow of ALL traffic through an IPsec
system, including security and key management traffic (e.g., ISAKMP)
from/to entities behind a security gateway. This means that ISAKMP
traffic must be explicitly accounted for in the SPD, else it will be
discarded. Note that a security gateway could prohibit traversal of
encrypted packets in various ways, e.g., having a DISCARD entry in
the SPD for ESP packets or providing proxy key exchange. In the
latter case, the traffic would be internally routed to the key
management module in the security gateway.

4.4.2 Selectors

An SA (or SA bundle) may be fine-grained or coarse-grained, depending
on the selectors used to define the set of traffic for the SA. For
example, all traffic between two hosts may be carried via a single
SA, and afforded a uniform set of security services. Alternatively,
traffic between a pair of hosts might be spread over multiple SAs,
depending on the applications being used (as defined by the Next
Protocol and Port fields), with different security services offered
by different SAs. Similarly, all traffic between a pair of security
gateways could be carried on a single SA, or one SA could be assigned
for each communicating host pair. The following selector parameters
MUST be supported for SA management to facilitate control of SA
granularity. Note that in the case of receipt of a packet with an
ESP header, e.g., at an encapsulating security gateway or BITW

implementation, the transport layer protocol, source/destination
ports, and Name (if present) may be "OPAQUE", i.e., inaccessible
because of encryption or fragmentation. Note also that both Source
and Destination addresses should either be IPv4 or IPv6.

- Destination IP Address (IPv4 or IPv6): this may be a single IP
address (unicast, anycast, broadcast (IPv4 only), or multicast
group), a range of addresses (high and low values (inclusive),
address + mask, or a wildcard address. The last three are used
to support more than one destination system sharing the same SA
(e.g., behind a security gateway). Note that this selector is
conceptually different from the "Destination IP Address" field
in the <Destination IP Address, IPsec Protocol, SPI> tuple used
to uniquely identify an SA. When a tunneled packet arrives at
the tunnel endpoint, its SPI/Destination address/Protocol are
used to look up the SA for this packet in the SAD. This
destination address comes from the encapsulating IP header.
Once the packet has been processed according to the tunnel SA
and has come out of the tunnel, its selectors are "looked up" in
the Inbound SPD. The Inbound SPD has a selector called
destination address. This IP destination address is the one in
the inner (encapsulated) IP header. In the case of a
transport'd packet, there will be only one IP header and this
ambiguity does not exist. [REQUIRED for all implementations]

- Source IP Address(es) (IPv4 or IPv6): this may be a single IP
address (unicast, anycast, broadcast (IPv4 only), or multicast
group), range of addresses (high and low values inclusive),
address + mask, or a wildcard address. The last three are used
to support more than one source system sharing the same SA
(e.g., behind a security gateway or in a multihomed host).
[REQUIRED for all implementations]

- Name: There are 2 cases (Note that these name forms are
supported in the IPsec DOI.)
1. User ID
a. a fully qualified user name string (DNS), e.g.,
mozart@foo.bar.com
b. X.500 distinguished name, e.g., C = US, SP = MA,
O = GTE Internetworking, CN = Stephen T. Kent.
2. System name (host, security gateway, etc.)
a. a fully qualified DNS name, e.g., foo.bar.com
b. X.500 distinguished name
c. X.500 general name

NOTE: One of the possible values of this selector is "OPAQUE".

[REQUIRED for the following cases. Note that support for name
forms other than addresses is not required for manually keyed
SAs.
o User ID
- native host implementations
- BITW and BITS implementations acting as HOSTS
with only one user
- security gateway implementations for INBOUND
processing.
o System names -- all implementations]

- Data sensitivity level: (IPSO/CIPSO labels)
[REQUIRED for all systems providing information flow security as
per Section 8, OPTIONAL for all other systems.]

- Transport Layer Protocol: Obtained from the IPv4 "Protocol" or
the IPv6 "Next Header" fields. This may be an individual
protocol number. These packet fields may not contain the
Transport Protocol due to the presence of IP extension headers,
e.g., a Routing Header, AH, ESP, Fragmentation Header,
Destination Options, Hop-by-hop options, etc. Note that the
Transport Protocol may not be available in the case of receipt
of a packet with an ESP header, thus a value of "OPAQUE" SHOULD
be supported.
[REQUIRED for all implementations]

NOTE: To locate the transport protocol, a system has to chain
through the packet headers checking the "Protocol" or "Next
Header" field until it encounters either one it recognizes as a
transport protocol, or until it reaches one that isn't on its
list of extension headers, or until it encounters an ESP header
that renders the transport protocol opaque.

- Source and Destination (e.g., TCP/UDP) Ports: These may be
individual UDP or TCP port values or a wildcard port. (The use
of the Next Protocol field and the Source and/or Destination
Port fields (in conjunction with the Source and/or Destination
Address fields), as an SA selector is sometimes referred to as
"session-oriented keying."). Note that the source and
destination ports may not be available in the case of receipt of
a packet with an ESP header, thus a value of "OPAQUE" SHOULD be
supported.

The following table summarizes the relationship between the
"Next Header" value in the packet and SPD and the derived Port
Selector value for the SPD and SAD.

Next Hdr Transport Layer Derived Port Selector Field
in Packet Protocol in SPD Value in SPD and SAD
-------- --------------- ---------------------------
ESP ESP or ANY ANY (i.e., don't look at it)
-don't care- ANY ANY (i.e., don't look at it)
specific value specific value NOT ANY (i.e., drop packet)
fragment
specific value specific value actual port selector field
not fragment

If the packet has been fragmented, then the port information may
not be available in the current fragment. If so, discard the
fragment. An ICMP PMTU should be sent for the first fragment,
which will have the port information. [MAY be supported]

The IPsec implementation context determines how selectors are used.
For example, a host implementation integrated into the stack may make
use of a socket interface. When a new connection is established the
SPD can be consulted and an SA (or SA bundle) bound to the socket.
Thus traffic sent via that socket need not result in additional
lookups to the SPD/SAD. In contrast, a BITS, BITW, or security
gateway implementation needs to look at each packet and perform an
SPD/SAD lookup based on the selectors. The allowable values for the
selector fields differ between the traffic flow, the security
association, and the security policy.

The following table summarizes the kinds of entries that one needs to
be able to express in the SPD and SAD. It shows how they relate to
the fields in data traffic being subjected to IPsec screening.
(Note: the "wild" or "wildcard" entry for src and dst addresses
includes a mask, range, etc.)

Field Traffic Value SAD Entry SPD Entry
-------- ------------- ---------------- --------------------
src addr single IP addr single,range,wild single,range,wildcard
dst addr single IP addr single,range,wild single,range,wildcard
xpt protocol* xpt protocol single,wildcard single,wildcard
src port* single src port single,wildcard single,wildcard
dst port* single dst port single,wildcard single,wildcard
user id* single user id single,wildcard single,wildcard
sec. labels single value single,wildcard single,wildcard

* The SAD and SPD entries for these fields could be "OPAQUE"
because the traffic value is encrypted.

NOTE: In principle, one could have selectors and/or selector values
in the SPD which cannot be negotiated for an SA or SA bundle.
Examples might include selector values used to select traffic for

discarding or enumerated lists which cause a separate SA to be
created for each item on the list. For now, this is left for future
versions of this document and the list of required selectors and
selector values is the same for the SPD and the SAD. However, it is
acceptable to have an administrative interface that supports use of
selector values which cannot be negotiated provided that it does not
mislead the user into believing it is creating an SA with these
selector values. For example, the interface may allow the user to
specify an enumerated list of values but would result in the creation
of a separate policy and SA for each item on the list. A vendor
might support such an interface to make it easier for its customers
to specify clear and concise policy specifications.

4.4.3 Security Association Database (SAD)

In each IPsec implementation there is a nominal Security Association
Database, in which each entry defines the parameters associated with
one SA. Each SA has an entry in the SAD. For outbound processing,
entries are pointed to by entries in the SPD. Note that if an SPD
entry does not currently point to an SA that is appropriate for the
packet, the implementation creates an appropriate SA (or SA Bundle)
and links the SPD entry to the SAD entry (see Section 5.1.1). For
inbound processing, each entry in the SAD is indexed by a destination
IP address, IPsec protocol type, and SPI. The following parameters
are associated with each entry in the SAD. This description does not
purport to be a MIB, but only a specification of the minimal data
items required to support an SA in an IPsec implementation.

For inbound processing: The following packet fields are used to look
up the SA in the SAD:

o Outer Header's Destination IP address: the IPv4 or IPv6
Destination address.
[REQUIRED for all implementations]
o IPsec Protocol: AH or ESP, used as an index for SA lookup
in this database. Specifies the IPsec protocol to be
applied to the traffic on this SA.
[REQUIRED for all implementations]
o SPI: the 32-bit value used to distinguish among different
SAs terminating at the same destination and using the same
IPsec protocol.
[REQUIRED for all implementations]

For each of the selectors defined in Section 4.4.2, the SA entry in
the SAD MUST contain the value or values which were negotiated at the
time the SA was created. For the sender, these values are used to
decide whether a given SA is appropriate for use with an outbound
packet. This is part of checking to see if there is an existing SA

that can be used. For the receiver, these values are used to check
that the selector values in an inbound packet match those for the SA
(and thus indirectly those for the matching policy). For the
receiver, this is part of verifying that the SA was appropriate for
this packet. (See Section 6 for rules for ICMP messages.) These
fields can have the form of specific values, ranges, wildcards, or
"OPAQUE" as described in section 4.4.2, "Selectors". Note that for
an ESP SA, the encryption algorithm or the authentication algorithm
could be "NULL". However they MUST not both be "NULL".

The following SAD fields are used in doing IPsec processing:

o Sequence Number Counter: a 32-bit value used to generate the
Sequence Number field in AH or ESP headers.
[REQUIRED for all implementations, but used only for outbound
traffic.]
o Sequence Counter Overflow: a flag indicating whether overflow
of the Sequence Number Counter should generate an auditable
event and prevent transmission of additional packets on the
SA.
[REQUIRED for all implementations, but used only for outbound
traffic.]
o Anti-Replay Window: a 32-bit counter and a bit-map (or
equivalent) used to determine whether an inbound AH or ESP
packet is a replay.
[REQUIRED for all implementations but used only for inbound
traffic. NOTE: If anti-replay has been disabled by the
receiver, e.g., in the case of a manually keyed SA, then the
Anti-Replay Window is not used.]
o AH Authentication algorithm, keys, etc.
[REQUIRED for AH implementations]
o ESP Encryption algorithm, keys, IV mode, IV, etc.
[REQUIRED for ESP implementations]
o ESP authentication algorithm, keys, etc. If the
authentication service is not selected, this field will be
null.
[REQUIRED for ESP implementations]
o Lifetime of this Security Association: a time interval after
which an SA must be replaced with a new SA (and new SPI) or
terminated, plus an indication of which of these actions
should occur. This may be expressed as a time or byte count,
or a simultaneous use of both, the first lifetime to expire
taking precedence. A compliant implementation MUST support
both types of lifetimes, and must support a simultaneous use
of both. If time is employed, and if IKE employs X.509
certificates for SA establishment, the SA lifetime must be
constrained by the validity intervals of the certificates,
and the NextIssueDate of the CRLs used in the IKE exchange

for the SA. Both initiator and responder are responsible for
constraining SA lifetime in this fashion.
[REQUIRED for all implementations]

NOTE: The details of how to handle the refreshing of keys
when SAs expire is a local matter. However, one reasonable
approach is:
(a) If byte count is used, then the implementation
SHOULD count the number of bytes to which the IPsec
algorithm is applied. For ESP, this is the encryption
algorithm (including Null encryption) and for AH,
this is the authentication algorithm. This includes
pad bytes, etc. Note that implementations SHOULD be
able to handle having the counters at the ends of an
SA get out of synch, e.g., because of packet loss or
because the implementations at each end of the SA
aren't doing things the same way.
(b) There SHOULD be two kinds of lifetime -- a soft
lifetime which warns the implementation to initiate
action such as setting up a replacement SA and a
hard lifetime when the current SA ends.
(c) If the entire packet does not get delivered during
the SAs lifetime, the packet SHOULD be discarded.

o IPsec protocol mode: tunnel, transport or wildcard.
Indicates which mode of AH or ESP is applied to traffic on
this SA. Note that if this field is "wildcard" at the
sending end of the SA, then the application has to specify
the mode to the IPsec implementation. This use of wildcard
allows the same SA to be used for either tunnel or transport
mode traffic on a per packet basis, e.g., by different
sockets. The receiver does not need to know the mode in
order to properly process the packet's IPsec headers.

[REQUIRED as follows, unless implicitly defined by context:
- host implementations must support all modes
- gateway implementations must support tunnel mode]

NOTE: The use of wildcard for the protocol mode of an inbound
SA may add complexity to the situation in the receiver (host
only). Since the packets on such an SA could be delivered in
either tunnel or transport mode, the security of an incoming
packet could depend in part on which mode had been used to
deliver it. If, as a result, an application cared about the
SA mode of a given packet, then the application would need a
mechanism to obtain this mode information.

o Path MTU: any observed path MTU and aging variables. See
Section 6.1.2.4
[REQUIRED for all implementations but used only for outbound
traffic]

4.5 Basic Combinations of Security Associations

This section describes four examples of combinations of security
associations that MUST be supported by compliant IPsec hosts or
security gateways. Additional combinations of AH and/or ESP in
tunnel and/or transport modes MAY be supported at the discretion of
the implementor. Compliant implementations MUST be capable of
generating these four combinations and on receipt, of processing
them, but SHOULD be able to receive and process any combination. The
diagrams and text below describe the basic cases. The legend for the
diagrams is:

==== = one or more security associations (AH or ESP, transport
or tunnel)
---- = connectivity (or if so labelled, administrative boundary)
Hx = host x
SGx = security gateway x
X* = X supports IPsec

NOTE: The security associations below can be either AH or ESP. The
mode (tunnel vs transport) is determined by the nature of the
endpoints. For host-to-host SAs, the mode can be either transport or
tunnel.

Case 1. The case of providing end-to-end security between 2 hosts
across the Internet (or an Intranet).

====================================
| |
H1* ------ (Inter/Intranet) ------ H2*

Note that either transport or tunnel mode can be selected by the
hosts. So the headers in a packet between H1 and H2 could look
like any of the following:

Transport Tunnel
----------------- ---------------------
1. [IP1][AH][upper] 4. [IP2][AH][IP1][upper]
2. [IP1][ESP][upper] 5. [IP2][ESP][IP1][upper]
3. [IP1][AH][ESP][upper]

Note that there is no requirement to support general nesting,
but in transport mode, both AH and ESP can be applied to the
packet. In this event, the SA establishment procedure MUST
ensure that first ESP, then AH are applied to the packet.

Case 2. This case illustrates simple virtual private networks
support.

===========================
| |
---------------------|---- ---|-----------------------
| | | | | |
| H1 -- (Local --- SG1* |--- (Internet) ---| SG2* --- (Local --- H2 |
| Intranet) | | Intranet) |
-------------------------- ---------------------------
admin. boundary admin. boundary

Only tunnel mode is required here. So the headers in a packet
between SG1 and SG2 could look like either of the following:

Tunnel
---------------------
4. [IP2][AH][IP1][upper]
5. [IP2][ESP][IP1][upper]

Case 3. This case combines cases 1 and 2, adding end-to-end security
between the sending and receiving hosts. It imposes no new
requirements on the hosts or security gateways, other than a
requirement for a security gateway to be configurable to pass
IPsec traffic (including ISAKMP traffic) for hosts behind it.

===============================================================
| |
| ========================= |
| | | |
---|-----------------|---- ---|-------------------|---
| | | | | | | |
| H1* -- (Local --- SG1* |-- (Internet) --| SG2* --- (Local --- H2* |
| Intranet) | | Intranet) |
-------------------------- ---------------------------
admin. boundary admin. boundary

Case 4. This covers the situation where a remote host (H1) uses the
Internet to reach an organization's firewall (SG2) and to then
gain access to some server or other machine (H2). The remote
host could be a mobile host (H1) dialing up to a local PPP/ARA
server (not shown) on the Internet and then crossing the
Internet to the home organization's firewall (SG2), etc. The

details of support for this case, (how H1 locates SG2,
authenticates it, and verifies its authorization to represent
H2) are discussed in Section 4.6.3, "Locating a Security
Gateway".

======================================================
| |
|============================== |
|| | |
|| ---|----------------------|---
|| | | | |
H1* ----- (Internet) ------| SG2* ---- (Local ----- H2* |
^ | Intranet) |
| ------------------------------
could be dialup admin. boundary (optional)
to PPP/ARA server

Only tunnel mode is required between H1 and SG2. So the choices
for the SA between H1 and SG2 would be one of the ones in case
2. The choices for the SA between H1 and H2 would be one of the
ones in case 1.

Note that in this case, the sender MUST apply the transport
header before the tunnel header. Therefore the management
interface to the IPsec implementation MUST support configuration
of the SPD and SAD to ensure this ordering of IPsec header
application.

As noted above, support for additional combinations of AH and ESP is
optional. Use of other, optional combinations may adversely affect
interoperability.

4.6 SA and Key Management

IPsec mandates support for both manual and automated SA and
cryptographic key management. The IPsec protocols, AH and ESP, are
largely independent of the associated SA management techniques,
although the techniques involved do affect some of the security
services offered by the protocols. For example, the optional anti-
replay services available for AH and ESP require automated SA
management. Moreover, the granularity of key distribution employed
with IPsec determines the granularity of authentication provided.
(See also a discussion of this issue in Section 4.7.) In general,
data origin authentication in AH and ESP is limited by the extent to
which secrets used with the authentication algorithm (or with a key
management protocol that creates such secrets) are shared among
multiple possible sources.

The following text describes the minimum requirements for both types
of SA management.

4.6.1 Manual Techniques

The simplest form of management is manual management, in which a
person manually configures each system with keying material and
security association management data relevant to secure communication
with other systems. Manual techniques are practical in small, static
environments but they do not scale well. For example, a company
could create a Virtual Private Network (VPN) using IPsec in security
gateways at several sites. If the number of sites is small, and
since all the sites come under the purview of a single administrative
domain, this is likely to be a feasible context for manual management
techniques. In this case, the security gateway might selectively
protect traffic to and from other sites within the organization using
a manually configured key, while not protecting traffic for other
destinations. It also might be appropriate when only selected
communications need to be secured. A similar argument might apply to
use of IPsec entirely within an organization for a small number of
hosts and/or gateways. Manual management techniques often employ
statically configured, symmetric keys, though other options also
exist.

4.6.2 Automated SA and Key Management

Widespread deployment and use of IPsec requires an Internet-standard,
scalable, automated, SA management protocol. Such support is
required to facilitate use of the anti-replay features of AH and ESP,
and to accommodate on-demand creation of SAs, e.g., for user- and
session-oriented keying. (Note that the notion of "rekeying" an SA
actually implies creation of a new SA with a new SPI, a process that
generally implies use of an automated SA/key management protocol.)

The default automated key management protocol selected for use with
IPsec is IKE [MSST97, Orm97, HC98] under the IPsec domain of
interpretation [Pip98]. Other automated SA management protocols MAY
be employed.

When an automated SA/key management protocol is employed, the output
from this protocol may be used to generate multiple keys, e.g., for a
single ESP SA. This may arise because:

o the encryption algorithm uses multiple keys (e.g., triple DES)
o the authentication algorithm uses multiple keys
o both encryption and authentication algorithms are employed

The Key Management System may provide a separate string of bits for
each key or it may generate one string of bits from which all of them
are extracted. If a single string of bits is provided, care needs to
be taken to ensure that the parts of the system that map the string
of bits to the required keys do so in the same fashion at both ends
of the SA. To ensure that the IPsec implementations at each end of
the SA use the same bits for the same keys, and irrespective of which
part of the system divides the string of bits into individual keys,
the encryption key(s) MUST be taken from the first (left-most, high-
order) bits and the authentication key(s) MUST be taken from the
remaining bits. The number of bits for each key is defined in the
relevant algorithm specification RFC. In the case of multiple
encryption keys or multiple authentication keys, the specification
for the algorithm must specify the order in which they are to be
selected from a single string of bits provided to the algorithm.

4.6.3 Locating a Security Gateway

This section discusses issues relating to how a host learns about the
existence of relevant security gateways and once a host has contacted
these security gateways, how it knows that these are the correct
security gateways. The details of where the required information is
stored is a local matter.

Consider a situation in which a remote host (H1) is using the
Internet to gain access to a server or other machine (H2) and there
is a security gateway (SG2), e.g., a firewall, through which H1's
traffic must pass. An example of this situation would be a mobile
host (Road Warrior) crossing the Internet to the home organization's
firewall (SG2). (See Case 4 in the section 4.5 Basic Combinations of
Security Associations.) This situation raises several issues:

1. How does H1 know/learn about the existence of the security
gateway SG2?
2. How does it authenticate SG2, and once it has authenticated
SG2, how does it confirm that SG2 has been authorized to
represent H2?
3. How does SG2 authenticate H1 and verify that H1 is authorized
to contact H2?
4. How does H1 know/learn about backup gateways which provide
alternate paths to H2?

To address these problems, a host or security gateway MUST have an
administrative interface that allows the user/administrator to
configure the address of a security gateway for any sets of
destination addresses that require its use. This includes the ability
to configure:

o the requisite information for locating and authenticating the
security gateway and verifying its authorization to represent
the destination host.
o the requisite information for locating and authenticating any
backup gateways and verifying their authorization to represent
the destination host.

It is assumed that the SPD is also configured with policy information
that covers any other IPsec requirements for the path to the security
gateway and the destination host.

This document does not address the issue of how to automate the
discovery/verification of security gateways.

4.7 Security Associations and Multicast

The receiver-orientation of the Security Association implies that, in
the case of unicast traffic, the destination system will normally
select the SPI value. By having the destination select the SPI
value, there is no potential for manually configured Security
Associations to conflict with automatically configured (e.g., via a
key management protocol) Security Associations or for Security
Associations from multiple sources to conflict with each other. For
multicast traffic, there are multiple destination systems per
multicast group. So some system or person will need to coordinate
among all multicast groups to select an SPI or SPIs on behalf of each
multicast group and then communicate the group's IPsec information to
all of the legitimate members of that multicast group via mechanisms
not defined here.

Multiple senders to a multicast group SHOULD use a single Security
Association (and hence Security Parameter Index) for all traffic to
that group when a symmetric key encryption or authentication
algorithm is employed. In such circumstances, the receiver knows only
that the message came from a system possessing the key for that
multicast group. In such circumstances, a receiver generally will
not be able to authenticate which system sent the multicast traffic.
Specifications for other, more general multicast cases are deferred
to later IPsec documents.

At the time this specification was published, automated protocols for
multicast key distribution were not considered adequately mature for
standardization. For multicast groups having relatively few members,
manual key distribution or multiple use of existing unicast key
distribution algorithms such as modified Diffie-Hellman appears
feasible. For very large groups, new scalable techniques will be
needed. An example of current work in this area is the Group Key
Management Protocol (GKMP) [HM97].

5. IP Traffic Processing

As mentioned in Section 4.4.1 "The Security Policy Database (SPD)",
the SPD must be consulted during the processing of all traffic
(INBOUND and OUTBOUND), including non-IPsec traffic. If no policy is
found in the SPD that matches the packet (for either inbound or
outbound traffic), the packet MUST be discarded.

NOTE: All of the cryptographic algorithms used in IPsec expect their
input in canonical network byte order (see Appendix in RFC 791) and
generate their output in canonical network byte order. IP packets
are also transmitted in network byte order.

5.1 Outbound IP Traffic Processing

5.1.1 Selecting and Using an SA or SA Bundle

In a security gateway or BITW implementation (and in many BITS
implementations), each outbound packet is compared against the SPD to
determine what processing is required for the packet. If the packet
is to be discarded, this is an auditable event. If the traffic is
allowed to bypass IPsec processing, the packet continues through
"normal" processing for the environment in which the IPsec processing
is taking place. If IPsec processing is required, the packet is
either mapped to an existing SA (or SA bundle), or a new SA (or SA
bundle) is created for the packet. Since a packet's selectors might
match multiple policies or multiple extant SAs and since the SPD is
ordered, but the SAD is not, IPsec MUST:

1. Match the packet's selector fields against the outbound
policies in the SPD to locate the first appropriate
policy, which will point to zero or more SA bundles in the
SAD.

2. Match the packet's selector fields against those in the SA
bundles found in (1) to locate the first SA bundle that
matches. If no SAs were found or none match, create an
appropriate SA bundle and link the SPD entry to the SAD
entry. If no key management entity is found, drop the
packet.

3. Use the SA bundle found/created in (2) to do the required
IPsec processing, e.g., authenticate and encrypt.

In a host IPsec implementation based on sockets, the SPD will be
consulted whenever a new socket is created, to determine what, if
any, IPsec processing will be applied to the traffic that will flow
on that socket.

NOTE: A compliant implementation MUST not allow instantiation of an
ESP SA that employs both a NULL encryption and a NULL authentication
algorithm. An attempt to negotiate such an SA is an auditable event.

5.1.2 Header Construction for Tunnel Mode

This section describes the handling of the inner and outer IP
headers, extension headers, and options for AH and ESP tunnels. This
includes how to construct the encapsulating (outer) IP header, how to
handle fields in the inner IP header, and what other actions should
be taken. The general idea is modeled after the one used in RFC
2003, "IP Encapsulation with IP":

o The outer IP header Source Address and Destination Address
identify the "endpoints" of the tunnel (the encapsulator and
decapsulator). The inner IP header Source Address and
Destination Addresses identify the original sender and
recipient of the datagram, (from the perspective of this
tunnel), respectively. (see footnote 3 after the table in
5.1.2.1 for more details on the encapsulating source IP
address.)
o The inner IP header is not changed except to decrement the TTL
as noted below, and remains unchanged during its delivery to
the tunnel exit point.
o No change to IP options or extension headers in the inner
header occurs during delivery of the encapsulated datagram
through the tunnel.
o If need be, other protocol headers such as the IP
Authentication header may be inserted between the outer IP
header and the inner IP header.

The tables in the following sub-sections show the handling for the
different header/option fields (constructed = the value in the outer
field is constructed independently of the value in the inner).

5.1.2.1 IPv4 -- Header Construction for Tunnel Mode

<-- How Outer Hdr Relates to Inner Hdr -->
Outer Hdr at Inner Hdr at
IPv4 Encapsulator Decapsulator
Header fields: -------------------- ------------
version 4 (1) no change
header length constructed no change
TOS copied from inner hdr (5) no change
total length constructed no change
ID constructed no change
flags (DF,MF) constructed, DF (4) no change
fragmt offset constructed no change

TTL constructed (2) decrement (2)
protocol AH, ESP, routing hdr no change
checksum constructed constructed (2)
src address constructed (3) no change
dest address constructed (3) no change
Options never copied no change

1. The IP version in the encapsulating header can be different
from the value in the inner header.

2. The TTL in the inner header is decremented by the
encapsulator prior to forwarding and by the decapsulator if
it forwards the packet. (The checksum changes when the TTL
changes.)

Note: The decrementing of the TTL is one of the usual actions
that takes place when forwarding a packet. Packets
originating from the same node as the encapsulator do not
have their TTL's decremented, as the sending node is
originating the packet rather than forwarding it.

3. src and dest addresses depend on the SA, which is used to
determine the dest address which in turn determines which src
address (net interface) is used to forward the packet.

NOTE: In principle, the encapsulating IP source address can
be any of the encapsulator's interface addresses or even an
address different from any of the encapsulator's IP
addresses, (e.g., if it's acting as a NAT box) so long as the
address is reachable through the encapsulator from the
environment into which the packet is sent. This does not
cause a problem because IPsec does not currently have any
INBOUND processing requirement that involves the Source
Address of the encapsulating IP header. So while the
receiving tunnel endpoint looks at the Destination Address in
the encapsulating IP header, it only looks at the Source
Address in the inner (encapsulated) IP header.

4. configuration determines whether to copy from the inner
header (IPv4 only), clear or set the DF.

5. If Inner Hdr is IPv4 (Protocol = 4), copy the TOS. If Inner
Hdr is IPv6 (Protocol = 41), map the Class to TOS.

5.1.2.2 IPv6 -- Header Construction for Tunnel Mode

See previous section 5.1.2 for notes 1-5 indicated by (footnote
number).

<-- How Outer Hdr Relates Inner Hdr --->
Outer Hdr at Inner Hdr at
IPv6 Encapsulator Decapsulator
Header fields: -------------------- ------------
version 6 (1) no change
class copied or configured (6) no change
flow id copied or configured no change
len constructed no change
next header AH,ESP,routing hdr no change
hop limit constructed (2) decrement (2)
src address constructed (3) no change
dest address constructed (3) no change
Extension headers never copied no change

6. If Inner Hdr is IPv6 (Next Header = 41), copy the Class. If
Inner Hdr is IPv4 (Next Header = 4), map the TOS to Class.

5.2 Processing Inbound IP Traffic

Prior to performing AH or ESP processing, any IP fragments are
reassembled. Each inbound IP datagram to which IPsec processing will
be applied is identified by the appearance of the AH or ESP values in
the IP Next Protocol field (or of AH or ESP as an extension header in
the IPv6 context).

Note: Appendix C contains sample code for a bitmask check for a 32
packet window that can be used for implementing anti-replay service.

5.2.1 Selecting and Using an SA or SA Bundle

Mapping the IP datagram to the appropriate SA is simplified because
of the presence of the SPI in the AH or ESP header. Note that the
selector checks are made on the inner headers not the outer (tunnel)
headers. The steps followed are:

1. Use the packet's destination address (outer IP header),
IPsec protocol, and SPI to look up the SA in the SAD. If
the SA lookup fails, drop the packet and log/report the
error.

2. Use the SA found in (1) to do the IPsec processing, e.g.,
authenticate and decrypt. This step includes matching the
packet's (Inner Header if tunneled) selectors to the
selectors in the SA. Local policy determines the
specificity of the SA selectors (single value, list,
range, wildcard). In general, a packet's source address
MUST match the SA selector value. However, an ICMP packet
received on a tunnel mode SA may have a source address

other than that bound to the SA and thus such packets
should be permitted as exceptions to this check. For an
ICMP packet, the selectors from the enclosed problem
packet (the source and destination addresses and ports
should be swapped) should be checked against the selectors
for the SA. Note that some or all of these selectors may
be inaccessible because of limitations on how many bits of
the problem packet the ICMP packet is allowed to carry or
due to encryption. See Section 6.

Do (1) and (2) for every IPsec header until a Transport
Protocol Header or an IP header that is NOT for this
system is encountered. Keep track of what SAs have been
used and their order of application.

3. Find an incoming policy in the SPD that matches the
packet. This could be done, for example, by use of
backpointers from the SAs to the SPD or by matching the
packet's selectors (Inner Header if tunneled) against
those of the policy entries in the SPD.

4. Check whether the required IPsec processing has been
applied, i.e., verify that the SA's found in (1) and (2)
match the kind and order of SAs required by the policy
found in (3).

NOTE: The correct "matching" policy will not necessarily
be the first inbound policy found. If the check in (4)
fails, steps (3) and (4) are repeated until all policy
entries have been checked or until the check succeeds.

At the end of these steps, pass the resulting packet to the Transport
Layer or forward the packet. Note that any IPsec headers processed
in these steps may have been removed, but that this information,
i.e., what SAs were used and the order of their application, may be
needed for subsequent IPsec or firewall processing.

Note that in the case of a security gateway, if forwarding causes a
packet to exit via an IPsec-enabled interface, then additional IPsec
processing may be applied.

5.2.2 Handling of AH and ESP tunnels

The handling of the inner and outer IP headers, extension headers,
and options for AH and ESP tunnels should be performed as described
in the tables in Section 5.1.

6. ICMP Processing (relevant to IPsec)

The focus of this section is on the handling of ICMP error messages.
Other ICMP traffic, e.g., Echo/Reply, should be treated like other
traffic and can be protected on an end-to-end basis using SAs in the
usual fashion.

An ICMP error message protected by AH or ESP and generated by a
router SHOULD be processed and forwarded in a tunnel mode SA. Local
policy determines whether or not it is subjected to source address
checks by the router at the destination end of the tunnel. Note that
if the router at the originating end of the tunnel is forwarding an
ICMP error message from another router, the source address check
would fail. An ICMP message protected by AH or ESP and generated by
a router MUST NOT be forwarded on a transport mode SA (unless the SA
has been established to the router acting as a host, e.g., a Telnet
connection used to manage a router). An ICMP message generated by a
host SHOULD be checked against the source IP address selectors bound
to the SA in which the message arrives. Note that even if the source
of an ICMP error message is authenticated, the returned IP header
could be invalid. Accordingly, the selector values in the IP header
SHOULD also be checked to be sure that they are consistent with the
selectors for the SA over which the ICMP message was received.

The table in Appendix D characterize ICMP messages as being either
host generated, router generated, both, unknown/unassigned. ICMP
messages falling into the last two categories should be handled as
determined by the receiver's policy.

An ICMP message not protected by AH or ESP is unauthenticated and its
processing and/or forwarding may result in denial of service. This
suggests that, in general, it would be desirable to ignore such
messages. However, it is expected that many routers (vs. security
gateways) will not implement IPsec for transit traffic and thus
strict adherence to this rule would cause many ICMP messages to be
discarded. The result is that some critical IP functions would be
lost, e.g., redirection and PMTU processing. Thus it MUST be
possible to configure an IPsec implementation to accept or reject
(router) ICMP traffic as per local security policy.

The remainder of this section addresses how PMTU processing MUST be
performed at hosts and security gateways. It addresses processing of
both authenticated and unauthenticated ICMP PMTU messages. However,
as noted above, unauthenticated ICMP messages MAY be discarded based
on local policy.

6.1 PMTU/DF Processing

6.1.1 DF Bit

In cases where a system (host or gateway) adds an encapsulating
header (ESP tunnel or AH tunnel), it MUST support the option of
copying the DF bit from the original packet to the encapsulating
header (and processing ICMP PMTU messages). This means that it MUST
be possible to configure the system's treatment of the DF bit (set,
clear, copy from encapsulated header) for each interface. (See
Appendix B for rationale.)

6.1.2 Path MTU Discovery (PMTU)

This section discusses IPsec handling for Path MTU Discovery
messages. ICMP PMTU is used here to refer to an ICMP message for:

IPv4 (RFC 792):
- Type = 3 (Destination Unreachable)
- Code = 4 (Fragmentation needed and DF set)
- Next-Hop MTU in the low-order 16 bits of the second
word of the ICMP header (labelled "unused" in RFC
792), with high-order 16 bits set to zero

IPv6 (RFC 1885):
- Type = 2 (Packet Too Big)
- Code = 0 (Fragmentation needed)
- Next-Hop MTU in the 32 bit MTU field of the ICMP6
message

6.1.2.1 Propagation of PMTU

The amount of information returned with the ICMP PMTU message (IPv4
or IPv6) is limited and this affects what selectors are available for
use in further propagating the PMTU information. (See Appendix B for
more detailed discussion of this topic.)

o PMTU message with 64 bits of IPsec header -- If the ICMP PMTU
message contains only 64 bits of the IPsec header (minimum for
IPv4), then a security gateway MUST support the following options
on a per SPI/SA basis:

a. if the originating host can be determined (or the possible
sources narrowed down to a manageable number), send the PM
information to all the possible originating hosts.
b. if the originating host cannot be determined, store the PMTU
with the SA and wait until the next packet(s) arrive from the
originating host for the relevant security association. If

the packet(s) are bigger than the PMTU, drop the packet(s),
and compose ICMP PMTU message(s) with the new packet(s) and
the updated PMTU, and send the ICMP message(s) about the
problem to the originating host. Retain the PMTU information
for any message that might arrive subsequently (see Section
6.1.2.4, "PMTU Aging").

o PMTU message with >64 bits of IPsec header -- If the ICMP message
contains more information from the original packet then there may
be enough non-opaque information to immediately determine to which
host to propagate the ICMP/PMTU message and to provide that system
with the 5 fields (source address, destination address, source
port, destination port, transport protocol) needed to determine
where to store/update the PMTU. Under such circumstances, a
security gateway MUST generate an ICMP PMTU message immediately
upon receipt of an ICMP PMTU from further down the path.

o Distributing the PMTU to the Transport Layer -- The host mechanism
for getting the updated PMTU to the transport layer is unchanged,
as specified in RFC 1191 (Path MTU Discovery).

6.1.2.2 Calculation of PMTU

The calculation of PMTU from an ICMP PMTU MUST take into account the
addition of any IPsec header -- AH transport, ESP transport, AH/ESP
transport, ESP tunnel, AH tunnel. (See Appendix B for discussion of
implementation issues.)

Note: In some situations the addition of IPsec headers could result
in an effective PMTU (as seen by the host or application) that is
unacceptably small. To avoid this problem, the implementation may
establish a threshold below which it will not report a reduced PMTU.
In such cases, the implementation would apply IPsec and then fragment
the resulting packet according to the PMTU. This would result in a
more efficient use of the available bandwidth.

6.1.2.3 Granularity of PMTU Processing

In hosts, the granularity with which ICMP PMTU processing can be done
differs depending on the implementation situation. Looking at a
host, there are 3 situations that are of interest with respect to
PMTU issues (See Appendix B for additional details on this topic.):

a. Integration of IPsec into the native IP implementation
b. Bump-in-the-stack implementations, where IPsec is implemented
"underneath" an existing implementation of a TCP/IP protocol
stack, between the native IP and the local network drivers

c. No IPsec implementation -- This case is included because it
is relevant in cases where a security gateway is sending PMTU
information back to a host.

Only in case (a) can the PMTU data be maintained at the same
granularity as communication associations. In (b) and (c), the IP
layer will only be able to maintain PMTU data at the granularity of
source and destination IP addresses (and optionally TOS), as
described in RFC 1191. This is an important difference, because more
than one communication association may map to the same source and
destination IP addresses, and each communication association may have
a different amount of IPsec header overhead (e.g., due to use of
different transforms or different algorithms).

Implementation of the calculation of PMTU and support for PMTUs at
the granularity of individual communication associations is a local
matter. However, a socket-based implementation of IPsec in a host
SHOULD maintain the information on a per socket basis. Bump in the
stack systems MUST pass an ICMP PMTU to the host IP implementation,
after adjusting it for any IPsec header overhead added by these
systems. The calculation of the overhead SHOULD be determined by
analysis of the SPI and any other selector information present in a
returned ICMP PMTU message.

6.1.2.4 PMTU Aging

In all systems (host or gateway) implementing IPsec and maintaining
PMTU information, the PMTU associated with a security association
(transport or tunnel) MUST be "aged" and some mechanism put in place
for updating the PMTU in a timely manner, especially for discovering
if the PMTU is smaller than it needs to be. A given PMTU has to
remain in place long enough for a packet to get from the source end
of the security association to the system at the other end of the
security association and propagate back an ICMP error message if the
current PMTU is too big. Note that if there are nested tunnels,
multiple packets and round trip times might be required to get an
ICMP message back to an encapsulator or originating host.

Systems SHOULD use the approach described in the Path MTU Discovery
document (RFC 1191, Section 6.3), which suggests periodically
resetting the PMTU to the first-hop data-link MTU and then letting
the normal PMTU Discovery processes update the PMTU as necessary.
The period SHOULD be configurable.

7. Auditing

Not all systems that implement IPsec will implement auditing. For
the most part, the granularity of auditing is a local matter.
However, several auditable events are identified in the AH and ESP
specifications and for each of these events a minimum set of
information that SHOULD be included in an audit log is defined.
Additional information also MAY be included in the audit log for each
of these events, and additional events, not explicitly called out in
this specification, also MAY result in audit log entries. There is
no requirement for the receiver to transmit any message to the
purported transmitter in response to the detection of an auditable
event, because of the potential to induce denial of service via such
action.

8. Use in Systems Supporting Information Flow Security

Information of various sensitivity levels may be carried over a
single network. Information labels (e.g., Unclassified, Company
Proprietary, Secret) [DoD85, DoD87] are often employed to distinguish
such information. The use of labels facilitates segregation of
information, in support of information flow security models, e.g.,
the Bell-LaPadula model [BL73]. Such models, and corresponding
supporting technology, are designed to prevent the unauthorized flow
of sensitive information, even in the face of Trojan Horse attacks.
Conventional, discretionary access control (DAC) mechanisms, e.g.,
based on access control lists, generally are not sufficient to
support such policies, and thus facilities such as the SPD do not
suffice in such environments.

In the military context, technology that supports such models is
often referred to as multi-level security (MLS). Computers and
networks often are designated "multi-level secure" if they support
the separation of labelled data in conjunction with information flow
security policies. Although such technology is more broadly
applicable than just military applications, this document uses the
acronym "MLS" to designate the technology, consistent with much
extant literature.

IPsec mechanisms can easily support MLS networking. MLS networking
requires the use of strong Mandatory Access Controls (MAC), which
unprivileged users or unprivileged processes are incapable of
controlling or violating. This section pertains only to the use of
these IP security mechanisms in MLS (information flow security
policy) environments. Nothing in this section applies to systems not
claiming to provide MLS.

As used in this section, "sensitivity information" might include
implementation-defined hierarchic levels, categories, and/or
releasability information.

AH can be used to provide strong authentication in support of
mandatory access control decisions in MLS environments. If explicit
IP sensitivity information (e.g., IPSO [Ken91]) is used and
confidentiality is not considered necessary within the particular
operational environment, AH can be used to authenticate the binding
between sensitivity labels in the IP header and the IP payload
(including user data). This is a significant improvement over
labeled IPv4 networks where the sensitivity information is trusted
even though there is no authentication or cryptographic binding of
the information to the IP header and user data. IPv4 networks might
or might not use explicit labelling. IPv6 will normally use implicit
sensitivity information that is part of the IPsec Security
Association but not transmitted with each packet instead of using
explicit sensitivity information. All explicit IP sensitivity
information MUST be authenticated using either ESP, AH, or both.

Encryption is useful and can be desirable even when all of the hosts
are within a protected environment, for example, behind a firewall or
disjoint from any external connectivity. ESP can be used, in
conjunction with appropriate key management and encryption
algorithms, in support of both DAC and MAC. (The choice of
encryption and authentication algorithms, and the assurance level of
an IPsec implementation will determine the environments in which an
implementation may be deemed sufficient to satisfy MLS requirements.)
Key management can make use of sensitivity information to provide
MAC. IPsec implementations on systems claiming to provide MLS SHOULD
be capable of using IPsec to provide MAC for IP-based communications.

8.1 Relationship Between Security Associations and Data Sensitivity

Both the Encapsulating Security Payload and the Authentication Header
can be combined with appropriate Security Association policies to
provide multi-level secure networking. In this case each SA (or SA
bundle) is normally used for only a single instance of sensitivity
information. For example, "PROPRIETARY - Internet Engineering" must
be associated with a different SA (or SA bundle) from "PROPRIETARY -
Finance".

8.2 Sensitivity Consistency Checking

An MLS implementation (both host and router) MAY associate
sensitivity information, or a range of sensitivity information with
an interface, or a configured IP address with its associated prefix
(the latter is sometimes referred to as a logical interface, or an

interface alias). If such properties exist, an implementation SHOULD
compare the sensitivity information associated with the packet
against the sensitivity information associated with the interface or
address/prefix from which the packet arrived, or through which the
packet will depart. This check will either verify that the
sensitivities match, or that the packet's sensitivity falls within
the range of the interface or address/prefix.

The checking SHOULD be done on both inbound and outbound processing.

8.3 Additional MLS Attributes for Security Association Databases

Section 4.4 discussed two Security Association databases (the
Security Policy Database (SPD) and the Security Association Database
(SAD)) and the associated policy selectors and SA attributes. MLS
networking introduces an additional selector/attribute:

- Sensitivity information.

The Sensitivity information aids in selecting the appropriate
algorithms and key strength, so that the traffic gets a level of
protection appropriate to its importance or sensitivity as described
in section 8.1. The exact syntax of the sensitivity information is
implementation defined.

8.4 Additional Inbound Processing Steps for MLS Networking

After an inbound packet has passed through IPsec processing, an MLS
implementation SHOULD first check the packet's sensitivity (as
defined by the SA (or SA bundle) used for the packet) with the
interface or address/prefix as described in section 8.2 before
delivering the datagram to an upper-layer protocol or forwarding it.

The MLS system MUST retain the binding between the data received in
an IPsec protected packet and the sensitivity information in the SA
or SAs used for processing, so appropriate policy decisions can be
made when delivering the datagram to an application or forwarding
engine. The means for maintaining this binding are implementation
specific.

8.5 Additional Outbound Processing Steps for MLS Networking

An MLS implementation of IPsec MUST perform two additional checks
besides the normal steps detailed in section 5.1.1. When consulting
the SPD or the SAD to find an outbound security association, the MLS
implementation MUST use the sensitivity of the data to select an

appropriate outbound SA or SA bundle. The second check comes before
forwarding the packet out to its destination, and is the sensitivity
consistency checking described in section 8.2.

8.6 Additional MLS Processing for Security Gateways

An MLS security gateway MUST follow the previously mentioned inbound
and outbound processing rules as well as perform some additional
processing specific to the intermediate protection of packets in an
MLS environment.

A security gateway MAY act as an outbound proxy, creating SAs for MLS
systems that originate packets forwarded by the gateway. These MLS
systems may explicitly label the packets to be forwarded, or the
whole originating network may have sensitivity characteristics
associated with it. The security gateway MUST create and use
appropriate SAs for AH, ESP, or both, to protect such traffic it
forwards.

Similarly such a gateway SHOULD accept and process inbound AH and/or
ESP packets and forward appropriately, using explicit packet
labeling, or relying on the sensitivity characteristics of the
destination network.

9. Performance Issues

The use of IPsec imposes computational performance costs on the hosts
or security gateways that implement these protocols. These costs are
associated with the memory needed for IPsec code and data structures,
and the computation of integrity check values, encryption and
decryption, and added per-packet handling. The per-packet
computational costs will be manifested by increased latency and,
possibly, reduced throughout. Use of SA/key management protocols,
especially ones that employ public key cryptography, also adds
computational performance costs to use of IPsec. These per-
association computational costs will be manifested in terms of
increased latency in association establishment. For many hosts, it
is anticipated that software-based cryptography will not appreciably
reduce throughput, but hardware may be required for security gateways
(since they represent aggregation points), and for some hosts.

The use of IPsec also imposes bandwidth utilization costs on
transmission, switching, and routing components of the Internet
infrastructure, components not implementing IPsec. This is due to
the increase in the packet size resulting from the addition of AH
and/or ESP headers, AH and ESP tunneling (which adds a second IP
header), and the increased packet traffic associated with key
management protocols. It is anticipated that, in most instances,

this increased bandwidth demand will not noticeably affect the
Internet infrastructure. However, in some instances, the effects may
be significant, e.g., transmission of ESP encrypted traffic over a
dialup link that otherwise would have compressed the traffic.

Note: The initial SA establishment overhead will be felt in the first
packet. This delay could impact the transport layer and application.
For example, it could cause TCP to retransmit the SYN before the
ISAKMP exchange is done. The effect of the delay would be different
on UDP than TCP because TCP shouldn't transmit anything other than
the SYN until the connection is set up whereas UDP will go ahead and
transmit data beyond the first packet.

Note: As discussed earlier, compression can still be employed at
layers above IP. There is an IETF working group (IP Payload
Compression Protocol (ippcp)) working on "protocol specifications
that make it possible to perform lossless compression on individual
payloads before the payload is processed by a protocol that encrypts
it. These specifications will allow for compression operations to be
performed prior to the encryption of a payload by IPsec protocols."

10. Conformance Requirements

All IPv4 systems that claim to implement IPsec MUST comply with all
requirements of the Security Architecture document. All IPv6 systems
MUST comply with all requirements of the Security Architecture
document.

11. Security Considerations

The focus of this document is security; hence security considerations
permeate this specification.

12. Differences from RFC 1825

This architecture document differs substantially from RFC 1825 in
detail and in organization, but the fundamental notions are
unchanged. This document provides considerable additional detail in
terms of compliance specifications. It introduces the SPD and SAD,
and the notion of SA selectors. It is aligned with the new versions
of AH and ESP, which also differ from their predecessors. Specific
requirements for supported combinations of AH and ESP are newly
added, as are details of PMTU management.

Acknowledgements

Many of the concepts embodied in this specification were derived from
or influenced by the US Government's SP3 security protocol, ISO/IEC's
NLSP, the proposed swIPe security protocol [SDNS, ISO, IB93, IBK93],
and the work done for SNMP Security and SNMPv2 Security.

For over 3 years (although it sometimes seems *much* longer), this
document has evolved through multiple versions and iterations.
During this time, many people have contributed significant ideas and
energy to the process and the documents themselves. The authors
would like to thank Karen Seo for providing extensive help in the
review, editing, background research, and coordination for this
version of the specification. The authors would also like to thank
the members of the IPsec and IPng working groups, with special
mention of the efforts of (in alphabetic order): Steve Bellovin,
Steve Deering, James Hughes, Phil Karn, Frank Kastenholz, Perry
Metzger, David Mihelcic, Hilarie Orman, Norman Shulman, William
Simpson, Harry Varnis, and Nina Yuan.

Appendix A -- Glossary

This section provides definitions for several key terms that are
employed in this document. Other documents provide additional
definitions and background information relevant to this technology,
e.g., [VK83, HA94]. Included in this glossary are generic security
service and security mechanism terms, plus IPsec-specific terms.

Access Control
Access control is a security service that prevents unauthorized
use of a resource, including the prevention of use of a resource
in an unauthorized manner. In the IPsec context, the resource
to which access is being controlled is often:
o for a host, computing cycles or data
o for a security gateway, a network behind the gateway
or
bandwidth on that network.

Anti-replay
[See "Integrity" below]

Authentication
This term is used informally to refer to the combination of two
nominally distinct security services, data origin authentication
and connectionless integrity. See the definitions below for
each of these services.

Availability
Availability, when viewed as a security service, addresses the
security concerns engendered by attacks against networks that
deny or degrade service. For example, in the IPsec context, the
use of anti-replay mechanisms in AH and ESP support
availability.

Confidentiality
Confidentiality is the security service that protects data from
unauthorized disclosure. The primary confidentiality concern in
most instances is unauthorized disclosure of application level
data, but disclosure of the external characteristics of
communication also can be a concern in some circumstances.
Traffic flow confidentiality is the service that addresses this
latter concern by concealing source and destination addresses,
message length, or frequency of communication. In the IPsec
context, using ESP in tunnel mode, especially at a security
gateway, can provide some level of traffic flow confidentiality.
(See also traffic analysis, below.)

Encryption
Encryption is a security mechanism used to transform data from
an intelligible form (plaintext) into an unintelligible form
(ciphertext), to provide confidentiality. The inverse
transformation process is designated "decryption". Oftimes the
term "encryption" is used to generically refer to both
processes.

Data Origin Authentication
Data origin authentication is a security service that verifies
the identity of the claimed source of data. This service is
usually bundled with connectionless integrity service.

Integrity
Integrity is a security service that ensures that modifications
to data are detectable. Integrity comes in various flavors to
match application requirements. IPsec supports two forms of
integrity: connectionless and a form of partial sequence
integrity. Connectionless integrity is a service that detects
modification of an individual IP datagram, without regard to the
ordering of the datagram in a stream of traffic. The form of
partial sequence integrity offered in IPsec is referred to as
anti-replay integrity, and it detects arrival of duplicate IP
datagrams (within a constrained window). This is in contrast to
connection-oriented integrity, which imposes more stringent
sequencing requirements on traffic, e.g., to be able to detect
lost or re-ordered messages. Although authentication and
integrity services often are cited separately, in practice they
are intimately connected and almost always offered in tandem.

Security Association (SA)
A simplex (uni-directional) logical connection, created for
security purposes. All traffic traversing an SA is provided the
same security processing. In IPsec, an SA is an internet layer
abstraction implemented through the use of AH or ESP.

Security Gateway
A security gateway is an intermediate system that acts as the
communications interface between two networks. The set of hosts
(and networks) on the external side of the security gateway is
viewed as untrusted (or less trusted), while the networks and
hosts and on the internal side are viewed as trusted (or more
trusted). The internal subnets and hosts served by a security
gateway are presumed to be trusted by virtue of sharing a
common, local, security administration. (See "Trusted
Subnetwork" below.) In the IPsec context, a security gateway is
a point at which AH and/or ESP is implemented in order to serve

a set of internal hosts, providing security services for these
hosts when they communicate with external hosts also employing
IPsec (either directly or via another security gateway).

SPI
Acronym for "Security Parameters Index". The combination of a
destination address, a security protocol, and an SPI uniquely
identifies a security association (SA, see above). The SPI is
carried in AH and ESP protocols to enable the receiving system
to select the SA under which a received packet will be
processed. An SPI has only local significance, as defined by
the creator of the SA (usually the receiver of the packet
carrying the SPI); thus an SPI is generally viewed as an opaque
bit string. However, the creator of an SA may choose to
interpret the bits in an SPI to facilitate local processing.

Traffic Analysis
The analysis of network traffic flow for the purpose of deducing
information that is useful to an adversary. Examples of such
information are frequency of transmission, the identities of the
conversing parties, sizes of packets, flow identifiers, etc.
[Sch94]

Trusted Subnetwork
A subnetwork containing hosts and routers that trust each other
not to engage in active or passive attacks. There also is an
assumption that the underlying communications channel (e.g., a
LAN or CAN) isn't being attacked by other means.

Appendix B -- Analysis/Discussion of PMTU/DF/Fragmentation Issues

B.1 DF bit

In cases where a system (host or gateway) adds an encapsulating
header (e.g., ESP tunnel), should/must the DF bit in the original
packet be copied to the encapsulating header?

Fragmenting seems correct for some situations, e.g., it might be
appropriate to fragment packets over a network with a very small MTU,
e.g., a packet radio network, or a cellular phone hop to mobile node,
rather than propagate back a very small PMTU for use over the rest of
the path. In other situations, it might be appropriate to set the DF
bit in order to get feedback from later routers about PMTU
constraints which require fragmentation. The existence of both of
these situations argues for enabling a system to decide whether or
not to fragment over a particular network "link", i.e., for requiring
an implementation to be able to copy the DF bit (and to process ICMP
PMTU messages), but making it an option to be selected on a per
interface basis. In other words, an administrator should be able to
configure the router's treatment of the DF bit (set, clear, copy from
encapsulated header) for each interface.

Note: If a bump-in-the-stack implementation of IPsec attempts to
apply different IPsec algorithms based on source/destination ports,
it will be difficult to apply Path MTU adjustments.

B.2 Fragmentation

If required, IP fragmentation occurs after IPsec processing within an
IPsec implementation. Thus, transport mode AH or ESP is applied only
to whole IP datagrams (not to IP fragments). An IP packet to which
AH or ESP has been applied may itself be fragmented by routers en
route, and such fragments MUST be reassembled prior to IPsec
processing at a receiver. In tunnel mode, AH or ESP is applied to an
IP packet, the payload of which may be a fragmented IP packet. For
example, a security gateway, "bump-in-the-stack" (BITS), or "bump-
in-the-wire" (BITW) IPsec implementation may apply tunnel mode AH to
such fragments. Note that BITS or BITW implementations are examples
of where a host IPsec implementation might receive fragments to which
tunnel mode is to be applied. However, if transport mode is to be
applied, then these implementations MUST reassemble the fragments
prior to applying IPsec.

NOTE: IPsec always has to figure out what the encapsulating IP header
fields are. This is independent of where you insert IPsec and is
intrinsic to the definition of IPsec. Therefore any IPsec
implementation that is not integrated into an IP implementation must
include code to construct the necessary IP headers (e.g., IP2):

o AH-tunnel --> IP2-AH-IP1-Transport-Data
o ESP-tunnel --> IP2-ESP_hdr-IP1-Transport-Data-ESP_trailer

*********************************************************************

Overall, the fragmentation/reassembly approach described above works
for all cases examined.

AH Xport AH Tunnel ESP Xport ESP Tunnel
Implementation approach IPv4 IPv6 IPv4 IPv6 IPv4 IPv6 IPv4 IPv6
----------------------- ---- ---- ---- ---- ---- ---- ---- ----
Hosts (integr w/ IP stack) Y Y Y Y Y Y Y Y
Hosts (betw/ IP and drivers) Y Y Y Y Y Y Y Y
S. Gwy (integr w/ IP stack) Y Y Y Y
Outboard crypto processor *

* If the crypto processor system has its own IP address, then it
is covered by the security gateway case. This box receives
the packet from the host and performs IPsec processing. It
has to be able to handle the same AH, ESP, and related
IPv4/IPv6 tunnel processing that a security gateway would have
to handle. If it doesn't have it's own address, then it is
similar to the bump-in-the stack implementation between IP and
the network drivers.

The following analysis assumes that:

1. There is only one IPsec module in a given system's stack.
There isn't an IPsec module A (adding ESP/encryption and
thus) hiding the transport protocol, SRC port, and DEST port
from IPsec module B.
2. There are several places where IPsec could be implemented (as
shown in the table above).
a. Hosts with integration of IPsec into the native IP
implementation. Implementer has access to the source
for the stack.
b. Hosts with bump-in-the-stack implementations, where
IPsec is implemented between IP and the local network
drivers. Source access for stack is not available;
but there are well-defined interfaces that allows the
IPsec code to be incorporated into the system.

c. Security gateways and outboard crypto processors with
integration of IPsec into the stack.
3. Not all of the above approaches are feasible in all hosts.
But it was assumed that for each approach, there are some
hosts for whom the approach is feasible.

For each of the above 3 categories, there are IPv4 and IPv6, AH
transport and tunnel modes, and ESP transport and tunnel modes -- for
a total of 24 cases (3 x 2 x 4).

Some header fields and interface fields are listed here for ease of
reference -- they're not in the header order, but instead listed to
allow comparison between the columns. (* = not covered by AH
authentication. ESP authentication doesn't cover any headers that
precede it.)

IP/Transport Interface
IPv4 IPv6 (RFC 1122 -- Sec 3.4)
---- ---- ----------------------
Version = 4 Version = 6
Header Len
*TOS Class,Flow Lbl TOS
Packet Len Payload Len Len
ID ID (optional)
*Flags DF
*Offset
*TTL *Hop Limit TTL
Protocol Next Header
*Checksum
Src Address Src Address Src Address
Dst Address Dst Address Dst Address
Options? Options? Opt

? = AH covers Option-Type and Option-Length, but
might not cover Option-Data.

The results for each of the 20 cases is shown below ("works" = will
work if system fragments after outbound IPsec processing, reassembles
before inbound IPsec processing). Notes indicate implementation
issues.

a. Hosts (integrated into IP stack)
o AH-transport --> (IP1-AH-Transport-Data)
- IPv4 -- works
- IPv6 -- works
o AH-tunnel --> (IP2-AH-IP1-Transport-Data)
- IPv4 -- works
- IPv6 -- works

o ESP-transport --> (IP1-ESP_hdr-Transport-Data-ESP_trailer)
- IPv4 -- works
- IPv6 -- works
o ESP-tunnel --> (IP2-ESP_hdr-IP1-Transport-Data-ESP_trailer)
- IPv4 -- works
- IPv6 -- works

b. Hosts (Bump-in-the-stack) -- put IPsec between IP layer and
network drivers. In this case, the IPsec module would have to do
something like one of the following for fragmentation and
reassembly.
- do the fragmentation/reassembly work itself and
send/receive the packet directly to/from the network
layer. In AH or ESP transport mode, this is fine. In AH
or ESP tunnel mode where the tunnel end is at the ultimate
destination, this is fine. But in AH or ESP tunnel modes
where the tunnel end is different from the ultimate
destination and where the source host is multi-homed, this
approach could result in sub-optimal routing because the
IPsec module may be unable to obtain the information
needed (LAN interface and next-hop gateway) to direct the
packet to the appropriate network interface. This is not
a problem if the interface and next-hop gateway are the
same for the ultimate destination and for the tunnel end.
But if they are different, then IPsec would need to know
the LAN interface and the next-hop gateway for the tunnel
end. (Note: The tunnel end (security gateway) is highly
likely to be on the regular path to the ultimate
destination. But there could also be more than one path
to the destination, e.g., the host could be at an
organization with 2 firewalls. And the path being used
could involve the less commonly chosen firewall.) OR
- pass the IPsec'd packet back to the IP layer where an
extra IP header would end up being pre-pended and the
IPsec module would have to check and let IPsec'd fragments
go by.
OR
- pass the packet contents to the IP layer in a form such
that the IP layer recreates an appropriate IP header

At the network layer, the IPsec module will have access to the
following selectors from the packet -- SRC address, DST address,
Next Protocol, and if there's a transport layer header --> SRC
port and DST port. One cannot assume IPsec has access to the
Name. It is assumed that the available selector information is
sufficient to figure out the relevant Security Policy entry and
Security Association(s).

o AH-transport --> (IP1-AH-Transport-Data)
- IPv4 -- works
- IPv6 -- works
o AH-tunnel --> (IP2-AH-IP1-Transport-Data)
- IPv4 -- works
- IPv6 -- works
o ESP-transport --> (IP1-ESP_hdr-Transport-Data-ESP_trailer)
- IPv4 -- works
- IPv6 -- works
o ESP-tunnel --> (IP2-ESP_hdr-IP1-Transport-Data-ESP_trailer)
- IPv4 -- works
- IPv6 -- works

c. Security gateways -- integrate IPsec into the IP stack

NOTE: The IPsec module will have access to the following
selectors from the packet -- SRC address, DST address, Next
Protocol, and if there's a transport layer header --> SRC port
and DST port. It won't have access to the User ID (only Hosts
have access to User ID information.) Unlike some Bump-in-the-
stack implementations, security gateways may be able to look up
the Source Address in the DNS to provide a System Name, e.g., in
situations involving use of dynamically assigned IP addresses in
conjunction with dynamically updated DNS entries. It also won't
have access to the transport layer information if there is an ESP
header, or if it's not the first fragment of a fragmented
message. It is assumed that the available selector information
is sufficient to figure out the relevant Security Policy entry
and Security Association(s).

o AH-tunnel --> (IP2-AH-IP1-Transport-Data)
- IPv4 -- works
- IPv6 -- works
o ESP-tunnel --> (IP2-ESP_hdr-IP1-Transport-Data-ESP_trailer)
- IPv4 -- works
- IPv6 -- works

**********************************************************************

B.3 Path MTU Discovery

As mentioned earlier, "ICMP PMTU" refers to an ICMP message used for
Path MTU Discovery.

The legend for the diagrams below in B.3.1 and B.3.3 (but not B.3.2)
is:

==== = security association (AH or ESP, transport or tunnel)

---- = connectivity (or if so labelled, administrative boundary)
.... = ICMP message (hereafter referred to as ICMP PMTU) for

IPv4:
- Type = 3 (Destination Unreachable)
- Code = 4 (Fragmentation needed and DF set)
- Next-Hop MTU in the low-order 16 bits of the second
word of the ICMP header (labelled unused in RFC 792),
with high-order 16 bits set to zero

IPv6 (RFC 1885):
- Type = 2 (Packet Too Big)
- Code = 0 (Fragmentation needed and DF set)
- Next-Hop MTU in the 32 bit MTU field of the ICMP6

Hx = host x
Rx = router x
SGx = security gateway x
X* = X supports IPsec

B.3.1 Identifying the Originating Host(s)

The amount of information returned with the ICMP message is limited
and this affects what selectors are available to identify security
associations, originating hosts, etc. for use in further propagating
the PMTU information.

In brief... An ICMP message must contain the following information
from the "offending" packet:
- IPv4 (RFC 792) -- IP header plus a minimum of 64 bits

Accordingly, in the IPv4 context, an ICMP PMTU may identify only the
first (outermost) security association. This is because the ICMP
PMTU may contain only 64 bits of the "offending" packet beyond the IP
header, which would capture only the first SPI from AH or ESP. In
the IPv6 context, an ICMP PMTU will probably provide all the SPIs and
the selectors in the IP header, but maybe not the SRC/DST ports (in
the transport header) or the encapsulated (TCP, UDP, etc.) protocol.
Moreover, if ESP is used, the transport ports and protocol selectors
may be encrypted.

Looking at the diagram below of a security gateway tunnel (as
mentioned elsewhere, security gateways do not use transport mode)...

H1 =================== H3
\ | | /
H0 -- SG1* ---- R1 ---- SG2* ---- R2 -- H5
/ ^ | \
H2 |........| H4

Suppose that the security policy for SG1 is to use a single SA to SG2
for all the traffic between hosts H0, H1, and H2 and hosts H3, H4,
and H5. And suppose H0 sends a data packet to H5 which causes R1 to
send an ICMP PMTU message to SG1. If the PMTU message has only the
SPI, SG1 will be able to look up the SA and find the list of possible
hosts (H0, H1, H2, wildcard); but SG1 will have no way to figure out
that H0 sent the traffic that triggered the ICMP PMTU message.

original after IPsec ICMP
packet processing packet
-------- ----------- ------
IP-3 header (S = R1, D = SG1)
ICMP header (includes PMTU)
IP-2 header IP-2 header (S = SG1, D = SG2)
ESP header minimum of 64 bits of ESP hdr (*)
IP-1 header IP-1 header
TCP header TCP header
TCP data TCP data
ESP trailer

(*) The 64 bits will include enough of the ESP (or AH) header to
include the SPI.
- ESP -- SPI (32 bits), Seq number (32 bits)
- AH -- Next header (8 bits), Payload Len (8 bits),
Reserved (16 bits), SPI (32 bits)

This limitation on the amount of information returned with an ICMP
message creates a problem in identifying the originating hosts for
the packet (so as to know where to further propagate the ICMP PMTU
information). If the ICMP message contains only 64 bits of the IPsec
header (minimum for IPv4), then the IPsec selectors (e.g., Source and
Destination addresses, Next Protocol, Source and Destination ports,
etc.) will have been lost. But the ICMP error message will still
provide SG1 with the SPI, the PMTU information and the source and
destination gateways for the relevant security association.

The destination security gateway and SPI uniquely define a security
association which in turn defines a set of possible originating
hosts. At this point, SG1 could:

a. send the PMTU information to all the possible originating hosts.
This would not work well if the host list is a wild card or if
many/most of the hosts weren't sending to SG1; but it might work
if the SPI/destination/etc mapped to just one or a small number of
hosts.
b. store the PMTU with the SPI/etc and wait until the next packet(s)
arrive from the originating host(s) for the relevant security
association. If it/they are bigger than the PMTU, drop the
packet(s), and compose ICMP PMTU message(s) with the new packet(s)
and the updated PMTU, and send the originating host(s) the ICMP
message(s) about the problem. This involves a delay in notifying
the originating host(s), but avoids the problems of (a).

Since only the latter approach is feasible in all instances, a
security gateway MUST provide such support, as an option. However,
if the ICMP message contains more information from the original
packet, then there may be enough information to immediately determine
to which host to propagate the ICMP/PMTU message and to provide that
system with the 5 fields (source address, destination address, source
port, destination port, and transport protocol) needed to determine
where to store/update the PMTU. Under such circumstances, a security
gateway MUST generate an ICMP PMTU message immediately upon receipt
of an ICMP PMTU from further down the path. NOTE: The Next Protocol
field may not be contained in the ICMP message and the use of ESP
encryption may hide the selector fields that have been encrypted.

B.3.2 Calculation of PMTU

The calculation of PMTU from an ICMP PMTU has to take into account
the addition of any IPsec header by H1 -- AH and/or ESP transport, or
ESP or AH tunnel. Within a single host, multiple applications may
share an SPI and nesting of security associations may occur. (See
Section 4.5 Basic Combinations of Security Associations for
description of the combinations that MUST be supported). The diagram
below illustrates an example of security associations between a pair
of hosts (as viewed from the perspective of one of the hosts.) (ESPx
or AHx = transport mode)

Socket 1 -------------------------|
|
Socket 2 (ESPx/SPI-A) ---------- AHx (SPI-B) -- Internet

In order to figure out the PMTU for each socket that maps to SPI-B,
it will be necessary to have backpointers from SPI-B to each of the 2
paths that lead to it -- Socket 1 and Socket 2/SPI-A.

B.3.3 Granularity of Maintaining PMTU Data

In hosts, the granularity with which PMTU ICMP processing can be done
differs depending on the implementation situation. Looking at a
host, there are three situations that are of interest with respect to
PMTU issues:

a. Integration of IPsec into the native IP implementation
b. Bump-in-the-stack implementations, where IPsec is implemented
"underneath" an existing implementation of a TCP/IP protocol
stack, between the native IP and the local network drivers
c. No IPsec implementation -- This case is included because it is
relevant in cases where a security gateway is sending PMTU
information back to a host.

Only in case (a) can the PMTU data be maintained at the same
granularity as communication associations. In the other cases, the
IP layer will maintain PMTU data at the granularity of Source and
Destination IP addresses (and optionally TOS/Class), as described in
RFC 1191. This is an important difference, because more than one
communication association may map to the same source and destination
IP addresses, and each communication association may have a different
amount of IPsec header overhead (e.g., due to use of different
transforms or different algorithms). The examples below illustrate
this.

In cases (a) and (b)... Suppose you have the following situation.
H1 is sending to H2 and the packet to be sent from R1 to R2 exceeds
the PMTU of the network hop between them.

==================================
| |
H1* --- R1 ----- R2 ---- R3 ---- H2*
^ |
|.......|

If R1 is configured to not fragment subscriber traffic, then R1 sends
an ICMP PMTU message with the appropriate PMTU to H1. H1's
processing would vary with the nature of the implementation. In case
(a) (native IP), the security services are bound to sockets or the
equivalent. Here the IP/IPsec implementation in H1 can store/update
the PMTU for the associated socket. In case (b), the IP layer in H1
can store/update the PMTU but only at the granularity of Source and
Destination addresses and possibly TOS/Class, as noted above. So the
result may be sub-optimal, since the PMTU for a given
SRC/DST/TOS/Class will be the subtraction of the largest amount of
IPsec header used for any communication association between a given
source and destination.

In case (c), there has to be a security gateway to have any IPsec
processing. So suppose you have the following situation. H1 is
sending to H2 and the packet to be sent from SG1 to R exceeds the
PMTU of the network hop between them.

================
| |
H1 ---- SG1* --- R --- SG2* ---- H2
^ |
|.......|

As described above for case (b), the IP layer in H1 can store/update
the PMTU but only at the granularity of Source and Destination
addresses, and possibly TOS/Class. So the result may be sub-optimal,
since the PMTU for a given SRC/DST/TOS/Class will be the subtraction
of the largest amount of IPsec header used for any communication
association between a given source and destination.

B.3.4 Per Socket Maintenance of PMTU Data

Implementation of the calculation of PMTU (Section B.3.2) and support
for PMTUs at the granularity of individual "communication
associations" (Section B.3.3) is a local matter. However, a socket-
based implementation of IPsec in a host SHOULD maintain the
information on a per socket basis. Bump in the stack systems MUST
pass an ICMP PMTU to the host IP implementation, after adjusting it
for any IPsec header overhead added by these systems. The
determination of the overhead SHOULD be determined by analysis of the
SPI and any other selector information present in a returned ICMP
PMTU message.

B.3.5 Delivery of PMTU Data to the Transport Layer

The host mechanism for getting the updated PMTU to the transport
layer is unchanged, as specified in RFC 1191 (Path MTU Discovery).

B.3.6 Aging of PMTU Data

This topic is covered in Section 6.1.2.4.

Appendix C -- Sequence Space Window Code Example

This appendix contains a routine that implements a bitmask check for
a 32 packet window. It was provided by James Hughes
(jim_hughes@stortek.com) and Harry Varnis (hgv@anubis.network.com)
and is intended as an implementation example. Note that this code
both checks for a replay and updates the window. Thus the algorithm,
as shown, should only be called AFTER the packet has been
authenticated. Implementers might wish to consider splitting the
code to do the check for replays before computing the ICV. If the
packet is not a replay, the code would then compute the ICV, (discard
any bad packets), and if the packet is OK, update the window.

#include <stdio.h>
#include <stdlib.h>
typedef unsigned long u_long;

enum {
ReplayWindowSize = 32
};

u_long bitmap = 0; /* session state - must be 32 bits */
u_long lastSeq = 0; /* session state */

/* Returns 0 if packet disallowed, 1 if packet permitted */
int ChkReplayWindow(u_long seq);

int ChkReplayWindow(u_long seq) {
u_long diff;

if (seq == 0) return 0; /* first == 0 or wrapped */
if (seq > lastSeq) { /* new larger sequence number */
diff = seq - lastSeq;
if (diff < ReplayWindowSize) { /* In window */
bitmap <<= diff;
bitmap |= 1; /* set bit for this packet */
} else bitmap = 1; /* This packet has a "way larger" */
lastSeq = seq;
return 1; /* larger is good */
}
diff = lastSeq - seq;
if (diff >= ReplayWindowSize) return 0; /* too old or wrapped */
if (bitmap & ((u_long)1 << diff)) return 0; /* already seen */
bitmap |= ((u_long)1 << diff); /* mark as seen */
return 1; /* out of order but good */
}

char string_buffer[512];

#define STRING_BUFFER_SIZE sizeof(string_buffer)

int main() {
int result;
u_long last, current, bits;

printf("Input initial state (bits in hex, last msgnum):\n");
if (!fgets(string_buffer, STRING_BUFFER_SIZE, stdin)) exit(0);
sscanf(string_buffer, "%lx %lu", &bits, &last);
if (last != 0)
bits |= 1;
bitmap = bits;
lastSeq = last;
printf("bits:%08lx last:%lu\n", bitmap, lastSeq);
printf("Input value to test (current):\n");

while (1) {
if (!fgets(string_buffer, STRING_BUFFER_SIZE, stdin)) break;
sscanf(string_buffer, "%lu", ¤t);
result = ChkReplayWindow(current);
printf("%-3s", result ? "OK" : "BAD");
printf(" bits:%08lx last:%lu\n", bitmap, lastSeq);
}
return 0;
}

Appendix D -- Categorization of ICMP messages

The tables below characterize ICMP messages as being either host
generated, router generated, both, unassigned/unknown. The first set
are IPv4. The second set are IPv6.

IPv4

Type Name/Codes Reference
========================================================================
HOST GENERATED:
3 Destination Unreachable
2 Protocol Unreachable [RFC792]
3 Port Unreachable [RFC792]
8 Source Host Isolated [RFC792]
14 Host Precedence Violation [RFC1812]
10 Router Selection [RFC1256]

Type Name/Codes Reference
========================================================================
ROUTER GENERATED:
3 Destination Unreachable
0 Net Unreachable [RFC792]
4 Fragmentation Needed, Don't Fragment was Set [RFC792]
5 Source Route Failed [RFC792]
6 Destination Network Unknown [RFC792]
7 Destination Host Unknown [RFC792]
9 Comm. w/Dest. Net. is Administratively Prohibited [RFC792]
11 Destination Network Unreachable for Type of Service[RFC792]
5 Redirect
0 Redirect Datagram for the Network (or subnet) [RFC792]
2 Redirect Datagram for the Type of Service & Network[RFC792]
9 Router Advertisement [RFC1256]
18 Address Mask Reply [RFC950]

IPv4
Type Name/Codes Reference
========================================================================
BOTH ROUTER AND HOST GENERATED:
0 Echo Reply [RFC792]
3 Destination Unreachable
1 Host Unreachable [RFC792]
10 Comm. w/Dest. Host is Administratively Prohibited [RFC792]
12 Destination Host Unreachable for Type of Service [RFC792]
13 Communication Administratively Prohibited [RFC1812]
15 Precedence cutoff in effect [RFC1812]
4 Source Quench [RFC792]
5 Redirect
1 Redirect Datagram for the Host [RFC792]
3 Redirect Datagram for the Type of Service and Host [RFC792]
6 Alternate Host Address [JBP]
8 Echo [RFC792]
11 Time Exceeded [RFC792]
12 Parameter Problem [RFC792,RFC1108]
13 Timestamp [RFC792]
14 Timestamp Reply [RFC792]
15 Information Request [RFC792]
16 Information Reply [RFC792]
17 Address Mask Request [RFC950]
30 Traceroute [RFC1393]
31 Datagram Conversion Error [RFC1475]
32 Mobile Host Redirect [Johnson]
39 SKIP [Markson]
40 Photuris [Simpson]

Type Name/Codes Reference
========================================================================
UNASSIGNED TYPE OR UNKNOWN GENERATOR:
1 Unassigned [JBP]
2 Unassigned [JBP]
7 Unassigned [JBP]
19 Reserved (for Security) [Solo]
20-29 Reserved (for Robustness Experiment) [ZSu]
33 IPv6 Where-Are-You [Simpson]
34 IPv6 I-Am-Here [Simpson]
35 Mobile Registration Request [Simpson]
36 Mobile Registration Reply [Simpson]
37 Domain Name Request [Simpson]
38 Domain Name Reply [Simpson]
41-255 Reserved [JBP]

IPv6

Type Name/Codes Reference
========================================================================
HOST GENERATED:
1 Destination Unreachable [RFC 1885]
4 Port Unreachable

Type Name/Codes Reference
========================================================================
ROUTER GENERATED:
1 Destination Unreachable [RFC1885]
0 No Route to Destination
1 Comm. w/Destination is Administratively Prohibited
2 Not a Neighbor
3 Address Unreachable
2 Packet Too Big [RFC1885]
0
3 Time Exceeded [RFC1885]
0 Hop Limit Exceeded in Transit
1 Fragment reassembly time exceeded

Type Name/Codes Reference
========================================================================
BOTH ROUTER AND HOST GENERATED:
4 Parameter Problem [RFC1885]
0 Erroneous Header Field Encountered
1 Unrecognized Next Header Type Encountered
2 Unrecognized IPv6 Option Encountered

References

[BL73] Bell, D.E. & LaPadula, L.J., "Secure Computer Systems:
Mathematical Foundations and Model", Technical Report M74-
244, The MITRE Corporation, Bedford, MA, May 1973.

[Bra97] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Level", BCP 14, RFC 2119, March 1997.

[DoD85] US National Computer Security Center, "Department of
Defense Trusted Computer System Evaluation Criteria", DoD
5200.28-STD, US Department of Defense, Ft. Meade, MD.,
December 1985.

[DoD87] US National Computer Security Center, "Trusted Network
Interpretation of the Trusted Computer System Evaluation
Criteria", NCSC-TG-005, Version 1, US Department of
Defense, Ft. Meade, MD., 31 July 1987.

[HA94] Haller, N., and R. Atkinson, "On Internet Authentication",
RFC 1704, October 1994.

[HC98] Harkins, D., and D. Carrel, "The Internet Key Exchange
(IKE)", RFC 2409, November 1998.

[HM97] Harney, H., and C. Muckenhirn, "Group Key Management
Protocol (GKMP) Architecture", RFC 2094, July 1997.

[ISO] ISO/IEC JTC1/SC6, Network Layer Security Protocol, ISO-IEC
DIS 11577, International Standards Organisation, Geneva,
Switzerland, 29 November 1992.

[IB93] John Ioannidis and Matt Blaze, "Architecture and
Implementation of Network-layer Security Under Unix",
Proceedings of USENIX Security Symposium, Santa Clara, CA,
October 1993.

[IBK93] John Ioannidis, Matt Blaze, & Phil Karn, "swIPe: Network-
Layer Security for IP", presentation at the Spring 1993
IETF Meeting, Columbus, Ohio

[KA98a] Kent, S., and R. Atkinson, "IP Authentication Header", RFC
2402, November 1998.

[KA98b] Kent, S., and R. Atkinson, "IP Encapsulating Security
Payload (ESP)", RFC 2406, November 1998.

[Ken91] Kent, S., "US DoD Security Options for the Internet
Protocol", RFC 1108, November 1991.

[MSST97] Maughan, D., Schertler, M., Schneider, M., and J. Turner,
"Internet Security Association and Key Management Protocol
(ISAKMP)", RFC 2408, November 1998.

[Orm97] Orman, H., "The OAKLEY Key Determination Protocol", RFC
2412, November 1998.

[Pip98] Piper, D., "The Internet IP Security Domain of
Interpretation for ISAKMP", RFC 2407, November 1998.

[Sch94] Bruce Schneier, Applied Cryptography, Section 8.6, John
Wiley & Sons, New York, NY, 1994.

[SDNS] SDNS Secure Data Network System, Security Protocol 3, SP3,
Document SDN.301, Revision 1.5, 15 May 1989, published in
NIST Publication NIST-IR-90-4250, February 1990.

[SMPT98] Shacham, A., Monsour, R., Pereira, R., and M. Thomas, "IP
Payload Compression Protocol (IPComp)", RFC 2393, August
1998.

[TDG97] Thayer, R., Doraswamy, N., and R. Glenn, "IP Security
Document Roadmap", RFC 2411, November 1998.

[VK83] V.L. Voydock & S.T. Kent, "Security Mechanisms in High-
level Networks", ACM Computing Surveys, Vol. 15, No. 2,
June 1983.

Disclaimer

The views and specification expressed in this document are those of
the authors and are not necessarily those of their employers. The
authors and their employers specifically disclaim responsibility for
any problems arising from correct or incorrect implementation or use
of this design.

Author Information

Stephen Kent
BBN Corporation
70 Fawcett Street
Cambridge, MA 02140
USA

Phone: +1 (617) 873-3988
EMail: kent@bbn.com

Randall Atkinson
@Home Network
425 Broadway
Redwood City, CA 94063
USA

Phone: +1 (415) 569-5000
EMail: rja@corp.home.net

Copyright (C) The Internet Society (1998). All Rights Reserved.

This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.

The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.

This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


RFC 1661 – The Point-to-Point Protocol (PPP)

 
Network Working Group                                 W. Simpson, Editor
Request for Comments: 1661 Daydreamer
STD: 51 July 1994
Obsoletes: 1548
Category: Standards Track

The Point-to-Point Protocol (PPP)

Status of this Memo

This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.

Abstract

The Point-to-Point Protocol (PPP) provides a standard method for
transporting multi-protocol datagrams over point-to-point links. PPP
is comprised of three main components:

1. A method for encapsulating multi-protocol datagrams.

2. A Link Control Protocol (LCP) for establishing, configuring,
and testing the data-link connection.

3. A family of Network Control Protocols (NCPs) for establishing
and configuring different network-layer protocols.

This document defines the PPP organization and methodology, and the
PPP encapsulation, together with an extensible option negotiation
mechanism which is able to negotiate a rich assortment of
configuration parameters and provides additional management
functions. The PPP Link Control Protocol (LCP) is described in terms
of this mechanism.

Table of Contents

1. Introduction .......................................... 1
1.1 Specification of Requirements ................... 2
1.2 Terminology ..................................... 3

2. PPP Encapsulation ..................................... 4

3. PPP Link Operation .................................... 6
3.1 Overview ........................................ 6
3.2 Phase Diagram ................................... 6
3.3 Link Dead (physical-layer not ready) ............ 7
3.4 Link Establishment Phase ........................ 7
3.5 Authentication Phase ............................ 8
3.6 Network-Layer Protocol Phase .................... 8
3.7 Link Termination Phase .......................... 9

4. The Option Negotiation Automaton ...................... 11
4.1 State Transition Table .......................... 12
4.2 States .......................................... 14
4.3 Events .......................................... 16
4.4 Actions ......................................... 21
4.5 Loop Avoidance .................................. 23
4.6 Counters and Timers ............................. 24

5. LCP Packet Formats .................................... 26
5.1 Configure-Request ............................... 28
5.2 Configure-Ack ................................... 29
5.3 Configure-Nak ................................... 30
5.4 Configure-Reject ................................ 31
5.5 Terminate-Request and Terminate-Ack ............. 33
5.6 Code-Reject ..................................... 34
5.7 Protocol-Reject ................................. 35
5.8 Echo-Request and Echo-Reply ..................... 36
5.9 Discard-Request ................................. 37

6. LCP Configuration Options ............................. 39
6.1 Maximum-Receive-Unit (MRU) ...................... 41
6.2 Authentication-Protocol ......................... 42
6.3 Quality-Protocol ................................ 43
6.4 Magic-Number .................................... 45
6.5 Protocol-Field-Compression (PFC) ................ 48
6.6 Address-and-Control-Field-Compression (ACFC)

SECURITY CONSIDERATIONS ...................................... 51
REFERENCES ................................................... 51
ACKNOWLEDGEMENTS ............................................. 51
CHAIR'S ADDRESS .............................................. 52
EDITOR'S ADDRESS ............................................. 52

1. Introduction

The Point-to-Point Protocol is designed for simple links which
transport packets between two peers. These links provide full-duplex
simultaneous bi-directional operation, and are assumed to deliver
packets in order. It is intended that PPP provide a common solution
for easy connection of a wide variety of hosts, bridges and routers
[1].

Encapsulation

The PPP encapsulation provides for multiplexing of different
network-layer protocols simultaneously over the same link. The
PPP encapsulation has been carefully designed to retain
compatibility with most commonly used supporting hardware.

Only 8 additional octets are necessary to form the encapsulation
when used within the default HDLC-like framing. In environments
where bandwidth is at a premium, the encapsulation and framing may
be shortened to 2 or 4 octets.

To support high speed implementations, the default encapsulation
uses only simple fields, only one of which needs to be examined
for demultiplexing. The default header and information fields
fall on 32-bit boundaries, and the trailer may be padded to an
arbitrary boundary.

Link Control Protocol

In order to be sufficiently versatile to be portable to a wide
variety of environments, PPP provides a Link Control Protocol
(LCP). The LCP is used to automatically agree upon the
encapsulation format options, handle varying limits on sizes of
packets, detect a looped-back link and other common
misconfiguration errors, and terminate the link. Other optional
facilities provided are authentication of the identity of its peer
on the link, and determination when a link is functioning properly
and when it is failing.

Network Control Protocols

Point-to-Point links tend to exacerbate many problems with the
current family of network protocols. For instance, assignment and
management of IP addresses, which is a problem even in LAN
environments, is especially difficult over circuit-switched
point-to-point links (such as dial-up modem servers). These
problems are handled by a family of Network Control Protocols
(NCPs), which each manage the specific needs required by their

respective network-layer protocols. These NCPs are defined in
companion documents.

Configuration

It is intended that PPP links be easy to configure. By design,
the standard defaults handle all common configurations. The
implementor can specify improvements to the default configuration,
which are automatically communicated to the peer without operator
intervention. Finally, the operator may explicitly configure
options for the link which enable the link to operate in
environments where it would otherwise be impossible.

This self-configuration is implemented through an extensible
option negotiation mechanism, wherein each end of the link
describes to the other its capabilities and requirements.
Although the option negotiation mechanism described in this
document is specified in terms of the Link Control Protocol (LCP),
the same facilities are designed to be used by other control
protocols, especially the family of NCPs.

1.1. Specification of Requirements

In this document, several words are used to signify the requirements
of the specification. These words are often capitalized.

MUST This word, or the adjective "required", means that the
definition is an absolute requirement of the specification.

MUST NOT This phrase means that the definition is an absolute
prohibition of the specification.

SHOULD This word, or the adjective "recommended", means that there
may exist valid reasons in particular circumstances to
ignore this item, but the full implications must be
understood and carefully weighed before choosing a
different course.

MAY This word, or the adjective "optional", means that this
item is one of an allowed set of alternatives. An
implementation which does not include this option MUST be
prepared to interoperate with another implementation which
does include the option.

1.2. Terminology

This document frequently uses the following terms:

datagram The unit of transmission in the network layer (such as IP).
A datagram may be encapsulated in one or more packets
passed to the data link layer.

frame The unit of transmission at the data link layer. A frame
may include a header and/or a trailer, along with some
number of units of data.

packet The basic unit of encapsulation, which is passed across the
interface between the network layer and the data link
layer. A packet is usually mapped to a frame; the
exceptions are when data link layer fragmentation is being
performed, or when multiple packets are incorporated into a
single frame.

peer The other end of the point-to-point link.

silently discard
The implementation discards the packet without further
processing. The implementation SHOULD provide the
capability of logging the error, including the contents of
the silently discarded packet, and SHOULD record the event
in a statistics counter.

2. PPP Encapsulation

The PPP encapsulation is used to disambiguate multiprotocol
datagrams. This encapsulation requires framing to indicate the
beginning and end of the encapsulation. Methods of providing framing
are specified in companion documents.

A summary of the PPP encapsulation is shown below. The fields are
transmitted from left to right.

+----------+-------------+---------+
| Protocol | Information | Padding |
| 8/16 bits| * | * |
+----------+-------------+---------+

Protocol Field

The Protocol field is one or two octets, and its value identifies
the datagram encapsulated in the Information field of the packet.
The field is transmitted and received most significant octet
first.

The structure of this field is consistent with the ISO 3309
extension mechanism for address fields. All Protocols MUST be
odd; the least significant bit of the least significant octet MUST
equal "1". Also, all Protocols MUST be assigned such that the
least significant bit of the most significant octet equals "0".
Frames received which don't comply with these rules MUST be
treated as having an unrecognized Protocol.

Protocol field values in the "0***" to "3***" range identify the
network-layer protocol of specific packets, and values in the
"8***" to "b***" range identify packets belonging to the
associated Network Control Protocols (NCPs), if any.

Protocol field values in the "4***" to "7***" range are used for
protocols with low volume traffic which have no associated NCP.
Protocol field values in the "c***" to "f***" range identify
packets as link-layer Control Protocols (such as LCP).

Up-to-date values of the Protocol field are specified in the most
recent "Assigned Numbers" RFC [2]. This specification reserves
the following values:

Value (in hex) Protocol Name

0001 Padding Protocol
0003 to 001f reserved (transparency inefficient)
007d reserved (Control Escape)
00cf reserved (PPP NLPID)
00ff reserved (compression inefficient)

8001 to 801f unused
807d unused
80cf unused
80ff unused

c021 Link Control Protocol
c023 Password Authentication Protocol
c025 Link Quality Report
c223 Challenge Handshake Authentication Protocol

Developers of new protocols MUST obtain a number from the Internet
Assigned Numbers Authority (IANA), at IANA@isi.edu.

Information Field

The Information field is zero or more octets. The Information
field contains the datagram for the protocol specified in the
Protocol field.

The maximum length for the Information field, including Padding,
but not including the Protocol field, is termed the Maximum
Receive Unit (MRU), which defaults to 1500 octets. By
negotiation, consenting PPP implementations may use other values
for the MRU.

Padding

On transmission, the Information field MAY be padded with an
arbitrary number of octets up to the MRU. It is the
responsibility of each protocol to distinguish padding octets from
real information.

3. PPP Link Operation

3.1. Overview

In order to establish communications over a point-to-point link, each
end of the PPP link MUST first send LCP packets to configure and test
the data link. After the link has been established, the peer MAY be
authenticated.

Then, PPP MUST send NCP packets to choose and configure one or more
network-layer protocols. Once each of the chosen network-layer
protocols has been configured, datagrams from each network-layer
protocol can be sent over the link.

The link will remain configured for communications until explicit LCP
or NCP packets close the link down, or until some external event
occurs (an inactivity timer expires or network administrator
intervention).

3.2. Phase Diagram

In the process of configuring, maintaining and terminating the
point-to-point link, the PPP link goes through several distinct
phases which are specified in the following simplified state diagram:

+------+ +-----------+ +--------------+
| | UP | | OPENED | | SUCCESS/NONE
| Dead |------->| Establish |---------->| Authenticate |--+
| | | | | | |
+------+ +-----------+ +--------------+ |
^ | | |
| FAIL | FAIL | |
+<--------------+ +----------+ |
| | |
| +-----------+ | +---------+ |
| DOWN | | | CLOSING | | |
+------------| Terminate |<---+<----------| Network |<-+
| | | |
+-----------+ +---------+

Not all transitions are specified in this diagram. The following
semantics MUST be followed.

3.3. Link Dead (physical-layer not ready)

The link necessarily begins and ends with this phase. When an
external event (such as carrier detection or network administrator
configuration) indicates that the physical-layer is ready to be used,
PPP will proceed to the Link Establishment phase.

During this phase, the LCP automaton (described later) will be in the
Initial or Starting states. The transition to the Link Establishment
phase will signal an Up event to the LCP automaton.

Implementation Note:

Typically, a link will return to this phase automatically after
the disconnection of a modem. In the case of a hard-wired link,
this phase may be extremely short -- merely long enough to detect
the presence of the device.

3.4. Link Establishment Phase

The Link Control Protocol (LCP) is used to establish the connection
through an exchange of Configure packets. This exchange is complete,
and the LCP Opened state entered, once a Configure-Ack packet
(described later) has been both sent and received.

All Configuration Options are assumed to be at default values unless
altered by the configuration exchange. See the chapter on LCP
Configuration Options for further discussion.

It is important to note that only Configuration Options which are
independent of particular network-layer protocols are configured by
LCP. Configuration of individual network-layer protocols is handled
by separate Network Control Protocols (NCPs) during the Network-Layer
Protocol phase.

Any non-LCP packets received during this phase MUST be silently
discarded.

The receipt of the LCP Configure-Request causes a return to the Link
Establishment phase from the Network-Layer Protocol phase or
Authentication phase.

3.5. Authentication Phase

On some links it may be desirable to require a peer to authenticate
itself before allowing network-layer protocol packets to be
exchanged.

By default, authentication is not mandatory. If an implementation
desires that the peer authenticate with some specific authentication
protocol, then it MUST request the use of that authentication
protocol during Link Establishment phase.

Authentication SHOULD take place as soon as possible after link
establishment. However, link quality determination MAY occur
concurrently. An implementation MUST NOT allow the exchange of link
quality determination packets to delay authentication indefinitely.

Advancement from the Authentication phase to the Network-Layer
Protocol phase MUST NOT occur until authentication has completed. If
authentication fails, the authenticator SHOULD proceed instead to the
Link Termination phase.

Only Link Control Protocol, authentication protocol, and link quality
monitoring packets are allowed during this phase. All other packets
received during this phase MUST be silently discarded.

Implementation Notes:

An implementation SHOULD NOT fail authentication simply due to
timeout or lack of response. The authentication SHOULD allow some
method of retransmission, and proceed to the Link Termination
phase only after a number of authentication attempts has been
exceeded.

The implementation responsible for commencing Link Termination
phase is the implementation which has refused authentication to
its peer.

3.6. Network-Layer Protocol Phase

Once PPP has finished the previous phases, each network-layer
protocol (such as IP, IPX, or AppleTalk) MUST be separately
configured by the appropriate Network Control Protocol (NCP).

Each NCP MAY be Opened and Closed at any time.

Implementation Note:

Because an implementation may initially use a significant amount
of time for link quality determination, implementations SHOULD
avoid fixed timeouts when waiting for their peers to configure a
NCP.

After a NCP has reached the Opened state, PPP will carry the
corresponding network-layer protocol packets. Any supported
network-layer protocol packets received when the corresponding NCP is
not in the Opened state MUST be silently discarded.

Implementation Note:

While LCP is in the Opened state, any protocol packet which is
unsupported by the implementation MUST be returned in a Protocol-
Reject (described later). Only protocols which are supported are
silently discarded.

During this phase, link traffic consists of any possible combination
of LCP, NCP, and network-layer protocol packets.

3.7. Link Termination Phase

PPP can terminate the link at any time. This might happen because of
the loss of carrier, authentication failure, link quality failure,
the expiration of an idle-period timer, or the administrative closing
of the link.

LCP is used to close the link through an exchange of Terminate
packets. When the link is closing, PPP informs the network-layer
protocols so that they may take appropriate action.

After the exchange of Terminate packets, the implementation SHOULD
signal the physical-layer to disconnect in order to enforce the
termination of the link, particularly in the case of an
authentication failure. The sender of the Terminate-Request SHOULD
disconnect after receiving a Terminate-Ack, or after the Restart
counter expires. The receiver of a Terminate-Request SHOULD wait for
the peer to disconnect, and MUST NOT disconnect until at least one
Restart time has passed after sending a Terminate-Ack. PPP SHOULD
proceed to the Link Dead phase.

Any non-LCP packets received during this phase MUST be silently
discarded.

Implementation Note:

The closing of the link by LCP is sufficient. There is no need
for each NCP to send a flurry of Terminate packets. Conversely,
the fact that one NCP has Closed is not sufficient reason to cause
the termination of the PPP link, even if that NCP was the only NCP
currently in the Opened state.

4. The Option Negotiation Automaton

The finite-state automaton is defined by events, actions and state
transitions. Events include reception of external commands such as
Open and Close, expiration of the Restart timer, and reception of
packets from a peer. Actions include the starting of the Restart
timer and transmission of packets to the peer.

Some types of packets -- Configure-Naks and Configure-Rejects, or
Code-Rejects and Protocol-Rejects, or Echo-Requests, Echo-Replies and
Discard-Requests -- are not differentiated in the automaton
descriptions. As will be described later, these packets do indeed
serve different functions. However, they always cause the same
transitions.

Events Actions

Up = lower layer is Up tlu = This-Layer-Up
Down = lower layer is Down tld = This-Layer-Down
Open = administrative Open tls = This-Layer-Started
Close= administrative Close tlf = This-Layer-Finished

TO+ = Timeout with counter > 0 irc = Initialize-Restart-Count
TO- = Timeout with counter expired zrc = Zero-Restart-Count

RCR+ = Receive-Configure-Request (Good) scr = Send-Configure-Request
RCR- = Receive-Configure-Request (Bad)
RCA = Receive-Configure-Ack sca = Send-Configure-Ack
RCN = Receive-Configure-Nak/Rej scn = Send-Configure-Nak/Rej

RTR = Receive-Terminate-Request str = Send-Terminate-Request
RTA = Receive-Terminate-Ack sta = Send-Terminate-Ack

RUC = Receive-Unknown-Code scj = Send-Code-Reject
RXJ+ = Receive-Code-Reject (permitted)
or Receive-Protocol-Reject
RXJ- = Receive-Code-Reject (catastrophic)
or Receive-Protocol-Reject
RXR = Receive-Echo-Request ser = Send-Echo-Reply
or Receive-Echo-Reply
or Receive-Discard-Request

4.1. State Transition Table

The complete state transition table follows. States are indicated
horizontally, and events are read vertically. State transitions and
actions are represented in the form action/new-state. Multiple
actions are separated by commas, and may continue on succeeding lines
as space requires; multiple actions may be implemented in any
convenient order. The state may be followed by a letter, which
indicates an explanatory footnote. The dash ('-') indicates an
illegal transition.

| State
| 0 1 2 3 4 5
Events| Initial Starting Closed Stopped Closing Stopping
------+-----------------------------------------------------------
Up | 2 irc,scr/6 - - - -
Down | - - 0 tls/1 0 1
Open | tls/1 1 irc,scr/6 3r 5r 5r
Close| 0 tlf/0 2 2 4 4
|
TO+ | - - - - str/4 str/5
TO- | - - - - tlf/2 tlf/3
|
RCR+ | - - sta/2 irc,scr,sca/8 4 5
RCR- | - - sta/2 irc,scr,scn/6 4 5
RCA | - - sta/2 sta/3 4 5
RCN | - - sta/2 sta/3 4 5
|
RTR | - - sta/2 sta/3 sta/4 sta/5
RTA | - - 2 3 tlf/2 tlf/3
|
RUC | - - scj/2 scj/3 scj/4 scj/5
RXJ+ | - - 2 3 4 5
RXJ- | - - tlf/2 tlf/3 tlf/2 tlf/3
|
RXR | - - 2 3 4 5

| State
| 6 7 8 9
Events| Req-Sent Ack-Rcvd Ack-Sent Opened
------+-----------------------------------------
Up | - - - -
Down | 1 1 1 tld/1
Open | 6 7 8 9r
Close|irc,str/4 irc,str/4 irc,str/4 tld,irc,str/4
|
TO+ | scr/6 scr/6 scr/8 -
TO- | tlf/3p tlf/3p tlf/3p -
|
RCR+ | sca/8 sca,tlu/9 sca/8 tld,scr,sca/8
RCR- | scn/6 scn/7 scn/6 tld,scr,scn/6
RCA | irc/7 scr/6x irc,tlu/9 tld,scr/6x
RCN |irc,scr/6 scr/6x irc,scr/8 tld,scr/6x
|
RTR | sta/6 sta/6 sta/6 tld,zrc,sta/5
RTA | 6 6 8 tld,scr/6
|
RUC | scj/6 scj/7 scj/8 scj/9
RXJ+ | 6 6 8 9
RXJ- | tlf/3 tlf/3 tlf/3 tld,irc,str/5
|
RXR | 6 7 8 ser/9

The states in which the Restart timer is running are identifiable by
the presence of TO events. Only the Send-Configure-Request, Send-
Terminate-Request and Zero-Restart-Count actions start or re-start
the Restart timer. The Restart timer is stopped when transitioning
from any state where the timer is running to a state where the timer
is not running.

The events and actions are defined according to a message passing
architecture, rather than a signalling architecture. If an action is
desired to control specific signals (such as DTR), additional actions
are likely to be required.

[p] Passive option; see Stopped state discussion.

[r] Restart option; see Open event discussion.

[x] Crossed connection; see RCA event discussion.

4.2. States

Following is a more detailed description of each automaton state.

Initial

In the Initial state, the lower layer is unavailable (Down), and
no Open has occurred. The Restart timer is not running in the
Initial state.

Starting

The Starting state is the Open counterpart to the Initial state.
An administrative Open has been initiated, but the lower layer is
still unavailable (Down). The Restart timer is not running in the
Starting state.

When the lower layer becomes available (Up), a Configure-Request
is sent.

Closed

In the Closed state, the link is available (Up), but no Open has
occurred. The Restart timer is not running in the Closed state.

Upon reception of Configure-Request packets, a Terminate-Ack is
sent. Terminate-Acks are silently discarded to avoid creating a
loop.

Stopped

The Stopped state is the Open counterpart to the Closed state. It
is entered when the automaton is waiting for a Down event after
the This-Layer-Finished action, or after sending a Terminate-Ack.
The Restart timer is not running in the Stopped state.

Upon reception of Configure-Request packets, an appropriate
response is sent. Upon reception of other packets, a Terminate-
Ack is sent. Terminate-Acks are silently discarded to avoid
creating a loop.

Rationale:

The Stopped state is a junction state for link termination,
link configuration failure, and other automaton failure modes.
These potentially separate states have been combined.

There is a race condition between the Down event response (from

the This-Layer-Finished action) and the Receive-Configure-
Request event. When a Configure-Request arrives before the
Down event, the Down event will supercede by returning the
automaton to the Starting state. This prevents attack by
repetition.

Implementation Option:

After the peer fails to respond to Configure-Requests, an
implementation MAY wait passively for the peer to send
Configure-Requests. In this case, the This-Layer-Finished
action is not used for the TO- event in states Req-Sent, Ack-
Rcvd and Ack-Sent.

This option is useful for dedicated circuits, or circuits which
have no status signals available, but SHOULD NOT be used for
switched circuits.

Closing

In the Closing state, an attempt is made to terminate the
connection. A Terminate-Request has been sent and the Restart
timer is running, but a Terminate-Ack has not yet been received.

Upon reception of a Terminate-Ack, the Closed state is entered.
Upon the expiration of the Restart timer, a new Terminate-Request
is transmitted, and the Restart timer is restarted. After the
Restart timer has expired Max-Terminate times, the Closed state is
entered.

Stopping

The Stopping state is the Open counterpart to the Closing state.
A Terminate-Request has been sent and the Restart timer is
running, but a Terminate-Ack has not yet been received.

Rationale:

The Stopping state provides a well defined opportunity to
terminate a link before allowing new traffic. After the link
has terminated, a new configuration may occur via the Stopped
or Starting states.

Request-Sent

In the Request-Sent state an attempt is made to configure the
connection. A Configure-Request has been sent and the Restart
timer is running, but a Configure-Ack has not yet been received

nor has one been sent.

Ack-Received

In the Ack-Received state, a Configure-Request has been sent and a
Configure-Ack has been received. The Restart timer is still
running, since a Configure-Ack has not yet been sent.

Ack-Sent

In the Ack-Sent state, a Configure-Request and a Configure-Ack
have both been sent, but a Configure-Ack has not yet been
received. The Restart timer is running, since a Configure-Ack has
not yet been received.

Opened

In the Opened state, a Configure-Ack has been both sent and
received. The Restart timer is not running.

When entering the Opened state, the implementation SHOULD signal
the upper layers that it is now Up. Conversely, when leaving the
Opened state, the implementation SHOULD signal the upper layers
that it is now Down.

4.3. Events

Transitions and actions in the automaton are caused by events.

Up

This event occurs when a lower layer indicates that it is ready to
carry packets.

Typically, this event is used by a modem handling or calling
process, or by some other coupling of the PPP link to the physical
media, to signal LCP that the link is entering Link Establishment
phase.

It also can be used by LCP to signal each NCP that the link is
entering Network-Layer Protocol phase. That is, the This-Layer-Up
action from LCP triggers the Up event in the NCP.

Down

This event occurs when a lower layer indicates that it is no

longer ready to carry packets.

Typically, this event is used by a modem handling or calling
process, or by some other coupling of the PPP link to the physical
media, to signal LCP that the link is entering Link Dead phase.

It also can be used by LCP to signal each NCP that the link is
leaving Network-Layer Protocol phase. That is, the This-Layer-
Down action from LCP triggers the Down event in the NCP.

Open

This event indicates that the link is administratively available
for traffic; that is, the network administrator (human or program)
has indicated that the link is allowed to be Opened. When this
event occurs, and the link is not in the Opened state, the
automaton attempts to send configuration packets to the peer.

If the automaton is not able to begin configuration (the lower
layer is Down, or a previous Close event has not completed), the
establishment of the link is automatically delayed.

When a Terminate-Request is received, or other events occur which
cause the link to become unavailable, the automaton will progress
to a state where the link is ready to re-open. No additional
administrative intervention is necessary.

Implementation Option:

Experience has shown that users will execute an additional Open
command when they want to renegotiate the link. This might
indicate that new values are to be negotiated.

Since this is not the meaning of the Open event, it is
suggested that when an Open user command is executed in the
Opened, Closing, Stopping, or Stopped states, the
implementation issue a Down event, immediately followed by an
Up event. Care must be taken that an intervening Down event
cannot occur from another source.

The Down followed by an Up will cause an orderly renegotiation
of the link, by progressing through the Starting to the
Request-Sent state. This will cause the renegotiation of the
link, without any harmful side effects.

Close

This event indicates that the link is not available for traffic;

that is, the network administrator (human or program) has
indicated that the link is not allowed to be Opened. When this
event occurs, and the link is not in the Closed state, the
automaton attempts to terminate the connection. Futher attempts
to re-configure the link are denied until a new Open event occurs.

Implementation Note:

When authentication fails, the link SHOULD be terminated, to
prevent attack by repetition and denial of service to other
users. Since the link is administratively available (by
definition), this can be accomplished by simulating a Close
event to the LCP, immediately followed by an Open event. Care
must be taken that an intervening Close event cannot occur from
another source.

The Close followed by an Open will cause an orderly termination
of the link, by progressing through the Closing to the Stopping
state, and the This-Layer-Finished action can disconnect the
link. The automaton waits in the Stopped or Starting states
for the next connection attempt.

Timeout (TO+,TO-)

This event indicates the expiration of the Restart timer. The
Restart timer is used to time responses to Configure-Request and
Terminate-Request packets.

The TO+ event indicates that the Restart counter continues to be
greater than zero, which triggers the corresponding Configure-
Request or Terminate-Request packet to be retransmitted.

The TO- event indicates that the Restart counter is not greater
than zero, and no more packets need to be retransmitted.

Receive-Configure-Request (RCR+,RCR-)

This event occurs when a Configure-Request packet is received from
the peer. The Configure-Request packet indicates the desire to
open a connection and may specify Configuration Options. The
Configure-Request packet is more fully described in a later
section.

The RCR+ event indicates that the Configure-Request was
acceptable, and triggers the transmission of a corresponding
Configure-Ack.

The RCR- event indicates that the Configure-Request was

unacceptable, and triggers the transmission of a corresponding
Configure-Nak or Configure-Reject.

Implementation Note:

These events may occur on a connection which is already in the
Opened state. The implementation MUST be prepared to
immediately renegotiate the Configuration Options.

Receive-Configure-Ack (RCA)

This event occurs when a valid Configure-Ack packet is received
from the peer. The Configure-Ack packet is a positive response to
a Configure-Request packet. An out of sequence or otherwise
invalid packet is silently discarded.

Implementation Note:

Since the correct packet has already been received before
reaching the Ack-Rcvd or Opened states, it is extremely
unlikely that another such packet will arrive. As specified,
all invalid Ack/Nak/Rej packets are silently discarded, and do
not affect the transitions of the automaton.

However, it is not impossible that a correctly formed packet
will arrive through a coincidentally-timed cross-connection.
It is more likely to be the result of an implementation error.
At the very least, this occurance SHOULD be logged.

Receive-Configure-Nak/Rej (RCN)

This event occurs when a valid Configure-Nak or Configure-Reject
packet is received from the peer. The Configure-Nak and
Configure-Reject packets are negative responses to a Configure-
Request packet. An out of sequence or otherwise invalid packet is
silently discarded.

Implementation Note:

Although the Configure-Nak and Configure-Reject cause the same
state transition in the automaton, these packets have
significantly different effects on the Configuration Options
sent in the resulting Configure-Request packet.

Receive-Terminate-Request (RTR)

This event occurs when a Terminate-Request packet is received.
The Terminate-Request packet indicates the desire of the peer to

close the connection.

Implementation Note:

This event is not identical to the Close event (see above), and
does not override the Open commands of the local network
administrator. The implementation MUST be prepared to receive
a new Configure-Request without network administrator
intervention.

Receive-Terminate-Ack (RTA)

This event occurs when a Terminate-Ack packet is received from the
peer. The Terminate-Ack packet is usually a response to a
Terminate-Request packet. The Terminate-Ack packet may also
indicate that the peer is in Closed or Stopped states, and serves
to re-synchronize the link configuration.

Receive-Unknown-Code (RUC)

This event occurs when an un-interpretable packet is received from
the peer. A Code-Reject packet is sent in response.

Receive-Code-Reject, Receive-Protocol-Reject (RXJ+,RXJ-)

This event occurs when a Code-Reject or a Protocol-Reject packet
is received from the peer.

The RXJ+ event arises when the rejected value is acceptable, such
as a Code-Reject of an extended code, or a Protocol-Reject of a
NCP. These are within the scope of normal operation. The
implementation MUST stop sending the offending packet type.

The RXJ- event arises when the rejected value is catastrophic,
such as a Code-Reject of Configure-Request, or a Protocol-Reject
of LCP! This event communicates an unrecoverable error that
terminates the connection.

Receive-Echo-Request, Receive-Echo-Reply, Receive-Discard-Request
(RXR)

This event occurs when an Echo-Request, Echo-Reply or Discard-
Request packet is received from the peer. The Echo-Reply packet
is a response to an Echo-Request packet. There is no reply to an
Echo-Reply or Discard-Request packet.

4.4. Actions

Actions in the automaton are caused by events and typically indicate
the transmission of packets and/or the starting or stopping of the
Restart timer.

Illegal-Event (-)

This indicates an event that cannot occur in a properly
implemented automaton. The implementation has an internal error,
which should be reported and logged. No transition is taken, and
the implementation SHOULD NOT reset or freeze.

This-Layer-Up (tlu)

This action indicates to the upper layers that the automaton is
entering the Opened state.

Typically, this action is used by the LCP to signal the Up event
to a NCP, Authentication Protocol, or Link Quality Protocol, or
MAY be used by a NCP to indicate that the link is available for
its network layer traffic.

This-Layer-Down (tld)

This action indicates to the upper layers that the automaton is
leaving the Opened state.

Typically, this action is used by the LCP to signal the Down event
to a NCP, Authentication Protocol, or Link Quality Protocol, or
MAY be used by a NCP to indicate that the link is no longer
available for its network layer traffic.

This-Layer-Started (tls)

This action indicates to the lower layers that the automaton is
entering the Starting state, and the lower layer is needed for the
link. The lower layer SHOULD respond with an Up event when the
lower layer is available.

This results of this action are highly implementation dependent.

This-Layer-Finished (tlf)

This action indicates to the lower layers that the automaton is
entering the Initial, Closed or Stopped states, and the lower
layer is no longer needed for the link. The lower layer SHOULD
respond with a Down event when the lower layer has terminated.

Typically, this action MAY be used by the LCP to advance to the
Link Dead phase, or MAY be used by a NCP to indicate to the LCP
that the link may terminate when there are no other NCPs open.

This results of this action are highly implementation dependent.

Initialize-Restart-Count (irc)

This action sets the Restart counter to the appropriate value
(Max-Terminate or Max-Configure). The counter is decremented for
each transmission, including the first.

Implementation Note:

In addition to setting the Restart counter, the implementation
MUST set the timeout period to the initial value when Restart
timer backoff is used.

Zero-Restart-Count (zrc)

This action sets the Restart counter to zero.

Implementation Note:

This action enables the FSA to pause before proceeding to the
desired final state, allowing traffic to be processed by the
peer. In addition to zeroing the Restart counter, the
implementation MUST set the timeout period to an appropriate
value.

Send-Configure-Request (scr)

A Configure-Request packet is transmitted. This indicates the
desire to open a connection with a specified set of Configuration
Options. The Restart timer is started when the Configure-Request
packet is transmitted, to guard against packet loss. The Restart
counter is decremented each time a Configure-Request is sent.

Send-Configure-Ack (sca)

A Configure-Ack packet is transmitted. This acknowledges the
reception of a Configure-Request packet with an acceptable set of
Configuration Options.

Send-Configure-Nak (scn)

A Configure-Nak or Configure-Reject packet is transmitted, as
appropriate. This negative response reports the reception of a

Configure-Request packet with an unacceptable set of Configuration
Options.

Configure-Nak packets are used to refuse a Configuration Option
value, and to suggest a new, acceptable value. Configure-Reject
packets are used to refuse all negotiation about a Configuration
Option, typically because it is not recognized or implemented.
The use of Configure-Nak versus Configure-Reject is more fully
described in the chapter on LCP Packet Formats.

Send-Terminate-Request (str)

A Terminate-Request packet is transmitted. This indicates the
desire to close a connection. The Restart timer is started when
the Terminate-Request packet is transmitted, to guard against
packet loss. The Restart counter is decremented each time a
Terminate-Request is sent.

Send-Terminate-Ack (sta)

A Terminate-Ack packet is transmitted. This acknowledges the
reception of a Terminate-Request packet or otherwise serves to
synchronize the automatons.

Send-Code-Reject (scj)

A Code-Reject packet is transmitted. This indicates the reception
of an unknown type of packet.

Send-Echo-Reply (ser)

An Echo-Reply packet is transmitted. This acknowledges the
reception of an Echo-Request packet.

4.5. Loop Avoidance

The protocol makes a reasonable attempt at avoiding Configuration
Option negotiation loops. However, the protocol does NOT guarantee
that loops will not happen. As with any negotiation, it is possible
to configure two PPP implementations with conflicting policies that
will never converge. It is also possible to configure policies which
do converge, but which take significant time to do so. Implementors
should keep this in mind and SHOULD implement loop detection
mechanisms or higher level timeouts.

4.6. Counters and Timers

Restart Timer

There is one special timer used by the automaton. The Restart
timer is used to time transmissions of Configure-Request and
Terminate-Request packets. Expiration of the Restart timer causes
a Timeout event, and retransmission of the corresponding
Configure-Request or Terminate-Request packet. The Restart timer
MUST be configurable, but SHOULD default to three (3) seconds.

Implementation Note:

The Restart timer SHOULD be based on the speed of the link.
The default value is designed for low speed (2,400 to 9,600
bps), high switching latency links (typical telephone lines).
Higher speed links, or links with low switching latency, SHOULD
have correspondingly faster retransmission times.

Instead of a constant value, the Restart timer MAY begin at an
initial small value and increase to the configured final value.
Each successive value less than the final value SHOULD be at
least twice the previous value. The initial value SHOULD be
large enough to account for the size of the packets, twice the
round trip time for transmission at the link speed, and at
least an additional 100 milliseconds to allow the peer to
process the packets before responding. Some circuits add
another 200 milliseconds of satellite delay. Round trip times
for modems operating at 14,400 bps have been measured in the
range of 160 to more than 600 milliseconds.

Max-Terminate

There is one required restart counter for Terminate-Requests.
Max-Terminate indicates the number of Terminate-Request packets
sent without receiving a Terminate-Ack before assuming that the
peer is unable to respond. Max-Terminate MUST be configurable,
but SHOULD default to two (2) transmissions.

Max-Configure

A similar counter is recommended for Configure-Requests. Max-
Configure indicates the number of Configure-Request packets sent
without receiving a valid Configure-Ack, Configure-Nak or
Configure-Reject before assuming that the peer is unable to
respond. Max-Configure MUST be configurable, but SHOULD default
to ten (10) transmissions.

Max-Failure

A related counter is recommended for Configure-Nak. Max-Failure
indicates the number of Configure-Nak packets sent without sending
a Configure-Ack before assuming that configuration is not
converging. Any further Configure-Nak packets for peer requested
options are converted to Configure-Reject packets, and locally
desired options are no longer appended. Max-Failure MUST be
configurable, but SHOULD default to five (5) transmissions.

5. LCP Packet Formats

There are three classes of LCP packets:

1. Link Configuration packets used to establish and configure a
link (Configure-Request, Configure-Ack, Configure-Nak and
Configure-Reject).

2. Link Termination packets used to terminate a link (Terminate-
Request and Terminate-Ack).

3. Link Maintenance packets used to manage and debug a link
(Code-Reject, Protocol-Reject, Echo-Request, Echo-Reply, and
Discard-Request).

In the interest of simplicity, there is no version field in the LCP
packet. A correctly functioning LCP implementation will always
respond to unknown Protocols and Codes with an easily recognizable
LCP packet, thus providing a deterministic fallback mechanism for
implementations of other versions.

Regardless of which Configuration Options are enabled, all LCP Link
Configuration, Link Termination, and Code-Reject packets (codes 1
through 7) are always sent as if no Configuration Options were
negotiated. In particular, each Configuration Option specifies a
default value. This ensures that such LCP packets are always
recognizable, even when one end of the link mistakenly believes the
link to be open.

Exactly one LCP packet is encapsulated in the PPP Information field,
where the PPP Protocol field indicates type hex c021 (Link Control
Protocol).

A summary of the Link Control Protocol packet format is shown below.
The fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+

Code

The Code field is one octet, and identifies the kind of LCP

packet. When a packet is received with an unknown Code field, a
Code-Reject packet is transmitted.

Up-to-date values of the LCP Code field are specified in the most
recent "Assigned Numbers" RFC [2]. This document concerns the
following values:

1 Configure-Request
2 Configure-Ack
3 Configure-Nak
4 Configure-Reject
5 Terminate-Request
6 Terminate-Ack
7 Code-Reject
8 Protocol-Reject
9 Echo-Request
10 Echo-Reply
11 Discard-Request

Identifier

The Identifier field is one octet, and aids in matching requests
and replies. When a packet is received with an invalid Identifier
field, the packet is silently discarded without affecting the
automaton.

Length

The Length field is two octets, and indicates the length of the
LCP packet, including the Code, Identifier, Length and Data
fields. The Length MUST NOT exceed the MRU of the link.

Octets outside the range of the Length field are treated as
padding and are ignored on reception. When a packet is received
with an invalid Length field, the packet is silently discarded
without affecting the automaton.

Data

The Data field is zero or more octets, as indicated by the Length
field. The format of the Data field is determined by the Code
field.

5.1. Configure-Request

Description

An implementation wishing to open a connection MUST transmit a
Configure-Request. The Options field is filled with any desired
changes to the link defaults. Configuration Options SHOULD NOT be
included with default values.

Upon reception of a Configure-Request, an appropriate reply MUST
be transmitted.

A summary of the Configure-Request packet format is shown below. The
fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options ...
+-+-+-+-+

Code

1 for Configure-Request.

Identifier

The Identifier field MUST be changed whenever the contents of the
Options field changes, and whenever a valid reply has been
received for a previous request. For retransmissions, the
Identifier MAY remain unchanged.

Options

The options field is variable in length, and contains the list of
zero or more Configuration Options that the sender desires to
negotiate. All Configuration Options are always negotiated
simultaneously. The format of Configuration Options is further
described in a later chapter.

5.2. Configure-Ack

Description

If every Configuration Option received in a Configure-Request is
recognizable and all values are acceptable, then the
implementation MUST transmit a Configure-Ack. The acknowledged
Configuration Options MUST NOT be reordered or modified in any
way.

On reception of a Configure-Ack, the Identifier field MUST match
that of the last transmitted Configure-Request. Additionally, the
Configuration Options in a Configure-Ack MUST exactly match those
of the last transmitted Configure-Request. Invalid packets are
silently discarded.

A summary of the Configure-Ack packet format is shown below. The
fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options ...
+-+-+-+-+

Code

2 for Configure-Ack.

Identifier

The Identifier field is a copy of the Identifier field of the
Configure-Request which caused this Configure-Ack.

Options

The Options field is variable in length, and contains the list of
zero or more Configuration Options that the sender is
acknowledging. All Configuration Options are always acknowledged
simultaneously.

5.3. Configure-Nak

Description

If every instance of the received Configuration Options is
recognizable, but some values are not acceptable, then the
implementation MUST transmit a Configure-Nak. The Options field
is filled with only the unacceptable Configuration Options from
the Configure-Request. All acceptable Configuration Options are
filtered out of the Configure-Nak, but otherwise the Configuration
Options from the Configure-Request MUST NOT be reordered.

Options which have no value fields (boolean options) MUST use the
Configure-Reject reply instead.

Each Configuration Option which is allowed only a single instance
MUST be modified to a value acceptable to the Configure-Nak
sender. The default value MAY be used, when this differs from the
requested value.

When a particular type of Configuration Option can be listed more
than once with different values, the Configure-Nak MUST include a
list of all values for that option which are acceptable to the
Configure-Nak sender. This includes acceptable values that were
present in the Configure-Request.

Finally, an implementation may be configured to request the
negotiation of a specific Configuration Option. If that option is
not listed, then that option MAY be appended to the list of Nak'd
Configuration Options, in order to prompt the peer to include that
option in its next Configure-Request packet. Any value fields for
the option MUST indicate values acceptable to the Configure-Nak
sender.

On reception of a Configure-Nak, the Identifier field MUST match
that of the last transmitted Configure-Request. Invalid packets
are silently discarded.

Reception of a valid Configure-Nak indicates that when a new
Configure-Request is sent, the Configuration Options MAY be
modified as specified in the Configure-Nak. When multiple
instances of a Configuration Option are present, the peer SHOULD
select a single value to include in its next Configure-Request
packet.

Some Configuration Options have a variable length. Since the
Nak'd Option has been modified by the peer, the implementation
MUST be able to handle an Option length which is different from

the original Configure-Request.

A summary of the Configure-Nak packet format is shown below. The
fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options ...
+-+-+-+-+

Code

3 for Configure-Nak.

Identifier

The Identifier field is a copy of the Identifier field of the
Configure-Request which caused this Configure-Nak.

Options

The Options field is variable in length, and contains the list of
zero or more Configuration Options that the sender is Nak'ing.
All Configuration Options are always Nak'd simultaneously.

5.4. Configure-Reject

Description

If some Configuration Options received in a Configure-Request are
not recognizable or are not acceptable for negotiation (as
configured by a network administrator), then the implementation
MUST transmit a Configure-Reject. The Options field is filled
with only the unacceptable Configuration Options from the
Configure-Request. All recognizable and negotiable Configuration
Options are filtered out of the Configure-Reject, but otherwise
the Configuration Options MUST NOT be reordered or modified in any
way.

On reception of a Configure-Reject, the Identifier field MUST
match that of the last transmitted Configure-Request.
Additionally, the Configuration Options in a Configure-Reject MUST

be a proper subset of those in the last transmitted Configure-
Request. Invalid packets are silently discarded.

Reception of a valid Configure-Reject indicates that when a new
Configure-Request is sent, it MUST NOT include any of the
Configuration Options listed in the Configure-Reject.

A summary of the Configure-Reject packet format is shown below. The
fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options ...
+-+-+-+-+

Code

4 for Configure-Reject.

Identifier

The Identifier field is a copy of the Identifier field of the
Configure-Request which caused this Configure-Reject.

Options

The Options field is variable in length, and contains the list of
zero or more Configuration Options that the sender is rejecting.
All Configuration Options are always rejected simultaneously.

5.5. Terminate-Request and Terminate-Ack

Description

LCP includes Terminate-Request and Terminate-Ack Codes in order to
provide a mechanism for closing a connection.

An implementation wishing to close a connection SHOULD transmit a
Terminate-Request. Terminate-Request packets SHOULD continue to
be sent until Terminate-Ack is received, the lower layer indicates
that it has gone down, or a sufficiently large number have been
transmitted such that the peer is down with reasonable certainty.

Upon reception of a Terminate-Request, a Terminate-Ack MUST be
transmitted.

Reception of an unelicited Terminate-Ack indicates that the peer
is in the Closed or Stopped states, or is otherwise in need of
re-negotiation.

A summary of the Terminate-Request and Terminate-Ack packet formats
is shown below. The fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+

Code

5 for Terminate-Request;

6 for Terminate-Ack.

Identifier

On transmission, the Identifier field MUST be changed whenever the
content of the Data field changes, and whenever a valid reply has
been received for a previous request. For retransmissions, the
Identifier MAY remain unchanged.

On reception, the Identifier field of the Terminate-Request is
copied into the Identifier field of the Terminate-Ack packet.

Data

The Data field is zero or more octets, and contains uninterpreted
data for use by the sender. The data may consist of any binary
value. The end of the field is indicated by the Length.

5.6. Code-Reject

Description

Reception of a LCP packet with an unknown Code indicates that the
peer is operating with a different version. This MUST be reported
back to the sender of the unknown Code by transmitting a Code-
Reject.

Upon reception of the Code-Reject of a code which is fundamental
to this version of the protocol, the implementation SHOULD report
the problem and drop the connection, since it is unlikely that the
situation can be rectified automatically.

A summary of the Code-Reject packet format is shown below. The
fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Rejected-Packet ...
+-+-+-+-+-+-+-+-+

Code

7 for Code-Reject.

Identifier

The Identifier field MUST be changed for each Code-Reject sent.

Rejected-Packet

The Rejected-Packet field contains a copy of the LCP packet which
is being rejected. It begins with the Information field, and does
not include any Data Link Layer headers nor an FCS. The
Rejected-Packet MUST be truncated to comply with the peer's

established MRU.

5.7. Protocol-Reject

Description

Reception of a PPP packet with an unknown Protocol field indicates
that the peer is attempting to use a protocol which is
unsupported. This usually occurs when the peer attempts to
configure a new protocol. If the LCP automaton is in the Opened
state, then this MUST be reported back to the peer by transmitting
a Protocol-Reject.

Upon reception of a Protocol-Reject, the implementation MUST stop
sending packets of the indicated protocol at the earliest
opportunity.

Protocol-Reject packets can only be sent in the LCP Opened state.
Protocol-Reject packets received in any state other than the LCP
Opened state SHOULD be silently discarded.

A summary of the Protocol-Reject packet format is shown below. The
fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Rejected-Protocol | Rejected-Information ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Code

8 for Protocol-Reject.

Identifier

The Identifier field MUST be changed for each Protocol-Reject
sent.

Rejected-Protocol

The Rejected-Protocol field is two octets, and contains the PPP
Protocol field of the packet which is being rejected.

Rejected-Information

The Rejected-Information field contains a copy of the packet which
is being rejected. It begins with the Information field, and does
not include any Data Link Layer headers nor an FCS. The
Rejected-Information MUST be truncated to comply with the peer's
established MRU.

5.8. Echo-Request and Echo-Reply

Description

LCP includes Echo-Request and Echo-Reply Codes in order to provide
a Data Link Layer loopback mechanism for use in exercising both
directions of the link. This is useful as an aid in debugging,
link quality determination, performance testing, and for numerous
other functions.

Upon reception of an Echo-Request in the LCP Opened state, an
Echo-Reply MUST be transmitted.

Echo-Request and Echo-Reply packets MUST only be sent in the LCP
Opened state. Echo-Request and Echo-Reply packets received in any
state other than the LCP Opened state SHOULD be silently
discarded.

A summary of the Echo-Request and Echo-Reply packet formats is shown
below. The fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Magic-Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+

Code

9 for Echo-Request;

10 for Echo-Reply.

Identifier

On transmission, the Identifier field MUST be changed whenever the
content of the Data field changes, and whenever a valid reply has
been received for a previous request. For retransmissions, the
Identifier MAY remain unchanged.

On reception, the Identifier field of the Echo-Request is copied
into the Identifier field of the Echo-Reply packet.

Magic-Number

The Magic-Number field is four octets, and aids in detecting links
which are in the looped-back condition. Until the Magic-Number
Configuration Option has been successfully negotiated, the Magic-
Number MUST be transmitted as zero. See the Magic-Number
Configuration Option for further explanation.

Data

The Data field is zero or more octets, and contains uninterpreted
data for use by the sender. The data may consist of any binary
value. The end of the field is indicated by the Length.

5.9. Discard-Request

Description

LCP includes a Discard-Request Code in order to provide a Data
Link Layer sink mechanism for use in exercising the local to
remote direction of the link. This is useful as an aid in
debugging, performance testing, and for numerous other functions.

Discard-Request packets MUST only be sent in the LCP Opened state.
On reception, the receiver MUST silently discard any Discard-
Request that it receives.

A summary of the Discard-Request packet format is shown below. The
fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Magic-Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+

Code

11 for Discard-Request.

Identifier

The Identifier field MUST be changed for each Discard-Request
sent.

Magic-Number

The Magic-Number field is four octets, and aids in detecting links
which are in the looped-back condition. Until the Magic-Number
Configuration Option has been successfully negotiated, the Magic-
Number MUST be transmitted as zero. See the Magic-Number
Configuration Option for further explanation.

Data

The Data field is zero or more octets, and contains uninterpreted
data for use by the sender. The data may consist of any binary
value. The end of the field is indicated by the Length.

6. LCP Configuration Options

LCP Configuration Options allow negotiation of modifications to the
default characteristics of a point-to-point link. If a Configuration
Option is not included in a Configure-Request packet, the default
value for that Configuration Option is assumed.

Some Configuration Options MAY be listed more than once. The effect
of this is Configuration Option specific, and is specified by each
such Configuration Option description. (None of the Configuration
Options in this specification can be listed more than once.)

The end of the list of Configuration Options is indicated by the
Length field of the LCP packet.

Unless otherwise specified, all Configuration Options apply in a
half-duplex fashion; typically, in the receive direction of the link
from the point of view of the Configure-Request sender.

Design Philosophy

The options indicate additional capabilities or requirements of
the implementation that is requesting the option. An
implementation which does not understand any option SHOULD
interoperate with one which implements every option.

A default is specified for each option which allows the link to
correctly function without negotiation of the option, although
perhaps with less than optimal performance.

Except where explicitly specified, acknowledgement of an option
does not require the peer to take any additional action other than
the default.

It is not necessary to send the default values for the options in
a Configure-Request.

A summary of the Configuration Option format is shown below. The
fields are transmitted from left to right.

0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length | Data ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Type

The Type field is one octet, and indicates the type of
Configuration Option. Up-to-date values of the LCP Option Type
field are specified in the most recent "Assigned Numbers" RFC [2].
This document concerns the following values:

0 RESERVED
1 Maximum-Receive-Unit
3 Authentication-Protocol
4 Quality-Protocol
5 Magic-Number
7 Protocol-Field-Compression
8 Address-and-Control-Field-Compression

Length

The Length field is one octet, and indicates the length of this
Configuration Option including the Type, Length and Data fields.

If a negotiable Configuration Option is received in a Configure-
Request, but with an invalid or unrecognized Length, a Configure-
Nak SHOULD be transmitted which includes the desired Configuration
Option with an appropriate Length and Data.

Data

The Data field is zero or more octets, and contains information
specific to the Configuration Option. The format and length of
the Data field is determined by the Type and Length fields.

When the Data field is indicated by the Length to extend beyond
the end of the Information field, the entire packet is silently
discarded without affecting the automaton.

6.1. Maximum-Receive-Unit (MRU)

Description

This Configuration Option may be sent to inform the peer that the
implementation can receive larger packets, or to request that the
peer send smaller packets.

The default value is 1500 octets. If smaller packets are
requested, an implementation MUST still be able to receive the
full 1500 octet information field in case link synchronization is
lost.

Implementation Note:

This option is used to indicate an implementation capability.
The peer is not required to maximize the use of the capacity.
For example, when a MRU is indicated which is 2048 octets, the
peer is not required to send any packet with 2048 octets. The
peer need not Configure-Nak to indicate that it will only send
smaller packets, since the implementation will always require
support for at least 1500 octets.

A summary of the Maximum-Receive-Unit Configuration Option format is
shown below. The fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length | Maximum-Receive-Unit |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Type

1

Length

4

Maximum-Receive-Unit

The Maximum-Receive-Unit field is two octets, and specifies the
maximum number of octets in the Information and Padding fields.
It does not include the framing, Protocol field, FCS, nor any
transparency bits or bytes.

6.2. Authentication-Protocol

Description

On some links it may be desirable to require a peer to
authenticate itself before allowing network-layer protocol packets
to be exchanged.

This Configuration Option provides a method to negotiate the use
of a specific protocol for authentication. By default,
authentication is not required.

An implementation MUST NOT include multiple Authentication-
Protocol Configuration Options in its Configure-Request packets.
Instead, it SHOULD attempt to configure the most desirable
protocol first. If that protocol is Configure-Nak'd, then the
implementation SHOULD attempt the next most desirable protocol in
the next Configure-Request.

The implementation sending the Configure-Request is indicating
that it expects authentication from its peer. If an
implementation sends a Configure-Ack, then it is agreeing to
authenticate with the specified protocol. An implementation
receiving a Configure-Ack SHOULD expect the peer to authenticate
with the acknowledged protocol.

There is no requirement that authentication be full-duplex or that
the same protocol be used in both directions. It is perfectly
acceptable for different protocols to be used in each direction.
This will, of course, depend on the specific protocols negotiated.

A summary of the Authentication-Protocol Configuration Option format
is shown below. The fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length | Authentication-Protocol |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+

Type

3

Length

>= 4

Authentication-Protocol

The Authentication-Protocol field is two octets, and indicates the
authentication protocol desired. Values for this field are always
the same as the PPP Protocol field values for that same
authentication protocol.

Up-to-date values of the Authentication-Protocol field are
specified in the most recent "Assigned Numbers" RFC [2]. Current
values are assigned as follows:

Value (in hex) Protocol

c023 Password Authentication Protocol
c223 Challenge Handshake Authentication Protocol

Data

The Data field is zero or more octets, and contains additional
data as determined by the particular protocol.

6.3. Quality-Protocol

Description

On some links it may be desirable to determine when, and how
often, the link is dropping data. This process is called link
quality monitoring.

This Configuration Option provides a method to negotiate the use
of a specific protocol for link quality monitoring. By default,
link quality monitoring is disabled.

The implementation sending the Configure-Request is indicating
that it expects to receive monitoring information from its peer.
If an implementation sends a Configure-Ack, then it is agreeing to
send the specified protocol. An implementation receiving a
Configure-Ack SHOULD expect the peer to send the acknowledged
protocol.

There is no requirement that quality monitoring be full-duplex or

that the same protocol be used in both directions. It is
perfectly acceptable for different protocols to be used in each
direction. This will, of course, depend on the specific protocols
negotiated.

A summary of the Quality-Protocol Configuration Option format is
shown below. The fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length | Quality-Protocol |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+

Type

4

Length

>= 4

Quality-Protocol

The Quality-Protocol field is two octets, and indicates the link
quality monitoring protocol desired. Values for this field are
always the same as the PPP Protocol field values for that same
monitoring protocol.

Up-to-date values of the Quality-Protocol field are specified in
the most recent "Assigned Numbers" RFC [2]. Current values are
assigned as follows:

Value (in hex) Protocol

c025 Link Quality Report

Data

The Data field is zero or more octets, and contains additional
data as determined by the particular protocol.

6.4. Magic-Number

Description

This Configuration Option provides a method to detect looped-back
links and other Data Link Layer anomalies. This Configuration
Option MAY be required by some other Configuration Options such as
the Quality-Protocol Configuration Option. By default, the
Magic-Number is not negotiated, and zero is inserted where a
Magic-Number might otherwise be used.

Before this Configuration Option is requested, an implementation
MUST choose its Magic-Number. It is recommended that the Magic-
Number be chosen in the most random manner possible in order to
guarantee with very high probability that an implementation will
arrive at a unique number. A good way to choose a unique random
number is to start with a unique seed. Suggested sources of
uniqueness include machine serial numbers, other network hardware
addresses, time-of-day clocks, etc. Particularly good random
number seeds are precise measurements of the inter-arrival time of
physical events such as packet reception on other connected
networks, server response time, or the typing rate of a human
user. It is also suggested that as many sources as possible be
used simultaneously.

When a Configure-Request is received with a Magic-Number
Configuration Option, the received Magic-Number is compared with
the Magic-Number of the last Configure-Request sent to the peer.
If the two Magic-Numbers are different, then the link is not
looped-back, and the Magic-Number SHOULD be acknowledged. If the
two Magic-Numbers are equal, then it is possible, but not certain,
that the link is looped-back and that this Configure-Request is
actually the one last sent. To determine this, a Configure-Nak
MUST be sent specifying a different Magic-Number value. A new
Configure-Request SHOULD NOT be sent to the peer until normal
processing would cause it to be sent (that is, until a Configure-
Nak is received or the Restart timer runs out).

Reception of a Configure-Nak with a Magic-Number different from
that of the last Configure-Nak sent to the peer proves that a link
is not looped-back, and indicates a unique Magic-Number. If the
Magic-Number is equal to the one sent in the last Configure-Nak,
the possibility of a looped-back link is increased, and a new
Magic-Number MUST be chosen. In either case, a new Configure-
Request SHOULD be sent with the new Magic-Number.

If the link is indeed looped-back, this sequence (transmit
Configure-Request, receive Configure-Request, transmit Configure-

Nak, receive Configure-Nak) will repeat over and over again. If
the link is not looped-back, this sequence might occur a few
times, but it is extremely unlikely to occur repeatedly. More
likely, the Magic-Numbers chosen at either end will quickly
diverge, terminating the sequence. The following table shows the
probability of collisions assuming that both ends of the link
select Magic-Numbers with a perfectly uniform distribution:

Number of Collisions Probability
-------------------- ---------------------
1 1/2**32 = 2.3 E-10
2 1/2**32**2 = 5.4 E-20
3 1/2**32**3 = 1.3 E-29

Good sources of uniqueness or randomness are required for this
divergence to occur. If a good source of uniqueness cannot be
found, it is recommended that this Configuration Option not be
enabled; Configure-Requests with the option SHOULD NOT be
transmitted and any Magic-Number Configuration Options which the
peer sends SHOULD be either acknowledged or rejected. In this
case, looped-back links cannot be reliably detected by the
implementation, although they may still be detectable by the peer.

If an implementation does transmit a Configure-Request with a
Magic-Number Configuration Option, then it MUST NOT respond with a
Configure-Reject when it receives a Configure-Request with a
Magic-Number Configuration Option. That is, if an implementation
desires to use Magic Numbers, then it MUST also allow its peer to
do so. If an implementation does receive a Configure-Reject in
response to a Configure-Request, it can only mean that the link is
not looped-back, and that its peer will not be using Magic-
Numbers. In this case, an implementation SHOULD act as if the
negotiation had been successful (as if it had instead received a
Configure-Ack).

The Magic-Number also may be used to detect looped-back links
during normal operation, as well as during Configuration Option
negotiation. All LCP Echo-Request, Echo-Reply, and Discard-
Request packets have a Magic-Number field. If Magic-Number has
been successfully negotiated, an implementation MUST transmit
these packets with the Magic-Number field set to its negotiated
Magic-Number.

The Magic-Number field of these packets SHOULD be inspected on
reception. All received Magic-Number fields MUST be equal to
either zero or the peer's unique Magic-Number, depending on
whether or not the peer negotiated a Magic-Number.

Reception of a Magic-Number field equal to the negotiated local
Magic-Number indicates a looped-back link. Reception of a Magic-
Number other than the negotiated local Magic-Number, the peer's
negotiated Magic-Number, or zero if the peer didn't negotiate one,
indicates a link which has been (mis)configured for communications
with a different peer.

Procedures for recovery from either case are unspecified, and may
vary from implementation to implementation. A somewhat
pessimistic procedure is to assume a LCP Down event. A further
Open event will begin the process of re-establishing the link,
which can't complete until the looped-back condition is
terminated, and Magic-Numbers are successfully negotiated. A more
optimistic procedure (in the case of a looped-back link) is to
begin transmitting LCP Echo-Request packets until an appropriate
Echo-Reply is received, indicating a termination of the looped-
back condition.

A summary of the Magic-Number Configuration Option format is shown
below. The fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length | Magic-Number
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Magic-Number (cont) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Type

5

Length

6

Magic-Number

The Magic-Number field is four octets, and indicates a number
which is very likely to be unique to one end of the link. A
Magic-Number of zero is illegal and MUST always be Nak'd, if it is
not Rejected outright.

6.5. Protocol-Field-Compression (PFC)

Description

This Configuration Option provides a method to negotiate the
compression of the PPP Protocol field. By default, all
implementations MUST transmit packets with two octet PPP Protocol
fields.

PPP Protocol field numbers are chosen such that some values may be
compressed into a single octet form which is clearly
distinguishable from the two octet form. This Configuration
Option is sent to inform the peer that the implementation can
receive such single octet Protocol fields.

As previously mentioned, the Protocol field uses an extension
mechanism consistent with the ISO 3309 extension mechanism for the
Address field; the Least Significant Bit (LSB) of each octet is
used to indicate extension of the Protocol field. A binary "0" as
the LSB indicates that the Protocol field continues with the
following octet. The presence of a binary "1" as the LSB marks
the last octet of the Protocol field. Notice that any number of
"0" octets may be prepended to the field, and will still indicate
the same value (consider the two binary representations for 3,
00000011 and 00000000 00000011).

When using low speed links, it is desirable to conserve bandwidth
by sending as little redundant data as possible. The Protocol-
Field-Compression Configuration Option allows a trade-off between
implementation simplicity and bandwidth efficiency. If
successfully negotiated, the ISO 3309 extension mechanism may be
used to compress the Protocol field to one octet instead of two.
The large majority of packets are compressible since data
protocols are typically assigned with Protocol field values less
than 256.

Compressed Protocol fields MUST NOT be transmitted unless this
Configuration Option has been negotiated. When negotiated, PPP
implementations MUST accept PPP packets with either double-octet
or single-octet Protocol fields, and MUST NOT distinguish between
them.

The Protocol field is never compressed when sending any LCP
packet. This rule guarantees unambiguous recognition of LCP
packets.

When a Protocol field is compressed, the Data Link Layer FCS field
is calculated on the compressed frame, not the original

uncompressed frame.

A summary of the Protocol-Field-Compression Configuration Option
format is shown below. The fields are transmitted from left to
right.

0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Type

7

Length

2

6.6. Address-and-Control-Field-Compression (ACFC)

Description

This Configuration Option provides a method to negotiate the
compression of the Data Link Layer Address and Control fields. By
default, all implementations MUST transmit frames with Address and
Control fields appropriate to the link framing.

Since these fields usually have constant values for point-to-point
links, they are easily compressed. This Configuration Option is
sent to inform the peer that the implementation can receive
compressed Address and Control fields.

If a compressed frame is received when Address-and-Control-Field-
Compression has not been negotiated, the implementation MAY
silently discard the frame.

The Address and Control fields MUST NOT be compressed when sending
any LCP packet. This rule guarantees unambiguous recognition of
LCP packets.

When the Address and Control fields are compressed, the Data Link
Layer FCS field is calculated on the compressed frame, not the
original uncompressed frame.

A summary of the Address-and-Control-Field-Compression configuration
option format is shown below. The fields are transmitted from left
to right.

0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Type

8

Length

2

Security Considerations

Security issues are briefly discussed in sections concerning the
Authentication Phase, the Close event, and the Authentication-
Protocol Configuration Option.

References

[1] Perkins, D., "Requirements for an Internet Standard Point-to-
Point Protocol", RFC 1547, Carnegie Mellon University,
December 1993.

[2] Reynolds, J., and Postel, J., "Assigned Numbers", STD 2, RFC
1340, USC/Information Sciences Institute, July 1992.

Acknowledgements

This document is the product of the Point-to-Point Protocol Working
Group of the Internet Engineering Task Force (IETF). Comments should
be submitted to the ietf-ppp@merit.edu mailing list.

Much of the text in this document is taken from the working group
requirements [1]; and RFCs 1171 & 1172, by Drew Perkins while at
Carnegie Mellon University, and by Russ Hobby of the University of
California at Davis.

William Simpson was principally responsible for introducing
consistent terminology and philosophy, and the re-design of the phase
and negotiation state machines.

Many people spent significant time helping to develop the Point-to-
Point Protocol. The complete list of people is too numerous to list,
but the following people deserve special thanks: Rick Adams, Ken
Adelman, Fred Baker, Mike Ballard, Craig Fox, Karl Fox, Phill Gross,
Kory Hamzeh, former WG chair Russ Hobby, David Kaufman, former WG
chair Steve Knowles, Mark Lewis, former WG chair Brian Lloyd, John
LoVerso, Bill Melohn, Mike Patton, former WG chair Drew Perkins, Greg
Satz, John Shriver, Vernon Schryver, and Asher Waldfogel.

Special thanks to Morning Star Technologies for providing computing
resources and network access support for writing this specification.

Chair's Address

The working group can be contacted via the current chair:

Fred Baker
Advanced Computer Communications
315 Bollay Drive
Santa Barbara, California 93117

fbaker@acc.com

Editor's Address

Questions about this memo can also be directed to:

William Allen Simpson
Daydreamer
Computer Systems Consulting Services
1384 Fontaine
Madison Heights, Michigan 48071

Bill.Simpson@um.cc.umich.edu
bsimpson@MorningStar.com

Simpson [Page 52]

RFC 791 – Internet Protocol Version 4 Specification


INTERNET PROTOCOL

DARPA INTERNET PROGRAM

PROTOCOL SPECIFICATION

September 1981

prepared for

Defense Advanced Research Projects Agency
Information Processing Techniques Office
1400 Wilson Boulevard
Arlington, Virginia 22209

by

Information Sciences Institute
University of Southern California
4676 Admiralty Way
Marina del Rey, California 90291

September 1981
Internet Protocol

TABLE OF CONTENTS

PREFACE ........................................................ iii

1. INTRODUCTION ..................................................... 1

1.1 Motivation .................................................... 1
1.2 Scope ......................................................... 1
1.3 Interfaces .................................................... 1
1.4 Operation ..................................................... 2

2. OVERVIEW ......................................................... 5

2.1 Relation to Other Protocols ................................... 9
2.2 Model of Operation ............................................ 5
2.3 Function Description .......................................... 7
2.4 Gateways ...................................................... 9

3. SPECIFICATION ................................................... 11

3.1 Internet Header Format ....................................... 11
3.2 Discussion ................................................... 23
3.3 Interfaces ................................................... 31

APPENDIX A: Examples & Scenarios ................................... 34
APPENDIX B: Data Transmission Order ................................ 39

GLOSSARY ............................................................ 41

REFERENCES .......................................................... 45

[Page i]

September 1981
Internet Protocol

[Page ii]

September 1981
Internet Protocol

PREFACE

This document specifies the DoD Standard Internet Protocol. This
document is based on six earlier editions of the ARPA Internet Protocol
Specification, and the present text draws heavily from them. There have
been many contributors to this work both in terms of concepts and in
terms of text. This edition revises aspects of addressing, error
handling, option codes, and the security, precedence, compartments, and
handling restriction features of the internet protocol.

Jon Postel

Editor

September 1981

RFC: 791
Replaces: RFC 760
IENs 128, 123, 111,
80, 54, 44, 41, 28, 26

INTERNET PROTOCOL

DARPA INTERNET PROGRAM
PROTOCOL SPECIFICATION

1. INTRODUCTION

1.1. Motivation

The Internet Protocol is designed for use in interconnected systems of
packet-switched computer communication networks. Such a system has
been called a "catenet" [1]. The internet protocol provides for
transmitting blocks of data called datagrams from sources to
destinations, where sources and destinations are hosts identified by
fixed length addresses. The internet protocol also provides for
fragmentation and reassembly of long datagrams, if necessary, for
transmission through "small packet" networks.

1.2. Scope

The internet protocol is specifically limited in scope to provide the
functions necessary to deliver a package of bits (an internet
datagram) from a source to a destination over an interconnected system
of networks. There are no mechanisms to augment end-to-end data
reliability, flow control, sequencing, or other services commonly
found in host-to-host protocols. The internet protocol can capitalize
on the services of its supporting networks to provide various types
and qualities of service.

1.3. Interfaces

This protocol is called on by host-to-host protocols in an internet
environment. This protocol calls on local network protocols to carry
the internet datagram to the next gateway or destination host.

For example, a TCP module would call on the internet module to take a
TCP segment (including the TCP header and user data) as the data
portion of an internet datagram. The TCP module would provide the
addresses and other parameters in the internet header to the internet
module as arguments of the call. The internet module would then
create an internet datagram and call on the local network interface to
transmit the internet datagram.

In the ARPANET case, for example, the internet module would call on a

[Page 1]

September 1981
Internet Protocol
Introduction

local net module which would add the 1822 leader [2] to the internet
datagram creating an ARPANET message to transmit to the IMP. The
ARPANET address would be derived from the internet address by the
local network interface and would be the address of some host in the
ARPANET, that host might be a gateway to other networks.

1.4. Operation

The internet protocol implements two basic functions: addressing and
fragmentation.

The internet modules use the addresses carried in the internet header
to transmit internet datagrams toward their destinations. The
selection of a path for transmission is called routing.

The internet modules use fields in the internet header to fragment and
reassemble internet datagrams when necessary for transmission through
"small packet" networks.

The model of operation is that an internet module resides in each host
engaged in internet communication and in each gateway that
interconnects networks. These modules share common rules for
interpreting address fields and for fragmenting and assembling
internet datagrams. In addition, these modules (especially in
gateways) have procedures for making routing decisions and other
functions.

The internet protocol treats each internet datagram as an independent
entity unrelated to any other internet datagram. There are no
connections or logical circuits (virtual or otherwise).

The internet protocol uses four key mechanisms in providing its
service: Type of Service, Time to Live, Options, and Header Checksum.

The Type of Service is used to indicate the quality of the service
desired. The type of service is an abstract or generalized set of
parameters which characterize the service choices provided in the
networks that make up the internet. This type of service indication
is to be used by gateways to select the actual transmission parameters
for a particular network, the network to be used for the next hop, or
the next gateway when routing an internet datagram.

The Time to Live is an indication of an upper bound on the lifetime of
an internet datagram. It is set by the sender of the datagram and
reduced at the points along the route where it is processed. If the
time to live reaches zero before the internet datagram reaches its
destination, the internet datagram is destroyed. The time to live can
be thought of as a self destruct time limit.

[Page 2]

September 1981
Internet Protocol
Introduction

The Options provide for control functions needed or useful in some
situations but unnecessary for the most common communications. The
options include provisions for timestamps, security, and special
routing.

The Header Checksum provides a verification that the information used
in processing internet datagram has been transmitted correctly. The
data may contain errors. If the header checksum fails, the internet
datagram is discarded at once by the entity which detects the error.

The internet protocol does not provide a reliable communication
facility. There are no acknowledgments either end-to-end or
hop-by-hop. There is no error control for data, only a header
checksum. There are no retransmissions. There is no flow control.

Errors detected may be reported via the Internet Control Message
Protocol (ICMP) [3] which is implemented in the internet protocol
module.

[Page 3]

September 1981
Internet Protocol

[Page 4]

September 1981
Internet Protocol

2. OVERVIEW

2.1. Relation to Other Protocols

The following diagram illustrates the place of the internet protocol
in the protocol hierarchy:

+------+ +-----+ +-----+ +-----+
|Telnet| | FTP | | TFTP| ... | ... |
+------+ +-----+ +-----+ +-----+
| | | |
+-----+ +-----+ +-----+
| TCP | | UDP | ... | ... |
+-----+ +-----+ +-----+
| | |
+--------------------------+----+
| Internet Protocol & ICMP |
+--------------------------+----+
|
+---------------------------+
| Local Network Protocol |
+---------------------------+

Protocol Relationships

Figure 1.

Internet protocol interfaces on one side to the higher level
host-to-host protocols and on the other side to the local network
protocol. In this context a "local network" may be a small network in
a building or a large network such as the ARPANET.

2.2. Model of Operation

The model of operation for transmitting a datagram from one
application program to another is illustrated by the following
scenario:

We suppose that this transmission will involve one intermediate
gateway.

The sending application program prepares its data and calls on its
local internet module to send that data as a datagram and passes the
destination address and other parameters as arguments of the call.

The internet module prepares a datagram header and attaches the data
to it. The internet module determines a local network address for
this internet address, in this case it is the address of a gateway.

[Page 5]

September 1981
Internet Protocol
Overview

It sends this datagram and the local network address to the local
network interface.

The local network interface creates a local network header, and
attaches the datagram to it, then sends the result via the local
network.

The datagram arrives at a gateway host wrapped in the local network
header, the local network interface strips off this header, and
turns the datagram over to the internet module. The internet module
determines from the internet address that the datagram is to be
forwarded to another host in a second network. The internet module
determines a local net address for the destination host. It calls
on the local network interface for that network to send the
datagram.

This local network interface creates a local network header and
attaches the datagram sending the result to the destination host.

At this destination host the datagram is stripped of the local net
header by the local network interface and handed to the internet
module.

The internet module determines that the datagram is for an
application program in this host. It passes the data to the
application program in response to a system call, passing the source
address and other parameters as results of the call.

Application Application
Program Program
\ /
Internet Module Internet Module Internet Module
\ / \ /
LNI-1 LNI-1 LNI-2 LNI-2
\ / \ /
Local Network 1 Local Network 2

Transmission Path

Figure 2

[Page 6]

September 1981
Internet Protocol
Overview

2.3. Function Description

The function or purpose of Internet Protocol is to move datagrams
through an interconnected set of networks. This is done by passing
the datagrams from one internet module to another until the
destination is reached. The internet modules reside in hosts and
gateways in the internet system. The datagrams are routed from one
internet module to another through individual networks based on the
interpretation of an internet address. Thus, one important mechanism
of the internet protocol is the internet address.

In the routing of messages from one internet module to another,
datagrams may need to traverse a network whose maximum packet size is
smaller than the size of the datagram. To overcome this difficulty, a
fragmentation mechanism is provided in the internet protocol.

Addressing

A distinction is made between names, addresses, and routes [4]. A
name indicates what we seek. An address indicates where it is. A
route indicates how to get there. The internet protocol deals
primarily with addresses. It is the task of higher level (i.e.,
host-to-host or application) protocols to make the mapping from
names to addresses. The internet module maps internet addresses to
local net addresses. It is the task of lower level (i.e., local net
or gateways) procedures to make the mapping from local net addresses
to routes.

Addresses are fixed length of four octets (32 bits). An address
begins with a network number, followed by local address (called the
"rest" field). There are three formats or classes of internet
addresses: in class a, the high order bit is zero, the next 7 bits
are the network, and the last 24 bits are the local address; in
class b, the high order two bits are one-zero, the next 14 bits are
the network and the last 16 bits are the local address; in class c,
the high order three bits are one-one-zero, the next 21 bits are the
network and the last 8 bits are the local address.

Care must be taken in mapping internet addresses to local net
addresses; a single physical host must be able to act as if it were
several distinct hosts to the extent of using several distinct
internet addresses. Some hosts will also have several physical
interfaces (multi-homing).

That is, provision must be made for a host to have several physical
interfaces to the network with each having several logical internet
addresses.

[Page 7]

September 1981
Internet Protocol
Overview

Examples of address mappings may be found in "Address Mappings" [5].

Fragmentation

Fragmentation of an internet datagram is necessary when it
originates in a local net that allows a large packet size and must
traverse a local net that limits packets to a smaller size to reach
its destination.

An internet datagram can be marked "don't fragment." Any internet
datagram so marked is not to be internet fragmented under any
circumstances. If internet datagram marked don't fragment cannot be
delivered to its destination without fragmenting it, it is to be
discarded instead.

Fragmentation, transmission and reassembly across a local network
which is invisible to the internet protocol module is called
intranet fragmentation and may be used [6].

The internet fragmentation and reassembly procedure needs to be able
to break a datagram into an almost arbitrary number of pieces that
can be later reassembled. The receiver of the fragments uses the
identification field to ensure that fragments of different datagrams
are not mixed. The fragment offset field tells the receiver the
position of a fragment in the original datagram. The fragment
offset and length determine the portion of the original datagram
covered by this fragment. The more-fragments flag indicates (by
being reset) the last fragment. These fields provide sufficient
information to reassemble datagrams.

The identification field is used to distinguish the fragments of one
datagram from those of another. The originating protocol module of
an internet datagram sets the identification field to a value that
must be unique for that source-destination pair and protocol for the
time the datagram will be active in the internet system. The
originating protocol module of a complete datagram sets the
more-fragments flag to zero and the fragment offset to zero.

To fragment a long internet datagram, an internet protocol module
(for example, in a gateway), creates two new internet datagrams and
copies the contents of the internet header fields from the long
datagram into both new internet headers. The data of the long
datagram is divided into two portions on a 8 octet (64 bit) boundary
(the second portion might not be an integral multiple of 8 octets,
but the first must be). Call the number of 8 octet blocks in the
first portion NFB (for Number of Fragment Blocks). The first
portion of the data is placed in the first new internet datagram,
and the total length field is set to the length of the first

[Page 8]

September 1981
Internet Protocol
Overview

datagram. The more-fragments flag is set to one. The second
portion of the data is placed in the second new internet datagram,
and the total length field is set to the length of the second
datagram. The more-fragments flag carries the same value as the
long datagram. The fragment offset field of the second new internet
datagram is set to the value of that field in the long datagram plus
NFB.

This procedure can be generalized for an n-way split, rather than
the two-way split described.

To assemble the fragments of an internet datagram, an internet
protocol module (for example at a destination host) combines
internet datagrams that all have the same value for the four fields:
identification, source, destination, and protocol. The combination
is done by placing the data portion of each fragment in the relative
position indicated by the fragment offset in that fragment's
internet header. The first fragment will have the fragment offset
zero, and the last fragment will have the more-fragments flag reset
to zero.

2.4. Gateways

Gateways implement internet protocol to forward datagrams between
networks. Gateways also implement the Gateway to Gateway Protocol
(GGP) [7] to coordinate routing and other internet control
information.

In a gateway the higher level protocols need not be implemented and
the GGP functions are added to the IP module.

+-------------------------------+
| Internet Protocol & ICMP & GGP|
+-------------------------------+
| |
+---------------+ +---------------+
| Local Net | | Local Net |
+---------------+ +---------------+

Gateway Protocols

Figure 3.

[Page 9]

September 1981
Internet Protocol

[Page 10]

September 1981
Internet Protocol

3. SPECIFICATION

3.1. Internet Header Format

A summary of the contents of the internet header follows:

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Example Internet Datagram Header

Figure 4.

Note that each tick mark represents one bit position.

Version: 4 bits

The Version field indicates the format of the internet header. This
document describes version 4.

IHL: 4 bits

Internet Header Length is the length of the internet header in 32
bit words, and thus points to the beginning of the data. Note that
the minimum value for a correct header is 5.

[Page 11]

September 1981
Internet Protocol
Specification

Type of Service: 8 bits

The Type of Service provides an indication of the abstract
parameters of the quality of service desired. These parameters are
to be used to guide the selection of the actual service parameters
when transmitting a datagram through a particular network. Several
networks offer service precedence, which somehow treats high
precedence traffic as more important than other traffic (generally
by accepting only traffic above a certain precedence at time of high
load). The major choice is a three way tradeoff between low-delay,
high-reliability, and high-throughput.

Bits 0-2: Precedence.
Bit 3: 0 = Normal Delay, 1 = Low Delay.
Bits 4: 0 = Normal Throughput, 1 = High Throughput.
Bits 5: 0 = Normal Relibility, 1 = High Relibility.
Bit 6-7: Reserved for Future Use.

0 1 2 3 4 5 6 7
+-----+-----+-----+-----+-----+-----+-----+-----+
| | | | | | |
| PRECEDENCE | D | T | R | 0 | 0 |
| | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+

Precedence

111 - Network Control
110 - Internetwork Control
101 - CRITIC/ECP
100 - Flash Override
011 - Flash
010 - Immediate
001 - Priority
000 - Routine

The use of the Delay, Throughput, and Reliability indications may
increase the cost (in some sense) of the service. In many networks
better performance for one of these parameters is coupled with worse
performance on another. Except for very unusual cases at most two
of these three indications should be set.

The type of service is used to specify the treatment of the datagram
during its transmission through the internet system. Example
mappings of the internet type of service to the actual service
provided on networks such as AUTODIN II, ARPANET, SATNET, and PRNET
is given in "Service Mappings" [8].

[Page 12]

September 1981
Internet Protocol
Specification

The Network Control precedence designation is intended to be used
within a network only. The actual use and control of that
designation is up to each network. The Internetwork Control
designation is intended for use by gateway control originators only.
If the actual use of these precedence designations is of concern to
a particular network, it is the responsibility of that network to
control the access to, and use of, those precedence designations.

Total Length: 16 bits

Total Length is the length of the datagram, measured in octets,
including internet header and data. This field allows the length of
a datagram to be up to 65,535 octets. Such long datagrams are
impractical for most hosts and networks. All hosts must be prepared
to accept datagrams of up to 576 octets (whether they arrive whole
or in fragments). It is recommended that hosts only send datagrams
larger than 576 octets if they have assurance that the destination
is prepared to accept the larger datagrams.

The number 576 is selected to allow a reasonable sized data block to
be transmitted in addition to the required header information. For
example, this size allows a data block of 512 octets plus 64 header
octets to fit in a datagram. The maximal internet header is 60
octets, and a typical internet header is 20 octets, allowing a
margin for headers of higher level protocols.

Identification: 16 bits

An identifying value assigned by the sender to aid in assembling the
fragments of a datagram.

Flags: 3 bits

Various Control Flags.

Bit 0: reserved, must be zero
Bit 1: (DF) 0 = May Fragment, 1 = Don't Fragment.
Bit 2: (MF) 0 = Last Fragment, 1 = More Fragments.

0 1 2
+---+---+---+
| | D | M |
| 0 | F | F |
+---+---+---+

Fragment Offset: 13 bits

This field indicates where in the datagram this fragment belongs.

[Page 13]

September 1981
Internet Protocol
Specification

The fragment offset is measured in units of 8 octets (64 bits). The
first fragment has offset zero.

Time to Live: 8 bits

This field indicates the maximum time the datagram is allowed to
remain in the internet system. If this field contains the value
zero, then the datagram must be destroyed. This field is modified
in internet header processing. The time is measured in units of
seconds, but since every module that processes a datagram must
decrease the TTL by at least one even if it process the datagram in
less than a second, the TTL must be thought of only as an upper
bound on the time a datagram may exist. The intention is to cause
undeliverable datagrams to be discarded, and to bound the maximum
datagram lifetime.

Protocol: 8 bits

This field indicates the next level protocol used in the data
portion of the internet datagram. The values for various protocols
are specified in "Assigned Numbers" [9].

Header Checksum: 16 bits

A checksum on the header only. Since some header fields change
(e.g., time to live), this is recomputed and verified at each point
that the internet header is processed.

The checksum algorithm is:

The checksum field is the 16 bit one's complement of the one's
complement sum of all 16 bit words in the header. For purposes of
computing the checksum, the value of the checksum field is zero.

This is a simple to compute checksum and experimental evidence
indicates it is adequate, but it is provisional and may be replaced
by a CRC procedure, depending on further experience.

Source Address: 32 bits

The source address. See section 3.2.

Destination Address: 32 bits

The destination address. See section 3.2.

[Page 14]

September 1981
Internet Protocol
Specification

Options: variable

The options may appear or not in datagrams. They must be
implemented by all IP modules (host and gateways). What is optional
is their transmission in any particular datagram, not their
implementation.

In some environments the security option may be required in all
datagrams.

The option field is variable in length. There may be zero or more
options. There are two cases for the format of an option:

Case 1: A single octet of option-type.

Case 2: An option-type octet, an option-length octet, and the
actual option-data octets.

The option-length octet counts the option-type octet and the
option-length octet as well as the option-data octets.

The option-type octet is viewed as having 3 fields:

1 bit copied flag,
2 bits option class,
5 bits option number.

The copied flag indicates that this option is copied into all
fragments on fragmentation.

0 = not copied
1 = copied

The option classes are:

0 = control
1 = reserved for future use
2 = debugging and measurement
3 = reserved for future use

[Page 15]

September 1981
Internet Protocol
Specification

The following internet options are defined:

CLASS NUMBER LENGTH DESCRIPTION
----- ------ ------ -----------
0 0 - End of Option list. This option occupies only
1 octet; it has no length octet.
0 1 - No Operation. This option occupies only 1
octet; it has no length octet.
0 2 11 Security. Used to carry Security,
Compartmentation, User Group (TCC), and
Handling Restriction Codes compatible with DOD
requirements.
0 3 var. Loose Source Routing. Used to route the
internet datagram based on information
supplied by the source.
0 9 var. Strict Source Routing. Used to route the
internet datagram based on information
supplied by the source.
0 7 var. Record Route. Used to trace the route an
internet datagram takes.
0 8 4 Stream ID. Used to carry the stream
identifier.
2 4 var. Internet Timestamp.

Specific Option Definitions

End of Option List

+--------+
|00000000|
+--------+
Type=0

This option indicates the end of the option list. This might
not coincide with the end of the internet header according to
the internet header length. This is used at the end of all
options, not the end of each option, and need only be used if
the end of the options would not otherwise coincide with the end
of the internet header.

May be copied, introduced, or deleted on fragmentation, or for
any other reason.

[Page 16]

September 1981
Internet Protocol
Specification

No Operation

+--------+
|00000001|
+--------+
Type=1

This option may be used between options, for example, to align
the beginning of a subsequent option on a 32 bit boundary.

May be copied, introduced, or deleted on fragmentation, or for
any other reason.

Security

This option provides a way for hosts to send security,
compartmentation, handling restrictions, and TCC (closed user
group) parameters. The format for this option is as follows:

+--------+--------+---//---+---//---+---//---+---//---+
|10000010|00001011|SSS SSS|CCC CCC|HHH HHH| TCC |
+--------+--------+---//---+---//---+---//---+---//---+
Type=130 Length=11

Security (S field): 16 bits

Specifies one of 16 levels of security (eight of which are
reserved for future use).

00000000 00000000 - Unclassified
11110001 00110101 - Confidential
01111000 10011010 - EFTO
10111100 01001101 - MMMM
01011110 00100110 - PROG
10101111 00010011 - Restricted
11010111 10001000 - Secret
01101011 11000101 - Top Secret
00110101 11100010 - (Reserved for future use)
10011010 11110001 - (Reserved for future use)
01001101 01111000 - (Reserved for future use)
00100100 10111101 - (Reserved for future use)
00010011 01011110 - (Reserved for future use)
10001001 10101111 - (Reserved for future use)
11000100 11010110 - (Reserved for future use)
11100010 01101011 - (Reserved for future use)

[Page 17]

September 1981
Internet Protocol
Specification

Compartments (C field): 16 bits

An all zero value is used when the information transmitted is
not compartmented. Other values for the compartments field
may be obtained from the Defense Intelligence Agency.

Handling Restrictions (H field): 16 bits

The values for the control and release markings are
alphanumeric digraphs and are defined in the Defense
Intelligence Agency Manual DIAM 65-19, "Standard Security
Markings".

Transmission Control Code (TCC field): 24 bits

Provides a means to segregate traffic and define controlled
communities of interest among subscribers. The TCC values are
trigraphs, and are available from HQ DCA Code 530.

Must be copied on fragmentation. This option appears at most
once in a datagram.

Loose Source and Record Route

+--------+--------+--------+---------//--------+
|10000011| length | pointer| route data |
+--------+--------+--------+---------//--------+
Type=131

The loose source and record route (LSRR) option provides a means
for the source of an internet datagram to supply routing
information to be used by the gateways in forwarding the
datagram to the destination, and to record the route
information.

The option begins with the option type code. The second octet
is the option length which includes the option type code and the
length octet, the pointer octet, and length-3 octets of route
data. The third octet is the pointer into the route data
indicating the octet which begins the next source address to be
processed. The pointer is relative to this option, and the
smallest legal value for the pointer is 4.

A route data is composed of a series of internet addresses.
Each internet address is 32 bits or 4 octets. If the pointer is
greater than the length, the source route is empty (and the
recorded route full) and the routing is to be based on the
destination address field.

[Page 18]

September 1981
Internet Protocol
Specification

If the address in destination address field has been reached and
the pointer is not greater than the length, the next address in
the source route replaces the address in the destination address
field, and the recorded route address replaces the source
address just used, and pointer is increased by four.

The recorded route address is the internet module's own internet
address as known in the environment into which this datagram is
being forwarded.

This procedure of replacing the source route with the recorded
route (though it is in the reverse of the order it must be in to
be used as a source route) means the option (and the IP header
as a whole) remains a constant length as the datagram progresses
through the internet.

This option is a loose source route because the gateway or host
IP is allowed to use any route of any number of other
intermediate gateways to reach the next address in the route.

Must be copied on fragmentation. Appears at most once in a
datagram.

Strict Source and Record Route

+--------+--------+--------+---------//--------+
|10001001| length | pointer| route data |
+--------+--------+--------+---------//--------+
Type=137

The strict source and record route (SSRR) option provides a
means for the source of an internet datagram to supply routing
information to be used by the gateways in forwarding the
datagram to the destination, and to record the route
information.

The option begins with the option type code. The second octet
is the option length which includes the option type code and the
length octet, the pointer octet, and length-3 octets of route
data. The third octet is the pointer into the route data
indicating the octet which begins the next source address to be
processed. The pointer is relative to this option, and the
smallest legal value for the pointer is 4.

A route data is composed of a series of internet addresses.
Each internet address is 32 bits or 4 octets. If the pointer is
greater than the length, the source route is empty (and the

[Page 19]

September 1981
Internet Protocol
Specification

recorded route full) and the routing is to be based on the
destination address field.

If the address in destination address field has been reached and
the pointer is not greater than the length, the next address in
the source route replaces the address in the destination address
field, and the recorded route address replaces the source
address just used, and pointer is increased by four.

The recorded route address is the internet module's own internet
address as known in the environment into which this datagram is
being forwarded.

This procedure of replacing the source route with the recorded
route (though it is in the reverse of the order it must be in to
be used as a source route) means the option (and the IP header
as a whole) remains a constant length as the datagram progresses
through the internet.

This option is a strict source route because the gateway or host
IP must send the datagram directly to the next address in the
source route through only the directly connected network
indicated in the next address to reach the next gateway or host
specified in the route.

Must be copied on fragmentation. Appears at most once in a
datagram.

Record Route

+--------+--------+--------+---------//--------+
|00000111| length | pointer| route data |
+--------+--------+--------+---------//--------+
Type=7

The record route option provides a means to record the route of
an internet datagram.

The option begins with the option type code. The second octet
is the option length which includes the option type code and the
length octet, the pointer octet, and length-3 octets of route
data. The third octet is the pointer into the route data
indicating the octet which begins the next area to store a route
address. The pointer is relative to this option, and the
smallest legal value for the pointer is 4.

A recorded route is composed of a series of internet addresses.
Each internet address is 32 bits or 4 octets. If the pointer is

[Page 20]

September 1981
Internet Protocol
Specification

greater than the length, the recorded route data area is full.
The originating host must compose this option with a large
enough route data area to hold all the address expected. The
size of the option does not change due to adding addresses. The
intitial contents of the route data area must be zero.

When an internet module routes a datagram it checks to see if
the record route option is present. If it is, it inserts its
own internet address as known in the environment into which this
datagram is being forwarded into the recorded route begining at
the octet indicated by the pointer, and increments the pointer
by four.

If the route data area is already full (the pointer exceeds the
length) the datagram is forwarded without inserting the address
into the recorded route. If there is some room but not enough
room for a full address to be inserted, the original datagram is
considered to be in error and is discarded. In either case an
ICMP parameter problem message may be sent to the source
host [3].

Not copied on fragmentation, goes in first fragment only.
Appears at most once in a datagram.

Stream Identifier

+--------+--------+--------+--------+
|10001000|00000010| Stream ID |
+--------+--------+--------+--------+
Type=136 Length=4

This option provides a way for the 16-bit SATNET stream
identifier to be carried through networks that do not support
the stream concept.

Must be copied on fragmentation. Appears at most once in a
datagram.

[Page 21]

September 1981
Internet Protocol
Specification

Internet Timestamp

+--------+--------+--------+--------+
|01000100| length | pointer|oflw|flg|
+--------+--------+--------+--------+
| internet address |
+--------+--------+--------+--------+
| timestamp |
+--------+--------+--------+--------+
| . |
.
.
Type = 68

The Option Length is the number of octets in the option counting
the type, length, pointer, and overflow/flag octets (maximum
length 40).

The Pointer is the number of octets from the beginning of this
option to the end of timestamps plus one (i.e., it points to the
octet beginning the space for next timestamp). The smallest
legal value is 5. The timestamp area is full when the pointer
is greater than the length.

The Overflow (oflw) [4 bits] is the number of IP modules that
cannot register timestamps due to lack of space.

The Flag (flg) [4 bits] values are

0 -- time stamps only, stored in consecutive 32-bit words,

1 -- each timestamp is preceded with internet address of the
registering entity,

3 -- the internet address fields are prespecified. An IP
module only registers its timestamp if it matches its own
address with the next specified internet address.

The Timestamp is a right-justified, 32-bit timestamp in
milliseconds since midnight UT. If the time is not available in
milliseconds or cannot be provided with respect to midnight UT
then any time may be inserted as a timestamp provided the high
order bit of the timestamp field is set to one to indicate the
use of a non-standard value.

The originating host must compose this option with a large
enough timestamp data area to hold all the timestamp information
expected. The size of the option does not change due to adding

[Page 22]

September 1981
Internet Protocol
Specification

timestamps. The intitial contents of the timestamp data area
must be zero or internet address/zero pairs.

If the timestamp data area is already full (the pointer exceeds
the length) the datagram is forwarded without inserting the
timestamp, but the overflow count is incremented by one.

If there is some room but not enough room for a full timestamp
to be inserted, or the overflow count itself overflows, the
original datagram is considered to be in error and is discarded.
In either case an ICMP parameter problem message may be sent to
the source host [3].

The timestamp option is not copied upon fragmentation. It is
carried in the first fragment. Appears at most once in a
datagram.

Padding: variable

The internet header padding is used to ensure that the internet
header ends on a 32 bit boundary. The padding is zero.

3.2. Discussion

The implementation of a protocol must be robust. Each implementation
must expect to interoperate with others created by different
individuals. While the goal of this specification is to be explicit
about the protocol there is the possibility of differing
interpretations. In general, an implementation must be conservative
in its sending behavior, and liberal in its receiving behavior. That
is, it must be careful to send well-formed datagrams, but must accept
any datagram that it can interpret (e.g., not object to technical
errors where the meaning is still clear).

The basic internet service is datagram oriented and provides for the
fragmentation of datagrams at gateways, with reassembly taking place
at the destination internet protocol module in the destination host.
Of course, fragmentation and reassembly of datagrams within a network
or by private agreement between the gateways of a network is also
allowed since this is transparent to the internet protocols and the
higher-level protocols. This transparent type of fragmentation and
reassembly is termed "network-dependent" (or intranet) fragmentation
and is not discussed further here.

Internet addresses distinguish sources and destinations to the host
level and provide a protocol field as well. It is assumed that each
protocol will provide for whatever multiplexing is necessary within a
host.

[Page 23]

September 1981
Internet Protocol
Specification

Addressing

To provide for flexibility in assigning address to networks and
allow for the large number of small to intermediate sized networks
the interpretation of the address field is coded to specify a small
number of networks with a large number of host, a moderate number of
networks with a moderate number of hosts, and a large number of
networks with a small number of hosts. In addition there is an
escape code for extended addressing mode.

Address Formats:

High Order Bits Format Class
--------------- ------------------------------- -----
0 7 bits of net, 24 bits of host a
10 14 bits of net, 16 bits of host b
110 21 bits of net, 8 bits of host c
111 escape to extended addressing mode

A value of zero in the network field means this network. This is
only used in certain ICMP messages. The extended addressing mode
is undefined. Both of these features are reserved for future use.

The actual values assigned for network addresses is given in
"Assigned Numbers" [9].

The local address, assigned by the local network, must allow for a
single physical host to act as several distinct internet hosts.
That is, there must be a mapping between internet host addresses and
network/host interfaces that allows several internet addresses to
correspond to one interface. It must also be allowed for a host to
have several physical interfaces and to treat the datagrams from
several of them as if they were all addressed to a single host.

Address mappings between internet addresses and addresses for
ARPANET, SATNET, PRNET, and other networks are described in "Address
Mappings" [5].

Fragmentation and Reassembly.

The internet identification field (ID) is used together with the
source and destination address, and the protocol fields, to identify
datagram fragments for reassembly.

The More Fragments flag bit (MF) is set if the datagram is not the
last fragment. The Fragment Offset field identifies the fragment
location, relative to the beginning of the original unfragmented
datagram. Fragments are counted in units of 8 octets. The

[Page 24]

September 1981
Internet Protocol
Specification

fragmentation strategy is designed so than an unfragmented datagram
has all zero fragmentation information (MF = 0, fragment offset =
0). If an internet datagram is fragmented, its data portion must be
broken on 8 octet boundaries.

This format allows 2**13 = 8192 fragments of 8 octets each for a
total of 65,536 octets. Note that this is consistent with the the
datagram total length field (of course, the header is counted in the
total length and not in the fragments).

When fragmentation occurs, some options are copied, but others
remain with the first fragment only.

Every internet module must be able to forward a datagram of 68
octets without further fragmentation. This is because an internet
header may be up to 60 octets, and the minimum fragment is 8 octets.

Every internet destination must be able to receive a datagram of 576
octets either in one piece or in fragments to be reassembled.

The fields which may be affected by fragmentation include:

(1) options field
(2) more fragments flag
(3) fragment offset
(4) internet header length field
(5) total length field
(6) header checksum

If the Don't Fragment flag (DF) bit is set, then internet
fragmentation of this datagram is NOT permitted, although it may be
discarded. This can be used to prohibit fragmentation in cases
where the receiving host does not have sufficient resources to
reassemble internet fragments.

One example of use of the Don't Fragment feature is to down line
load a small host. A small host could have a boot strap program
that accepts a datagram stores it in memory and then executes it.

The fragmentation and reassembly procedures are most easily
described by examples. The following procedures are example
implementations.

General notation in the following pseudo programs: "=<" means "less
than or equal", "#" means "not equal", "=" means "equal", "<-" means
"is set to". Also, "x to y" includes x and excludes y; for example,
"4 to 7" would include 4, 5, and 6 (but not 7).

[Page 25]

September 1981
Internet Protocol
Specification

An Example Fragmentation Procedure

The maximum sized datagram that can be transmitted through the
next network is called the maximum transmission unit (MTU).

If the total length is less than or equal the maximum transmission
unit then submit this datagram to the next step in datagram
processing; otherwise cut the datagram into two fragments, the
first fragment being the maximum size, and the second fragment
being the rest of the datagram. The first fragment is submitted
to the next step in datagram processing, while the second fragment
is submitted to this procedure in case it is still too large.

Notation:

FO - Fragment Offset
IHL - Internet Header Length
DF - Don't Fragment flag
MF - More Fragments flag
TL - Total Length
OFO - Old Fragment Offset
OIHL - Old Internet Header Length
OMF - Old More Fragments flag
OTL - Old Total Length
NFB - Number of Fragment Blocks
MTU - Maximum Transmission Unit

Procedure:

IF TL =< MTU THEN Submit this datagram to the next step
in datagram processing ELSE IF DF = 1 THEN discard the
datagram ELSE
To produce the first fragment:
(1) Copy the original internet header;
(2) OIHL <- IHL; OTL <- TL; OFO <- FO; OMF <- MF;
(3) NFB <- (MTU-IHL*4)/8;
(4) Attach the first NFB*8 data octets;
(5) Correct the header:
MF <- 1; TL <- (IHL*4)+(NFB*8);
Recompute Checksum;
(6) Submit this fragment to the next step in
datagram processing;
To produce the second fragment:
(7) Selectively copy the internet header (some options
are not copied, see option definitions);
(8) Append the remaining data;
(9) Correct the header:
IHL <- (((OIHL*4)-(length of options not copied))+3)/4;

[Page 26]

September 1981
Internet Protocol
Specification

TL <- OTL - NFB*8 - (OIHL-IHL)*4);
FO <- OFO + NFB; MF <- OMF; Recompute Checksum;
(10) Submit this fragment to the fragmentation test; DONE.

In the above procedure each fragment (except the last) was made
the maximum allowable size. An alternative might produce less
than the maximum size datagrams. For example, one could implement
a fragmentation procedure that repeatly divided large datagrams in
half until the resulting fragments were less than the maximum
transmission unit size.

An Example Reassembly Procedure

For each datagram the buffer identifier is computed as the
concatenation of the source, destination, protocol, and
identification fields. If this is a whole datagram (that is both
the fragment offset and the more fragments fields are zero), then
any reassembly resources associated with this buffer identifier
are released and the datagram is forwarded to the next step in
datagram processing.

If no other fragment with this buffer identifier is on hand then
reassembly resources are allocated. The reassembly resources
consist of a data buffer, a header buffer, a fragment block bit
table, a total data length field, and a timer. The data from the
fragment is placed in the data buffer according to its fragment
offset and length, and bits are set in the fragment block bit
table corresponding to the fragment blocks received.

If this is the first fragment (that is the fragment offset is
zero) this header is placed in the header buffer. If this is the
last fragment ( that is the more fragments field is zero) the
total data length is computed. If this fragment completes the
datagram (tested by checking the bits set in the fragment block
table), then the datagram is sent to the next step in datagram
processing; otherwise the timer is set to the maximum of the
current timer value and the value of the time to live field from
this fragment; and the reassembly routine gives up control.

If the timer runs out, the all reassembly resources for this
buffer identifier are released. The initial setting of the timer
is a lower bound on the reassembly waiting time. This is because
the waiting time will be increased if the Time to Live in the
arriving fragment is greater than the current timer value but will
not be decreased if it is less. The maximum this timer value
could reach is the maximum time to live (approximately 4.25
minutes). The current recommendation for the initial timer
setting is 15 seconds. This may be changed as experience with

[Page 27]

September 1981
Internet Protocol
Specification

this protocol accumulates. Note that the choice of this parameter
value is related to the buffer capacity available and the data
rate of the transmission medium; that is, data rate times timer
value equals buffer size (e.g., 10Kb/s X 15s = 150Kb).

Notation:

FO - Fragment Offset
IHL - Internet Header Length
MF - More Fragments flag
TTL - Time To Live
NFB - Number of Fragment Blocks
TL - Total Length
TDL - Total Data Length
BUFID - Buffer Identifier
RCVBT - Fragment Received Bit Table
TLB - Timer Lower Bound

Procedure:

(1) BUFID <- source|destination|protocol|identification;
(2) IF FO = 0 AND MF = 0
(3) THEN IF buffer with BUFID is allocated
(4) THEN flush all reassembly for this BUFID;
(5) Submit datagram to next step; DONE.
(6) ELSE IF no buffer with BUFID is allocated
(7) THEN allocate reassembly resources
with BUFID;
TIMER <- TLB; TDL <- 0;
(8) put data from fragment into data buffer with
BUFID from octet FO*8 to
octet (TL-(IHL*4))+FO*8;
(9) set RCVBT bits from FO
to FO+((TL-(IHL*4)+7)/8);
(10) IF MF = 0 THEN TDL <- TL-(IHL*4)+(FO*8)
(11) IF FO = 0 THEN put header in header buffer
(12) IF TDL # 0
(13) AND all RCVBT bits from 0
to (TDL+7)/8 are set
(14) THEN TL <- TDL+(IHL*4)
(15) Submit datagram to next step;
(16) free all reassembly resources
for this BUFID; DONE.
(17) TIMER <- MAX(TIMER,TTL);
(18) give up until next fragment or timer expires;
(19) timer expires: flush all reassembly with this BUFID; DONE.

In the case that two or more fragments contain the same data

[Page 28]

September 1981
Internet Protocol
Specification

either identically or through a partial overlap, this procedure
will use the more recently arrived copy in the data buffer and
datagram delivered.

Identification

The choice of the Identifier for a datagram is based on the need to
provide a way to uniquely identify the fragments of a particular
datagram. The protocol module assembling fragments judges fragments
to belong to the same datagram if they have the same source,
destination, protocol, and Identifier. Thus, the sender must choose
the Identifier to be unique for this source, destination pair and
protocol for the time the datagram (or any fragment of it) could be
alive in the internet.

It seems then that a sending protocol module needs to keep a table
of Identifiers, one entry for each destination it has communicated
with in the last maximum packet lifetime for the internet.

However, since the Identifier field allows 65,536 different values,
some host may be able to simply use unique identifiers independent
of destination.

It is appropriate for some higher level protocols to choose the
identifier. For example, TCP protocol modules may retransmit an
identical TCP segment, and the probability for correct reception
would be enhanced if the retransmission carried the same identifier
as the original transmission since fragments of either datagram
could be used to construct a correct TCP segment.

Type of Service

The type of service (TOS) is for internet service quality selection.
The type of service is specified along the abstract parameters
precedence, delay, throughput, and reliability. These abstract
parameters are to be mapped into the actual service parameters of
the particular networks the datagram traverses.

Precedence. An independent measure of the importance of this
datagram.

Delay. Prompt delivery is important for datagrams with this
indication.

Throughput. High data rate is important for datagrams with this
indication.

[Page 29]

September 1981
Internet Protocol
Specification

Reliability. A higher level of effort to ensure delivery is
important for datagrams with this indication.

For example, the ARPANET has a priority bit, and a choice between
"standard" messages (type 0) and "uncontrolled" messages (type 3),
(the choice between single packet and multipacket messages can also
be considered a service parameter). The uncontrolled messages tend
to be less reliably delivered and suffer less delay. Suppose an
internet datagram is to be sent through the ARPANET. Let the
internet type of service be given as:

Precedence: 5
Delay: 0
Throughput: 1
Reliability: 1

In this example, the mapping of these parameters to those available
for the ARPANET would be to set the ARPANET priority bit on since
the Internet precedence is in the upper half of its range, to select
standard messages since the throughput and reliability requirements
are indicated and delay is not. More details are given on service
mappings in "Service Mappings" [8].

Time to Live

The time to live is set by the sender to the maximum time the
datagram is allowed to be in the internet system. If the datagram
is in the internet system longer than the time to live, then the
datagram must be destroyed.

This field must be decreased at each point that the internet header
is processed to reflect the time spent processing the datagram.
Even if no local information is available on the time actually
spent, the field must be decremented by 1. The time is measured in
units of seconds (i.e. the value 1 means one second). Thus, the
maximum time to live is 255 seconds or 4.25 minutes. Since every
module that processes a datagram must decrease the TTL by at least
one even if it process the datagram in less than a second, the TTL
must be thought of only as an upper bound on the time a datagram may
exist. The intention is to cause undeliverable datagrams to be
discarded, and to bound the maximum datagram lifetime.

Some higher level reliable connection protocols are based on
assumptions that old duplicate datagrams will not arrive after a
certain time elapses. The TTL is a way for such protocols to have
an assurance that their assumption is met.

[Page 30]

September 1981
Internet Protocol
Specification

Options

The options are optional in each datagram, but required in
implementations. That is, the presence or absence of an option is
the choice of the sender, but each internet module must be able to
parse every option. There can be several options present in the
option field.

The options might not end on a 32-bit boundary. The internet header
must be filled out with octets of zeros. The first of these would
be interpreted as the end-of-options option, and the remainder as
internet header padding.

Every internet module must be able to act on every option. The
Security Option is required if classified, restricted, or
compartmented traffic is to be passed.

Checksum

The internet header checksum is recomputed if the internet header is
changed. For example, a reduction of the time to live, additions or
changes to internet options, or due to fragmentation. This checksum
at the internet level is intended to protect the internet header
fields from transmission errors.

There are some applications where a few data bit errors are
acceptable while retransmission delays are not. If the internet
protocol enforced data correctness such applications could not be
supported.

Errors

Internet protocol errors may be reported via the ICMP messages [3].

3.3. Interfaces

The functional description of user interfaces to the IP is, at best,
fictional, since every operating system will have different
facilities. Consequently, we must warn readers that different IP
implementations may have different user interfaces. However, all IPs
must provide a certain minimum set of services to guarantee that all
IP implementations can support the same protocol hierarchy. This
section specifies the functional interfaces required of all IP
implementations.

Internet protocol interfaces on one side to the local network and on
the other side to either a higher level protocol or an application
program. In the following, the higher level protocol or application

[Page 31]

September 1981
Internet Protocol
Specification

program (or even a gateway program) will be called the "user" since it
is using the internet module. Since internet protocol is a datagram
protocol, there is minimal memory or state maintained between datagram
transmissions, and each call on the internet protocol module by the
user supplies all information necessary for the IP to perform the
service requested.

An Example Upper Level Interface

The following two example calls satisfy the requirements for the user
to internet protocol module communication ("=>" means returns):

SEND (src, dst, prot, TOS, TTL, BufPTR, len, Id, DF, opt => result)

where:

src = source address
dst = destination address
prot = protocol
TOS = type of service
TTL = time to live
BufPTR = buffer pointer
len = length of buffer
Id = Identifier
DF = Don't Fragment
opt = option data
result = response
OK = datagram sent ok
Error = error in arguments or local network error

Note that the precedence is included in the TOS and the
security/compartment is passed as an option.

RECV (BufPTR, prot, => result, src, dst, TOS, len, opt)

where:

BufPTR = buffer pointer
prot = protocol
result = response
OK = datagram received ok
Error = error in arguments
len = length of buffer
src = source address
dst = destination address
TOS = type of service
opt = option data

[Page 32]

September 1981
Internet Protocol
Specification

When the user sends a datagram, it executes the SEND call supplying
all the arguments. The internet protocol module, on receiving this
call, checks the arguments and prepares and sends the message. If the
arguments are good and the datagram is accepted by the local network,
the call returns successfully. If either the arguments are bad, or
the datagram is not accepted by the local network, the call returns
unsuccessfully. On unsuccessful returns, a reasonable report must be
made as to the cause of the problem, but the details of such reports
are up to individual implementations.

When a datagram arrives at the internet protocol module from the local
network, either there is a pending RECV call from the user addressed
or there is not. In the first case, the pending call is satisfied by
passing the information from the datagram to the user. In the second
case, the user addressed is notified of a pending datagram. If the
user addressed does not exist, an ICMP error message is returned to
the sender, and the data is discarded.

The notification of a user may be via a pseudo interrupt or similar
mechanism, as appropriate in the particular operating system
environment of the implementation.

A user's RECV call may then either be immediately satisfied by a
pending datagram, or the call may be pending until a datagram arrives.

The source address is included in the send call in case the sending
host has several addresses (multiple physical connections or logical
addresses). The internet module must check to see that the source
address is one of the legal address for this host.

An implementation may also allow or require a call to the internet
module to indicate interest in or reserve exclusive use of a class of
datagrams (e.g., all those with a certain value in the protocol
field).

This section functionally characterizes a USER/IP interface. The
notation used is similar to most procedure of function calls in high
level languages, but this usage is not meant to rule out trap type
service calls (e.g., SVCs, UUOs, EMTs), or any other form of
interprocess communication.

[Page 33]

September 1981
Internet Protocol

APPENDIX A: Examples & Scenarios

Example 1:

This is an example of the minimal data carrying internet datagram:

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Ver= 4 |IHL= 5 |Type of Service| Total Length = 21 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification = 111 |Flg=0| Fragment Offset = 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time = 123 | Protocol = 1 | header checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| source address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| destination address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+

Example Internet Datagram

Figure 5.

Note that each tick mark represents one bit position.

This is a internet datagram in version 4 of internet protocol; the
internet header consists of five 32 bit words, and the total length of
the datagram is 21 octets. This datagram is a complete datagram (not
a fragment).

[Page 34]

September 1981
Internet Protocol

Example 2:

In this example, we show first a moderate size internet datagram (452
data octets), then two internet fragments that might result from the
fragmentation of this datagram if the maximum sized transmission
allowed were 280 octets.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Ver= 4 |IHL= 5 |Type of Service| Total Length = 472 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification = 111 |Flg=0| Fragment Offset = 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time = 123 | Protocol = 6 | header checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| source address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| destination address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
\ \
\ \
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Example Internet Datagram

Figure 6.

[Page 35]

September 1981
Internet Protocol

Now the first fragment that results from splitting the datagram after
256 data octets.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Ver= 4 |IHL= 5 |Type of Service| Total Length = 276 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification = 111 |Flg=1| Fragment Offset = 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time = 119 | Protocol = 6 | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| source address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| destination address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
\ \
\ \
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Example Internet Fragment

Figure 7.

[Page 36]

September 1981
Internet Protocol

And the second fragment.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Ver= 4 |IHL= 5 |Type of Service| Total Length = 216 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification = 111 |Flg=0| Fragment Offset = 32 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time = 119 | Protocol = 6 | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| source address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| destination address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
\ \
\ \
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Example Internet Fragment

Figure 8.

[Page 37]

September 1981
Internet Protocol

Example 3:

Here, we show an example of a datagram containing options:

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Ver= 4 |IHL= 8 |Type of Service| Total Length = 576 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification = 111 |Flg=0| Fragment Offset = 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time = 123 | Protocol = 6 | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| source address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| destination address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Opt. Code = x | Opt. Len.= 3 | option value | Opt. Code = x |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Opt. Len. = 4 | option value | Opt. Code = 1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Opt. Code = y | Opt. Len. = 3 | option value | Opt. Code = 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
\ \
\ \
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Example Internet Datagram

Figure 9.

[Page 38]

September 1981
Internet Protocol

APPENDIX B: Data Transmission Order

The order of transmission of the header and data described in this
document is resolved to the octet level. Whenever a diagram shows a
group of octets, the order of transmission of those octets is the normal
order in which they are read in English. For example, in the following
diagram the octets are transmitted in the order they are numbered.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 1 | 2 | 3 | 4 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 5 | 6 | 7 | 8 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 9 | 10 | 11 | 12 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Transmission Order of Bytes

Figure 10.

Whenever an octet represents a numeric quantity the left most bit in the
diagram is the high order or most significant bit. That is, the bit
labeled 0 is the most significant bit. For example, the following
diagram represents the value 170 (decimal).

0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|1 0 1 0 1 0 1 0|
+-+-+-+-+-+-+-+-+

Significance of Bits

Figure 11.

Similarly, whenever a multi-octet field represents a numeric quantity
the left most bit of the whole field is the most significant bit. When
a multi-octet quantity is transmitted the most significant octet is
transmitted first.

[Page 39]

September 1981
Internet Protocol

[Page 40]

September 1981
Internet Protocol

GLOSSARY

1822
BBN Report 1822, "The Specification of the Interconnection of
a Host and an IMP". The specification of interface between a
host and the ARPANET.

ARPANET leader
The control information on an ARPANET message at the host-IMP
interface.

ARPANET message
The unit of transmission between a host and an IMP in the
ARPANET. The maximum size is about 1012 octets (8096 bits).

ARPANET packet
A unit of transmission used internally in the ARPANET between
IMPs. The maximum size is about 126 octets (1008 bits).

Destination
The destination address, an internet header field.

DF
The Don't Fragment bit carried in the flags field.

Flags
An internet header field carrying various control flags.

Fragment Offset
This internet header field indicates where in the internet
datagram a fragment belongs.

GGP
Gateway to Gateway Protocol, the protocol used primarily
between gateways to control routing and other gateway
functions.

header
Control information at the beginning of a message, segment,
datagram, packet or block of data.

ICMP
Internet Control Message Protocol, implemented in the internet
module, the ICMP is used from gateways to hosts and between
hosts to report errors and make routing suggestions.

[Page 41]

September 1981
Internet Protocol
Glossary

Identification
An internet header field carrying the identifying value
assigned by the sender to aid in assembling the fragments of a
datagram.

IHL
The internet header field Internet Header Length is the length
of the internet header measured in 32 bit words.

IMP
The Interface Message Processor, the packet switch of the
ARPANET.

Internet Address
A four octet (32 bit) source or destination address consisting
of a Network field and a Local Address field.

internet datagram
The unit of data exchanged between a pair of internet modules
(includes the internet header).

internet fragment
A portion of the data of an internet datagram with an internet
header.

Local Address
The address of a host within a network. The actual mapping of
an internet local address on to the host addresses in a
network is quite general, allowing for many to one mappings.

MF
The More-Fragments Flag carried in the internet header flags
field.

module
An implementation, usually in software, of a protocol or other
procedure.

more-fragments flag
A flag indicating whether or not this internet datagram
contains the end of an internet datagram, carried in the
internet header Flags field.

NFB
The Number of Fragment Blocks in a the data portion of an
internet fragment. That is, the length of a portion of data
measured in 8 octet units.

[Page 42]

September 1981
Internet Protocol
Glossary

octet
An eight bit byte.

Options
The internet header Options field may contain several options,
and each option may be several octets in length.

Padding
The internet header Padding field is used to ensure that the
data begins on 32 bit word boundary. The padding is zero.

Protocol
In this document, the next higher level protocol identifier,
an internet header field.

Rest
The local address portion of an Internet Address.

Source
The source address, an internet header field.

TCP
Transmission Control Protocol: A host-to-host protocol for
reliable communication in internet environments.

TCP Segment
The unit of data exchanged between TCP modules (including the
TCP header).

TFTP
Trivial File Transfer Protocol: A simple file transfer
protocol built on UDP.

Time to Live
An internet header field which indicates the upper bound on
how long this internet datagram may exist.

TOS
Type of Service

Total Length
The internet header field Total Length is the length of the
datagram in octets including internet header and data.

TTL
Time to Live

[Page 43]

September 1981
Internet Protocol
Glossary

Type of Service
An internet header field which indicates the type (or quality)
of service for this internet datagram.

UDP
User Datagram Protocol: A user level protocol for transaction
oriented applications.

User
The user of the internet protocol. This may be a higher level
protocol module, an application program, or a gateway program.

Version
The Version field indicates the format of the internet header.

[Page 44]

September 1981
Internet Protocol

REFERENCES

[1] Cerf, V., "The Catenet Model for Internetworking," Information
Processing Techniques Office, Defense Advanced Research Projects
Agency, IEN 48, July 1978.

[2] Bolt Beranek and Newman, "Specification for the Interconnection of
a Host and an IMP," BBN Technical Report 1822, Revised May 1978.

[3] Postel, J., "Internet Control Message Protocol - DARPA Internet
Program Protocol Specification," RFC 792, USC/Information Sciences
Institute, September 1981.

[4] Shoch, J., "Inter-Network Naming, Addressing, and Routing,"
COMPCON, IEEE Computer Society, Fall 1978.

[5] Postel, J., "Address Mappings," RFC 796, USC/Information Sciences
Institute, September 1981.

[6] Shoch, J., "Packet Fragmentation in Inter-Network Protocols,"
Computer Networks, v. 3, n. 1, February 1979.

[7] Strazisar, V., "How to Build a Gateway", IEN 109, Bolt Beranek and
Newman, August 1979.

[8] Postel, J., "Service Mappings," RFC 795, USC/Information Sciences
Institute, September 1981.

[9] Postel, J., "Assigned Numbers," RFC 790, USC/Information Sciences
Institute, September 1981.