ipv6 traffic statistics

Volkswagen, the Internet of Things, and Cheating Sensors

By A.R. Guess

by Angela Guess Forbes contributor Theo Priestley recently wrote, “Gartner IT analyst Doug Laney defined the 3 V’s of Big Data (Volume, Variety, Velocity) in a 2001 MetaGroup research publication. Since then there have been revisions by various analysts and vendors, but this week another V exposed a severe weakness not only in Big Data…

The post Volkswagen, the Internet of Things, and Cheating Sensors appeared first on DATAVERSITY.

Read more here:: www.dataversity.net/feed/

RFC 3232 – Assigned Numbers: RFC 1700 is Replaced by an On-line Database

Network Working Group                                J. Reynolds, Editor
Request for Comments: 3232                                    RFC Editor
Obsoletes: 1700                                             January 2002
Category: Informational


Assigned Numbers: RFC 1700 is Replaced by an On-line Database


Status of this Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard of any kind.  Distribution of this
   memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2002).  All Rights Reserved.

Abstract

   This memo obsoletes RFC 1700 (STD 2) "Assigned Numbers", which
   contained an October 1994 snapshot of assigned Internet protocol
   parameters.

Description

   From November 1977 through October 1994, the Internet Assigned
   Numbers Authority (IANA) periodically published tables of the
   Internet protocol parameter assignments in RFCs entitled, "Assigned
   Numbers".  The most current of these Assigned Numbers RFCs had
   Standard status and carried the designation: STD 2.  At this time,
   the latest STD 2 is RFC 1700.

   Since 1994, this sequence of RFCs have been replaced by an online
   database accessible through a web page (currently, www.iana.org).
   The purpose of the present RFC is to note this fact and to officially
   obsolete RFC 1700, whose status changes to Historic.  RFC 1700 is
   obsolete, and its values are incomplete and in some cases may be
   wrong.

   We expect this series to be revived in the future by the new IANA
   organization.

Security Considerations

   This memo does not affect the technical security of the Internet.





Reynolds                     Informational                      [Page 1]

RFC 3232         RFC 1700 Replaced by On-line Database      January 2002


Author's Address

   Joyce K. Reynolds
   RFC Editor
   4676 Admiralty Way
   Marina del Rey, CA  90292
   USA

   EMail: rfc-editor@rfc-editor.org

RFC 2464 – Transmission of IPv6 Packets over Ethernet Networks

Network Working Group M. Crawford
Request for Comments: 2464 Fermilab
Obsoletes: 1972 December 1998
Category: Standards Track

Transmission of IPv6 Packets over Ethernet Networks

Status of this Memo

This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.

Copyright Notice

Copyright (C) The Internet Society (1998). All Rights Reserved.

1. Introduction

This document specifies the frame format for transmission of IPv6
packets and the method of forming IPv6 link-local addresses and
statelessly autoconfigured addresses on Ethernet networks. It also
specifies the content of the Source/Target Link-layer Address option
used in Router Solicitation, Router Advertisement, Neighbor
Solicitation, Neighbor Advertisement and Redirect messages when those
messages are transmitted on an Ethernet.

This document replaces RFC 1972, "A Method for the Transmission of
IPv6 Packets over Ethernet Networks", which will become historic.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC 2119].

2. Maximum Transmission Unit

The default MTU size for IPv6 [IPV6] packets on an Ethernet is 1500
octets. This size may be reduced by a Router Advertisement [DISC]
containing an MTU option which specifies a smaller MTU, or by manual
configuration of each node. If a Router Advertisement received on an
Ethernet interface has an MTU option specifying an MTU larger than
1500, or larger than a manually configured value, that MTU option may
be logged to system management but must be otherwise ignored.

For purposes of this document, information received from DHCP is
considered "manually configured" and the term Ethernet includes
CSMA/CD and full-duplex subnetworks based on ISO/IEC 8802-3, with
various data rates.

3. Frame Format

IPv6 packets are transmitted in standard Ethernet frames. The
Ethernet header contains the Destination and Source Ethernet
addresses and the Ethernet type code, which must contain the value
86DD hexadecimal. The data field contains the IPv6 header followed
immediately by the payload, and possibly padding octets to meet the
minimum frame size for the Ethernet link.

0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination |
+- -+
| Ethernet |
+- -+
| Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source |
+- -+
| Ethernet |
+- -+
| Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1 0 0 0 0 1 1 0 1 1 0 1 1 1 0 1|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IPv6 |
+- -+
| header |
+- -+
| and |
+- -+
/ payload ... /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

(Each tic mark represents one bit.)

4. Stateless Autoconfiguration

The Interface Identifier [AARCH] for an Ethernet interface is based
on the EUI-64 identifier [EUI64] derived from the interface's built-
in 48-bit IEEE 802 address. The EUI-64 is formed as follows.
(Canonical bit order is assumed throughout.)

The OUI of the Ethernet address (the first three octets) becomes the
company_id of the EUI-64 (the first three octets). The fourth and
fifth octets of the EUI are set to the fixed value FFFE hexadecimal.
The last three octets of the Ethernet address become the last three
octets of the EUI-64.

The Interface Identifier is then formed from the EUI-64 by
complementing the "Universal/Local" (U/L) bit, which is the next-to-
lowest order bit of the first octet of the EUI-64. Complementing
this bit will generally change a 0 value to a 1, since an interface's
built-in address is expected to be from a universally administered
address space and hence have a globally unique value. A universally
administered IEEE 802 address or an EUI-64 is signified by a 0 in the
U/L bit position, while a globally unique IPv6 Interface Identifier
is signified by a 1 in the corresponding position. For further
discussion on this point, see [AARCH].

For example, the Interface Identifier for an Ethernet interface whose
built-in address is, in hexadecimal,

34-56-78-9A-BC-DE

would be

36-56-78-FF-FE-9A-BC-DE.

A different MAC address set manually or by software should not be
used to derive the Interface Identifier. If such a MAC address must
be used, its global uniqueness property should be reflected in the
value of the U/L bit.

An IPv6 address prefix used for stateless autoconfiguration [ACONF]
of an Ethernet interface must have a length of 64 bits.

5. Link-Local Addresses

The IPv6 link-local address [AARCH] for an Ethernet interface is
formed by appending the Interface Identifier, as defined above, to
the prefix FE80::/64.

10 bits 54 bits 64 bits
+----------+-----------------------+----------------------------+
|1111111010| (zeros) | Interface Identifier |
+----------+-----------------------+----------------------------+

6. Address Mapping -- Unicast

The procedure for mapping IPv6 unicast addresses into Ethernet link-
layer addresses is described in [DISC]. The Source/Target Link-layer
Address option has the following form when the link layer is
Ethernet.

0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+- Ethernet -+
| |
+- Address -+
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Option fields:

Type 1 for Source Link-layer address.
2 for Target Link-layer address.

Length 1 (in units of 8 octets).

Ethernet Address
The 48 bit Ethernet IEEE 802 address, in canonical bit
order. This is the address the interface currently
responds to, and may be different from the built-in
address used to derive the Interface Identifier.

7. Address Mapping -- Multicast

An IPv6 packet with a multicast destination address DST, consisting
of the sixteen octets DST[1] through DST[16], is transmitted to the
Ethernet multicast address whose first two octets are the value 3333
hexadecimal and whose last four octets are the last four octets of
DST.

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0 0 1 1 0 0 1 1|0 0 1 1 0 0 1 1|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DST[13] | DST[14] |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| DST[15] | DST[16] |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

8. Differences From RFC 1972

The following are the functional differences between this
specification and RFC 1972.

The Address Token, which was a node's 48-bit MAC address, is
replaced with the Interface Identifier, which is 64 bits in
length and based on the EUI-64 format [EUI64]. An IEEE-defined
mapping exists from 48-bit MAC addresses to EUI-64 form.

A prefix used for stateless autoconfiguration must now be 64 bits
long rather than 80. The link-local prefix is also shortened to
64 bits.

9. Security Considerations

The method of derivation of Interface Identifiers from MAC addresses
is intended to preserve global uniqueness when possible. However,
there is no protection from duplication through accident or forgery.

10. References

[AARCH] Hinden, R. and S. Deering "IP Version 6 Addressing
Architecture", RFC 2373, July 1998.

[ACONF] Thomson, S. and T. Narten, "IPv6 Stateless Address
Autoconfiguration", RFC 2462, December 1998.

[DISC] Narten, T., Nordmark, E. and W. Simpson, "Neighbor Discovery
for IP Version 6 (IPv6)", RFC 2461, December 1998.

[EUI64] "Guidelines For 64-bit Global Identifier (EUI-64)",
http://standards.ieee.org/db/oui/tutorials/EUI64.html

[IPV6] Deering, S. and R. Hinden, "Internet Protocol, Version 6
(IPv6) Specification", RFC 2460, December 1998.

[RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.

11. Author's Address

Matt Crawford
Fermilab MS 368
PO Box 500
Batavia, IL 60510
USA

Phone: +1 630 840-3461
EMail: crawdad@fnal.gov

12. Full Copyright Statement

Copyright (C) The Internet Society (1998). All Rights Reserved.

This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.

The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.

This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


RFC 1885 – Internet Control Message Protocol (ICMPv6) for IPv6 (OBSOLETE)

 
Network Working Group             A. Conta, Digital Equipment Corporation
Request for Comments: 1885 S. Deering, Xerox PARC
Category: Standards Track December 1995

Internet Control Message Protocol (ICMPv6)
for the Internet Protocol Version 6 (IPv6)
Specification

Status of this Memo

This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.

Abstract

This document specifies a set of Internet Control Message Protocol
(ICMP) messages for use with version 6 of the Internet Protocol
(IPv6). The Internet Group Management Protocol (IGMP) messages
specified in STD 5, RFC 1112 have been merged into ICMP, for IPv6,
and are included in this document.

Table of Contents

1. Introduction........................................3

2. ICMPv6 (ICMP for IPv6)..............................3

2.1 Message General Format.......................3

2.2 Message Source Address Determination.........4

2.3 Message Checksum Calculation.................5

2.4 Message Processing Rules.....................5

3. ICMPv6 Error Messages...............................8

3.1 Destination Unreachable Message..............8

3.2 Packet Too Big Message......................10

3.3 Time Exceeded Message.......................11

3.4 Parameter Problem Message...................12

4. ICMPv6 Informational Messages......................14

4.1 Echo Request Message........................14

4.2 Echo Reply Message..........................15

4.3 Group Membership Messages...................17

5. References.........................................19

6. Acknowledgements...................................19

7. Security Considerations............................19

Authors' Addresses....................................20

1. Introduction

The Internet Protocol, version 6 (IPv6) is a new version of IP. IPv6
uses the Internet Control Message Protocol (ICMP) as defined for IPv4
[RFC-792], with a number of changes. The Internet Group Membership
Protocol (IGMP) specified for IPv4 [RFC-1112] has also been revised
and has been absorbed into ICMP for IPv6. The resulting protocol is
called ICMPv6, and has an IPv6 Next Header value of 58.

This document describes the format of a set of control messages used
in ICMPv6. It does not describe the procedures for using these
messages to achieve functions like Path MTU discovery or multicast
group membership maintenance; such procedures are described in other
documents (e.g., [RFC-1112, RFC-1191]). Other documents may also
introduce additional ICMPv6 message types, such as Neighbor Discovery
messages [IPv6-DISC], subject to the general rules for ICMPv6
messages given in section 2 of this document.

Terminology defined in the IPv6 specification [IPv6] and the IPv6
Routing and Addressing specification [IPv6-ADDR] applies to this
document as well.

2. ICMPv6 (ICMP for IPv6)

ICMPv6 is used by IPv6 nodes to report errors encountered in
processing packets, and to perform other internet-layer functions,
such as diagnostics (ICMPv6 "ping") and multicast membership
reporting. ICMPv6 is an integral part of IPv6 and MUST be fully
implemented by every IPv6 node.

2.1 Message General Format

ICMPv6 messages are grouped into two classes: error messages and
informational messages. Error messages are identified as such by
having a zero in the high-order bit of their message Type field
values. Thus, error messages have message Types from 0 to 127;
informational messages have message Types from 128 to 255.

This document defines the message formats for the following ICMPv6
messages:

ICMPv6 error messages:

1 Destination Unreachable (see section 3.1)
2 Packet Too Big (see section 3.2)
3 Time Exceeded (see section 3.3)
4 Parameter Problem (see section 3.4)

ICMPv6 informational messages:

128 Echo Request (see section 4.1)
129 Echo Reply (see section 4.2)
130 Group Membership Query (see section 4.3)
131 Group Membership Report (see section 4.3)
132 Group Membership Reduction (see section 4.3)

Every ICMPv6 message is preceded by an IPv6 header and zero or more
IPv6 extension headers. The ICMPv6 header is identified by a Next
Header value of 58 in the immediately preceding header. (NOTE: this
is different than the value used to identify ICMP for IPv4.)

The ICMPv6 messages have the following general format:

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Message Body +
| |

The type field indicates the type of the message. Its value
determines the format of the remaining data.

The code field depends on the message type. It is used to create an
additional level of message granularity.

The checksum field is used to detect data corruption in the ICMPv6
message and parts of the IPv6 header.

2.2 Message Source Address Determination

A node that sends an ICMPv6 message has to determine both the Source
and Destination IPv6 Addresses in the IPv6 header before calculating
the checksum. If the node has more than one unicast address, it must
choose the Source Address of the message as follows:

(a) If the message is a response to a message sent to one of the
node's unicast addresses, the Source Address of the reply must
be that same address.

(b) If the message is a response to a message sent to a multicast or
anycast group in which the node is a member, the Source Address
of the reply must be a unicast address belonging to the
interface on which the multicast or anycast packet was received.

(c) If the message is a response to a message sent to an address
that does not belong to the node, the Source Address should be
that unicast address belonging to the node that will be most
helpful in diagnosing the error. For example, if the message is
a response to a packet forwarding action that cannot complete
successfully, the Source Address should be a unicast address
belonging to the interface on which the packet forwarding
failed.

(d) Otherwise, the node's routing table must be examined to
determine which interface will be used to transmit the message
to its destination, and a unicast address belonging to that
interface must be used as the Source Address of the message.

2.3 Message Checksum Calculation

The checksum is the 16-bit one's complement of the one's complement
sum of the entire ICMPv6 message starting with the ICMPv6 message
type field, prepended with a "pseudo-header" of IPv6 header fields,
as specified in [IPv6, section 8.1]. The Next Header value used in
the pseudo-header is 58. (NOTE: the inclusion of a pseudo-header in
the ICMPv6 checksum is a change from IPv4; see [IPv6] for the
rationale for this change.)

For computing the checksum, the checksum field is set to zero.

2.4 Message Processing Rules

Implementations MUST observe the following rules when processing
ICMPv6 messages (from [RFC-1122]):

(a) If an ICMPv6 error message of unknown type is received, it MUST
be passed to the upper layer.

(b) If an ICMPv6 informational message of unknown type is received,
it MUST be silently discarded.

(c) Every ICMPv6 error message (type < 128) includes as much of the
IPv6 offending (invoking) packet (the packet that caused the
error) as will fit without making the error message packet
exceed 576 octets.

(d) In those cases where the internet-layer protocol is required to
pass an ICMPv6 error message to the upper-layer protocol, the
upper-layer protocol type is extracted from the original packet
(contained in the body of the ICMPv6 error message) and used to
select the appropriate upper-layer protocol entity to handle the
error.

If the original packet had an unusually large amount of
extension headers, it is possible that the upper-layer protocol
type may not be present in the ICMPv6 message, due to truncation
of the original packet to meet the 576-octet limit. In that
case, the error message is silently dropped after any IPv6-layer
processing.

(e) An ICMPv6 error message MUST NOT be sent as a result of
receiving:

(e.1) an ICMPv6 error message, or

(e.2) a packet destined to an IPv6 multicast address (there are
two exceptions to this rule: (1) the Packet Too Big
Message - Section 3.2 - to allow Path MTU discovery to
work for IPv6 multicast, and (2) the Parameter Problem
Message, Code 2 - Section 3.4 - reporting an unrecognized
IPv6 option that has the Option Type highest-order two
bits set to 10), or

(e.3) a packet sent as a link-layer multicast, (the exception
from e.2 applies to this case too), or

(e.4) a packet sent as a link-layer broadcast, (the exception
from e.2 applies to this case too), or

(e.5) a packet whose source address does not uniquely identify
a single node -- e.g., the IPv6 Unspecified Address, an
IPv6 multicast address, or an address known by the ICMP
message sender to be an IPv6 anycast address.

(f) Finally, to each sender of an erroneous data packet, an IPv6
node MUST limit the rate of ICMPv6 error messages sent, in order
to limit the bandwidth and forwarding costs incurred by the
error messages when a generator of erroneous packets does not
respond to those error messages by ceasing its transmissions.

There are a variety of ways of implementing the rate-limiting
function, for example:

(f.1) Timer-based - for example, limiting the rate of
transmission of error messages to a given source, or to
any source, to at most once every T milliseconds.

(f.2) Bandwidth-based - for example, limiting the rate at
which error messages are sent from a particular interface
to some fraction F of the attached link's bandwidth.

The limit parameters (e.g., T or F in the above examples) MUST
be configurable for the node, with a conservative default value
(e.g., T = 1 second, NOT 0 seconds, or F = 2 percent, NOT 100
percent).

The following sections describe the message formats for the above
ICMPv6 messages.

3. ICMPv6 Error Messages

3.1 Destination Unreachable Message

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Unused |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| As much of invoking packet |
+ as will fit without the ICMPv6 packet +
| exceeding 576 octets |

IPv6 Fields:

Destination Address

Copied from the Source Address field of the invoking
packet.

ICMPv6 Fields:

Type 1

Code 0 - no route to destination
1 - communication with destination
administratively prohibited
2 - not a neighbor
3 - address unreachable
4 - port unreachable

Unused This field is unused for all code values.
It must be initialized to zero by the sender
and ignored by the receiver.
Description

A Destination Unreachable message SHOULD be generated by a router, or
by the IPv6 layer in the originating node, in response to a packet
that cannot be delivered to its destination address for reasons other
than congestion. (An ICMPv6 message MUST NOT be generated if a
packet is dropped due to congestion.)

If the reason for the failure to deliver is lack of a matching entry
in the forwarding node's routing table, the Code field is set to 0
(NOTE: this error can occur only in nodes that do not hold a "default
route" in their routing tables).

If the reason for the failure to deliver is administrative
prohibition, e.g., a "firewall filter", the Code field is set to 1.

If the reason for the failure to deliver is that the next destination
address in the Routing header is not a neighbor of the processing
node but the "strict" bit is set for that address, then the Code
field is set to 2.

If there is any other reason for the failure to deliver, e.g.,
inability to resolve the IPv6 destination address into a
corresponding link address, or a link-specific problem of some sort,
then the Code field is set to 3.

A destination node SHOULD send a Destination Unreachable message with
Code 4 in response to a packet for which the transport protocol
(e.g., UDP) has no listener, if that transport protocol has no
alternative means to inform the sender.

Upper layer notification

A node receiving the ICMPv6 Destination Unreachable message MUST
notify the upper-layer protocol.

3.2 Packet Too Big Message

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MTU |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| As much of invoking packet |
+ as will fit without the ICMPv6 packet +
| exceeding 576 octets |

IPv6 Fields:

Destination Address

Copied from the Source Address field of the invoking
packet.

ICMPv6 Fields:

Type 2

Code 0

MTU The Maximum Transmission Unit of the next-hop link.

Description

A Packet Too Big MUST be sent by a router in response to a packet
that it cannot forward because the packet is larger than the MTU of
the outgoing link. The information in this message is used as part
of the Path MTU Discovery process [RFC-1191].

Sending a Packet Too Big Message makes an exception to one of the
rules of when to send an ICMPv6 error message, in that unlike other
messages, it is sent in response to a packet received with an IPv6
multicast destination address, or a link-layer multicast or link-
layer broadcast address.

Upper layer notification

An incoming Packet Too Big message MUST be passed to the upper-layer
protocol.

3.3 Time Exceeded Message

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Unused |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| As much of invoking packet |
+ as will fit without the ICMPv6 packet +
| exceeding 576 octets |

IPv6 Fields:

Destination Address
Copied from the Source Address field of the invoking
packet.

ICMPv6 Fields:

Type 3

Code 0 - hop limit exceeded in transit

1 - fragment reassembly time exceeded

Unused This field is unused for all code values.
It must be initialized to zero by the sender
and ignored by the receiver.

Description

If a router receives a packet with a Hop Limit of zero, or a router
decrements a packet's Hop Limit to zero, it MUST discard the packet
and send an ICMPv6 Time Exceeded message with Code 0 to the source of
the packet. This indicates either a routing loop or too small an
initial Hop Limit value.

The router sending an ICMPv6 Time Exceeded message with Code 0 SHOULD
consider the receiving interface of the packet as the interface on
which the packet forwarding failed in following rule (d) for
selecting the Source Address of the message.

Upper layer notification

An incoming Time Exceeded message MUST be passed to the upper-layer
protocol.

3.4 Parameter Problem Message

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| As much of invoking packet |
+ as will fit without the ICMPv6 packet +
| exceeding 576 octets |

IPv6 Fields:

Destination Address

Copied from the Source Address field of the invoking
packet.

ICMPv6 Fields:

Type 4

Code 0 - erroneous header field encountered

1 - unrecognized Next Header type encountered

2 - unrecognized IPv6 option encountered

Pointer Identifies the octet offset within the
invoking packet where the error was detected.

The pointer will point beyond the end of the ICMPv6
packet if the field in error is beyond what can fit
in the 576-byte limit of an ICMPv6 error message.

Description

If an IPv6 node processing a packet finds a problem with a field in
the IPv6 header or extension headers such that it cannot complete
processing the packet, it MUST discard the packet and SHOULD send an
ICMPv6 Parameter Problem message to the packet's source, indicating
the type and location of the problem.

The pointer identifies the octet of the original packet's header
where the error was detected. For example, an ICMPv6 message with
Type field = 4, Code field = 1, and Pointer field = 40 would indicate

that the IPv6 extension header following the IPv6 header of the
original packet holds an unrecognized Next Header field value.

Upper layer notification

A node receiving this ICMPv6 message MUST notify the upper-layer
protocol.

4. ICMPv6 Informational Messages

4.1 Echo Request Message

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identifier | Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+-

IPv6 Fields:

Destination Address

Any legal IPv6 address.

ICMPv6 Fields:

Type 128

Code 0

Identifier An identifier to aid in matching Echo Replies
to this Echo Request. May be zero.

Sequence Number

A sequence number to aid in matching Echo Replies
to this Echo Request. May be zero.

Data Zero or more octets of arbitrary data.

Description

Every node MUST implement an ICMPv6 Echo responder function that
receives Echo Requests and sends corresponding Echo Replies. A node
SHOULD also implement an application-layer interface for sending Echo
Requests and receiving Echo Replies, for diagnostic purposes.

Upper layer notification

A node receiving this ICMPv6 message MAY notify the upper-layer
protocol.

4.2 Echo Reply Message

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identifier | Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+-

IPv6 Fields:

Destination Address

Copied from the Source Address field of the invoking
Echo Request packet.

ICMPv6 Fields:

Type 129

Code 0

Identifier The identifier from the invoking Echo Request message.

Sequence The sequence number from the invoking Echo Request
Number message.

Data The data from the invoking Echo Request message.

Description

Every node MUST implement an ICMPv6 Echo responder function that
receives Echo Requests and sends corresponding Echo Replies. A node
SHOULD also implement an application-layer interface for sending Echo
Requests and receiving Echo Replies, for diagnostic purposes.

The source address of an Echo Reply sent in response to a unicast
Echo Request message MUST be the same as the destination address of
that Echo Request message.

An Echo Reply SHOULD be sent in response to an Echo Request message
sent to an IPv6 multicast address. The source address of the reply
MUST be a unicast address belonging to the interface on which the
multicast Echo Request message was received.

The data received in the ICMPv6 Echo Request message MUST be returned
entirely and unmodified in the ICMPv6 Echo Reply message, unless the
Echo Reply would exceed the MTU of the path back to the Echo
requester, in which case the data is truncated to fit that path MTU.

Upper layer notification

Echo Reply messages MUST be passed to the ICMPv6 user interface,
unless the corresponding Echo Request originated in the IP layer.

4.3 Group Membership Messages

The ICMPv6 Group Membership Messages have the following format:

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Maximum Response Delay | Unused |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| Multicast |
+ +
| Address |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

IPv6 Fields:

Destination Address

In a Group Membership Query message, the multicast
address of the group being queried, or the Link-Local
All-Nodes multicast address.

In a Group Membership Report or a Group Membership
Reduction message, the multicast address of the
group being reported or terminated.

Hop Limit 1

ICMPv6 Fields:

Type 130 - Group Membership Query
131 - Group Membership Report
132 - Group Membership Reduction

Code 0

Maximum Response Delay

In Query messages, the maximum time that responding
Report messages may be delayed, in milliseconds.

In Report and Reduction messages, this field is
is initialized to zero by the sender and ignored by
receivers.

Unused Initialized to zero by the sender; ignored by receivers.

Multicast Address

The address of the multicast group about which the
message is being sent. In Query messages, the Multicast
Address field may be zero, implying a query for all
groups.

Description

The ICMPv6 Group Membership messages are used to convey information
about multicast group membership from nodes to their neighboring
routers. The details of their usage is given in [RFC-1112].

5. References

[IPv6] Deering, S., and R. Hinden, "Internet Protocol, Version
6, Specification", RFC 1883, Xerox PARC, Ipsilon
Networks, December 1995.

[IPv6-ADDR] Hinden, R., and S. Deering, Editors, "IP Version 6
Addressing Architecture", RFC 1884, Ipsilon Networks,
Xerox PARC, December 1995.

[IPv6-DISC] Narten, T., Nordmark, E., and W. Simpson, "Neighbor
Discovery for IP Version 6 (IPv6)", Work in Progress.

[RFC-792] Postel, J., "Internet Control Message Protocol", STD 5,
RFC 792, USC/Information Sciences Institute, September
1981.

[RFC-1112] Deering, S., "Host Extensions for IP Multicasting", STD
5, RFC 1112, Stanford University, August 1989.

[RFC-1122] Braden, R., "Requirements for Internet Hosts -
Communication Layers", STD 3, RFC 1122, USC/Information
Sciences Institute, October 1989.

[RFC-1191] Mogul, J., and S. Deering, "Path MTU Discovery", RFC
1191, DECWRL, Stanford University, November 1990.

6. Acknowledgements

The document is derived from previous ICMP drafts of the SIPP and
IPng working group.

The IPng working group and particularly Robert Elz, Jim Bound, Bill
Simpson, Thomas Narten, Charlie Lynn, Bill Fink, and Scott Bradner
(in chronological order) provided extensive review information and
feedback.

7. Security Considerations

Security issues are not discussed in this memo.

Authors' Addresses:

Alex Conta Stephen Deering
Digital Equipment Corporation Xerox Palo Alto Research Center
110 Spitbrook Rd 3333 Coyote Hill Road
Nashua, NH 03062 Palo Alto, CA 94304

Phone: +1-603-881-0744 Phone: +1-415-812-4839
EMail: conta@zk3.dec.com EMail: deering@parc.xerox.com


RFC 792 – Internet Control Message Protocol


Network Working Group                                          J. Postel
Request for Comments:  792                                           ISI
                                                          September 1981
Updates:  RFCs 777, 760
Updates:  IENs 109, 128

                   INTERNET CONTROL MESSAGE PROTOCOL

                         DARPA INTERNET PROGRAM
                         PROTOCOL SPECIFICATION

Introduction

   The Internet Protocol (IP) [1] is used for host-to-host datagram
   service in a system of interconnected networks called the
   Catenet [2].  The network connecting devices are called Gateways.
   These gateways communicate between themselves for control purposes
   via a Gateway to Gateway Protocol (GGP) [3,4].  Occasionally a
   gateway or destination host will communicate with a source host, for
   example, to report an error in datagram processing.  For such
   purposes this protocol, the Internet Control Message Protocol (ICMP),
   is used.  ICMP, uses the basic support of IP as if it were a higher
   level protocol, however, ICMP is actually an integral part of IP, and
   must be implemented by every IP module.

   ICMP messages are sent in several situations:  for example, when a
   datagram cannot reach its destination, when the gateway does not have
   the buffering capacity to forward a datagram, and when the gateway
   can direct the host to send traffic on a shorter route.

   The Internet Protocol is not designed to be absolutely reliable.  The
   purpose of these control messages is to provide feedback about
   problems in the communication environment, not to make IP reliable.
   There are still no guarantees that a datagram will be delivered or a
   control message will be returned.  Some datagrams may still be
   undelivered without any report of their loss.  The higher level
   protocols that use IP must implement their own reliability procedures
   if reliable communication is required.

   The ICMP messages typically report errors in the processing of
   datagrams.  To avoid the infinite regress of messages about messages
   etc., no ICMP messages are sent about ICMP messages.  Also ICMP
   messages are only sent about errors in handling fragment zero of
   fragemented datagrams.  (Fragment zero has the fragment offeset equal
   zero).

                                                                [Page 1]

                                                          September 1981
RFC 792

Message Formats

   ICMP messages are sent using the basic IP header.  The first octet of
   the data portion of the datagram is a ICMP type field; the value of
   this field determines the format of the remaining data.  Any field
   labeled "unused" is reserved for later extensions and must be zero
   when sent, but receivers should not use these fields (except to
   include them in the checksum).  Unless otherwise noted under the
   individual format descriptions, the values of the internet header
   fields are as follows:

   Version

      4

   IHL

      Internet header length in 32-bit words.

   Type of Service

      0

   Total Length

      Length of internet header and data in octets.

   Identification, Flags, Fragment Offset

      Used in fragmentation, see [1].

   Time to Live

      Time to live in seconds; as this field is decremented at each
      machine in which the datagram is processed, the value in this
      field should be at least as great as the number of gateways which
      this datagram will traverse.

   Protocol

      ICMP = 1

   Header Checksum

      The 16 bit one's complement of the one's complement sum of all 16
      bit words in the header.  For computing the checksum, the checksum
      field should be zero.  This checksum may be replaced in the
      future.

[Page 2]                                                                

September 1981                                                          
RFC 792

   Source Address

      The address of the gateway or host that composes the ICMP message.
      Unless otherwise noted, this can be any of a gateway's addresses.

   Destination Address

      The address of the gateway or host to which the message should be
      sent.

                                                                [Page 3]

                                                          September 1981
RFC 792

Destination Unreachable Message

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Type      |     Code      |          Checksum             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                             unused                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      Internet Header + 64 bits of Original Data Datagram      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   IP Fields:

   Destination Address

      The source network and address from the original datagram's data.

   ICMP Fields:

   Type

      3

   Code

      0 = net unreachable;

      1 = host unreachable;

      2 = protocol unreachable;

      3 = port unreachable;

      4 = fragmentation needed and DF set;

      5 = source route failed.

   Checksum

      The checksum is the 16-bit ones's complement of the one's
      complement sum of the ICMP message starting with the ICMP Type.
      For computing the checksum , the checksum field should be zero.
      This checksum may be replaced in the future.

   Internet Header + 64 bits of Data Datagram

      The internet header plus the first 64 bits of the original

[Page 4]                                                                

September 1981                                                          
RFC 792

      datagram's data.  This data is used by the host to match the
      message to the appropriate process.  If a higher level protocol
      uses port numbers, they are assumed to be in the first 64 data
      bits of the original datagram's data.

   Description

      If, according to the information in the gateway's routing tables,
      the network specified in the internet destination field of a
      datagram is unreachable, e.g., the distance to the network is
      infinity, the gateway may send a destination unreachable message
      to the internet source host of the datagram.  In addition, in some
      networks, the gateway may be able to determine if the internet
      destination host is unreachable.  Gateways in these networks may
      send destination unreachable messages to the source host when the
      destination host is unreachable.

      If, in the destination host, the IP module cannot deliver the
      datagram  because the indicated protocol module or process port is
      not active, the destination host may send a destination
      unreachable message to the source host.

      Another case is when a datagram must be fragmented to be forwarded
      by a gateway yet the Don't Fragment flag is on.  In this case the
      gateway must discard the datagram and may return a destination
      unreachable message.

      Codes 0, 1, 4, and 5 may be received from a gateway.  Codes 2 and
      3 may be received from a host.

                                                                [Page 5]

                                                          September 1981
RFC 792

Time Exceeded Message

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Type      |     Code      |          Checksum             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                             unused                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      Internet Header + 64 bits of Original Data Datagram      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   IP Fields:

   Destination Address

      The source network and address from the original datagram's data.

   ICMP Fields:

   Type

      11

   Code

      0 = time to live exceeded in transit;

      1 = fragment reassembly time exceeded.

   Checksum

      The checksum is the 16-bit ones's complement of the one's
      complement sum of the ICMP message starting with the ICMP Type.
      For computing the checksum , the checksum field should be zero.
      This checksum may be replaced in the future.

   Internet Header + 64 bits of Data Datagram

      The internet header plus the first 64 bits of the original
      datagram's data.  This data is used by the host to match the
      message to the appropriate process.  If a higher level protocol
      uses port numbers, they are assumed to be in the first 64 data
      bits of the original datagram's data.

   Description

      If the gateway processing a datagram finds the time to live field

[Page 6]                                                                

September 1981                                                          
RFC 792

      is zero it must discard the datagram.  The gateway may also notify
      the source host via the time exceeded message.

      If a host reassembling a fragmented datagram cannot complete the
      reassembly due to missing fragments within its time limit it
      discards the datagram, and it may send a time exceeded message.

      If fragment zero is not available then no time exceeded need be
      sent at all.

      Code 0 may be received from a gateway.  Code 1 may be received
      from a host.

                                                                [Page 7]

                                                          September 1981
RFC 792

Parameter Problem Message

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Type      |     Code      |          Checksum             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |    Pointer    |                   unused                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      Internet Header + 64 bits of Original Data Datagram      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   IP Fields:

   Destination Address

      The source network and address from the original datagram's data.

   ICMP Fields:

   Type

      12

   Code

      0 = pointer indicates the error.

   Checksum

      The checksum is the 16-bit ones's complement of the one's
      complement sum of the ICMP message starting with the ICMP Type.
      For computing the checksum , the checksum field should be zero.
      This checksum may be replaced in the future.

   Pointer

      If code = 0, identifies the octet where an error was detected.

   Internet Header + 64 bits of Data Datagram

      The internet header plus the first 64 bits of the original
      datagram's data.  This data is used by the host to match the
      message to the appropriate process.  If a higher level protocol
      uses port numbers, they are assumed to be in the first 64 data
      bits of the original datagram's data.

[Page 8]                                                                

September 1981                                                          
RFC 792

   Description

      If the gateway or host processing a datagram finds a problem with
      the header parameters such that it cannot complete processing the
      datagram it must discard the datagram.  One potential source of
      such a problem is with incorrect arguments in an option.  The
      gateway or host may also notify the source host via the parameter
      problem message.  This message is only sent if the error caused
      the datagram to be discarded.

      The pointer identifies the octet of the original datagram's header
      where the error was detected (it may be in the middle of an
      option).  For example, 1 indicates something is wrong with the
      Type of Service, and (if there are options present) 20 indicates
      something is wrong with the type code of the first option.

      Code 0 may be received from a gateway or a host.

                                                                [Page 9]

                                                          September 1981
RFC 792

Source Quench Message

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Type      |     Code      |          Checksum             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                             unused                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      Internet Header + 64 bits of Original Data Datagram      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   IP Fields:

   Destination Address

      The source network and address of the original datagram's data.

   ICMP Fields:

   Type

      4

   Code

      0

   Checksum

      The checksum is the 16-bit ones's complement of the one's
      complement sum of the ICMP message starting with the ICMP Type.
      For computing the checksum , the checksum field should be zero.
      This checksum may be replaced in the future.

   Internet Header + 64 bits of Data Datagram

      The internet header plus the first 64 bits of the original
      datagram's data.  This data is used by the host to match the
      message to the appropriate process.  If a higher level protocol
      uses port numbers, they are assumed to be in the first 64 data
      bits of the original datagram's data.

   Description

      A gateway may discard internet datagrams if it does not have the
      buffer space needed to queue the datagrams for output to the next
      network on the route to the destination network.  If a gateway

[Page 10]                                                               

September 1981                                                          
RFC 792

      discards a datagram, it may send a source quench message to the
      internet source host of the datagram.  A destination host may also
      send a source quench message if datagrams arrive too fast to be
      processed.  The source quench message is a request to the host to
      cut back the rate at which it is sending traffic to the internet
      destination.  The gateway may send a source quench message for
      every message that it discards.  On receipt of a source quench
      message, the source host should cut back the rate at which it is
      sending traffic to the specified destination until it no longer
      receives source quench messages from the gateway.  The source host
      can then gradually increase the rate at which it sends traffic to
      the destination until it again receives source quench messages.

      The gateway or host may send the source quench message when it
      approaches its capacity limit rather than waiting until the
      capacity is exceeded.  This means that the data datagram which
      triggered the source quench message may be delivered.

      Code 0 may be received from a gateway or a host.

                                                               [Page 11]

                                                          September 1981
RFC 792

Redirect Message

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Type      |     Code      |          Checksum             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                 Gateway Internet Address                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      Internet Header + 64 bits of Original Data Datagram      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   IP Fields:

   Destination Address

      The source network and address of the original datagram's data.

   ICMP Fields:

   Type

      5

   Code

      0 = Redirect datagrams for the Network.

      1 = Redirect datagrams for the Host.

      2 = Redirect datagrams for the Type of Service and Network.

      3 = Redirect datagrams for the Type of Service and Host.

   Checksum

      The checksum is the 16-bit ones's complement of the one's
      complement sum of the ICMP message starting with the ICMP Type.
      For computing the checksum , the checksum field should be zero.
      This checksum may be replaced in the future.

   Gateway Internet Address

      Address of the gateway to which traffic for the network specified
      in the internet destination network field of the original
      datagram's data should be sent.

[Page 12]                                                               

September 1981                                                          
RFC 792

   Internet Header + 64 bits of Data Datagram

      The internet header plus the first 64 bits of the original
      datagram's data.  This data is used by the host to match the
      message to the appropriate process.  If a higher level protocol
      uses port numbers, they are assumed to be in the first 64 data
      bits of the original datagram's data.

   Description

      The gateway sends a redirect message to a host in the following
      situation.  A gateway, G1, receives an internet datagram from a
      host on a network to which the gateway is attached.  The gateway,
      G1, checks its routing table and obtains the address of the next
      gateway, G2, on the route to the datagram's internet destination
      network, X.  If G2 and the host identified by the internet source
      address of the datagram are on the same network, a redirect
      message is sent to the host.  The redirect message advises the
      host to send its traffic for network X directly to gateway G2 as
      this is a shorter path to the destination.  The gateway forwards
      the original datagram's data to its internet destination.

      For datagrams with the IP source route options and the gateway
      address in the destination address field, a redirect message is
      not sent even if there is a better route to the ultimate
      destination than the next address in the source route.

      Codes 0, 1, 2, and 3 may be received from a gateway.

                                                               [Page 13]

                                                          September 1981
RFC 792

Echo or Echo Reply Message

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Type      |     Code      |          Checksum             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           Identifier          |        Sequence Number        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Data ...
   +-+-+-+-+-

   IP Fields:

   Addresses

      The address of the source in an echo message will be the
      destination of the echo reply message.  To form an echo reply
      message, the source and destination addresses are simply reversed,
      the type code changed to 0, and the checksum recomputed.

   IP Fields:

   Type

      8 for echo message;

      0 for echo reply message.

   Code

      0

   Checksum

      The checksum is the 16-bit ones's complement of the one's
      complement sum of the ICMP message starting with the ICMP Type.
      For computing the checksum , the checksum field should be zero.
      If the total length is odd, the received data is padded with one
      octet of zeros for computing the checksum.  This checksum may be
      replaced in the future.

   Identifier

      If code = 0, an identifier to aid in matching echos and replies,
      may be zero.

   Sequence Number

[Page 14]                                                               

September 1981                                                          
RFC 792

      If code = 0, a sequence number to aid in matching echos and
      replies, may be zero.

   Description

      The data received in the echo message must be returned in the echo
      reply message.

      The identifier and sequence number may be used by the echo sender
      to aid in matching the replies with the echo requests.  For
      example, the identifier might be used like a port in TCP or UDP to
      identify a session, and the sequence number might be incremented
      on each echo request sent.  The echoer returns these same values
      in the echo reply.

      Code 0 may be received from a gateway or a host.

                                                               [Page 15]

                                                          September 1981
RFC 792

Timestamp or Timestamp Reply Message

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Type      |      Code     |          Checksum             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           Identifier          |        Sequence Number        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Originate Timestamp                                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Receive Timestamp                                         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Transmit Timestamp                                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   IP Fields:

   Addresses

      The address of the source in a timestamp message will be the
      destination of the timestamp reply message.  To form a timestamp
      reply message, the source and destination addresses are simply
      reversed, the type code changed to 14, and the checksum
      recomputed.

   IP Fields:

   Type

      13 for timestamp message;

      14 for timestamp reply message.

   Code

      0

   Checksum

      The checksum is the 16-bit ones's complement of the one's
      complement sum of the ICMP message starting with the ICMP Type.
      For computing the checksum , the checksum field should be zero.
      This checksum may be replaced in the future.

   Identifier

[Page 16]                                                               

September 1981                                                          
RFC 792

      If code = 0, an identifier to aid in matching timestamp and
      replies, may be zero.

   Sequence Number

      If code = 0, a sequence number to aid in matching timestamp and
      replies, may be zero.

   Description

      The data received (a timestamp) in the message is returned in the
      reply together with an additional timestamp.  The timestamp is 32
      bits of milliseconds since midnight UT.  One use of these
      timestamps is described by Mills [5].

      The Originate Timestamp is the time the sender last touched the
      message before sending it, the Receive Timestamp is the time the
      echoer first touched it on receipt, and the Transmit Timestamp is
      the time the echoer last touched the message on sending it.

      If the time is not available in miliseconds or cannot be provided
      with respect to midnight UT then any time can be inserted in a
      timestamp provided the high order bit of the timestamp is also set
      to indicate this non-standard value.

      The identifier and sequence number may be used by the echo sender
      to aid in matching the replies with the requests.  For example,
      the identifier might be used like a port in TCP or UDP to identify
      a session, and the sequence number might be incremented on each
      request sent.  The destination returns these same values in the
      reply.

      Code 0 may be received from a gateway or a host.

                                                               [Page 17]

                                                          September 1981
RFC 792

Information Request or Information Reply Message

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Type      |      Code     |          Checksum             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           Identifier          |        Sequence Number        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   IP Fields:

   Addresses

      The address of the source in a information request message will be
      the destination of the information reply message.  To form a
      information reply message, the source and destination addresses
      are simply reversed, the type code changed to 16, and the checksum
      recomputed.

   IP Fields:

   Type

      15 for information request message;

      16 for information reply message.

   Code

      0

   Checksum

      The checksum is the 16-bit ones's complement of the one's
      complement sum of the ICMP message starting with the ICMP Type.
      For computing the checksum , the checksum field should be zero.
      This checksum may be replaced in the future.

   Identifier

      If code = 0, an identifier to aid in matching request and replies,
      may be zero.

   Sequence Number

      If code = 0, a sequence number to aid in matching request and
      replies, may be zero.

[Page 18]                                                               

September 1981                                                          
RFC 792

   Description

      This message may be sent with the source network in the IP header
      source and destination address fields zero (which means "this"
      network).  The replying IP module should send the reply with the
      addresses fully specified.  This message is a way for a host to
      find out the number of the network it is on.

      The identifier and sequence number may be used by the echo sender
      to aid in matching the replies with the requests.  For example,
      the identifier might be used like a port in TCP or UDP to identify
      a session, and the sequence number might be incremented on each
      request sent.  The destination returns these same values in the
      reply.

      Code 0 may be received from a gateway or a host.

                                                               [Page 19]

                                                          September 1981
RFC 792

Summary of Message Types

    0  Echo Reply

    3  Destination Unreachable

    4  Source Quench

    5  Redirect

    8  Echo

   11  Time Exceeded

   12  Parameter Problem

   13  Timestamp

   14  Timestamp Reply

   15  Information Request

   16  Information Reply

[Page 20]                                                               

September 1981                                                          
RFC 792

References

   [1]  Postel, J. (ed.), "Internet Protocol - DARPA Internet Program
         Protocol Specification," RFC 791, USC/Information Sciences
         Institute, September 1981.

   [2]   Cerf, V., "The Catenet Model for Internetworking," IEN 48,
         Information Processing Techniques Office, Defense Advanced
         Research Projects Agency, July 1978.

   [3]   Strazisar, V., "Gateway Routing:  An Implementation
         Specification", IEN 30, Bolt Beranek and Newman, April 1979.

   [4]   Strazisar, V., "How to Build a Gateway", IEN 109, Bolt Beranek
         and Newman, August 1979.

   [5]   Mills, D., "DCNET Internet Clock Service," RFC 778, COMSAT
         Laboratories, April 1981.

RFC 2373 – IP Version 6 Addressing Architecture

 
Network Working Group                                        R. Hinden
Request for Comments: 2373 Nokia
Obsoletes: 1884 S. Deering
Category: Standards Track Cisco Systems
July 1998

IP Version 6 Addressing Architecture

Status of this Memo

This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.

Copyright Notice

Copyright (C) The Internet Society (1998). All Rights Reserved.

Abstract

This specification defines the addressing architecture of the IP
Version 6 protocol [IPV6]. The document includes the IPv6 addressing
model, text representations of IPv6 addresses, definition of IPv6
unicast addresses, anycast addresses, and multicast addresses, and an
IPv6 node's required addresses.

Table of Contents

1. Introduction.................................................2
2. IPv6 Addressing..............................................2
2.1 Addressing Model.........................................3
2.2 Text Representation of Addresses.........................3
2.3 Text Representation of Address Prefixes..................5
2.4 Address Type Representation..............................6
2.5 Unicast Addresses........................................7
2.5.1 Interface Identifiers................................8
2.5.2 The Unspecified Address..............................9
2.5.3 The Loopback Address.................................9
2.5.4 IPv6 Addresses with Embedded IPv4 Addresses.........10
2.5.5 NSAP Addresses......................................10
2.5.6 IPX Addresses.......................................10
2.5.7 Aggregatable Global Unicast Addresses...............11
2.5.8 Local-use IPv6 Unicast Addresses....................11
2.6 Anycast Addresses.......................................12
2.6.1 Required Anycast Address............................13
2.7 Multicast Addresses.....................................14

2.7.1 Pre-Defined Multicast Addresses.....................15
2.7.2 Assignment of New IPv6 Multicast Addresses..........17
2.8 A Node's Required Addresses.............................17
3. Security Considerations.....................................18
APPENDIX A: Creating EUI-64 based Interface Identifiers........19
APPENDIX B: ABNF Description of Text Representations...........22
APPENDIX C: CHANGES FROM RFC-1884..............................23
REFERENCES.....................................................24
AUTHORS' ADDRESSES.............................................25
FULL COPYRIGHT STATEMENT.......................................26

1.0 INTRODUCTION

This specification defines the addressing architecture of the IP
Version 6 protocol. It includes a detailed description of the
currently defined address formats for IPv6 [IPV6].

The authors would like to acknowledge the contributions of Paul
Francis, Scott Bradner, Jim Bound, Brian Carpenter, Matt Crawford,
Deborah Estrin, Roger Fajman, Bob Fink, Peter Ford, Bob Gilligan,
Dimitry Haskin, Tom Harsch, Christian Huitema, Tony Li, Greg
Minshall, Thomas Narten, Erik Nordmark, Yakov Rekhter, Bill Simpson,
and Sue Thomson.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC 2119].

2.0 IPv6 ADDRESSING

IPv6 addresses are 128-bit identifiers for interfaces and sets of
interfaces. There are three types of addresses:

Unicast: An identifier for a single interface. A packet sent to
a unicast address is delivered to the interface
identified by that address.

Anycast: An identifier for a set of interfaces (typically
belonging to different nodes). A packet sent to an
anycast address is delivered to one of the interfaces
identified by that address (the "nearest" one, according
to the routing protocols' measure of distance).

Multicast: An identifier for a set of interfaces (typically
belonging to different nodes). A packet sent to a
multicast address is delivered to all interfaces
identified by that address.

There are no broadcast addresses in IPv6, their function being
superseded by multicast addresses.

In this document, fields in addresses are given a specific name, for
example "subscriber". When this name is used with the term "ID" for
identifier after the name (e.g., "subscriber ID"), it refers to the
contents of the named field. When it is used with the term "prefix"
(e.g. "subscriber prefix") it refers to all of the address up to and
including this field.

In IPv6, all zeros and all ones are legal values for any field,
unless specifically excluded. Specifically, prefixes may contain
zero-valued fields or end in zeros.

2.1 Addressing Model

IPv6 addresses of all types are assigned to interfaces, not nodes.
An IPv6 unicast address refers to a single interface. Since each
interface belongs to a single node, any of that node's interfaces'
unicast addresses may be used as an identifier for the node.

All interfaces are required to have at least one link-local unicast
address (see section 2.8 for additional required addresses). A
single interface may also be assigned multiple IPv6 addresses of any
type (unicast, anycast, and multicast) or scope. Unicast addresses
with scope greater than link-scope are not needed for interfaces that
are not used as the origin or destination of any IPv6 packets to or
from non-neighbors. This is sometimes convenient for point-to-point
interfaces. There is one exception to this addressing model:

An unicast address or a set of unicast addresses may be assigned to
multiple physical interfaces if the implementation treats the
multiple physical interfaces as one interface when presenting it to
the internet layer. This is useful for load-sharing over multiple
physical interfaces.

Currently IPv6 continues the IPv4 model that a subnet prefix is
associated with one link. Multiple subnet prefixes may be assigned
to the same link.

2.2 Text Representation of Addresses

There are three conventional forms for representing IPv6 addresses as
text strings:

1. The preferred form is x:x:x:x:x:x:x:x, where the 'x's are the
hexadecimal values of the eight 16-bit pieces of the address.
Examples:

FEDC:BA98:7654:3210:FEDC:BA98:7654:3210

1080:0:0:0:8:800:200C:417A

Note that it is not necessary to write the leading zeros in an
individual field, but there must be at least one numeral in every
field (except for the case described in 2.).

2. Due to some methods of allocating certain styles of IPv6
addresses, it will be common for addresses to contain long strings
of zero bits. In order to make writing addresses containing zero
bits easier a special syntax is available to compress the zeros.
The use of "::" indicates multiple groups of 16-bits of zeros.
The "::" can only appear once in an address. The "::" can also be
used to compress the leading and/or trailing zeros in an address.

For example the following addresses:

1080:0:0:0:8:800:200C:417A a unicast address
FF01:0:0:0:0:0:0:101 a multicast address
0:0:0:0:0:0:0:1 the loopback address
0:0:0:0:0:0:0:0 the unspecified addresses

may be represented as:

1080::8:800:200C:417A a unicast address
FF01::101 a multicast address
::1 the loopback address
:: the unspecified addresses

3. An alternative form that is sometimes more convenient when dealing
with a mixed environment of IPv4 and IPv6 nodes is
x:x:x:x:x:x:d.d.d.d, where the 'x's are the hexadecimal values of
the six high-order 16-bit pieces of the address, and the 'd's are
the decimal values of the four low-order 8-bit pieces of the
address (standard IPv4 representation). Examples:

0:0:0:0:0:0:13.1.68.3

0:0:0:0:0:FFFF:129.144.52.38

or in compressed form:

::13.1.68.3

::FFFF:129.144.52.38

2.3 Text Representation of Address Prefixes

The text representation of IPv6 address prefixes is similar to the
way IPv4 addresses prefixes are written in CIDR notation. An IPv6
address prefix is represented by the notation:

ipv6-address/prefix-length

where

ipv6-address is an IPv6 address in any of the notations listed
in section 2.2.

prefix-length is a decimal value specifying how many of the
leftmost contiguous bits of the address comprise
the prefix.

For example, the following are legal representations of the 60-bit
prefix 12AB00000000CD3 (hexadecimal):

12AB:0000:0000:CD30:0000:0000:0000:0000/60
12AB::CD30:0:0:0:0/60
12AB:0:0:CD30::/60

The following are NOT legal representations of the above prefix:

12AB:0:0:CD3/60 may drop leading zeros, but not trailing zeros,
within any 16-bit chunk of the address

12AB::CD30/60 address to left of "/" expands to
12AB:0000:0000:0000:0000:000:0000:CD30

12AB::CD3/60 address to left of "/" expands to
12AB:0000:0000:0000:0000:000:0000:0CD3

When writing both a node address and a prefix of that node address
(e.g., the node's subnet prefix), the two can combined as follows:

the node address 12AB:0:0:CD30:123:4567:89AB:CDEF
and its subnet number 12AB:0:0:CD30::/60

can be abbreviated as 12AB:0:0:CD30:123:4567:89AB:CDEF/60

2.4 Address Type Representation

The specific type of an IPv6 address is indicated by the leading bits
in the address. The variable-length field comprising these leading
bits is called the Format Prefix (FP). The initial allocation of
these prefixes is as follows:

Allocation Prefix Fraction of
(binary) Address Space
----------------------------------- -------- -------------
Reserved 0000 0000 1/256
Unassigned 0000 0001 1/256

Reserved for NSAP Allocation 0000 001 1/128
Reserved for IPX Allocation 0000 010 1/128

Unassigned 0000 011 1/128
Unassigned 0000 1 1/32
Unassigned 0001 1/16

Aggregatable Global Unicast Addresses 001 1/8
Unassigned 010 1/8
Unassigned 011 1/8
Unassigned 100 1/8
Unassigned 101 1/8
Unassigned 110 1/8

Unassigned 1110 1/16
Unassigned 1111 0 1/32
Unassigned 1111 10 1/64
Unassigned 1111 110 1/128
Unassigned 1111 1110 0 1/512

Link-Local Unicast Addresses 1111 1110 10 1/1024
Site-Local Unicast Addresses 1111 1110 11 1/1024

Multicast Addresses 1111 1111 1/256

Notes:

(1) The "unspecified address" (see section 2.5.2), the loopback
address (see section 2.5.3), and the IPv6 Addresses with
Embedded IPv4 Addresses (see section 2.5.4), are assigned out
of the 0000 0000 format prefix space.

(2) The format prefixes 001 through 111, except for Multicast
Addresses (1111 1111), are all required to have to have 64-bit
interface identifiers in EUI-64 format. See section 2.5.1 for
definitions.

This allocation supports the direct allocation of aggregation
addresses, local use addresses, and multicast addresses. Space is
reserved for NSAP addresses and IPX addresses. The remainder of the
address space is unassigned for future use. This can be used for
expansion of existing use (e.g., additional aggregatable addresses,
etc.) or new uses (e.g., separate locators and identifiers). Fifteen
percent of the address space is initially allocated. The remaining
85% is reserved for future use.

Unicast addresses are distinguished from multicast addresses by the
value of the high-order octet of the addresses: a value of FF
(11111111) identifies an address as a multicast address; any other
value identifies an address as a unicast address. Anycast addresses
are taken from the unicast address space, and are not syntactically
distinguishable from unicast addresses.

2.5 Unicast Addresses

IPv6 unicast addresses are aggregatable with contiguous bit-wise
masks similar to IPv4 addresses under Class-less Interdomain Routing
[CIDR].

There are several forms of unicast address assignment in IPv6,
including the global aggregatable global unicast address, the NSAP
address, the IPX hierarchical address, the site-local address, the
link-local address, and the IPv4-capable host address. Additional
address types can be defined in the future.

IPv6 nodes may have considerable or little knowledge of the internal
structure of the IPv6 address, depending on the role the node plays
(for instance, host versus router). At a minimum, a node may
consider that unicast addresses (including its own) have no internal
structure:

| 128 bits |
+-----------------------------------------------------------------+
| node address |
+-----------------------------------------------------------------+

A slightly sophisticated host (but still rather simple) may
additionally be aware of subnet prefix(es) for the link(s) it is
attached to, where different addresses may have different values for
n:

| n bits | 128-n bits |
+------------------------------------------------+----------------+
| subnet prefix | interface ID |
+------------------------------------------------+----------------+

Still more sophisticated hosts may be aware of other hierarchical
boundaries in the unicast address. Though a very simple router may
have no knowledge of the internal structure of IPv6 unicast
addresses, routers will more generally have knowledge of one or more
of the hierarchical boundaries for the operation of routing
protocols. The known boundaries will differ from router to router,
depending on what positions the router holds in the routing
hierarchy.

2.5.1 Interface Identifiers

Interface identifiers in IPv6 unicast addresses are used to identify
interfaces on a link. They are required to be unique on that link.
They may also be unique over a broader scope. In many cases an
interface's identifier will be the same as that interface's link-
layer address. The same interface identifier may be used on multiple
interfaces on a single node.

Note that the use of the same interface identifier on multiple
interfaces of a single node does not affect the interface
identifier's global uniqueness or each IPv6 addresses global
uniqueness created using that interface identifier.

In a number of the format prefixes (see section 2.4) Interface IDs
are required to be 64 bits long and to be constructed in IEEE EUI-64
format [EUI64]. EUI-64 based Interface identifiers may have global
scope when a global token is available (e.g., IEEE 48bit MAC) or may
have local scope where a global token is not available (e.g., serial
links, tunnel end-points, etc.). It is required that the "u" bit
(universal/local bit in IEEE EUI-64 terminology) be inverted when
forming the interface identifier from the EUI-64. The "u" bit is set
to one (1) to indicate global scope, and it is set to zero (0) to
indicate local scope. The first three octets in binary of an EUI-64
identifier are as follows:

0 0 0 1 1 2
|0 7 8 5 6 3|
+----+----+----+----+----+----+
|cccc|ccug|cccc|cccc|cccc|cccc|
+----+----+----+----+----+----+

written in Internet standard bit-order , where "u" is the
universal/local bit, "g" is the individual/group bit, and "c" are the
bits of the company_id. Appendix A: "Creating EUI-64 based Interface
Identifiers" provides examples on the creation of different EUI-64
based interface identifiers.

The motivation for inverting the "u" bit when forming the interface
identifier is to make it easy for system administrators to hand
configure local scope identifiers when hardware tokens are not
available. This is expected to be case for serial links, tunnel end-
points, etc. The alternative would have been for these to be of the
form 0200:0:0:1, 0200:0:0:2, etc., instead of the much simpler ::1,
::2, etc.

The use of the universal/local bit in the IEEE EUI-64 identifier is
to allow development of future technology that can take advantage of
interface identifiers with global scope.

The details of forming interface identifiers are defined in the
appropriate "IPv6 over <link>" specification such as "IPv6 over
Ethernet" [ETHER], "IPv6 over FDDI" [FDDI], etc.

2.5.2 The Unspecified Address

The address 0:0:0:0:0:0:0:0 is called the unspecified address. It
must never be assigned to any node. It indicates the absence of an
address. One example of its use is in the Source Address field of
any IPv6 packets sent by an initializing host before it has learned
its own address.

The unspecified address must not be used as the destination address
of IPv6 packets or in IPv6 Routing Headers.

2.5.3 The Loopback Address

The unicast address 0:0:0:0:0:0:0:1 is called the loopback address.
It may be used by a node to send an IPv6 packet to itself. It may
never be assigned to any physical interface. It may be thought of as
being associated with a virtual interface (e.g., the loopback
interface).

The loopback address must not be used as the source address in IPv6
packets that are sent outside of a single node. An IPv6 packet with
a destination address of loopback must never be sent outside of a
single node and must never be forwarded by an IPv6 router.

2.5.4 IPv6 Addresses with Embedded IPv4 Addresses

The IPv6 transition mechanisms [TRAN] include a technique for hosts
and routers to dynamically tunnel IPv6 packets over IPv4 routing
infrastructure. IPv6 nodes that utilize this technique are assigned
special IPv6 unicast addresses that carry an IPv4 address in the low-
order 32-bits. This type of address is termed an "IPv4-compatible
IPv6 address" and has the format:

| 80 bits | 16 | 32 bits |
+--------------------------------------+--------------------------+
|0000..............................0000|0000| IPv4 address |
+--------------------------------------+----+---------------------+

A second type of IPv6 address which holds an embedded IPv4 address is
also defined. This address is used to represent the addresses of
IPv4-only nodes (those that *do not* support IPv6) as IPv6 addresses.
This type of address is termed an "IPv4-mapped IPv6 address" and has
the format:

| 80 bits | 16 | 32 bits |
+--------------------------------------+--------------------------+
|0000..............................0000|FFFF| IPv4 address |
+--------------------------------------+----+---------------------+

2.5.5 NSAP Addresses

This mapping of NSAP address into IPv6 addresses is defined in
[NSAP]. This document recommends that network implementors who have
planned or deployed an OSI NSAP addressing plan, and who wish to
deploy or transition to IPv6, should redesign a native IPv6
addressing plan to meet their needs. However, it also defines a set
of mechanisms for the support of OSI NSAP addressing in an IPv6
network. These mechanisms are the ones that must be used if such
support is required. This document also defines a mapping of IPv6
addresses within the OSI address format, should this be required.

2.5.6 IPX Addresses

This mapping of IPX address into IPv6 addresses is as follows:

| 7 | 121 bits |
+-------+---------------------------------------------------------+
|0000010| to be defined |
+-------+---------------------------------------------------------+

The draft definition, motivation, and usage are under study.

2.5.7 Aggregatable Global Unicast Addresses

The global aggregatable global unicast address is defined in [AGGR].
This address format is designed to support both the current provider
based aggregation and a new type of aggregation called exchanges.
The combination will allow efficient routing aggregation for both
sites which connect directly to providers and who connect to
exchanges. Sites will have the choice to connect to either type of
aggregation point.

The IPv6 aggregatable global unicast address format is as follows:

| 3| 13 | 8 | 24 | 16 | 64 bits |
+--+-----+---+--------+--------+--------------------------------+
|FP| TLA |RES| NLA | SLA | Interface ID |
| | ID | | ID | ID | |
+--+-----+---+--------+--------+--------------------------------+

Where

001 Format Prefix (3 bit) for Aggregatable Global
Unicast Addresses
TLA ID Top-Level Aggregation Identifier
RES Reserved for future use
NLA ID Next-Level Aggregation Identifier
SLA ID Site-Level Aggregation Identifier
INTERFACE ID Interface Identifier

The contents, field sizes, and assignment rules are defined in
[AGGR].

2.5.8 Local-Use IPv6 Unicast Addresses

There are two types of local-use unicast addresses defined. These
are Link-Local and Site-Local. The Link-Local is for use on a single
link and the Site-Local is for use in a single site. Link-Local
addresses have the following format:

| 10 |
| bits | 54 bits | 64 bits |
+----------+-------------------------+----------------------------+
|1111111010| 0 | interface ID |
+----------+-------------------------+----------------------------+

Link-Local addresses are designed to be used for addressing on a
single link for purposes such as auto-address configuration, neighbor
discovery, or when no routers are present.

Routers must not forward any packets with link-local source or
destination addresses to other links.

Site-Local addresses have the following format:

| 10 |
| bits | 38 bits | 16 bits | 64 bits |
+----------+-------------+-----------+----------------------------+
|1111111011| 0 | subnet ID | interface ID |
+----------+-------------+-----------+----------------------------+

Site-Local addresses are designed to be used for addressing inside of
a site without the need for a global prefix.

Routers must not forward any packets with site-local source or
destination addresses outside of the site.

2.6 Anycast Addresses

An IPv6 anycast address is an address that is assigned to more than
one interface (typically belonging to different nodes), with the
property that a packet sent to an anycast address is routed to the
"nearest" interface having that address, according to the routing
protocols' measure of distance.

Anycast addresses are allocated from the unicast address space, using
any of the defined unicast address formats. Thus, anycast addresses
are syntactically indistinguishable from unicast addresses. When a
unicast address is assigned to more than one interface, thus turning
it into an anycast address, the nodes to which the address is
assigned must be explicitly configured to know that it is an anycast
address.

For any assigned anycast address, there is a longest address prefix P
that identifies the topological region in which all interfaces
belonging to that anycast address reside. Within the region
identified by P, each member of the anycast set must be advertised as
a separate entry in the routing system (commonly referred to as a
"host route"); outside the region identified by P, the anycast
address may be aggregated into the routing advertisement for prefix
P.

Note that in, the worst case, the prefix P of an anycast set may be
the null prefix, i.e., the members of the set may have no topological
locality. In that case, the anycast address must be advertised as a
separate routing entry throughout the entire internet, which presents

a severe scaling limit on how many such "global" anycast sets may be
supported. Therefore, it is expected that support for global anycast
sets may be unavailable or very restricted.

One expected use of anycast addresses is to identify the set of
routers belonging to an organization providing internet service.
Such addresses could be used as intermediate addresses in an IPv6
Routing header, to cause a packet to be delivered via a particular
aggregation or sequence of aggregations. Some other possible uses
are to identify the set of routers attached to a particular subnet,
or the set of routers providing entry into a particular routing
domain.

There is little experience with widespread, arbitrary use of internet
anycast addresses, and some known complications and hazards when
using them in their full generality [ANYCST]. Until more experience
has been gained and solutions agreed upon for those problems, the
following restrictions are imposed on IPv6 anycast addresses:

o An anycast address must not be used as the source address of an
IPv6 packet.

o An anycast address must not be assigned to an IPv6 host, that
is, it may be assigned to an IPv6 router only.

2.6.1 Required Anycast Address

The Subnet-Router anycast address is predefined. Its format is as
follows:

| n bits | 128-n bits |
+------------------------------------------------+----------------+
| subnet prefix | 00000000000000 |
+------------------------------------------------+----------------+

The "subnet prefix" in an anycast address is the prefix which
identifies a specific link. This anycast address is syntactically
the same as a unicast address for an interface on the link with the
interface identifier set to zero.

Packets sent to the Subnet-Router anycast address will be delivered
to one router on the subnet. All routers are required to support the
Subnet-Router anycast addresses for the subnets which they have
interfaces.

The subnet-router anycast address is intended to be used for
applications where a node needs to communicate with one of a set of
routers on a remote subnet. For example when a mobile host needs to
communicate with one of the mobile agents on its "home" subnet.

2.7 Multicast Addresses

An IPv6 multicast address is an identifier for a group of nodes. A
node may belong to any number of multicast groups. Multicast
addresses have the following format:

| 8 | 4 | 4 | 112 bits |
+------ -+----+----+---------------------------------------------+
|11111111|flgs|scop| group ID |
+--------+----+----+---------------------------------------------+

11111111 at the start of the address identifies the address as
being a multicast address.

+-+-+-+-+
flgs is a set of 4 flags: |0|0|0|T|
+-+-+-+-+

The high-order 3 flags are reserved, and must be initialized to
0.

T = 0 indicates a permanently-assigned ("well-known") multicast
address, assigned by the global internet numbering authority.

T = 1 indicates a non-permanently-assigned ("transient")
multicast address.

scop is a 4-bit multicast scope value used to limit the scope of
the multicast group. The values are:

0 reserved
1 node-local scope
2 link-local scope
3 (unassigned)
4 (unassigned)
5 site-local scope
6 (unassigned)
7 (unassigned)
8 organization-local scope
9 (unassigned)
A (unassigned)
B (unassigned)
C (unassigned)

D (unassigned)
E global scope
F reserved

group ID identifies the multicast group, either permanent or
transient, within the given scope.

The "meaning" of a permanently-assigned multicast address is
independent of the scope value. For example, if the "NTP servers
group" is assigned a permanent multicast address with a group ID of
101 (hex), then:

FF01:0:0:0:0:0:0:101 means all NTP servers on the same node as the
sender.

FF02:0:0:0:0:0:0:101 means all NTP servers on the same link as the
sender.

FF05:0:0:0:0:0:0:101 means all NTP servers at the same site as the
sender.

FF0E:0:0:0:0:0:0:101 means all NTP servers in the internet.

Non-permanently-assigned multicast addresses are meaningful only
within a given scope. For example, a group identified by the non-
permanent, site-local multicast address FF15:0:0:0:0:0:0:101 at one
site bears no relationship to a group using the same address at a
different site, nor to a non-permanent group using the same group ID
with different scope, nor to a permanent group with the same group
ID.

Multicast addresses must not be used as source addresses in IPv6
packets or appear in any routing header.

2.7.1 Pre-Defined Multicast Addresses

The following well-known multicast addresses are pre-defined:

Reserved Multicast Addresses: FF00:0:0:0:0:0:0:0
FF01:0:0:0:0:0:0:0
FF02:0:0:0:0:0:0:0
FF03:0:0:0:0:0:0:0
FF04:0:0:0:0:0:0:0
FF05:0:0:0:0:0:0:0
FF06:0:0:0:0:0:0:0
FF07:0:0:0:0:0:0:0
FF08:0:0:0:0:0:0:0
FF09:0:0:0:0:0:0:0

FF0A:0:0:0:0:0:0:0
FF0B:0:0:0:0:0:0:0
FF0C:0:0:0:0:0:0:0
FF0D:0:0:0:0:0:0:0
FF0E:0:0:0:0:0:0:0
FF0F:0:0:0:0:0:0:0

The above multicast addresses are reserved and shall never be
assigned to any multicast group.

All Nodes Addresses: FF01:0:0:0:0:0:0:1
FF02:0:0:0:0:0:0:1

The above multicast addresses identify the group of all IPv6 nodes,
within scope 1 (node-local) or 2 (link-local).

All Routers Addresses: FF01:0:0:0:0:0:0:2
FF02:0:0:0:0:0:0:2
FF05:0:0:0:0:0:0:2

The above multicast addresses identify the group of all IPv6 routers,
within scope 1 (node-local), 2 (link-local), or 5 (site-local).

Solicited-Node Address: FF02:0:0:0:0:1:FFXX:XXXX

The above multicast address is computed as a function of a node's
unicast and anycast addresses. The solicited-node multicast address
is formed by taking the low-order 24 bits of the address (unicast or
anycast) and appending those bits to the prefix
FF02:0:0:0:0:1:FF00::/104 resulting in a multicast address in the
range

FF02:0:0:0:0:1:FF00:0000

to

FF02:0:0:0:0:1:FFFF:FFFF

For example, the solicited node multicast address corresponding to
the IPv6 address 4037::01:800:200E:8C6C is FF02::1:FF0E:8C6C. IPv6
addresses that differ only in the high-order bits, e.g. due to
multiple high-order prefixes associated with different aggregations,
will map to the same solicited-node address thereby reducing the
number of multicast addresses a node must join.

A node is required to compute and join the associated Solicited-Node
multicast addresses for every unicast and anycast address it is
assigned.

2.7.2 Assignment of New IPv6 Multicast Addresses

The current approach [ETHER] to map IPv6 multicast addresses into
IEEE 802 MAC addresses takes the low order 32 bits of the IPv6
multicast address and uses it to create a MAC address. Note that
Token Ring networks are handled differently. This is defined in
[TOKEN]. Group ID's less than or equal to 32 bits will generate
unique MAC addresses. Due to this new IPv6 multicast addresses
should be assigned so that the group identifier is always in the low
order 32 bits as shown in the following:

| 8 | 4 | 4 | 80 bits | 32 bits |
+------ -+----+----+---------------------------+-----------------+
|11111111|flgs|scop| reserved must be zero | group ID |
+--------+----+----+---------------------------+-----------------+

While this limits the number of permanent IPv6 multicast groups to
2^32 this is unlikely to be a limitation in the future. If it
becomes necessary to exceed this limit in the future multicast will
still work but the processing will be sightly slower.

Additional IPv6 multicast addresses are defined and registered by the
IANA [MASGN].

2.8 A Node's Required Addresses

A host is required to recognize the following addresses as
identifying itself:

o Its Link-Local Address for each interface
o Assigned Unicast Addresses
o Loopback Address
o All-Nodes Multicast Addresses
o Solicited-Node Multicast Address for each of its assigned
unicast and anycast addresses
o Multicast Addresses of all other groups to which the host
belongs.

A router is required to recognize all addresses that a host is
required to recognize, plus the following addresses as identifying
itself:

o The Subnet-Router anycast addresses for the interfaces it is
configured to act as a router on.
o All other Anycast addresses with which the router has been
configured.
o All-Routers Multicast Addresses

o Multicast Addresses of all other groups to which the router
belongs.

The only address prefixes which should be predefined in an
implementation are the:

o Unspecified Address
o Loopback Address
o Multicast Prefix (FF)
o Local-Use Prefixes (Link-Local and Site-Local)
o Pre-Defined Multicast Addresses
o IPv4-Compatible Prefixes

Implementations should assume all other addresses are unicast unless
specifically configured (e.g., anycast addresses).

3. Security Considerations

IPv6 addressing documents do not have any direct impact on Internet
infrastructure security. Authentication of IPv6 packets is defined
in [AUTH].

APPENDIX A : Creating EUI-64 based Interface Identifiers
--------------------------------------------------------

Depending on the characteristics of a specific link or node there are
a number of approaches for creating EUI-64 based interface
identifiers. This appendix describes some of these approaches.

Links or Nodes with EUI-64 Identifiers

The only change needed to transform an EUI-64 identifier to an
interface identifier is to invert the "u" (universal/local) bit. For
example, a globally unique EUI-64 identifier of the form:

|0 1|1 3|3 4|4 6|
|0 5|6 1|2 7|8 3|
+----------------+----------------+----------------+----------------+
|cccccc0gcccccccc|ccccccccmmmmmmmm|mmmmmmmmmmmmmmmm|mmmmmmmmmmmmmmmm|
+----------------+----------------+----------------+----------------+

where "c" are the bits of the assigned company_id, "0" is the value
of the universal/local bit to indicate global scope, "g" is
individual/group bit, and "m" are the bits of the manufacturer-
selected extension identifier. The IPv6 interface identifier would
be of the form:

|0 1|1 3|3 4|4 6|
|0 5|6 1|2 7|8 3|
+----------------+----------------+----------------+----------------+
|cccccc1gcccccccc|ccccccccmmmmmmmm|mmmmmmmmmmmmmmmm|mmmmmmmmmmmmmmmm|
+----------------+----------------+----------------+----------------+

The only change is inverting the value of the universal/local bit.

Links or Nodes with IEEE 802 48 bit MAC's

[EUI64] defines a method to create a EUI-64 identifier from an IEEE
48bit MAC identifier. This is to insert two octets, with hexadecimal
values of 0xFF and 0xFE, in the middle of the 48 bit MAC (between the
company_id and vendor supplied id). For example the 48 bit MAC with
global scope:

|0 1|1 3|3 4|
|0 5|6 1|2 7|
+----------------+----------------+----------------+
|cccccc0gcccccccc|ccccccccmmmmmmmm|mmmmmmmmmmmmmmmm|
+----------------+----------------+----------------+

where "c" are the bits of the assigned company_id, "0" is the value
of the universal/local bit to indicate global scope, "g" is
individual/group bit, and "m" are the bits of the manufacturer-
selected extension identifier. The interface identifier would be of
the form:

|0 1|1 3|3 4|4 6|
|0 5|6 1|2 7|8 3|
+----------------+----------------+----------------+----------------+
|cccccc1gcccccccc|cccccccc11111111|11111110mmmmmmmm|mmmmmmmmmmmmmmmm|
+----------------+----------------+----------------+----------------+

When IEEE 802 48bit MAC addresses are available (on an interface or a
node), an implementation should use them to create interface
identifiers due to their availability and uniqueness properties.

Links with Non-Global Identifiers

There are a number of types of links that, while multi-access, do not
have globally unique link identifiers. Examples include LocalTalk
and Arcnet. The method to create an EUI-64 formatted identifier is
to take the link identifier (e.g., the LocalTalk 8 bit node
identifier) and zero fill it to the left. For example a LocalTalk 8
bit node identifier of hexadecimal value 0x4F results in the
following interface identifier:

|0 1|1 3|3 4|4 6|
|0 5|6 1|2 7|8 3|
+----------------+----------------+----------------+----------------+
|0000000000000000|0000000000000000|0000000000000000|0000000001001111|
+----------------+----------------+----------------+----------------+

Note that this results in the universal/local bit set to "0" to
indicate local scope.

Links without Identifiers

There are a number of links that do not have any type of built-in
identifier. The most common of these are serial links and configured
tunnels. Interface identifiers must be chosen that are unique for
the link.

When no built-in identifier is available on a link the preferred
approach is to use a global interface identifier from another
interface or one which is assigned to the node itself. To use this
approach no other interface connecting the same node to the same link
may use the same identifier.

If there is no global interface identifier available for use on the
link the implementation needs to create a local scope interface
identifier. The only requirement is that it be unique on the link.
There are many possible approaches to select a link-unique interface
identifier. They include:

Manual Configuration
Generated Random Number
Node Serial Number (or other node-specific token)

The link-unique interface identifier should be generated in a manner
that it does not change after a reboot of a node or if interfaces are
added or deleted from the node.

The selection of the appropriate algorithm is link and implementation
dependent. The details on forming interface identifiers are defined
in the appropriate "IPv6 over <link>" specification. It is strongly
recommended that a collision detection algorithm be implemented as
part of any automatic algorithm.

APPENDIX B: ABNF Description of Text Representations
----------------------------------------------------

This appendix defines the text representation of IPv6 addresses and
prefixes in Augmented BNF [ABNF] for reference purposes.

IPv6address = hexpart [ ":" IPv4address ]
IPv4address = 1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT

IPv6prefix = hexpart "/" 1*2DIGIT

hexpart = hexseq | hexseq "::" [ hexseq ] | "::" [ hexseq ]
hexseq = hex4 *( ":" hex4)
hex4 = 1*4HEXDIG

APPENDIX C: CHANGES FROM RFC-1884
---------------------------------

The following changes were made from RFC-1884 "IP Version 6
Addressing Architecture":

- Added an appendix providing a ABNF description of text
representations.
- Clarification that link unique identifiers not change after
reboot or other interface reconfigurations.
- Clarification of Address Model based on comments.
- Changed aggregation format terminology to be consistent with
aggregation draft.
- Added text to allow interface identifier to be used on more than
one interface on same node.
- Added rules for defining new multicast addresses.
- Added appendix describing procedures for creating EUI-64 based
interface ID's.
- Added notation for defining IPv6 prefixes.
- Changed solicited node multicast definition to use a longer
prefix.
- Added site scope all routers multicast address.
- Defined Aggregatable Global Unicast Addresses to use "001" Format
Prefix.
- Changed "010" (Provider-Based Unicast) and "100" (Reserved for
Geographic) Format Prefixes to Unassigned.
- Added section on Interface ID definition for unicast addresses.
Requires use of EUI-64 in range of format prefixes and rules for
setting global/local scope bit in EUI-64.
- Updated NSAP text to reflect working in RFC1888.
- Removed protocol specific IPv6 multicast addresses (e.g., DHCP)
and referenced the IANA definitions.
- Removed section "Unicast Address Example". Had become OBE.
- Added new and updated references.
- Minor text clarifications and improvements.

REFERENCES

[ABNF] Crocker, D., and P. Overell, "Augmented BNF for
Syntax Specifications: ABNF", RFC 2234, November 1997.

[AGGR] Hinden, R., O'Dell, M., and S. Deering, "An
Aggregatable Global Unicast Address Format", RFC 2374, July
1998.

[AUTH] Atkinson, R., "IP Authentication Header", RFC 1826, August
1995.

[ANYCST] Partridge, C., Mendez, T., and W. Milliken, "Host
Anycasting Service", RFC 1546, November 1993.

[CIDR] Fuller, V., Li, T., Yu, J., and K. Varadhan, "Classless
Inter-Domain Routing (CIDR): An Address Assignment and
Aggregation Strategy", RFC 1519, September 1993.

[ETHER] Crawford, M., "Transmission of IPv6 Pacekts over Ethernet
Networks", Work in Progress.

[EUI64] IEEE, "Guidelines for 64-bit Global Identifier (EUI-64)
Registration Authority",
http://standards.ieee.org/db/oui/tutorials/EUI64.html,
March 1997.

[FDDI] Crawford, M., "Transmission of IPv6 Packets over FDDI
Networks", Work in Progress.

[IPV6] Deering, S., and R. Hinden, Editors, "Internet Protocol,
Version 6 (IPv6) Specification", RFC 1883, December 1995.

[MASGN] Hinden, R., and S. Deering, "IPv6 Multicast Address
Assignments", RFC 2375, July 1998.

[NSAP] Bound, J., Carpenter, B., Harrington, D., Houldsworth, J.,
and A. Lloyd, "OSI NSAPs and IPv6", RFC 1888, August 1996.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.

[TOKEN] Thomas, S., "Transmission of IPv6 Packets over Token Ring
Networks", Work in Progress.

[TRAN] Gilligan, R., and E. Nordmark, "Transition Mechanisms for
IPv6 Hosts and Routers", RFC 1993, April 1996.

AUTHORS' ADDRESSES

Robert M. Hinden
Nokia
232 Java Drive
Sunnyvale, CA 94089
USA

Phone: +1 408 990-2004
Fax: +1 408 743-5677
EMail: hinden@iprg.nokia.com

Stephen E. Deering
Cisco Systems, Inc.
170 West Tasman Drive
San Jose, CA 95134-1706
USA

Phone: +1 408 527-8213
Fax: +1 408 527-8254
EMail: deering@cisco.com

Full Copyright Statement

Copyright (C) The Internet Society (1998). All Rights Reserved.

This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.

The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.

This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

RFC 2463 – Internet Control Message Protocol (ICMPv6)

 
Network Working Group                                           A. Conta
Request for Comments: 2463 Lucent
Obsoletes: 1885 S. Deering
Category: Standards Track Cisco Systems
December 1998

Internet Control Message Protocol (ICMPv6)
for the Internet Protocol Version 6 (IPv6)
Specification

Status of this Memo

This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.

Copyright Notice

Copyright (C) The Internet Society (1998). All Rights Reserved.

Abstract

This document specifies a set of Internet Control Message Protocol
(ICMP) messages for use with version 6 of the Internet Protocol
(IPv6).

Table of Contents

1. Introduction........................................2
2. ICMPv6 (ICMP for IPv6)..............................2
2.1 Message General Format.......................2
2.2 Message Source Address Determination.........3
2.3 Message Checksum Calculation.................4
2.4 Message Processing Rules.....................4
3. ICMPv6 Error Messages...............................6
3.1 Destination Unreachable Message..............6
3.2 Packet Too Big Message...................... 8
3.3 Time Exceeded Message....................... 9
3.4 Parameter Problem Message...................10
4. ICMPv6 Informational Messages......................11
4.1 Echo Request Message........................11
4.2 Echo Reply Message..........................12
5. Security Considerations............................13
6. References.........................................14
7. Acknowledgments....................................15
8. Authors' Addresses.................................16
Appendix A - Changes since RFC 1885...................17
Full Copyright Statement..............................18

1. Introduction


The Internet Protocol, version 6 (IPv6) is a new version of IP. IPv6
uses the Internet Control Message Protocol (ICMP) as defined for IPv4
[RFC-792], with a number of changes. The resulting protocol is
called ICMPv6, and has an IPv6 Next Header value of 58.

This document describes the format of a set of control messages used
in ICMPv6. It does not describe the procedures for using these
messages to achieve functions like Path MTU discovery; such
procedures are described in other documents (e.g., [PMTU]). Other
documents may also introduce additional ICMPv6 message types, such as
Neighbor Discovery messages [IPv6-DISC], subject to the general rules
for ICMPv6 messages given in section 2 of this document.

Terminology defined in the IPv6 specification [IPv6] and the IPv6
Routing and Addressing specification [IPv6-ADDR] applies to this
document as well.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC-2119].

2. ICMPv6 (ICMP for IPv6)


ICMPv6 is used by IPv6 nodes to report errors encountered in
processing packets, and to perform other internet-layer functions,
such as diagnostics (ICMPv6 "ping"). ICMPv6 is an integral part of
IPv6 and MUST be fully implemented by every IPv6 node.

2.1 Message General Format


ICMPv6 messages are grouped into two classes: error messages and
informational messages. Error messages are identified as such by
having a zero in the high-order bit of their message Type field
values. Thus, error messages have message Types from 0 to 127;
informational messages have message Types from 128 to 255.

This document defines the message formats for the following ICMPv6
messages:

ICMPv6 error messages:

1 Destination Unreachable (see section 3.1)
2 Packet Too Big (see section 3.2)
3 Time Exceeded (see section 3.3)
4 Parameter Problem (see section 3.4)

ICMPv6 informational messages:

128 Echo Request (see section 4.1)
129 Echo Reply (see section 4.2)

Every ICMPv6 message is preceded by an IPv6 header and zero or more
IPv6 extension headers. The ICMPv6 header is identified by a Next
Header value of 58 in the immediately preceding header. (NOTE: this
is different than the value used to identify ICMP for IPv4.)

The ICMPv6 messages have the following general format:

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ Message Body +
| |

The type field indicates the type of the message. Its value
determines the format of the remaining data.

The code field depends on the message type. It is used to create an
additional level of message granularity.

The checksum field is used to detect data corruption in the ICMPv6
message and parts of the IPv6 header.

2.2 Message Source Address Determination


A node that sends an ICMPv6 message has to determine both the Source
and Destination IPv6 Addresses in the IPv6 header before calculating
the checksum. If the node has more than one unicast address, it must
choose the Source Address of the message as follows:

(a) If the message is a response to a message sent to one of the
node's unicast addresses, the Source Address of the reply must
be that same address.

(b) If the message is a response to a message sent to a multicast or
anycast group in which the node is a member, the Source Address
of the reply must be a unicast address belonging to the
interface on which the multicast or anycast packet was received.

(c) If the message is a response to a message sent to an address
that does not belong to the node, the Source Address should be
that unicast address belonging to the node that will be most
helpful in diagnosing the error. For example, if the message is
a response to a packet forwarding action that cannot complete
successfully, the Source Address should be a unicast address
belonging to the interface on which the packet forwarding
failed.

(d) Otherwise, the node's routing table must be examined to
determine which interface will be used to transmit the message
to its destination, and a unicast address belonging to that
interface must be used as the Source Address of the message.

2.3 Message Checksum Calculation


The checksum is the 16-bit one's complement of the one's complement
sum of the entire ICMPv6 message starting with the ICMPv6 message
type field, prepended with a "pseudo-header" of IPv6 header fields,
as specified in [IPv6, section 8.1]. The Next Header value used in
the pseudo-header is 58. (NOTE: the inclusion of a pseudo-header in
the ICMPv6 checksum is a change from IPv4; see [IPv6] for the
rationale for this change.)

For computing the checksum, the checksum field is set to zero.

2.4 Message Processing Rules


Implementations MUST observe the following rules when processing
ICMPv6 messages (from [RFC-1122]):

(a) If an ICMPv6 error message of unknown type is received, it MUST
be passed to the upper layer.

(b) If an ICMPv6 informational message of unknown type is received,
it MUST be silently discarded.

(c) Every ICMPv6 error message (type < 128) includes as much of the
IPv6 offending (invoking) packet (the packet that caused the
error) as will fit without making the error message packet
exceed the minimum IPv6 MTU [IPv6].

(d) In those cases where the internet-layer protocol is required to
pass an ICMPv6 error message to the upper-layer process, the
upper-layer protocol type is extracted from the original packet
(contained in the body of the ICMPv6 error message) and used to
select the appropriate upper-layer process to handle the error.

If the original packet had an unusually large amount of
extension headers, it is possible that the upper-layer protocol
type may not be present in the ICMPv6 message, due to truncation
of the original packet to meet the minimum IPv6 MTU [IPv6]
limit. In that case, the error message is silently dropped
after any IPv6-layer processing.

(e) An ICMPv6 error message MUST NOT be sent as a result of
receiving:

(e.1) an ICMPv6 error message, or

(e.2) a packet destined to an IPv6 multicast address (there are
two exceptions to this rule: (1) the Packet Too Big
Message - Section 3.2 - to allow Path MTU discovery to
work for IPv6 multicast, and (2) the Parameter Problem
Message, Code 2 - Section 3.4 - reporting an unrecognized
IPv6 option that has the Option Type highest-order two
bits set to 10), or

(e.3) a packet sent as a link-layer multicast, (the exception
from e.2 applies to this case too), or

(e.4) a packet sent as a link-layer broadcast, (the exception
from e.2 applies to this case too), or

(e.5) a packet whose source address does not uniquely identify
a single node -- e.g., the IPv6 Unspecified Address, an
IPv6 multicast address, or an address known by the ICMP
message sender to be an IPv6 anycast address.

(f) Finally, in order to limit the bandwidth and forwarding costs
incurred sending ICMPv6 error messages, an IPv6 node MUST limit
the rate of ICMPv6 error messages it sends. This situation may
occur when a source sending a stream of erroneous packets fails
to heed the resulting ICMPv6 error messages. There are a
variety of ways of implementing the rate-limiting function, for
example:

(f.1) Timer-based - for example, limiting the rate of
transmission of error messages to a given source, or to
any source, to at most once every T milliseconds.

(f.2) Bandwidth-based - for example, limiting the rate at which
error messages are sent from a particular interface to
some fraction F of the attached link's bandwidth.

The limit parameters (e.g., T or F in the above examples) MUST
be configurable for the node, with a conservative default value
(e.g., T = 1 second, NOT 0 seconds, or F = 2 percent, NOT 100
percent).

The following sections describe the message formats for the above
ICMPv6 messages.

3. ICMPv6 Error Messages


3.1 Destination Unreachable Message


0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Unused |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| As much of invoking packet |
+ as will fit without the ICMPv6 packet +
| exceeding the minimum IPv6 MTU [IPv6] |

IPv6 Fields:

Destination Address

Copied from the Source Address field of the invoking
packet.

ICMPv6 Fields:

Type 1

Code 0 - no route to destination
1 - communication with destination
administratively prohibited
2 - (not assigned)
3 - address unreachable
4 - port unreachable

Unused This field is unused for all code values.
It must be initialized to zero by the sender
and ignored by the receiver.

Description

A Destination Unreachable message SHOULD be generated by a router, or
by the IPv6 layer in the originating node, in response to a packet
that cannot be delivered to its destination address for reasons other
than congestion. (An ICMPv6 message MUST NOT be generated if a
packet is dropped due to congestion.)

If the reason for the failure to deliver is lack of a matching entry
in the forwarding node's routing table, the Code field is set to 0
(NOTE: this error can occur only in nodes that do not hold a "default
route" in their routing tables).

If the reason for the failure to deliver is administrative
prohibition, e.g., a "firewall filter", the Code field is set to 1.

If there is any other reason for the failure to deliver, e.g.,
inability to resolve the IPv6 destination address into a
corresponding link address, or a link-specific problem of some sort,
then the Code field is set to 3.

A destination node SHOULD send a Destination Unreachable message with
Code 4 in response to a packet for which the transport protocol
(e.g., UDP) has no listener, if that transport protocol has no
alternative means to inform the sender.

Upper layer notification

A node receiving the ICMPv6 Destination Unreachable message MUST
notify the upper-layer process.

3.2 Packet Too Big Message


0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MTU |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| As much of invoking packet |
+ as will fit without the ICMPv6 packet +
| exceeding the minimum IPv6 MTU [IPv6] |

IPv6 Fields:

Destination Address

Copied from the Source Address field of the invoking
packet.

ICMPv6 Fields:

Type 2

Code Set to 0 (zero) by the sender and ignored by the
receiver

MTU The Maximum Transmission Unit of the next-hop link.

Description

A Packet Too Big MUST be sent by a router in response to a packet
that it cannot forward because the packet is larger than the MTU of
the outgoing link. The information in this message is used as part
of the Path MTU Discovery process [PMTU].

Sending a Packet Too Big Message makes an exception to one of the
rules of when to send an ICMPv6 error message, in that unlike other
messages, it is sent in response to a packet received with an IPv6
multicast destination address, or a link-layer multicast or link-
layer broadcast address.

Upper layer notification

An incoming Packet Too Big message MUST be passed to the upper-layer
process.

3.3 Time Exceeded Message


0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Unused |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| As much of invoking packet |
+ as will fit without the ICMPv6 packet +
| exceeding the minimum IPv6 MTU [IPv6] |

IPv6 Fields:

Destination Address
Copied from the Source Address field of the invoking
packet.

ICMPv6 Fields:

Type 3

Code 0 - hop limit exceeded in transit

1 - fragment reassembly time exceeded

Unused This field is unused for all code values.
It must be initialized to zero by the sender
and ignored by the receiver.

Description

If a router receives a packet with a Hop Limit of zero, or a router
decrements a packet's Hop Limit to zero, it MUST discard the packet
and send an ICMPv6 Time Exceeded message with Code 0 to the source of
the packet. This indicates either a routing loop or too small an
initial Hop Limit value.

The rules for selecting the Source Address of this message are
defined in section 2.2.

Upper layer notification

An incoming Time Exceeded message MUST be passed to the upper-layer
process.

3.4 Parameter Problem Message


0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| As much of invoking packet |
+ as will fit without the ICMPv6 packet +
| exceeding the minimum IPv6 MTU [IPv6] |

IPv6 Fields:

Destination Address

Copied from the Source Address field of the invoking
packet.

ICMPv6 Fields:

Type 4

Code 0 - erroneous header field encountered

1 - unrecognized Next Header type encountered

2 - unrecognized IPv6 option encountered

Pointer Identifies the octet offset within the
invoking packet where the error was detected.

The pointer will point beyond the end of the ICMPv6
packet if the field in error is beyond what can fit
in the maximum size of an ICMPv6 error message.

Description

If an IPv6 node processing a packet finds a problem with a field in
the IPv6 header or extension headers such that it cannot complete
processing the packet, it MUST discard the packet and SHOULD send an
ICMPv6 Parameter Problem message to the packet's source, indicating
the type and location of the problem.

The pointer identifies the octet of the original packet's header
where the error was detected. For example, an ICMPv6 message with
Type field = 4, Code field = 1, and Pointer field = 40 would indicate

that the IPv6 extension header following the IPv6 header of the
original packet holds an unrecognized Next Header field value.

Upper layer notification

A node receiving this ICMPv6 message MUST notify the upper-layer
process.

4. ICMPv6 Informational Messages


4.1 Echo Request Message


0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identifier | Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+-

IPv6 Fields:

Destination Address

Any legal IPv6 address.

ICMPv6 Fields:

Type 128

Code 0

Identifier An identifier to aid in matching Echo Replies
to this Echo Request. May be zero.

Sequence Number

A sequence number to aid in matching Echo Replies
to this Echo Request. May be zero.

Data Zero or more octets of arbitrary data.

Description

Every node MUST implement an ICMPv6 Echo responder function that
receives Echo Requests and sends corresponding Echo Replies. A node
SHOULD also implement an application-layer interface for sending Echo
Requests and receiving Echo Replies, for diagnostic purposes.

Upper layer notification

Echo Request messages MAY be passed to processes receiving ICMP
messages.

4.2 Echo Reply Message


0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Code | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identifier | Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+-

IPv6 Fields:

Destination Address

Copied from the Source Address field of the invoking
Echo Request packet.

ICMPv6 Fields:

Type 129

Code 0

Identifier The identifier from the invoking Echo Request message.

Sequence The sequence number from the invoking Echo Request
Number message.

Data The data from the invoking Echo Request message.

Description

Every node MUST implement an ICMPv6 Echo responder function that
receives Echo Requests and sends corresponding Echo Replies. A node
SHOULD also implement an application-layer interface for sending Echo
Requests and receiving Echo Replies, for diagnostic purposes.

The source address of an Echo Reply sent in response to a unicast
Echo Request message MUST be the same as the destination address of
that Echo Request message.

An Echo Reply SHOULD be sent in response to an Echo Request message
sent to an IPv6 multicast address. The source address of the reply
MUST be a unicast address belonging to the interface on which the
multicast Echo Request message was received.

The data received in the ICMPv6 Echo Request message MUST be returned
entirely and unmodified in the ICMPv6 Echo Reply message.

Upper layer notification

Echo Reply messages MUST be passed to the process that originated an
Echo Request message. It may be passed to processes that did not
originate the Echo Request message.

5. Security Considerations


5.1 Authentication and Encryption of ICMP messages


ICMP protocol packet exchanges can be authenticated using the IP
Authentication Header [IPv6-AUTH]. A node SHOULD include an
Authentication Header when sending ICMP messages if a security
association for use with the IP Authentication Header exists for the
destination address. The security associations may have been created
through manual configuration or through the operation of some key
management protocol.

Received Authentication Headers in ICMP packets MUST be verified for
correctness and packets with incorrect authentication MUST be ignored
and discarded.

It SHOULD be possible for the system administrator to configure a
node to ignore any ICMP messages that are not authenticated using
either the Authentication Header or Encapsulating Security Payload.
Such a switch SHOULD default to allowing unauthenticated messages.

Confidentiality issues are addressed by the IP Security Architecture
and the IP Encapsulating Security Payload documents [IPv6-SA, IPv6-
ESP].

5.2 ICMP Attacks


ICMP messages may be subject to various attacks. A complete
discussion can be found in the IP Security Architecture [IPv6-SA]. A
brief discussion of such attacks and their prevention is as follows:

1. ICMP messages may be subject to actions intended to cause the
receiver believe the message came from a different source than the
message originator. The protection against this attack can be
achieved by applying the IPv6 Authentication mechanism [IPv6-Auth]
to the ICMP message.

2. ICMP messages may be subject to actions intended to cause the
message or the reply to it go to a destination different than the
message originator's intention. The ICMP checksum calculation
provides a protection mechanism against changes by a malicious
interceptor in the destination and source address of the IP packet
carrying that message, provided the ICMP checksum field is
protected against change by authentication [IPv6-Auth] or
encryption [IPv6-ESP] of the ICMP message.

3. ICMP messages may be subject to changes in the message fields, or
payload. The authentication [IPv6-Auth] or encryption [IPv6-ESP]
of the ICMP message is a protection against such actions.

4. ICMP messages may be used as attempts to perform denial of service
attacks by sending back to back erroneous IP packets. An
implementation that correctly followed section 2.4, paragraph (f)
of this specifications, would be protected by the ICMP error rate
limiting mechanism.

6. References


[IPv6] Deering, S. and R. Hinden, "Internet Protocol, Version
6, (IPv6) Specification", RFC 2460, December 1998.

[IPv6-ADDR] Hinden, R. and S. Deering, "IP Version 6 Addressing
Architecture", RFC 2373, July 1998.

[IPv6-DISC] Narten, T., Nordmark, E. and W. Simpson, "Neighbor
Discovery for IP Version 6 (IPv6)", RFC 2461, December
1998.

[RFC-792] Postel, J., "Internet Control Message Protocol", STD 5,
RFC 792, September 1981.

[RFC-1122] Braden, R., "Requirements for Internet Hosts -
Communication Layers", STD 5, RFC 1122, August 1989.

[PMTU] McCann, J., Deering, S. and J. Mogul, "Path MTU
Discovery for IP version 6", RFC 1981, August 1996.

[RFC-2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.

[IPv6-SA] Kent, S. and R. Atkinson, "Security Architecture for the
Internet Protocol", RFC 2401, November 1998.

[IPv6-Auth] Kent, S. and R. Atkinson, "IP Authentication Header",
RFC 2402, November 1998.

[IPv6-ESP] Kent, S. and R. Atkinson, "IP Encapsulating Security
Protocol (ESP)", RFC 2406, November 1998.

7. Acknowledgments


The document is derived from previous ICMP drafts of the SIPP and
IPng working group.

The IPng working group and particularly Robert Elz, Jim Bound, Bill
Simpson, Thomas Narten, Charlie Lynn, Bill Fink, Scott Bradner,
Dimitri Haskin, and Bob Hinden (in chronological order) provided
extensive review information and feedback.

8. Authors' Addresses


Alex Conta
Lucent Technologies Inc.
300 Baker Ave, Suite 100
Concord, MA 01742
USA

Phone: +1 978 287-2842
EMail: aconta@lucent.com

Stephen Deering
Cisco Systems, Inc.
170 West Tasman Drive
San Jose, CA 95134-1706
USA

Phone: +1 408 527-8213
EMail: deering@cisco.com

Appendix A - Changes from RFC 1885


Version 2-02

- Excluded mentioning informational replies from paragraph (f.2) of
section 2.4.
- In "Upper layer notification" sections changed "upper-layer
protocol" and "User Interface" to "process".
- Changed section 5.2, item 2 and 3 to also refer to AH
authentication.
- Removed item 5. from section 5.2 on denial of service attacks.
- Updated phone numbers and Email addresses in the "Authors'
Addresses" section.

Version 2-01

- Replaced all references to "576 octets" as the maximum for an ICMP
message size with "minimum IPv6 MTU" as defined by the base IPv6
specification.
- Removed rate control from informational messages.
- Added requirement that receivers ignore Code value in Packet Too
Big message.
- Removed "Not a Neighbor" (code 2) from destination unreachable
message.
- Fixed typos and update references.

Version 2-00

- Applied rate control to informational messages
- Removed section 2.4 on Group Management ICMP messages
- Removed references to IGMP in Abstract and Section 1.
- Updated references to other IPv6 documents
- Removed references to RFC-1112 in Abstract, and Section 1, and to
RFC-1191 in section 1, and section 3.2
- Added security section
- Added Appendix A - changes

Full Copyright Statement


Copyright (C) The Internet Society (1998). All Rights Reserved.

This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.

The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.

This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


RFC 1981 – Path MTU Discovery for IP version 6

 
Network Working Group                                          J. McCann
Request for Comments: 1981 Digital Equipment Corporation
Category: Standards Track S. Deering
Xerox PARC
J. Mogul
Digital Equipment Corporation
August 1996

Path MTU Discovery for IP version 6

Status of this Memo

This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.

Abstract

This document describes Path MTU Discovery for IP version 6. It is
largely derived from RFC 1191, which describes Path MTU Discovery for
IP version 4.

Table of Contents

1. Introduction.................................................2
2. Terminology..................................................2
3. Protocol overview............................................3
4. Protocol Requirements........................................4
5. Implementation Issues........................................5
5.1. Layering...................................................5
5.2. Storing PMTU information...................................6
5.3. Purging stale PMTU information.............................8
5.4. TCP layer actions..........................................9
5.5. Issues for other transport protocols......................11
5.6. Management interface......................................12
6. Security Considerations.....................................12
Acknowledgements...............................................13
Appendix A - Comparison to RFC 1191............................14
References.....................................................14
Authors' Addresses.............................................15

1. Introduction

When one IPv6 node has a large amount of data to send to another
node, the data is transmitted in a series of IPv6 packets. It is
usually preferable that these packets be of the largest size that can
successfully traverse the path from the source node to the
destination node. This packet size is referred to as the Path MTU
(PMTU), and it is equal to the minimum link MTU of all the links in a
path. IPv6 defines a standard mechanism for a node to discover the
PMTU of an arbitrary path.

IPv6 nodes SHOULD implement Path MTU Discovery in order to discover
and take advantage of paths with PMTU greater than the IPv6 minimum
link MTU [IPv6-SPEC]. A minimal IPv6 implementation (e.g., in a boot
ROM) may choose to omit implementation of Path MTU Discovery.

Nodes not implementing Path MTU Discovery use the IPv6 minimum link
MTU defined in [IPv6-SPEC] as the maximum packet size. In most
cases, this will result in the use of smaller packets than necessary,
because most paths have a PMTU greater than the IPv6 minimum link
MTU. A node sending packets much smaller than the Path MTU allows is
wasting network resources and probably getting suboptimal throughput.

2. Terminology

node - a device that implements IPv6.

router - a node that forwards IPv6 packets not explicitly
addressed to itself.

host - any node that is not a router.

upper layer - a protocol layer immediately above IPv6. Examples are
transport protocols such as TCP and UDP, control
protocols such as ICMP, routing protocols such as OSPF,
and internet or lower-layer protocols being "tunneled"
over (i.e., encapsulated in) IPv6 such as IPX,
AppleTalk, or IPv6 itself.

link - a communication facility or medium over which nodes can
communicate at the link layer, i.e., the layer
immediately below IPv6. Examples are Ethernets (simple
or bridged); PPP links; X.25, Frame Relay, or ATM
networks; and internet (or higher) layer "tunnels",
such as tunnels over IPv4 or IPv6 itself.

interface - a node's attachment to a link.

address - an IPv6-layer identifier for an interface or a set of
interfaces.

packet - an IPv6 header plus payload.

link MTU - the maximum transmission unit, i.e., maximum packet
size in octets, that can be conveyed in one piece over
a link.

path - the set of links traversed by a packet between a source
node and a destination node

path MTU - the minimum link MTU of all the links in a path between
a source node and a destination node.

PMTU - path MTU

Path MTU
Discovery - process by which a node learns the PMTU of a path

flow - a sequence of packets sent from a particular source
to a particular (unicast or multicast) destination for
which the source desires special handling by the
intervening routers.

flow id - a combination of a source address and a non-zero
flow label.

3. Protocol overview

This memo describes a technique to dynamically discover the PMTU of a
path. The basic idea is that a source node initially assumes that
the PMTU of a path is the (known) MTU of the first hop in the path.
If any of the packets sent on that path are too large to be forwarded
by some node along the path, that node will discard them and return
ICMPv6 Packet Too Big messages [ICMPv6]. Upon receipt of such a
message, the source node reduces its assumed PMTU for the path based
on the MTU of the constricting hop as reported in the Packet Too Big
message.

The Path MTU Discovery process ends when the node's estimate of the
PMTU is less than or equal to the actual PMTU. Note that several
iterations of the packet-sent/Packet-Too-Big-message-received cycle
may occur before the Path MTU Discovery process ends, as there may be
links with smaller MTUs further along the path.

Alternatively, the node may elect to end the discovery process by
ceasing to send packets larger than the IPv6 minimum link MTU.

The PMTU of a path may change over time, due to changes in the
routing topology. Reductions of the PMTU are detected by Packet Too
Big messages. To detect increases in a path's PMTU, a node
periodically increases its assumed PMTU. This will almost always
result in packets being discarded and Packet Too Big messages being
generated, because in most cases the PMTU of the path will not have
changed. Therefore, attempts to detect increases in a path's PMTU
should be done infrequently.

Path MTU Discovery supports multicast as well as unicast
destinations. In the case of a multicast destination, copies of a
packet may traverse many different paths to many different nodes.
Each path may have a different PMTU, and a single multicast packet
may result in multiple Packet Too Big messages, each reporting a
different next-hop MTU. The minimum PMTU value across the set of
paths in use determines the size of subsequent packets sent to the
multicast destination.

Note that Path MTU Discovery must be performed even in cases where a
node "thinks" a destination is attached to the same link as itself.
In a situation such as when a neighboring router acts as proxy [ND]
for some destination, the destination can to appear to be directly
connected but is in fact more than one hop away.

4. Protocol Requirements

As discussed in section 1, IPv6 nodes are not required to implement
Path MTU Discovery. The requirements in this section apply only to
those implementations that include Path MTU Discovery.

When a node receives a Packet Too Big message, it MUST reduce its
estimate of the PMTU for the relevant path, based on the value of the
MTU field in the message. The precise behavior of a node in this
circumstance is not specified, since different applications may have
different requirements, and since different implementation
architectures may favor different strategies.

After receiving a Packet Too Big message, a node MUST attempt to
avoid eliciting more such messages in the near future. The node MUST
reduce the size of the packets it is sending along the path. Using a
PMTU estimate larger than the IPv6 minimum link MTU may continue to
elicit Packet Too Big messages. Since each of these messages (and
the dropped packets they respond to) consume network resources, the
node MUST force the Path MTU Discovery process to end.

Nodes using Path MTU Discovery MUST detect decreases in PMTU as fast
as possible. Nodes MAY detect increases in PMTU, but because doing
so requires sending packets larger than the current estimated PMTU,

and because the likelihood is that the PMTU will not have increased,
this MUST be done at infrequent intervals. An attempt to detect an
increase (by sending a packet larger than the current estimate) MUST
NOT be done less than 5 minutes after a Packet Too Big message has
been received for the given path. The recommended setting for this
timer is twice its minimum value (10 minutes).

A node MUST NOT reduce its estimate of the Path MTU below the IPv6
minimum link MTU.

Note: A node may receive a Packet Too Big message reporting a
next-hop MTU that is less than the IPv6 minimum link MTU. In that
case, the node is not required to reduce the size of subsequent
packets sent on the path to less than the IPv6 minimun link MTU,
but rather must include a Fragment header in those packets [IPv6-
SPEC].

A node MUST NOT increase its estimate of the Path MTU in response to
the contents of a Packet Too Big message. A message purporting to
announce an increase in the Path MTU might be a stale packet that has
been floating around in the network, a false packet injected as part
of a denial-of-service attack, or the result of having multiple paths
to the destination, each with a different PMTU.

5. Implementation Issues

This section discusses a number of issues related to the
implementation of Path MTU Discovery. This is not a specification,
but rather a set of notes provided as an aid for implementors.

The issues include:

- What layer or layers implement Path MTU Discovery?

- How is the PMTU information cached?

- How is stale PMTU information removed?

- What must transport and higher layers do?

5.1. Layering

In the IP architecture, the choice of what size packet to send is
made by a protocol at a layer above IP. This memo refers to such a
protocol as a "packetization protocol". Packetization protocols are
usually transport protocols (for example, TCP) but can also be
higher-layer protocols (for example, protocols built on top of UDP).

Implementing Path MTU Discovery in the packetization layers
simplifies some of the inter-layer issues, but has several drawbacks:
the implementation may have to be redone for each packetization
protocol, it becomes hard to share PMTU information between different
packetization layers, and the connection-oriented state maintained by
some packetization layers may not easily extend to save PMTU
information for long periods.

It is therefore suggested that the IP layer store PMTU information
and that the ICMP layer process received Packet Too Big messages.
The packetization layers may respond to changes in the PMTU, by
changing the size of the messages they send. To support this
layering, packetization layers require a way to learn of changes in
the value of MMS_S, the "maximum send transport-message size". The
MMS_S is derived from the Path MTU by subtracting the size of the
IPv6 header plus space reserved by the IP layer for additional
headers (if any).

It is possible that a packetization layer, perhaps a UDP application
outside the kernel, is unable to change the size of messages it
sends. This may result in a packet size that exceeds the Path MTU.
To accommodate such situations, IPv6 defines a mechanism that allows
large payloads to be divided into fragments, with each fragment sent
in a separate packet (see [IPv6-SPEC] section "Fragment Header").
However, packetization layers are encouraged to avoid sending
messages that will require fragmentation (for the case against
fragmentation, see [FRAG]).

5.2. Storing PMTU information

Ideally, a PMTU value should be associated with a specific path
traversed by packets exchanged between the source and destination
nodes. However, in most cases a node will not have enough
information to completely and accurately identify such a path.
Rather, a node must associate a PMTU value with some local
representation of a path. It is left to the implementation to select
the local representation of a path.

In the case of a multicast destination address, copies of a packet
may traverse many different paths to reach many different nodes. The
local representation of the "path" to a multicast destination must in
fact represent a potentially large set of paths.

Minimally, an implementation could maintain a single PMTU value to be
used for all packets originated from the node. This PMTU value would
be the minimum PMTU learned across the set of all paths in use by the
node. This approach is likely to result in the use of smaller
packets than is necessary for many paths.

An implementation could use the destination address as the local
representation of a path. The PMTU value associated with a
destination would be the minimum PMTU learned across the set of all
paths in use to that destination. The set of paths in use to a
particular destination is expected to be small, in many cases
consisting of a single path. This approach will result in the use of
optimally sized packets on a per-destination basis. This approach
integrates nicely with the conceptual model of a host as described in
[ND]: a PMTU value could be stored with the corresponding entry in
the destination cache.

If flows [IPv6-SPEC] are in use, an implementation could use the flow
id as the local representation of a path. Packets sent to a
particular destination but belonging to different flows may use
different paths, with the choice of path depending on the flow id.
This approach will result in the use of optimally sized packets on a
per-flow basis, providing finer granularity than PMTU values
maintained on a per-destination basis.

For source routed packets (i.e. packets containing an IPv6 Routing
header [IPv6-SPEC]), the source route may further qualify the local
representation of a path. In particular, a packet containing a type
0 Routing header in which all bits in the Strict/Loose Bit Map are
equal to 1 contains a complete path specification. An implementation
could use source route information in the local representation of a
path.

Note: Some paths may be further distinguished by different
security classifications. The details of such classifications are
beyond the scope of this memo.

Initially, the PMTU value for a path is assumed to be the (known) MTU
of the first-hop link.

When a Packet Too Big message is received, the node determines which
path the message applies to based on the contents of the Packet Too
Big message. For example, if the destination address is used as the
local representation of a path, the destination address from the
original packet would be used to determine which path the message
applies to.

Note: if the original packet contained a Routing header, the
Routing header should be used to determine the location of the
destination address within the original packet. If Segments Left
is equal to zero, the destination address is in the Destination
Address field in the IPv6 header. If Segments Left is greater
than zero, the destination address is the last address
(Address[n]) in the Routing header.

The node then uses the value in the MTU field in the Packet Too Big
message as a tentative PMTU value, and compares the tentative PMTU to
the existing PMTU. If the tentative PMTU is less than the existing
PMTU estimate, the tentative PMTU replaces the existing PMTU as the
PMTU value for the path.

The packetization layers must be notified about decreases in the
PMTU. Any packetization layer instance (for example, a TCP
connection) that is actively using the path must be notified if the
PMTU estimate is decreased.

Note: even if the Packet Too Big message contains an Original
Packet Header that refers to a UDP packet, the TCP layer must be
notified if any of its connections use the given path.

Also, the instance that sent the packet that elicited the Packet Too
Big message should be notified that its packet has been dropped, even
if the PMTU estimate has not changed, so that it may retransmit the
dropped data.

Note: An implementation can avoid the use of an asynchronous
notification mechanism for PMTU decreases by postponing
notification until the next attempt to send a packet larger than
the PMTU estimate. In this approach, when an attempt is made to
SEND a packet that is larger than the PMTU estimate, the SEND
function should fail and return a suitable error indication. This
approach may be more suitable to a connectionless packetization
layer (such as one using UDP), which (in some implementations) may
be hard to "notify" from the ICMP layer. In this case, the normal
timeout-based retransmission mechanisms would be used to recover
from the dropped packets.

It is important to understand that the notification of the
packetization layer instances using the path about the change in the
PMTU is distinct from the notification of a specific instance that a
packet has been dropped. The latter should be done as soon as
practical (i.e., asynchronously from the point of view of the
packetization layer instance), while the former may be delayed until
a packetization layer instance wants to create a packet.
Retransmission should be done for only for those packets that are
known to be dropped, as indicated by a Packet Too Big message.

5.3. Purging stale PMTU information

Internetwork topology is dynamic; routes change over time. While the
local representation of a path may remain constant, the actual
path(s) in use may change. Thus, PMTU information cached by a node
can become stale.

If the stale PMTU value is too large, this will be discovered almost
immediately once a large enough packet is sent on the path. No such
mechanism exists for realizing that a stale PMTU value is too small,
so an implementation should "age" cached values. When a PMTU value
has not been decreased for a while (on the order of 10 minutes), the
PMTU estimate should be set to the MTU of the first-hop link, and the
packetization layers should be notified of the change. This will
cause the complete Path MTU Discovery process to take place again.

Note: an implementation should provide a means for changing the
timeout duration, including setting it to "infinity". For
example, nodes attached to an FDDI link which is then attached to
the rest of the Internet via a small MTU serial line are never
going to discover a new non-local PMTU, so they should not have to
put up with dropped packets every 10 minutes.

An upper layer must not retransmit data in response to an increase in
the PMTU estimate, since this increase never comes in response to an
indication of a dropped packet.

One approach to implementing PMTU aging is to associate a timestamp
field with a PMTU value. This field is initialized to a "reserved"
value, indicating that the PMTU is equal to the MTU of the first hop
link. Whenever the PMTU is decreased in response to a Packet Too Big
message, the timestamp is set to the current time.

Once a minute, a timer-driven procedure runs through all cached PMTU
values, and for each PMTU whose timestamp is not "reserved" and is
older than the timeout interval:

- The PMTU estimate is set to the MTU of the first hop link.

- The timestamp is set to the "reserved" value.

- Packetization layers using this path are notified of the increase.

5.4. TCP layer actions

The TCP layer must track the PMTU for the path(s) in use by a
connection; it should not send segments that would result in packets
larger than the PMTU. A simple implementation could ask the IP layer
for this value each time it created a new segment, but this could be
inefficient. Moreover, TCP implementations that follow the "slow-
start" congestion-avoidance algorithm [CONG] typically calculate and
cache several other values derived from the PMTU. It may be simpler
to receive asynchronous notification when the PMTU changes, so that
these variables may be updated.

A TCP implementation must also store the MSS value received from its
peer, and must not send any segment larger than this MSS, regardless
of the PMTU. In 4.xBSD-derived implementations, this may require
adding an additional field to the TCP state record.

The value sent in the TCP MSS option is independent of the PMTU.
This MSS option value is used by the other end of the connection,
which may be using an unrelated PMTU value. See [IPv6-SPEC] sections
"Packet Size Issues" and "Maximum Upper-Layer Payload Size" for
information on selecting a value for the TCP MSS option.

When a Packet Too Big message is received, it implies that a packet
was dropped by the node that sent the ICMP message. It is sufficient
to treat this as any other dropped segment, and wait until the
retransmission timer expires to cause retransmission of the segment.
If the Path MTU Discovery process requires several steps to find the
PMTU of the full path, this could delay the connection by many
round-trip times.

Alternatively, the retransmission could be done in immediate response
to a notification that the Path MTU has changed, but only for the
specific connection specified by the Packet Too Big message. The
packet size used in the retransmission should be no larger than the
new PMTU.

Note: A packetization layer must not retransmit in response to
every Packet Too Big message, since a burst of several oversized
segments will give rise to several such messages and hence several
retransmissions of the same data. If the new estimated PMTU is
still wrong, the process repeats, and there is an exponential
growth in the number of superfluous segments sent.

This means that the TCP layer must be able to recognize when a
Packet Too Big notification actually decreases the PMTU that it
has already used to send a packet on the given connection, and
should ignore any other notifications.

Many TCP implementations incorporate "congestion avoidance" and
"slow-start" algorithms to improve performance [CONG]. Unlike a
retransmission caused by a TCP retransmission timeout, a
retransmission caused by a Packet Too Big message should not change
the congestion window. It should, however, trigger the slow-start
mechanism (i.e., only one segment should be retransmitted until
acknowledgements begin to arrive again).

TCP performance can be reduced if the sender's maximum window size is
not an exact multiple of the segment size in use (this is not the
congestion window size, which is always a multiple of the segment

size). In many systems (such as those derived from 4.2BSD), the
segment size is often set to 1024 octets, and the maximum window size
(the "send space") is usually a multiple of 1024 octets, so the
proper relationship holds by default. If Path MTU Discovery is used,
however, the segment size may not be a submultiple of the send space,
and it may change during a connection; this means that the TCP layer
may need to change the transmission window size when Path MTU
Discovery changes the PMTU value. The maximum window size should be
set to the greatest multiple of the segment size that is less than or
equal to the sender's buffer space size.

5.5. Issues for other transport protocols

Some transport protocols (such as ISO TP4 [ISOTP]) are not allowed to
repacketize when doing a retransmission. That is, once an attempt is
made to transmit a segment of a certain size, the transport cannot
split the contents of the segment into smaller segments for
retransmission. In such a case, the original segment can be
fragmented by the IP layer during retransmission. Subsequent
segments, when transmitted for the first time, should be no larger
than allowed by the Path MTU.

The Sun Network File System (NFS) uses a Remote Procedure Call (RPC)
protocol [RPC] that, when used over UDP, in many cases will generate
payloads that must be fragmented even for the first-hop link. This
might improve performance in certain cases, but it is known to cause
reliability and performance problems, especially when the client and
server are separated by routers.

It is recommended that NFS implementations use Path MTU Discovery
whenever routers are involved. Most NFS implementations allow the
RPC datagram size to be changed at mount-time (indirectly, by
changing the effective file system block size), but might require
some modification to support changes later on.

Also, since a single NFS operation cannot be split across several UDP
datagrams, certain operations (primarily, those operating on file
names and directories) require a minimum payload size that if sent in
a single packet would exceed the PMTU. NFS implementations should
not reduce the payload size below this threshold, even if Path MTU
Discovery suggests a lower value. In this case the payload will be
fragmented by the IP layer.

5.6. Management interface

It is suggested that an implementation provide a way for a system
utility program to:

- Specify that Path MTU Discovery not be done on a given path.

- Change the PMTU value associated with a given path.

The former can be accomplished by associating a flag with the path;
when a packet is sent on a path with this flag set, the IP layer does
not send packets larger than the IPv6 minimum link MTU.

These features might be used to work around an anomalous situation,
or by a routing protocol implementation that is able to obtain Path
MTU values.

The implementation should also provide a way to change the timeout
period for aging stale PMTU information.

6. Security Considerations

This Path MTU Discovery mechanism makes possible two denial-of-
service attacks, both based on a malicious party sending false Packet
Too Big messages to a node.

In the first attack, the false message indicates a PMTU much smaller
than reality. This should not entirely stop data flow, since the
victim node should never set its PMTU estimate below the IPv6 minimum
link MTU. It will, however, result in suboptimal performance.

In the second attack, the false message indicates a PMTU larger than
reality. If believed, this could cause temporary blockage as the
victim sends packets that will be dropped by some router. Within one
round-trip time, the node would discover its mistake (receiving
Packet Too Big messages from that router), but frequent repetition of
this attack could cause lots of packets to be dropped. A node,
however, should never raise its estimate of the PMTU based on a
Packet Too Big message, so should not be vulnerable to this attack.

A malicious party could also cause problems if it could stop a victim
from receiving legitimate Packet Too Big messages, but in this case
there are simpler denial-of-service attacks available.

Acknowledgements

We would like to acknowledge the authors of and contributors to
[RFC-1191], from which the majority of this document was derived. We
would also like to acknowledge the members of the IPng working group
for their careful review and constructive criticisms.

Appendix A - Comparison to RFC 1191

This document is based in large part on RFC 1191, which describes
Path MTU Discovery for IPv4. Certain portions of RFC 1191 were not
needed in this document:

router specification - Packet Too Big messages and corresponding
router behavior are defined in [ICMPv6]

Don't Fragment bit - there is no DF bit in IPv6 packets

TCP MSS discussion - selecting a value to send in the TCP MSS
option is discussed in [IPv6-SPEC]

old-style messages - all Packet Too Big messages report the
MTU of the constricting link

MTU plateau tables - not needed because there are no old-style
messages

References

[CONG] Van Jacobson. Congestion Avoidance and Control. Proc.
SIGCOMM '88 Symposium on Communications Architectures and
Protocols, pages 314-329. Stanford, CA, August, 1988.

[FRAG] C. Kent and J. Mogul. Fragmentation Considered Harmful.
In Proc. SIGCOMM '87 Workshop on Frontiers in Computer
Communications Technology. August, 1987.

[ICMPv6] Conta, A., and S. Deering, "Internet Control Message
Protocol (ICMPv6) for the Internet Protocol Version 6
(IPv6) Specification", RFC 1885, December 1995.

[IPv6-SPEC] Deering, S., and R. Hinden, "Internet Protocol, Version
6 (IPv6) Specification", RFC 1883, December 1995.

[ISOTP] ISO. ISO Transport Protocol Specification: ISO DP 8073.
RFC 905, SRI Network Information Center, April, 1984.

[ND] Narten, T., Nordmark, E., and W. Simpson, "Neighbor
Discovery for IP Version 6 (IPv6)", Work in Progress.

[RFC-1191] Mogul, J., and S. Deering, "Path MTU Discovery",
RFC 1191, November 1990.

[RPC] Sun Microsystems, Inc., "RPC: Remote Procedure Call
Protocol", RFC 1057, SRI Network Information Center,
June, 1988.

Authors' Addresses

Jack McCann
Digital Equipment Corporation
110 Spitbrook Road, ZKO3-3/U14
Nashua, NH 03062
Phone: +1 603 881 2608

Fax: +1 603 881 0120
Email: mccann@zk3.dec.com

Stephen E. Deering
Xerox Palo Alto Research Center
3333 Coyote Hill Road
Palo Alto, CA 94304
Phone: +1 415 812 4839

Fax: +1 415 812 4471
EMail: deering@parc.xerox.com

Jeffrey Mogul
Digital Equipment Corporation Western Research Laboratory
250 University Avenue
Palo Alto, CA 94301
Phone: +1 415 617 3304

EMail: mogul@pa.dec.com


RFC 1661 – The Point-to-Point Protocol (PPP)

 
Network Working Group                                 W. Simpson, Editor
Request for Comments: 1661 Daydreamer
STD: 51 July 1994
Obsoletes: 1548
Category: Standards Track

The Point-to-Point Protocol (PPP)

Status of this Memo

This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.

Abstract

The Point-to-Point Protocol (PPP) provides a standard method for
transporting multi-protocol datagrams over point-to-point links. PPP
is comprised of three main components:

1. A method for encapsulating multi-protocol datagrams.

2. A Link Control Protocol (LCP) for establishing, configuring,
and testing the data-link connection.

3. A family of Network Control Protocols (NCPs) for establishing
and configuring different network-layer protocols.

This document defines the PPP organization and methodology, and the
PPP encapsulation, together with an extensible option negotiation
mechanism which is able to negotiate a rich assortment of
configuration parameters and provides additional management
functions. The PPP Link Control Protocol (LCP) is described in terms
of this mechanism.

Table of Contents

1. Introduction .......................................... 1
1.1 Specification of Requirements ................... 2
1.2 Terminology ..................................... 3

2. PPP Encapsulation ..................................... 4

3. PPP Link Operation .................................... 6
3.1 Overview ........................................ 6
3.2 Phase Diagram ................................... 6
3.3 Link Dead (physical-layer not ready) ............ 7
3.4 Link Establishment Phase ........................ 7
3.5 Authentication Phase ............................ 8
3.6 Network-Layer Protocol Phase .................... 8
3.7 Link Termination Phase .......................... 9

4. The Option Negotiation Automaton ...................... 11
4.1 State Transition Table .......................... 12
4.2 States .......................................... 14
4.3 Events .......................................... 16
4.4 Actions ......................................... 21
4.5 Loop Avoidance .................................. 23
4.6 Counters and Timers ............................. 24

5. LCP Packet Formats .................................... 26
5.1 Configure-Request ............................... 28
5.2 Configure-Ack ................................... 29
5.3 Configure-Nak ................................... 30
5.4 Configure-Reject ................................ 31
5.5 Terminate-Request and Terminate-Ack ............. 33
5.6 Code-Reject ..................................... 34
5.7 Protocol-Reject ................................. 35
5.8 Echo-Request and Echo-Reply ..................... 36
5.9 Discard-Request ................................. 37

6. LCP Configuration Options ............................. 39
6.1 Maximum-Receive-Unit (MRU) ...................... 41
6.2 Authentication-Protocol ......................... 42
6.3 Quality-Protocol ................................ 43
6.4 Magic-Number .................................... 45
6.5 Protocol-Field-Compression (PFC) ................ 48
6.6 Address-and-Control-Field-Compression (ACFC)

SECURITY CONSIDERATIONS ...................................... 51
REFERENCES ................................................... 51
ACKNOWLEDGEMENTS ............................................. 51
CHAIR'S ADDRESS .............................................. 52
EDITOR'S ADDRESS ............................................. 52

1. Introduction

The Point-to-Point Protocol is designed for simple links which
transport packets between two peers. These links provide full-duplex
simultaneous bi-directional operation, and are assumed to deliver
packets in order. It is intended that PPP provide a common solution
for easy connection of a wide variety of hosts, bridges and routers
[1].

Encapsulation

The PPP encapsulation provides for multiplexing of different
network-layer protocols simultaneously over the same link. The
PPP encapsulation has been carefully designed to retain
compatibility with most commonly used supporting hardware.

Only 8 additional octets are necessary to form the encapsulation
when used within the default HDLC-like framing. In environments
where bandwidth is at a premium, the encapsulation and framing may
be shortened to 2 or 4 octets.

To support high speed implementations, the default encapsulation
uses only simple fields, only one of which needs to be examined
for demultiplexing. The default header and information fields
fall on 32-bit boundaries, and the trailer may be padded to an
arbitrary boundary.

Link Control Protocol

In order to be sufficiently versatile to be portable to a wide
variety of environments, PPP provides a Link Control Protocol
(LCP). The LCP is used to automatically agree upon the
encapsulation format options, handle varying limits on sizes of
packets, detect a looped-back link and other common
misconfiguration errors, and terminate the link. Other optional
facilities provided are authentication of the identity of its peer
on the link, and determination when a link is functioning properly
and when it is failing.

Network Control Protocols

Point-to-Point links tend to exacerbate many problems with the
current family of network protocols. For instance, assignment and
management of IP addresses, which is a problem even in LAN
environments, is especially difficult over circuit-switched
point-to-point links (such as dial-up modem servers). These
problems are handled by a family of Network Control Protocols
(NCPs), which each manage the specific needs required by their

respective network-layer protocols. These NCPs are defined in
companion documents.

Configuration

It is intended that PPP links be easy to configure. By design,
the standard defaults handle all common configurations. The
implementor can specify improvements to the default configuration,
which are automatically communicated to the peer without operator
intervention. Finally, the operator may explicitly configure
options for the link which enable the link to operate in
environments where it would otherwise be impossible.

This self-configuration is implemented through an extensible
option negotiation mechanism, wherein each end of the link
describes to the other its capabilities and requirements.
Although the option negotiation mechanism described in this
document is specified in terms of the Link Control Protocol (LCP),
the same facilities are designed to be used by other control
protocols, especially the family of NCPs.

1.1. Specification of Requirements

In this document, several words are used to signify the requirements
of the specification. These words are often capitalized.

MUST This word, or the adjective "required", means that the
definition is an absolute requirement of the specification.

MUST NOT This phrase means that the definition is an absolute
prohibition of the specification.

SHOULD This word, or the adjective "recommended", means that there
may exist valid reasons in particular circumstances to
ignore this item, but the full implications must be
understood and carefully weighed before choosing a
different course.

MAY This word, or the adjective "optional", means that this
item is one of an allowed set of alternatives. An
implementation which does not include this option MUST be
prepared to interoperate with another implementation which
does include the option.

1.2. Terminology

This document frequently uses the following terms:

datagram The unit of transmission in the network layer (such as IP).
A datagram may be encapsulated in one or more packets
passed to the data link layer.

frame The unit of transmission at the data link layer. A frame
may include a header and/or a trailer, along with some
number of units of data.

packet The basic unit of encapsulation, which is passed across the
interface between the network layer and the data link
layer. A packet is usually mapped to a frame; the
exceptions are when data link layer fragmentation is being
performed, or when multiple packets are incorporated into a
single frame.

peer The other end of the point-to-point link.

silently discard
The implementation discards the packet without further
processing. The implementation SHOULD provide the
capability of logging the error, including the contents of
the silently discarded packet, and SHOULD record the event
in a statistics counter.

2. PPP Encapsulation

The PPP encapsulation is used to disambiguate multiprotocol
datagrams. This encapsulation requires framing to indicate the
beginning and end of the encapsulation. Methods of providing framing
are specified in companion documents.

A summary of the PPP encapsulation is shown below. The fields are
transmitted from left to right.

+----------+-------------+---------+
| Protocol | Information | Padding |
| 8/16 bits| * | * |
+----------+-------------+---------+

Protocol Field

The Protocol field is one or two octets, and its value identifies
the datagram encapsulated in the Information field of the packet.
The field is transmitted and received most significant octet
first.

The structure of this field is consistent with the ISO 3309
extension mechanism for address fields. All Protocols MUST be
odd; the least significant bit of the least significant octet MUST
equal "1". Also, all Protocols MUST be assigned such that the
least significant bit of the most significant octet equals "0".
Frames received which don't comply with these rules MUST be
treated as having an unrecognized Protocol.

Protocol field values in the "0***" to "3***" range identify the
network-layer protocol of specific packets, and values in the
"8***" to "b***" range identify packets belonging to the
associated Network Control Protocols (NCPs), if any.

Protocol field values in the "4***" to "7***" range are used for
protocols with low volume traffic which have no associated NCP.
Protocol field values in the "c***" to "f***" range identify
packets as link-layer Control Protocols (such as LCP).

Up-to-date values of the Protocol field are specified in the most
recent "Assigned Numbers" RFC [2]. This specification reserves
the following values:

Value (in hex) Protocol Name

0001 Padding Protocol
0003 to 001f reserved (transparency inefficient)
007d reserved (Control Escape)
00cf reserved (PPP NLPID)
00ff reserved (compression inefficient)

8001 to 801f unused
807d unused
80cf unused
80ff unused

c021 Link Control Protocol
c023 Password Authentication Protocol
c025 Link Quality Report
c223 Challenge Handshake Authentication Protocol

Developers of new protocols MUST obtain a number from the Internet
Assigned Numbers Authority (IANA), at IANA@isi.edu.

Information Field

The Information field is zero or more octets. The Information
field contains the datagram for the protocol specified in the
Protocol field.

The maximum length for the Information field, including Padding,
but not including the Protocol field, is termed the Maximum
Receive Unit (MRU), which defaults to 1500 octets. By
negotiation, consenting PPP implementations may use other values
for the MRU.

Padding

On transmission, the Information field MAY be padded with an
arbitrary number of octets up to the MRU. It is the
responsibility of each protocol to distinguish padding octets from
real information.

3. PPP Link Operation

3.1. Overview

In order to establish communications over a point-to-point link, each
end of the PPP link MUST first send LCP packets to configure and test
the data link. After the link has been established, the peer MAY be
authenticated.

Then, PPP MUST send NCP packets to choose and configure one or more
network-layer protocols. Once each of the chosen network-layer
protocols has been configured, datagrams from each network-layer
protocol can be sent over the link.

The link will remain configured for communications until explicit LCP
or NCP packets close the link down, or until some external event
occurs (an inactivity timer expires or network administrator
intervention).

3.2. Phase Diagram

In the process of configuring, maintaining and terminating the
point-to-point link, the PPP link goes through several distinct
phases which are specified in the following simplified state diagram:

+------+ +-----------+ +--------------+
| | UP | | OPENED | | SUCCESS/NONE
| Dead |------->| Establish |---------->| Authenticate |--+
| | | | | | |
+------+ +-----------+ +--------------+ |
^ | | |
| FAIL | FAIL | |
+<--------------+ +----------+ |
| | |
| +-----------+ | +---------+ |
| DOWN | | | CLOSING | | |
+------------| Terminate |<---+<----------| Network |<-+
| | | |
+-----------+ +---------+

Not all transitions are specified in this diagram. The following
semantics MUST be followed.

3.3. Link Dead (physical-layer not ready)

The link necessarily begins and ends with this phase. When an
external event (such as carrier detection or network administrator
configuration) indicates that the physical-layer is ready to be used,
PPP will proceed to the Link Establishment phase.

During this phase, the LCP automaton (described later) will be in the
Initial or Starting states. The transition to the Link Establishment
phase will signal an Up event to the LCP automaton.

Implementation Note:

Typically, a link will return to this phase automatically after
the disconnection of a modem. In the case of a hard-wired link,
this phase may be extremely short -- merely long enough to detect
the presence of the device.

3.4. Link Establishment Phase

The Link Control Protocol (LCP) is used to establish the connection
through an exchange of Configure packets. This exchange is complete,
and the LCP Opened state entered, once a Configure-Ack packet
(described later) has been both sent and received.

All Configuration Options are assumed to be at default values unless
altered by the configuration exchange. See the chapter on LCP
Configuration Options for further discussion.

It is important to note that only Configuration Options which are
independent of particular network-layer protocols are configured by
LCP. Configuration of individual network-layer protocols is handled
by separate Network Control Protocols (NCPs) during the Network-Layer
Protocol phase.

Any non-LCP packets received during this phase MUST be silently
discarded.

The receipt of the LCP Configure-Request causes a return to the Link
Establishment phase from the Network-Layer Protocol phase or
Authentication phase.

3.5. Authentication Phase

On some links it may be desirable to require a peer to authenticate
itself before allowing network-layer protocol packets to be
exchanged.

By default, authentication is not mandatory. If an implementation
desires that the peer authenticate with some specific authentication
protocol, then it MUST request the use of that authentication
protocol during Link Establishment phase.

Authentication SHOULD take place as soon as possible after link
establishment. However, link quality determination MAY occur
concurrently. An implementation MUST NOT allow the exchange of link
quality determination packets to delay authentication indefinitely.

Advancement from the Authentication phase to the Network-Layer
Protocol phase MUST NOT occur until authentication has completed. If
authentication fails, the authenticator SHOULD proceed instead to the
Link Termination phase.

Only Link Control Protocol, authentication protocol, and link quality
monitoring packets are allowed during this phase. All other packets
received during this phase MUST be silently discarded.

Implementation Notes:

An implementation SHOULD NOT fail authentication simply due to
timeout or lack of response. The authentication SHOULD allow some
method of retransmission, and proceed to the Link Termination
phase only after a number of authentication attempts has been
exceeded.

The implementation responsible for commencing Link Termination
phase is the implementation which has refused authentication to
its peer.

3.6. Network-Layer Protocol Phase

Once PPP has finished the previous phases, each network-layer
protocol (such as IP, IPX, or AppleTalk) MUST be separately
configured by the appropriate Network Control Protocol (NCP).

Each NCP MAY be Opened and Closed at any time.

Implementation Note:

Because an implementation may initially use a significant amount
of time for link quality determination, implementations SHOULD
avoid fixed timeouts when waiting for their peers to configure a
NCP.

After a NCP has reached the Opened state, PPP will carry the
corresponding network-layer protocol packets. Any supported
network-layer protocol packets received when the corresponding NCP is
not in the Opened state MUST be silently discarded.

Implementation Note:

While LCP is in the Opened state, any protocol packet which is
unsupported by the implementation MUST be returned in a Protocol-
Reject (described later). Only protocols which are supported are
silently discarded.

During this phase, link traffic consists of any possible combination
of LCP, NCP, and network-layer protocol packets.

3.7. Link Termination Phase

PPP can terminate the link at any time. This might happen because of
the loss of carrier, authentication failure, link quality failure,
the expiration of an idle-period timer, or the administrative closing
of the link.

LCP is used to close the link through an exchange of Terminate
packets. When the link is closing, PPP informs the network-layer
protocols so that they may take appropriate action.

After the exchange of Terminate packets, the implementation SHOULD
signal the physical-layer to disconnect in order to enforce the
termination of the link, particularly in the case of an
authentication failure. The sender of the Terminate-Request SHOULD
disconnect after receiving a Terminate-Ack, or after the Restart
counter expires. The receiver of a Terminate-Request SHOULD wait for
the peer to disconnect, and MUST NOT disconnect until at least one
Restart time has passed after sending a Terminate-Ack. PPP SHOULD
proceed to the Link Dead phase.

Any non-LCP packets received during this phase MUST be silently
discarded.

Implementation Note:

The closing of the link by LCP is sufficient. There is no need
for each NCP to send a flurry of Terminate packets. Conversely,
the fact that one NCP has Closed is not sufficient reason to cause
the termination of the PPP link, even if that NCP was the only NCP
currently in the Opened state.

4. The Option Negotiation Automaton

The finite-state automaton is defined by events, actions and state
transitions. Events include reception of external commands such as
Open and Close, expiration of the Restart timer, and reception of
packets from a peer. Actions include the starting of the Restart
timer and transmission of packets to the peer.

Some types of packets -- Configure-Naks and Configure-Rejects, or
Code-Rejects and Protocol-Rejects, or Echo-Requests, Echo-Replies and
Discard-Requests -- are not differentiated in the automaton
descriptions. As will be described later, these packets do indeed
serve different functions. However, they always cause the same
transitions.

Events Actions

Up = lower layer is Up tlu = This-Layer-Up
Down = lower layer is Down tld = This-Layer-Down
Open = administrative Open tls = This-Layer-Started
Close= administrative Close tlf = This-Layer-Finished

TO+ = Timeout with counter > 0 irc = Initialize-Restart-Count
TO- = Timeout with counter expired zrc = Zero-Restart-Count

RCR+ = Receive-Configure-Request (Good) scr = Send-Configure-Request
RCR- = Receive-Configure-Request (Bad)
RCA = Receive-Configure-Ack sca = Send-Configure-Ack
RCN = Receive-Configure-Nak/Rej scn = Send-Configure-Nak/Rej

RTR = Receive-Terminate-Request str = Send-Terminate-Request
RTA = Receive-Terminate-Ack sta = Send-Terminate-Ack

RUC = Receive-Unknown-Code scj = Send-Code-Reject
RXJ+ = Receive-Code-Reject (permitted)
or Receive-Protocol-Reject
RXJ- = Receive-Code-Reject (catastrophic)
or Receive-Protocol-Reject
RXR = Receive-Echo-Request ser = Send-Echo-Reply
or Receive-Echo-Reply
or Receive-Discard-Request

4.1. State Transition Table

The complete state transition table follows. States are indicated
horizontally, and events are read vertically. State transitions and
actions are represented in the form action/new-state. Multiple
actions are separated by commas, and may continue on succeeding lines
as space requires; multiple actions may be implemented in any
convenient order. The state may be followed by a letter, which
indicates an explanatory footnote. The dash ('-') indicates an
illegal transition.

| State
| 0 1 2 3 4 5
Events| Initial Starting Closed Stopped Closing Stopping
------+-----------------------------------------------------------
Up | 2 irc,scr/6 - - - -
Down | - - 0 tls/1 0 1
Open | tls/1 1 irc,scr/6 3r 5r 5r
Close| 0 tlf/0 2 2 4 4
|
TO+ | - - - - str/4 str/5
TO- | - - - - tlf/2 tlf/3
|
RCR+ | - - sta/2 irc,scr,sca/8 4 5
RCR- | - - sta/2 irc,scr,scn/6 4 5
RCA | - - sta/2 sta/3 4 5
RCN | - - sta/2 sta/3 4 5
|
RTR | - - sta/2 sta/3 sta/4 sta/5
RTA | - - 2 3 tlf/2 tlf/3
|
RUC | - - scj/2 scj/3 scj/4 scj/5
RXJ+ | - - 2 3 4 5
RXJ- | - - tlf/2 tlf/3 tlf/2 tlf/3
|
RXR | - - 2 3 4 5

| State
| 6 7 8 9
Events| Req-Sent Ack-Rcvd Ack-Sent Opened
------+-----------------------------------------
Up | - - - -
Down | 1 1 1 tld/1
Open | 6 7 8 9r
Close|irc,str/4 irc,str/4 irc,str/4 tld,irc,str/4
|
TO+ | scr/6 scr/6 scr/8 -
TO- | tlf/3p tlf/3p tlf/3p -
|
RCR+ | sca/8 sca,tlu/9 sca/8 tld,scr,sca/8
RCR- | scn/6 scn/7 scn/6 tld,scr,scn/6
RCA | irc/7 scr/6x irc,tlu/9 tld,scr/6x
RCN |irc,scr/6 scr/6x irc,scr/8 tld,scr/6x
|
RTR | sta/6 sta/6 sta/6 tld,zrc,sta/5
RTA | 6 6 8 tld,scr/6
|
RUC | scj/6 scj/7 scj/8 scj/9
RXJ+ | 6 6 8 9
RXJ- | tlf/3 tlf/3 tlf/3 tld,irc,str/5
|
RXR | 6 7 8 ser/9

The states in which the Restart timer is running are identifiable by
the presence of TO events. Only the Send-Configure-Request, Send-
Terminate-Request and Zero-Restart-Count actions start or re-start
the Restart timer. The Restart timer is stopped when transitioning
from any state where the timer is running to a state where the timer
is not running.

The events and actions are defined according to a message passing
architecture, rather than a signalling architecture. If an action is
desired to control specific signals (such as DTR), additional actions
are likely to be required.

[p] Passive option; see Stopped state discussion.

[r] Restart option; see Open event discussion.

[x] Crossed connection; see RCA event discussion.

4.2. States

Following is a more detailed description of each automaton state.

Initial

In the Initial state, the lower layer is unavailable (Down), and
no Open has occurred. The Restart timer is not running in the
Initial state.

Starting

The Starting state is the Open counterpart to the Initial state.
An administrative Open has been initiated, but the lower layer is
still unavailable (Down). The Restart timer is not running in the
Starting state.

When the lower layer becomes available (Up), a Configure-Request
is sent.

Closed

In the Closed state, the link is available (Up), but no Open has
occurred. The Restart timer is not running in the Closed state.

Upon reception of Configure-Request packets, a Terminate-Ack is
sent. Terminate-Acks are silently discarded to avoid creating a
loop.

Stopped

The Stopped state is the Open counterpart to the Closed state. It
is entered when the automaton is waiting for a Down event after
the This-Layer-Finished action, or after sending a Terminate-Ack.
The Restart timer is not running in the Stopped state.

Upon reception of Configure-Request packets, an appropriate
response is sent. Upon reception of other packets, a Terminate-
Ack is sent. Terminate-Acks are silently discarded to avoid
creating a loop.

Rationale:

The Stopped state is a junction state for link termination,
link configuration failure, and other automaton failure modes.
These potentially separate states have been combined.

There is a race condition between the Down event response (from

the This-Layer-Finished action) and the Receive-Configure-
Request event. When a Configure-Request arrives before the
Down event, the Down event will supercede by returning the
automaton to the Starting state. This prevents attack by
repetition.

Implementation Option:

After the peer fails to respond to Configure-Requests, an
implementation MAY wait passively for the peer to send
Configure-Requests. In this case, the This-Layer-Finished
action is not used for the TO- event in states Req-Sent, Ack-
Rcvd and Ack-Sent.

This option is useful for dedicated circuits, or circuits which
have no status signals available, but SHOULD NOT be used for
switched circuits.

Closing

In the Closing state, an attempt is made to terminate the
connection. A Terminate-Request has been sent and the Restart
timer is running, but a Terminate-Ack has not yet been received.

Upon reception of a Terminate-Ack, the Closed state is entered.
Upon the expiration of the Restart timer, a new Terminate-Request
is transmitted, and the Restart timer is restarted. After the
Restart timer has expired Max-Terminate times, the Closed state is
entered.

Stopping

The Stopping state is the Open counterpart to the Closing state.
A Terminate-Request has been sent and the Restart timer is
running, but a Terminate-Ack has not yet been received.

Rationale:

The Stopping state provides a well defined opportunity to
terminate a link before allowing new traffic. After the link
has terminated, a new configuration may occur via the Stopped
or Starting states.

Request-Sent

In the Request-Sent state an attempt is made to configure the
connection. A Configure-Request has been sent and the Restart
timer is running, but a Configure-Ack has not yet been received

nor has one been sent.

Ack-Received

In the Ack-Received state, a Configure-Request has been sent and a
Configure-Ack has been received. The Restart timer is still
running, since a Configure-Ack has not yet been sent.

Ack-Sent

In the Ack-Sent state, a Configure-Request and a Configure-Ack
have both been sent, but a Configure-Ack has not yet been
received. The Restart timer is running, since a Configure-Ack has
not yet been received.

Opened

In the Opened state, a Configure-Ack has been both sent and
received. The Restart timer is not running.

When entering the Opened state, the implementation SHOULD signal
the upper layers that it is now Up. Conversely, when leaving the
Opened state, the implementation SHOULD signal the upper layers
that it is now Down.

4.3. Events

Transitions and actions in the automaton are caused by events.

Up

This event occurs when a lower layer indicates that it is ready to
carry packets.

Typically, this event is used by a modem handling or calling
process, or by some other coupling of the PPP link to the physical
media, to signal LCP that the link is entering Link Establishment
phase.

It also can be used by LCP to signal each NCP that the link is
entering Network-Layer Protocol phase. That is, the This-Layer-Up
action from LCP triggers the Up event in the NCP.

Down

This event occurs when a lower layer indicates that it is no

longer ready to carry packets.

Typically, this event is used by a modem handling or calling
process, or by some other coupling of the PPP link to the physical
media, to signal LCP that the link is entering Link Dead phase.

It also can be used by LCP to signal each NCP that the link is
leaving Network-Layer Protocol phase. That is, the This-Layer-
Down action from LCP triggers the Down event in the NCP.

Open

This event indicates that the link is administratively available
for traffic; that is, the network administrator (human or program)
has indicated that the link is allowed to be Opened. When this
event occurs, and the link is not in the Opened state, the
automaton attempts to send configuration packets to the peer.

If the automaton is not able to begin configuration (the lower
layer is Down, or a previous Close event has not completed), the
establishment of the link is automatically delayed.

When a Terminate-Request is received, or other events occur which
cause the link to become unavailable, the automaton will progress
to a state where the link is ready to re-open. No additional
administrative intervention is necessary.

Implementation Option:

Experience has shown that users will execute an additional Open
command when they want to renegotiate the link. This might
indicate that new values are to be negotiated.

Since this is not the meaning of the Open event, it is
suggested that when an Open user command is executed in the
Opened, Closing, Stopping, or Stopped states, the
implementation issue a Down event, immediately followed by an
Up event. Care must be taken that an intervening Down event
cannot occur from another source.

The Down followed by an Up will cause an orderly renegotiation
of the link, by progressing through the Starting to the
Request-Sent state. This will cause the renegotiation of the
link, without any harmful side effects.

Close

This event indicates that the link is not available for traffic;

that is, the network administrator (human or program) has
indicated that the link is not allowed to be Opened. When this
event occurs, and the link is not in the Closed state, the
automaton attempts to terminate the connection. Futher attempts
to re-configure the link are denied until a new Open event occurs.

Implementation Note:

When authentication fails, the link SHOULD be terminated, to
prevent attack by repetition and denial of service to other
users. Since the link is administratively available (by
definition), this can be accomplished by simulating a Close
event to the LCP, immediately followed by an Open event. Care
must be taken that an intervening Close event cannot occur from
another source.

The Close followed by an Open will cause an orderly termination
of the link, by progressing through the Closing to the Stopping
state, and the This-Layer-Finished action can disconnect the
link. The automaton waits in the Stopped or Starting states
for the next connection attempt.

Timeout (TO+,TO-)

This event indicates the expiration of the Restart timer. The
Restart timer is used to time responses to Configure-Request and
Terminate-Request packets.

The TO+ event indicates that the Restart counter continues to be
greater than zero, which triggers the corresponding Configure-
Request or Terminate-Request packet to be retransmitted.

The TO- event indicates that the Restart counter is not greater
than zero, and no more packets need to be retransmitted.

Receive-Configure-Request (RCR+,RCR-)

This event occurs when a Configure-Request packet is received from
the peer. The Configure-Request packet indicates the desire to
open a connection and may specify Configuration Options. The
Configure-Request packet is more fully described in a later
section.

The RCR+ event indicates that the Configure-Request was
acceptable, and triggers the transmission of a corresponding
Configure-Ack.

The RCR- event indicates that the Configure-Request was

unacceptable, and triggers the transmission of a corresponding
Configure-Nak or Configure-Reject.

Implementation Note:

These events may occur on a connection which is already in the
Opened state. The implementation MUST be prepared to
immediately renegotiate the Configuration Options.

Receive-Configure-Ack (RCA)

This event occurs when a valid Configure-Ack packet is received
from the peer. The Configure-Ack packet is a positive response to
a Configure-Request packet. An out of sequence or otherwise
invalid packet is silently discarded.

Implementation Note:

Since the correct packet has already been received before
reaching the Ack-Rcvd or Opened states, it is extremely
unlikely that another such packet will arrive. As specified,
all invalid Ack/Nak/Rej packets are silently discarded, and do
not affect the transitions of the automaton.

However, it is not impossible that a correctly formed packet
will arrive through a coincidentally-timed cross-connection.
It is more likely to be the result of an implementation error.
At the very least, this occurance SHOULD be logged.

Receive-Configure-Nak/Rej (RCN)

This event occurs when a valid Configure-Nak or Configure-Reject
packet is received from the peer. The Configure-Nak and
Configure-Reject packets are negative responses to a Configure-
Request packet. An out of sequence or otherwise invalid packet is
silently discarded.

Implementation Note:

Although the Configure-Nak and Configure-Reject cause the same
state transition in the automaton, these packets have
significantly different effects on the Configuration Options
sent in the resulting Configure-Request packet.

Receive-Terminate-Request (RTR)

This event occurs when a Terminate-Request packet is received.
The Terminate-Request packet indicates the desire of the peer to

close the connection.

Implementation Note:

This event is not identical to the Close event (see above), and
does not override the Open commands of the local network
administrator. The implementation MUST be prepared to receive
a new Configure-Request without network administrator
intervention.

Receive-Terminate-Ack (RTA)

This event occurs when a Terminate-Ack packet is received from the
peer. The Terminate-Ack packet is usually a response to a
Terminate-Request packet. The Terminate-Ack packet may also
indicate that the peer is in Closed or Stopped states, and serves
to re-synchronize the link configuration.

Receive-Unknown-Code (RUC)

This event occurs when an un-interpretable packet is received from
the peer. A Code-Reject packet is sent in response.

Receive-Code-Reject, Receive-Protocol-Reject (RXJ+,RXJ-)

This event occurs when a Code-Reject or a Protocol-Reject packet
is received from the peer.

The RXJ+ event arises when the rejected value is acceptable, such
as a Code-Reject of an extended code, or a Protocol-Reject of a
NCP. These are within the scope of normal operation. The
implementation MUST stop sending the offending packet type.

The RXJ- event arises when the rejected value is catastrophic,
such as a Code-Reject of Configure-Request, or a Protocol-Reject
of LCP! This event communicates an unrecoverable error that
terminates the connection.

Receive-Echo-Request, Receive-Echo-Reply, Receive-Discard-Request
(RXR)

This event occurs when an Echo-Request, Echo-Reply or Discard-
Request packet is received from the peer. The Echo-Reply packet
is a response to an Echo-Request packet. There is no reply to an
Echo-Reply or Discard-Request packet.

4.4. Actions

Actions in the automaton are caused by events and typically indicate
the transmission of packets and/or the starting or stopping of the
Restart timer.

Illegal-Event (-)

This indicates an event that cannot occur in a properly
implemented automaton. The implementation has an internal error,
which should be reported and logged. No transition is taken, and
the implementation SHOULD NOT reset or freeze.

This-Layer-Up (tlu)

This action indicates to the upper layers that the automaton is
entering the Opened state.

Typically, this action is used by the LCP to signal the Up event
to a NCP, Authentication Protocol, or Link Quality Protocol, or
MAY be used by a NCP to indicate that the link is available for
its network layer traffic.

This-Layer-Down (tld)

This action indicates to the upper layers that the automaton is
leaving the Opened state.

Typically, this action is used by the LCP to signal the Down event
to a NCP, Authentication Protocol, or Link Quality Protocol, or
MAY be used by a NCP to indicate that the link is no longer
available for its network layer traffic.

This-Layer-Started (tls)

This action indicates to the lower layers that the automaton is
entering the Starting state, and the lower layer is needed for the
link. The lower layer SHOULD respond with an Up event when the
lower layer is available.

This results of this action are highly implementation dependent.

This-Layer-Finished (tlf)

This action indicates to the lower layers that the automaton is
entering the Initial, Closed or Stopped states, and the lower
layer is no longer needed for the link. The lower layer SHOULD
respond with a Down event when the lower layer has terminated.

Typically, this action MAY be used by the LCP to advance to the
Link Dead phase, or MAY be used by a NCP to indicate to the LCP
that the link may terminate when there are no other NCPs open.

This results of this action are highly implementation dependent.

Initialize-Restart-Count (irc)

This action sets the Restart counter to the appropriate value
(Max-Terminate or Max-Configure). The counter is decremented for
each transmission, including the first.

Implementation Note:

In addition to setting the Restart counter, the implementation
MUST set the timeout period to the initial value when Restart
timer backoff is used.

Zero-Restart-Count (zrc)

This action sets the Restart counter to zero.

Implementation Note:

This action enables the FSA to pause before proceeding to the
desired final state, allowing traffic to be processed by the
peer. In addition to zeroing the Restart counter, the
implementation MUST set the timeout period to an appropriate
value.

Send-Configure-Request (scr)

A Configure-Request packet is transmitted. This indicates the
desire to open a connection with a specified set of Configuration
Options. The Restart timer is started when the Configure-Request
packet is transmitted, to guard against packet loss. The Restart
counter is decremented each time a Configure-Request is sent.

Send-Configure-Ack (sca)

A Configure-Ack packet is transmitted. This acknowledges the
reception of a Configure-Request packet with an acceptable set of
Configuration Options.

Send-Configure-Nak (scn)

A Configure-Nak or Configure-Reject packet is transmitted, as
appropriate. This negative response reports the reception of a

Configure-Request packet with an unacceptable set of Configuration
Options.

Configure-Nak packets are used to refuse a Configuration Option
value, and to suggest a new, acceptable value. Configure-Reject
packets are used to refuse all negotiation about a Configuration
Option, typically because it is not recognized or implemented.
The use of Configure-Nak versus Configure-Reject is more fully
described in the chapter on LCP Packet Formats.

Send-Terminate-Request (str)

A Terminate-Request packet is transmitted. This indicates the
desire to close a connection. The Restart timer is started when
the Terminate-Request packet is transmitted, to guard against
packet loss. The Restart counter is decremented each time a
Terminate-Request is sent.

Send-Terminate-Ack (sta)

A Terminate-Ack packet is transmitted. This acknowledges the
reception of a Terminate-Request packet or otherwise serves to
synchronize the automatons.

Send-Code-Reject (scj)

A Code-Reject packet is transmitted. This indicates the reception
of an unknown type of packet.

Send-Echo-Reply (ser)

An Echo-Reply packet is transmitted. This acknowledges the
reception of an Echo-Request packet.

4.5. Loop Avoidance

The protocol makes a reasonable attempt at avoiding Configuration
Option negotiation loops. However, the protocol does NOT guarantee
that loops will not happen. As with any negotiation, it is possible
to configure two PPP implementations with conflicting policies that
will never converge. It is also possible to configure policies which
do converge, but which take significant time to do so. Implementors
should keep this in mind and SHOULD implement loop detection
mechanisms or higher level timeouts.

4.6. Counters and Timers

Restart Timer

There is one special timer used by the automaton. The Restart
timer is used to time transmissions of Configure-Request and
Terminate-Request packets. Expiration of the Restart timer causes
a Timeout event, and retransmission of the corresponding
Configure-Request or Terminate-Request packet. The Restart timer
MUST be configurable, but SHOULD default to three (3) seconds.

Implementation Note:

The Restart timer SHOULD be based on the speed of the link.
The default value is designed for low speed (2,400 to 9,600
bps), high switching latency links (typical telephone lines).
Higher speed links, or links with low switching latency, SHOULD
have correspondingly faster retransmission times.

Instead of a constant value, the Restart timer MAY begin at an
initial small value and increase to the configured final value.
Each successive value less than the final value SHOULD be at
least twice the previous value. The initial value SHOULD be
large enough to account for the size of the packets, twice the
round trip time for transmission at the link speed, and at
least an additional 100 milliseconds to allow the peer to
process the packets before responding. Some circuits add
another 200 milliseconds of satellite delay. Round trip times
for modems operating at 14,400 bps have been measured in the
range of 160 to more than 600 milliseconds.

Max-Terminate

There is one required restart counter for Terminate-Requests.
Max-Terminate indicates the number of Terminate-Request packets
sent without receiving a Terminate-Ack before assuming that the
peer is unable to respond. Max-Terminate MUST be configurable,
but SHOULD default to two (2) transmissions.

Max-Configure

A similar counter is recommended for Configure-Requests. Max-
Configure indicates the number of Configure-Request packets sent
without receiving a valid Configure-Ack, Configure-Nak or
Configure-Reject before assuming that the peer is unable to
respond. Max-Configure MUST be configurable, but SHOULD default
to ten (10) transmissions.

Max-Failure

A related counter is recommended for Configure-Nak. Max-Failure
indicates the number of Configure-Nak packets sent without sending
a Configure-Ack before assuming that configuration is not
converging. Any further Configure-Nak packets for peer requested
options are converted to Configure-Reject packets, and locally
desired options are no longer appended. Max-Failure MUST be
configurable, but SHOULD default to five (5) transmissions.

5. LCP Packet Formats

There are three classes of LCP packets:

1. Link Configuration packets used to establish and configure a
link (Configure-Request, Configure-Ack, Configure-Nak and
Configure-Reject).

2. Link Termination packets used to terminate a link (Terminate-
Request and Terminate-Ack).

3. Link Maintenance packets used to manage and debug a link
(Code-Reject, Protocol-Reject, Echo-Request, Echo-Reply, and
Discard-Request).

In the interest of simplicity, there is no version field in the LCP
packet. A correctly functioning LCP implementation will always
respond to unknown Protocols and Codes with an easily recognizable
LCP packet, thus providing a deterministic fallback mechanism for
implementations of other versions.

Regardless of which Configuration Options are enabled, all LCP Link
Configuration, Link Termination, and Code-Reject packets (codes 1
through 7) are always sent as if no Configuration Options were
negotiated. In particular, each Configuration Option specifies a
default value. This ensures that such LCP packets are always
recognizable, even when one end of the link mistakenly believes the
link to be open.

Exactly one LCP packet is encapsulated in the PPP Information field,
where the PPP Protocol field indicates type hex c021 (Link Control
Protocol).

A summary of the Link Control Protocol packet format is shown below.
The fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+

Code

The Code field is one octet, and identifies the kind of LCP

packet. When a packet is received with an unknown Code field, a
Code-Reject packet is transmitted.

Up-to-date values of the LCP Code field are specified in the most
recent "Assigned Numbers" RFC [2]. This document concerns the
following values:

1 Configure-Request
2 Configure-Ack
3 Configure-Nak
4 Configure-Reject
5 Terminate-Request
6 Terminate-Ack
7 Code-Reject
8 Protocol-Reject
9 Echo-Request
10 Echo-Reply
11 Discard-Request

Identifier

The Identifier field is one octet, and aids in matching requests
and replies. When a packet is received with an invalid Identifier
field, the packet is silently discarded without affecting the
automaton.

Length

The Length field is two octets, and indicates the length of the
LCP packet, including the Code, Identifier, Length and Data
fields. The Length MUST NOT exceed the MRU of the link.

Octets outside the range of the Length field are treated as
padding and are ignored on reception. When a packet is received
with an invalid Length field, the packet is silently discarded
without affecting the automaton.

Data

The Data field is zero or more octets, as indicated by the Length
field. The format of the Data field is determined by the Code
field.

5.1. Configure-Request

Description

An implementation wishing to open a connection MUST transmit a
Configure-Request. The Options field is filled with any desired
changes to the link defaults. Configuration Options SHOULD NOT be
included with default values.

Upon reception of a Configure-Request, an appropriate reply MUST
be transmitted.

A summary of the Configure-Request packet format is shown below. The
fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options ...
+-+-+-+-+

Code

1 for Configure-Request.

Identifier

The Identifier field MUST be changed whenever the contents of the
Options field changes, and whenever a valid reply has been
received for a previous request. For retransmissions, the
Identifier MAY remain unchanged.

Options

The options field is variable in length, and contains the list of
zero or more Configuration Options that the sender desires to
negotiate. All Configuration Options are always negotiated
simultaneously. The format of Configuration Options is further
described in a later chapter.

5.2. Configure-Ack

Description

If every Configuration Option received in a Configure-Request is
recognizable and all values are acceptable, then the
implementation MUST transmit a Configure-Ack. The acknowledged
Configuration Options MUST NOT be reordered or modified in any
way.

On reception of a Configure-Ack, the Identifier field MUST match
that of the last transmitted Configure-Request. Additionally, the
Configuration Options in a Configure-Ack MUST exactly match those
of the last transmitted Configure-Request. Invalid packets are
silently discarded.

A summary of the Configure-Ack packet format is shown below. The
fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options ...
+-+-+-+-+

Code

2 for Configure-Ack.

Identifier

The Identifier field is a copy of the Identifier field of the
Configure-Request which caused this Configure-Ack.

Options

The Options field is variable in length, and contains the list of
zero or more Configuration Options that the sender is
acknowledging. All Configuration Options are always acknowledged
simultaneously.

5.3. Configure-Nak

Description

If every instance of the received Configuration Options is
recognizable, but some values are not acceptable, then the
implementation MUST transmit a Configure-Nak. The Options field
is filled with only the unacceptable Configuration Options from
the Configure-Request. All acceptable Configuration Options are
filtered out of the Configure-Nak, but otherwise the Configuration
Options from the Configure-Request MUST NOT be reordered.

Options which have no value fields (boolean options) MUST use the
Configure-Reject reply instead.

Each Configuration Option which is allowed only a single instance
MUST be modified to a value acceptable to the Configure-Nak
sender. The default value MAY be used, when this differs from the
requested value.

When a particular type of Configuration Option can be listed more
than once with different values, the Configure-Nak MUST include a
list of all values for that option which are acceptable to the
Configure-Nak sender. This includes acceptable values that were
present in the Configure-Request.

Finally, an implementation may be configured to request the
negotiation of a specific Configuration Option. If that option is
not listed, then that option MAY be appended to the list of Nak'd
Configuration Options, in order to prompt the peer to include that
option in its next Configure-Request packet. Any value fields for
the option MUST indicate values acceptable to the Configure-Nak
sender.

On reception of a Configure-Nak, the Identifier field MUST match
that of the last transmitted Configure-Request. Invalid packets
are silently discarded.

Reception of a valid Configure-Nak indicates that when a new
Configure-Request is sent, the Configuration Options MAY be
modified as specified in the Configure-Nak. When multiple
instances of a Configuration Option are present, the peer SHOULD
select a single value to include in its next Configure-Request
packet.

Some Configuration Options have a variable length. Since the
Nak'd Option has been modified by the peer, the implementation
MUST be able to handle an Option length which is different from

the original Configure-Request.

A summary of the Configure-Nak packet format is shown below. The
fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options ...
+-+-+-+-+

Code

3 for Configure-Nak.

Identifier

The Identifier field is a copy of the Identifier field of the
Configure-Request which caused this Configure-Nak.

Options

The Options field is variable in length, and contains the list of
zero or more Configuration Options that the sender is Nak'ing.
All Configuration Options are always Nak'd simultaneously.

5.4. Configure-Reject

Description

If some Configuration Options received in a Configure-Request are
not recognizable or are not acceptable for negotiation (as
configured by a network administrator), then the implementation
MUST transmit a Configure-Reject. The Options field is filled
with only the unacceptable Configuration Options from the
Configure-Request. All recognizable and negotiable Configuration
Options are filtered out of the Configure-Reject, but otherwise
the Configuration Options MUST NOT be reordered or modified in any
way.

On reception of a Configure-Reject, the Identifier field MUST
match that of the last transmitted Configure-Request.
Additionally, the Configuration Options in a Configure-Reject MUST

be a proper subset of those in the last transmitted Configure-
Request. Invalid packets are silently discarded.

Reception of a valid Configure-Reject indicates that when a new
Configure-Request is sent, it MUST NOT include any of the
Configuration Options listed in the Configure-Reject.

A summary of the Configure-Reject packet format is shown below. The
fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options ...
+-+-+-+-+

Code

4 for Configure-Reject.

Identifier

The Identifier field is a copy of the Identifier field of the
Configure-Request which caused this Configure-Reject.

Options

The Options field is variable in length, and contains the list of
zero or more Configuration Options that the sender is rejecting.
All Configuration Options are always rejected simultaneously.

5.5. Terminate-Request and Terminate-Ack

Description

LCP includes Terminate-Request and Terminate-Ack Codes in order to
provide a mechanism for closing a connection.

An implementation wishing to close a connection SHOULD transmit a
Terminate-Request. Terminate-Request packets SHOULD continue to
be sent until Terminate-Ack is received, the lower layer indicates
that it has gone down, or a sufficiently large number have been
transmitted such that the peer is down with reasonable certainty.

Upon reception of a Terminate-Request, a Terminate-Ack MUST be
transmitted.

Reception of an unelicited Terminate-Ack indicates that the peer
is in the Closed or Stopped states, or is otherwise in need of
re-negotiation.

A summary of the Terminate-Request and Terminate-Ack packet formats
is shown below. The fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+

Code

5 for Terminate-Request;

6 for Terminate-Ack.

Identifier

On transmission, the Identifier field MUST be changed whenever the
content of the Data field changes, and whenever a valid reply has
been received for a previous request. For retransmissions, the
Identifier MAY remain unchanged.

On reception, the Identifier field of the Terminate-Request is
copied into the Identifier field of the Terminate-Ack packet.

Data

The Data field is zero or more octets, and contains uninterpreted
data for use by the sender. The data may consist of any binary
value. The end of the field is indicated by the Length.

5.6. Code-Reject

Description

Reception of a LCP packet with an unknown Code indicates that the
peer is operating with a different version. This MUST be reported
back to the sender of the unknown Code by transmitting a Code-
Reject.

Upon reception of the Code-Reject of a code which is fundamental
to this version of the protocol, the implementation SHOULD report
the problem and drop the connection, since it is unlikely that the
situation can be rectified automatically.

A summary of the Code-Reject packet format is shown below. The
fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Rejected-Packet ...
+-+-+-+-+-+-+-+-+

Code

7 for Code-Reject.

Identifier

The Identifier field MUST be changed for each Code-Reject sent.

Rejected-Packet

The Rejected-Packet field contains a copy of the LCP packet which
is being rejected. It begins with the Information field, and does
not include any Data Link Layer headers nor an FCS. The
Rejected-Packet MUST be truncated to comply with the peer's

established MRU.

5.7. Protocol-Reject

Description

Reception of a PPP packet with an unknown Protocol field indicates
that the peer is attempting to use a protocol which is
unsupported. This usually occurs when the peer attempts to
configure a new protocol. If the LCP automaton is in the Opened
state, then this MUST be reported back to the peer by transmitting
a Protocol-Reject.

Upon reception of a Protocol-Reject, the implementation MUST stop
sending packets of the indicated protocol at the earliest
opportunity.

Protocol-Reject packets can only be sent in the LCP Opened state.
Protocol-Reject packets received in any state other than the LCP
Opened state SHOULD be silently discarded.

A summary of the Protocol-Reject packet format is shown below. The
fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Rejected-Protocol | Rejected-Information ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Code

8 for Protocol-Reject.

Identifier

The Identifier field MUST be changed for each Protocol-Reject
sent.

Rejected-Protocol

The Rejected-Protocol field is two octets, and contains the PPP
Protocol field of the packet which is being rejected.

Rejected-Information

The Rejected-Information field contains a copy of the packet which
is being rejected. It begins with the Information field, and does
not include any Data Link Layer headers nor an FCS. The
Rejected-Information MUST be truncated to comply with the peer's
established MRU.

5.8. Echo-Request and Echo-Reply

Description

LCP includes Echo-Request and Echo-Reply Codes in order to provide
a Data Link Layer loopback mechanism for use in exercising both
directions of the link. This is useful as an aid in debugging,
link quality determination, performance testing, and for numerous
other functions.

Upon reception of an Echo-Request in the LCP Opened state, an
Echo-Reply MUST be transmitted.

Echo-Request and Echo-Reply packets MUST only be sent in the LCP
Opened state. Echo-Request and Echo-Reply packets received in any
state other than the LCP Opened state SHOULD be silently
discarded.

A summary of the Echo-Request and Echo-Reply packet formats is shown
below. The fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Magic-Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+

Code

9 for Echo-Request;

10 for Echo-Reply.

Identifier

On transmission, the Identifier field MUST be changed whenever the
content of the Data field changes, and whenever a valid reply has
been received for a previous request. For retransmissions, the
Identifier MAY remain unchanged.

On reception, the Identifier field of the Echo-Request is copied
into the Identifier field of the Echo-Reply packet.

Magic-Number

The Magic-Number field is four octets, and aids in detecting links
which are in the looped-back condition. Until the Magic-Number
Configuration Option has been successfully negotiated, the Magic-
Number MUST be transmitted as zero. See the Magic-Number
Configuration Option for further explanation.

Data

The Data field is zero or more octets, and contains uninterpreted
data for use by the sender. The data may consist of any binary
value. The end of the field is indicated by the Length.

5.9. Discard-Request

Description

LCP includes a Discard-Request Code in order to provide a Data
Link Layer sink mechanism for use in exercising the local to
remote direction of the link. This is useful as an aid in
debugging, performance testing, and for numerous other functions.

Discard-Request packets MUST only be sent in the LCP Opened state.
On reception, the receiver MUST silently discard any Discard-
Request that it receives.

A summary of the Discard-Request packet format is shown below. The
fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Code | Identifier | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Magic-Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+

Code

11 for Discard-Request.

Identifier

The Identifier field MUST be changed for each Discard-Request
sent.

Magic-Number

The Magic-Number field is four octets, and aids in detecting links
which are in the looped-back condition. Until the Magic-Number
Configuration Option has been successfully negotiated, the Magic-
Number MUST be transmitted as zero. See the Magic-Number
Configuration Option for further explanation.

Data

The Data field is zero or more octets, and contains uninterpreted
data for use by the sender. The data may consist of any binary
value. The end of the field is indicated by the Length.

6. LCP Configuration Options

LCP Configuration Options allow negotiation of modifications to the
default characteristics of a point-to-point link. If a Configuration
Option is not included in a Configure-Request packet, the default
value for that Configuration Option is assumed.

Some Configuration Options MAY be listed more than once. The effect
of this is Configuration Option specific, and is specified by each
such Configuration Option description. (None of the Configuration
Options in this specification can be listed more than once.)

The end of the list of Configuration Options is indicated by the
Length field of the LCP packet.

Unless otherwise specified, all Configuration Options apply in a
half-duplex fashion; typically, in the receive direction of the link
from the point of view of the Configure-Request sender.

Design Philosophy

The options indicate additional capabilities or requirements of
the implementation that is requesting the option. An
implementation which does not understand any option SHOULD
interoperate with one which implements every option.

A default is specified for each option which allows the link to
correctly function without negotiation of the option, although
perhaps with less than optimal performance.

Except where explicitly specified, acknowledgement of an option
does not require the peer to take any additional action other than
the default.

It is not necessary to send the default values for the options in
a Configure-Request.

A summary of the Configuration Option format is shown below. The
fields are transmitted from left to right.

0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length | Data ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Type

The Type field is one octet, and indicates the type of
Configuration Option. Up-to-date values of the LCP Option Type
field are specified in the most recent "Assigned Numbers" RFC [2].
This document concerns the following values:

0 RESERVED
1 Maximum-Receive-Unit
3 Authentication-Protocol
4 Quality-Protocol
5 Magic-Number
7 Protocol-Field-Compression
8 Address-and-Control-Field-Compression

Length

The Length field is one octet, and indicates the length of this
Configuration Option including the Type, Length and Data fields.

If a negotiable Configuration Option is received in a Configure-
Request, but with an invalid or unrecognized Length, a Configure-
Nak SHOULD be transmitted which includes the desired Configuration
Option with an appropriate Length and Data.

Data

The Data field is zero or more octets, and contains information
specific to the Configuration Option. The format and length of
the Data field is determined by the Type and Length fields.

When the Data field is indicated by the Length to extend beyond
the end of the Information field, the entire packet is silently
discarded without affecting the automaton.

6.1. Maximum-Receive-Unit (MRU)

Description

This Configuration Option may be sent to inform the peer that the
implementation can receive larger packets, or to request that the
peer send smaller packets.

The default value is 1500 octets. If smaller packets are
requested, an implementation MUST still be able to receive the
full 1500 octet information field in case link synchronization is
lost.

Implementation Note:

This option is used to indicate an implementation capability.
The peer is not required to maximize the use of the capacity.
For example, when a MRU is indicated which is 2048 octets, the
peer is not required to send any packet with 2048 octets. The
peer need not Configure-Nak to indicate that it will only send
smaller packets, since the implementation will always require
support for at least 1500 octets.

A summary of the Maximum-Receive-Unit Configuration Option format is
shown below. The fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length | Maximum-Receive-Unit |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Type

1

Length

4

Maximum-Receive-Unit

The Maximum-Receive-Unit field is two octets, and specifies the
maximum number of octets in the Information and Padding fields.
It does not include the framing, Protocol field, FCS, nor any
transparency bits or bytes.

6.2. Authentication-Protocol

Description

On some links it may be desirable to require a peer to
authenticate itself before allowing network-layer protocol packets
to be exchanged.

This Configuration Option provides a method to negotiate the use
of a specific protocol for authentication. By default,
authentication is not required.

An implementation MUST NOT include multiple Authentication-
Protocol Configuration Options in its Configure-Request packets.
Instead, it SHOULD attempt to configure the most desirable
protocol first. If that protocol is Configure-Nak'd, then the
implementation SHOULD attempt the next most desirable protocol in
the next Configure-Request.

The implementation sending the Configure-Request is indicating
that it expects authentication from its peer. If an
implementation sends a Configure-Ack, then it is agreeing to
authenticate with the specified protocol. An implementation
receiving a Configure-Ack SHOULD expect the peer to authenticate
with the acknowledged protocol.

There is no requirement that authentication be full-duplex or that
the same protocol be used in both directions. It is perfectly
acceptable for different protocols to be used in each direction.
This will, of course, depend on the specific protocols negotiated.

A summary of the Authentication-Protocol Configuration Option format
is shown below. The fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length | Authentication-Protocol |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+

Type

3

Length

>= 4

Authentication-Protocol

The Authentication-Protocol field is two octets, and indicates the
authentication protocol desired. Values for this field are always
the same as the PPP Protocol field values for that same
authentication protocol.

Up-to-date values of the Authentication-Protocol field are
specified in the most recent "Assigned Numbers" RFC [2]. Current
values are assigned as follows:

Value (in hex) Protocol

c023 Password Authentication Protocol
c223 Challenge Handshake Authentication Protocol

Data

The Data field is zero or more octets, and contains additional
data as determined by the particular protocol.

6.3. Quality-Protocol

Description

On some links it may be desirable to determine when, and how
often, the link is dropping data. This process is called link
quality monitoring.

This Configuration Option provides a method to negotiate the use
of a specific protocol for link quality monitoring. By default,
link quality monitoring is disabled.

The implementation sending the Configure-Request is indicating
that it expects to receive monitoring information from its peer.
If an implementation sends a Configure-Ack, then it is agreeing to
send the specified protocol. An implementation receiving a
Configure-Ack SHOULD expect the peer to send the acknowledged
protocol.

There is no requirement that quality monitoring be full-duplex or

that the same protocol be used in both directions. It is
perfectly acceptable for different protocols to be used in each
direction. This will, of course, depend on the specific protocols
negotiated.

A summary of the Quality-Protocol Configuration Option format is
shown below. The fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length | Quality-Protocol |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data ...
+-+-+-+-+

Type

4

Length

>= 4

Quality-Protocol

The Quality-Protocol field is two octets, and indicates the link
quality monitoring protocol desired. Values for this field are
always the same as the PPP Protocol field values for that same
monitoring protocol.

Up-to-date values of the Quality-Protocol field are specified in
the most recent "Assigned Numbers" RFC [2]. Current values are
assigned as follows:

Value (in hex) Protocol

c025 Link Quality Report

Data

The Data field is zero or more octets, and contains additional
data as determined by the particular protocol.

6.4. Magic-Number

Description

This Configuration Option provides a method to detect looped-back
links and other Data Link Layer anomalies. This Configuration
Option MAY be required by some other Configuration Options such as
the Quality-Protocol Configuration Option. By default, the
Magic-Number is not negotiated, and zero is inserted where a
Magic-Number might otherwise be used.

Before this Configuration Option is requested, an implementation
MUST choose its Magic-Number. It is recommended that the Magic-
Number be chosen in the most random manner possible in order to
guarantee with very high probability that an implementation will
arrive at a unique number. A good way to choose a unique random
number is to start with a unique seed. Suggested sources of
uniqueness include machine serial numbers, other network hardware
addresses, time-of-day clocks, etc. Particularly good random
number seeds are precise measurements of the inter-arrival time of
physical events such as packet reception on other connected
networks, server response time, or the typing rate of a human
user. It is also suggested that as many sources as possible be
used simultaneously.

When a Configure-Request is received with a Magic-Number
Configuration Option, the received Magic-Number is compared with
the Magic-Number of the last Configure-Request sent to the peer.
If the two Magic-Numbers are different, then the link is not
looped-back, and the Magic-Number SHOULD be acknowledged. If the
two Magic-Numbers are equal, then it is possible, but not certain,
that the link is looped-back and that this Configure-Request is
actually the one last sent. To determine this, a Configure-Nak
MUST be sent specifying a different Magic-Number value. A new
Configure-Request SHOULD NOT be sent to the peer until normal
processing would cause it to be sent (that is, until a Configure-
Nak is received or the Restart timer runs out).

Reception of a Configure-Nak with a Magic-Number different from
that of the last Configure-Nak sent to the peer proves that a link
is not looped-back, and indicates a unique Magic-Number. If the
Magic-Number is equal to the one sent in the last Configure-Nak,
the possibility of a looped-back link is increased, and a new
Magic-Number MUST be chosen. In either case, a new Configure-
Request SHOULD be sent with the new Magic-Number.

If the link is indeed looped-back, this sequence (transmit
Configure-Request, receive Configure-Request, transmit Configure-

Nak, receive Configure-Nak) will repeat over and over again. If
the link is not looped-back, this sequence might occur a few
times, but it is extremely unlikely to occur repeatedly. More
likely, the Magic-Numbers chosen at either end will quickly
diverge, terminating the sequence. The following table shows the
probability of collisions assuming that both ends of the link
select Magic-Numbers with a perfectly uniform distribution:

Number of Collisions Probability
-------------------- ---------------------
1 1/2**32 = 2.3 E-10
2 1/2**32**2 = 5.4 E-20
3 1/2**32**3 = 1.3 E-29

Good sources of uniqueness or randomness are required for this
divergence to occur. If a good source of uniqueness cannot be
found, it is recommended that this Configuration Option not be
enabled; Configure-Requests with the option SHOULD NOT be
transmitted and any Magic-Number Configuration Options which the
peer sends SHOULD be either acknowledged or rejected. In this
case, looped-back links cannot be reliably detected by the
implementation, although they may still be detectable by the peer.

If an implementation does transmit a Configure-Request with a
Magic-Number Configuration Option, then it MUST NOT respond with a
Configure-Reject when it receives a Configure-Request with a
Magic-Number Configuration Option. That is, if an implementation
desires to use Magic Numbers, then it MUST also allow its peer to
do so. If an implementation does receive a Configure-Reject in
response to a Configure-Request, it can only mean that the link is
not looped-back, and that its peer will not be using Magic-
Numbers. In this case, an implementation SHOULD act as if the
negotiation had been successful (as if it had instead received a
Configure-Ack).

The Magic-Number also may be used to detect looped-back links
during normal operation, as well as during Configuration Option
negotiation. All LCP Echo-Request, Echo-Reply, and Discard-
Request packets have a Magic-Number field. If Magic-Number has
been successfully negotiated, an implementation MUST transmit
these packets with the Magic-Number field set to its negotiated
Magic-Number.

The Magic-Number field of these packets SHOULD be inspected on
reception. All received Magic-Number fields MUST be equal to
either zero or the peer's unique Magic-Number, depending on
whether or not the peer negotiated a Magic-Number.

Reception of a Magic-Number field equal to the negotiated local
Magic-Number indicates a looped-back link. Reception of a Magic-
Number other than the negotiated local Magic-Number, the peer's
negotiated Magic-Number, or zero if the peer didn't negotiate one,
indicates a link which has been (mis)configured for communications
with a different peer.

Procedures for recovery from either case are unspecified, and may
vary from implementation to implementation. A somewhat
pessimistic procedure is to assume a LCP Down event. A further
Open event will begin the process of re-establishing the link,
which can't complete until the looped-back condition is
terminated, and Magic-Numbers are successfully negotiated. A more
optimistic procedure (in the case of a looped-back link) is to
begin transmitting LCP Echo-Request packets until an appropriate
Echo-Reply is received, indicating a termination of the looped-
back condition.

A summary of the Magic-Number Configuration Option format is shown
below. The fields are transmitted from left to right.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length | Magic-Number
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Magic-Number (cont) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Type

5

Length

6

Magic-Number

The Magic-Number field is four octets, and indicates a number
which is very likely to be unique to one end of the link. A
Magic-Number of zero is illegal and MUST always be Nak'd, if it is
not Rejected outright.

6.5. Protocol-Field-Compression (PFC)

Description

This Configuration Option provides a method to negotiate the
compression of the PPP Protocol field. By default, all
implementations MUST transmit packets with two octet PPP Protocol
fields.

PPP Protocol field numbers are chosen such that some values may be
compressed into a single octet form which is clearly
distinguishable from the two octet form. This Configuration
Option is sent to inform the peer that the implementation can
receive such single octet Protocol fields.

As previously mentioned, the Protocol field uses an extension
mechanism consistent with the ISO 3309 extension mechanism for the
Address field; the Least Significant Bit (LSB) of each octet is
used to indicate extension of the Protocol field. A binary "0" as
the LSB indicates that the Protocol field continues with the
following octet. The presence of a binary "1" as the LSB marks
the last octet of the Protocol field. Notice that any number of
"0" octets may be prepended to the field, and will still indicate
the same value (consider the two binary representations for 3,
00000011 and 00000000 00000011).

When using low speed links, it is desirable to conserve bandwidth
by sending as little redundant data as possible. The Protocol-
Field-Compression Configuration Option allows a trade-off between
implementation simplicity and bandwidth efficiency. If
successfully negotiated, the ISO 3309 extension mechanism may be
used to compress the Protocol field to one octet instead of two.
The large majority of packets are compressible since data
protocols are typically assigned with Protocol field values less
than 256.

Compressed Protocol fields MUST NOT be transmitted unless this
Configuration Option has been negotiated. When negotiated, PPP
implementations MUST accept PPP packets with either double-octet
or single-octet Protocol fields, and MUST NOT distinguish between
them.

The Protocol field is never compressed when sending any LCP
packet. This rule guarantees unambiguous recognition of LCP
packets.

When a Protocol field is compressed, the Data Link Layer FCS field
is calculated on the compressed frame, not the original

uncompressed frame.

A summary of the Protocol-Field-Compression Configuration Option
format is shown below. The fields are transmitted from left to
right.

0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Type

7

Length

2

6.6. Address-and-Control-Field-Compression (ACFC)

Description

This Configuration Option provides a method to negotiate the
compression of the Data Link Layer Address and Control fields. By
default, all implementations MUST transmit frames with Address and
Control fields appropriate to the link framing.

Since these fields usually have constant values for point-to-point
links, they are easily compressed. This Configuration Option is
sent to inform the peer that the implementation can receive
compressed Address and Control fields.

If a compressed frame is received when Address-and-Control-Field-
Compression has not been negotiated, the implementation MAY
silently discard the frame.

The Address and Control fields MUST NOT be compressed when sending
any LCP packet. This rule guarantees unambiguous recognition of
LCP packets.

When the Address and Control fields are compressed, the Data Link
Layer FCS field is calculated on the compressed frame, not the
original uncompressed frame.

A summary of the Address-and-Control-Field-Compression configuration
option format is shown below. The fields are transmitted from left
to right.

0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Type

8

Length

2

Security Considerations

Security issues are briefly discussed in sections concerning the
Authentication Phase, the Close event, and the Authentication-
Protocol Configuration Option.

References

[1] Perkins, D., "Requirements for an Internet Standard Point-to-
Point Protocol", RFC 1547, Carnegie Mellon University,
December 1993.

[2] Reynolds, J., and Postel, J., "Assigned Numbers", STD 2, RFC
1340, USC/Information Sciences Institute, July 1992.

Acknowledgements

This document is the product of the Point-to-Point Protocol Working
Group of the Internet Engineering Task Force (IETF). Comments should
be submitted to the ietf-ppp@merit.edu mailing list.

Much of the text in this document is taken from the working group
requirements [1]; and RFCs 1171 & 1172, by Drew Perkins while at
Carnegie Mellon University, and by Russ Hobby of the University of
California at Davis.

William Simpson was principally responsible for introducing
consistent terminology and philosophy, and the re-design of the phase
and negotiation state machines.

Many people spent significant time helping to develop the Point-to-
Point Protocol. The complete list of people is too numerous to list,
but the following people deserve special thanks: Rick Adams, Ken
Adelman, Fred Baker, Mike Ballard, Craig Fox, Karl Fox, Phill Gross,
Kory Hamzeh, former WG chair Russ Hobby, David Kaufman, former WG
chair Steve Knowles, Mark Lewis, former WG chair Brian Lloyd, John
LoVerso, Bill Melohn, Mike Patton, former WG chair Drew Perkins, Greg
Satz, John Shriver, Vernon Schryver, and Asher Waldfogel.

Special thanks to Morning Star Technologies for providing computing
resources and network access support for writing this specification.

Chair's Address

The working group can be contacted via the current chair:

Fred Baker
Advanced Computer Communications
315 Bollay Drive
Santa Barbara, California 93117

fbaker@acc.com

Editor's Address

Questions about this memo can also be directed to:

William Allen Simpson
Daydreamer
Computer Systems Consulting Services
1384 Fontaine
Madison Heights, Michigan 48071

Bill.Simpson@um.cc.umich.edu
bsimpson@MorningStar.com

Simpson [Page 52]

RFC 2406 – IP Encapsulating Security Payload (ESP)

 
Network Working Group                                            S. Kent
Request for Comments: 2406 BBN Corp
Obsoletes: 1827 R. Atkinson
Category: Standards Track @Home Network
November 1998

IP Encapsulating Security Payload (ESP)

Status of this Memo

This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.

Copyright Notice

Copyright (C) The Internet Society (1998). All Rights Reserved.

Table of Contents

1. Introduction..................................................2
2. Encapsulating Security Payload Packet Format..................3
2.1 Security Parameters Index................................4
2.2 Sequence Number .........................................4
2.3 Payload Data.............................................5
2.4 Padding (for Encryption).................................5
2.5 Pad Length...............................................7
2.6 Next Header..............................................7
2.7 Authentication Data......................................7
3. Encapsulating Security Protocol Processing....................7
3.1 ESP Header Location......................................7
3.2 Algorithms..............................................10
3.2.1 Encryption Algorithms..............................10
3.2.2 Authentication Algorithms..........................10
3.3 Outbound Packet Processing..............................10
3.3.1 Security Association Lookup........................11
3.3.2 Packet Encryption..................................11
3.3.3 Sequence Number Generation.........................12
3.3.4 Integrity Check Value Calculation..................12
3.3.5 Fragmentation......................................13
3.4 Inbound Packet Processing...............................13
3.4.1 Reassembly.........................................13
3.4.2 Security Association Lookup........................13
3.4.3 Sequence Number Verification.......................14
3.4.4 Integrity Check Value Verification.................15

3.4.5 Packet Decryption..................................16
4. Auditing.....................................................17
5. Conformance Requirements.....................................18
6. Security Considerations......................................18
7. Differences from RFC 1827....................................18
Acknowledgements................................................19
References......................................................19
Disclaimer......................................................20
Author Information..............................................21
Full Copyright Statement........................................22

1. Introduction

The Encapsulating Security Payload (ESP) header is designed to
provide a mix of security services in IPv4 and IPv6. ESP may be
applied alone, in combination with the IP Authentication Header (AH)
[KA97b], or in a nested fashion, e.g., through the use of tunnel mode
(see "Security Architecture for the Internet Protocol" [KA97a],
hereafter referred to as the Security Architecture document).
Security services can be provided between a pair of communicating
hosts, between a pair of communicating security gateways, or between
a security gateway and a host. For more details on how to use ESP
and AH in various network environments, see the Security Architecture
document [KA97a].

The ESP header is inserted after the IP header and before the upper
layer protocol header (transport mode) or before an encapsulated IP
header (tunnel mode). These modes are described in more detail
below.

ESP is used to provide confidentiality, data origin authentication,
connectionless integrity, an anti-replay service (a form of partial
sequence integrity), and limited traffic flow confidentiality. The
set of services provided depends on options selected at the time of
Security Association establishment and on the placement of the
implementation. Confidentiality may be selected independent of all
other services. However, use of confidentiality without
integrity/authentication (either in ESP or separately in AH) may
subject traffic to certain forms of active attacks that could
undermine the confidentiality service (see [Bel96]). Data origin
authentication and connectionless integrity are joint services
(hereafter referred to jointly as "authentication) and are offered as
an option in conjunction with (optional) confidentiality. The anti-
replay service may be selected only if data origin authentication is
selected, and its election is solely at the discretion of the
receiver. (Although the default calls for the sender to increment
the Sequence Number used for anti-replay, the service is effective
only if the receiver checks the Sequence Number.) Traffic flow

confidentiality requires selection of tunnel mode, and is most
effective if implemented at a security gateway, where traffic
aggregation may be able to mask true source-destination patterns.
Note that although both confidentiality and authentication are
optional, at least one of them MUST be selected.

It is assumed that the reader is familiar with the terms and concepts
described in the Security Architecture document. In particular, the
reader should be familiar with the definitions of security services
offered by ESP and AH, the concept of Security Associations, the ways
in which ESP can be used in conjunction with the Authentication
Header (AH), and the different key management options available for
ESP and AH. (With regard to the last topic, the current key
management options required for both AH and ESP are manual keying and
automated keying via IKE [HC98].)

The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this
document, are to be interpreted as described in RFC 2119 [Bra97].

2. Encapsulating Security Payload Packet Format

The protocol header (IPv4, IPv6, or Extension) immediately preceding
the ESP header will contain the value 50 in its Protocol (IPv4) or
Next Header (IPv6, Extension) field [STD-2].

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ----
| Security Parameters Index (SPI) | ^Auth.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Cov-
| Sequence Number | |erage
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ----
| Payload Data* (variable) | | ^
~ ~ | |
| | |Conf.
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Cov-
| | Padding (0-255 bytes) | |erage*
+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |
| | Pad Length | Next Header | v v
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ------
| Authentication Data (variable) |
~ ~
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

* If included in the Payload field, cryptographic
synchronization data, e.g., an Initialization Vector (IV, see

Section 2.3), usually is not encrypted per se, although it
often is referred to as being part of the ciphertext.

The following subsections define the fields in the header format.
"Optional" means that the field is omitted if the option is not
selected, i.e., it is present in neither the packet as transmitted
nor as formatted for computation of an Integrity Check Value (ICV,
see Section 2.7). Whether or not an option is selected is defined as
part of Security Association (SA) establishment. Thus the format of
ESP packets for a given SA is fixed, for the duration of the SA. In
contrast, "mandatory" fields are always present in the ESP packet
format, for all SAs.

2.1 Security Parameters Index

The SPI is an arbitrary 32-bit value that, in combination with the
destination IP address and security protocol (ESP), uniquely
identifies the Security Association for this datagram. The set of
SPI values in the range 1 through 255 are reserved by the Internet
Assigned Numbers Authority (IANA) for future use; a reserved SPI
value will not normally be assigned by IANA unless the use of the
assigned SPI value is specified in an RFC. It is ordinarily selected
by the destination system upon establishment of an SA (see the
Security Architecture document for more details). The SPI field is
mandatory.

The SPI value of zero (0) is reserved for local, implementation-
specific use and MUST NOT be sent on the wire. For example, a key
management implementation MAY use the zero SPI value to mean "No
Security Association Exists" during the period when the IPsec
implementation has requested that its key management entity establish
a new SA, but the SA has not yet been established.

2.2 Sequence Number

This unsigned 32-bit field contains a monotonically increasing
counter value (sequence number). It is mandatory and is always
present even if the receiver does not elect to enable the anti-replay
service for a specific SA. Processing of the Sequence Number field
is at the discretion of the receiver, i.e., the sender MUST always
transmit this field, but the receiver need not act upon it (see the
discussion of Sequence Number Verification in the "Inbound Packet
Processing" section below).

The sender's counter and the receiver's counter are initialized to 0
when an SA is established. (The first packet sent using a given SA
will have a Sequence Number of 1; see Section 3.3.3 for more details
on how the Sequence Number is generated.) If anti-replay is enabled

(the default), the transmitted Sequence Number must never be allowed
to cycle. Thus, the sender's counter and the receiver's counter MUST
be reset (by establishing a new SA and thus a new key) prior to the
transmission of the 2^32nd packet on an SA.

2.3 Payload Data

Payload Data is a variable-length field containing data described by
the Next Header field. The Payload Data field is mandatory and is an
integral number of bytes in length. If the algorithm used to encrypt
the payload requires cryptographic synchronization data, e.g., an
Initialization Vector (IV), then this data MAY be carried explicitly
in the Payload field. Any encryption algorithm that requires such
explicit, per-packet synchronization data MUST indicate the length,
any structure for such data, and the location of this data as part of
an RFC specifying how the algorithm is used with ESP. If such
synchronization data is implicit, the algorithm for deriving the data
MUST be part of the RFC.

Note that with regard to ensuring the alignment of the (real)
ciphertext in the presence of an IV:

o For some IV-based modes of operation, the receiver treats
the IV as the start of the ciphertext, feeding it into the
algorithm directly. In these modes, alignment of the start
of the (real) ciphertext is not an issue at the receiver.
o In some cases, the receiver reads the IV in separately from
the ciphertext. In these cases, the algorithm
specification MUST address how alignment of the (real)
ciphertext is to be achieved.

2.4 Padding (for Encryption)

Several factors require or motivate use of the Padding field.

o If an encryption algorithm is employed that requires the
plaintext to be a multiple of some number of bytes, e.g.,
the block size of a block cipher, the Padding field is used
to fill the plaintext (consisting of the Payload Data, Pad
Length and Next Header fields, as well as the Padding) to
the size required by the algorithm.

o Padding also may be required, irrespective of encryption
algorithm requirements, to ensure that the resulting
ciphertext terminates on a 4-byte boundary. Specifically,

the Pad Length and Next Header fields must be right aligned
within a 4-byte word, as illustrated in the ESP packet
format figure above, to ensure that the Authentication Data
field (if present) is aligned on a 4-byte boundary.

o Padding beyond that required for the algorithm or alignment
reasons cited above, may be used to conceal the actual
length of the payload, in support of (partial) traffic flow
confidentiality. However, inclusion of such additional
padding has adverse bandwidth implications and thus its use
should be undertaken with care.

The sender MAY add 0-255 bytes of padding. Inclusion of the Padding
field in an ESP packet is optional, but all implementations MUST
support generation and consumption of padding.

a. For the purpose of ensuring that the bits to be encrypted
are a multiple of the algorithm's blocksize (first bullet
above), the padding computation applies to the Payload
Data exclusive of the IV, the Pad Length, and Next Header
fields.

b. For the purposes of ensuring that the Authentication Data
is aligned on a 4-byte boundary (second bullet above), the
padding computation applies to the Payload Data inclusive
of the IV, the Pad Length, and Next Header fields.

If Padding bytes are needed but the encryption algorithm does not
specify the padding contents, then the following default processing
MUST be used. The Padding bytes are initialized with a series of
(unsigned, 1-byte) integer values. The first padding byte appended
to the plaintext is numbered 1, with subsequent padding bytes making
up a monotonically increasing sequence: 1, 2, 3, ... When this
padding scheme is employed, the receiver SHOULD inspect the Padding
field. (This scheme was selected because of its relative simplicity,
ease of implementation in hardware, and because it offers limited
protection against certain forms of "cut and paste" attacks in the
absence of other integrity measures, if the receiver checks the
padding values upon decryption.)

Any encryption algorithm that requires Padding other than the default
described above, MUST define the Padding contents (e.g., zeros or
random data) and any required receiver processing of these Padding
bytes in an RFC specifying how the algorithm is used with ESP. In
such circumstances, the content of the Padding field will be
determined by the encryption algorithm and mode selected and defined
in the corresponding algorithm RFC. The relevant algorithm RFC MAY
specify that a receiver MUST inspect the Padding field or that a

receiver MUST inform senders of how the receiver will handle the
Padding field.

2.5 Pad Length

The Pad Length field indicates the number of pad bytes immediately
preceding it. The range of valid values is 0-255, where a value of
zero indicates that no Padding bytes are present. The Pad Length
field is mandatory.

2.6 Next Header

The Next Header is an 8-bit field that identifies the type of data
contained in the Payload Data field, e.g., an extension header in
IPv6 or an upper layer protocol identifier. The value of this field
is chosen from the set of IP Protocol Numbers defined in the most
recent "Assigned Numbers" [STD-2] RFC from the Internet Assigned
Numbers Authority (IANA). The Next Header field is mandatory.

2.7 Authentication Data

The Authentication Data is a variable-length field containing an
Integrity Check Value (ICV) computed over the ESP packet minus the
Authentication Data. The length of the field is specified by the
authentication function selected. The Authentication Data field is
optional, and is included only if the authentication service has been
selected for the SA in question. The authentication algorithm
specification MUST specify the length of the ICV and the comparison
rules and processing steps for validation.

3. Encapsulating Security Protocol Processing

3.1 ESP Header Location

Like AH, ESP may be employed in two ways: transport mode or tunnel
mode. The former mode is applicable only to host implementations and
provides protection for upper layer protocols, but not the IP header.
(In this mode, note that for "bump-in-the-stack" or "bump-in-the-
wire" implementations, as defined in the Security Architecture
document, inbound and outbound IP fragments may require an IPsec
implementation to perform extra IP reassembly/fragmentation in order
to both conform to this specification and provide transparent IPsec
support. Special care is required to perform such operations within
these implementations when multiple interfaces are in use.)

In transport mode, ESP is inserted after the IP header and before an
upper layer protocol, e.g., TCP, UDP, ICMP, etc. or before any other
IPsec headers that have already been inserted. In the context of

IPv4, this translates to placing ESP after the IP header (and any
options that it contains), but before the upper layer protocol.
(Note that the term "transport" mode should not be misconstrued as
restricting its use to TCP and UDP. For example, an ICMP message MAY
be sent using either "transport" mode or "tunnel" mode.) The
following diagram illustrates ESP transport mode positioning for a
typical IPv4 packet, on a "before and after" basis. (The "ESP
trailer" encompasses any Padding, plus the Pad Length, and Next
Header fields.)

BEFORE APPLYING ESP
----------------------------
IPv4 |orig IP hdr | | |
|(any options)| TCP | Data |
----------------------------

AFTER APPLYING ESP
-------------------------------------------------
IPv4 |orig IP hdr | ESP | | | ESP | ESP|
|(any options)| Hdr | TCP | Data | Trailer |Auth|
-------------------------------------------------
|<----- encrypted ---->|
|<------ authenticated ----->|

In the IPv6 context, ESP is viewed as an end-to-end payload, and thus
should appear after hop-by-hop, routing, and fragmentation extension
headers. The destination options extension header(s) could appear
either before or after the ESP header depending on the semantics
desired. However, since ESP protects only fields after the ESP
header, it generally may be desirable to place the destination
options header(s) after the ESP header. The following diagram
illustrates ESP transport mode positioning for a typical IPv6 packet.

BEFORE APPLYING ESP
---------------------------------------
IPv6 | | ext hdrs | | |
| orig IP hdr |if present| TCP | Data |
---------------------------------------

AFTER APPLYING ESP
---------------------------------------------------------
IPv6 | orig |hop-by-hop,dest*,| |dest| | | ESP | ESP|
|IP hdr|routing,fragment.|ESP|opt*|TCP|Data|Trailer|Auth|
---------------------------------------------------------
|<---- encrypted ---->|
|<---- authenticated ---->|

* = if present, could be before ESP, after ESP, or both

ESP and AH headers can be combined in a variety of modes. The IPsec
Architecture document describes the combinations of security
associations that must be supported.

Tunnel mode ESP may be employed in either hosts or security gateways.
When ESP is implemented in a security gateway (to protect subscriber
transit traffic), tunnel mode must be used. In tunnel mode, the
"inner" IP header carries the ultimate source and destination
addresses, while an "outer" IP header may contain distinct IP
addresses, e.g., addresses of security gateways. In tunnel mode, ESP
protects the entire inner IP packet, including the entire inner IP
header. The position of ESP in tunnel mode, relative to the outer IP
header, is the same as for ESP in transport mode. The following
diagram illustrates ESP tunnel mode positioning for typical IPv4 and
IPv6 packets.

-----------------------------------------------------------
IPv4 | new IP hdr* | | orig IP hdr* | | | ESP | ESP|
|(any options)| ESP | (any options) |TCP|Data|Trailer|Auth|
-----------------------------------------------------------
|<--------- encrypted ---------->|
|<----------- authenticated ---------->|

------------------------------------------------------------
IPv6 | new* |new ext | | orig*|orig ext | | | ESP | ESP|
|IP hdr| hdrs* |ESP|IP hdr| hdrs * |TCP|Data|Trailer|Auth|
------------------------------------------------------------
|<--------- encrypted ----------->|
|<---------- authenticated ---------->|

* = if present, construction of outer IP hdr/extensions
and modification of inner IP hdr/extensions is
discussed below.

3.2 Algorithms

The mandatory-to-implement algorithms are described in Section 5,
"Conformance Requirements". Other algorithms MAY be supported. Note
that although both confidentiality and authentication are optional,
at least one of these services MUST be selected hence both algorithms
MUST NOT be simultaneously NULL.

3.2.1 Encryption Algorithms

The encryption algorithm employed is specified by the SA. ESP is
designed for use with symmetric encryption algorithms. Because IP
packets may arrive out of order, each packet must carry any data
required to allow the receiver to establish cryptographic
synchronization for decryption. This data may be carried explicitly
in the payload field, e.g., as an IV (as described above), or the
data may be derived from the packet header. Since ESP makes
provision for padding of the plaintext, encryption algorithms
employed with ESP may exhibit either block or stream mode
characteristics. Note that since encryption (confidentiality) is
optional, this algorithm may be "NULL".

3.2.2 Authentication Algorithms

The authentication algorithm employed for the ICV computation is
specified by the SA. For point-to-point communication, suitable
authentication algorithms include keyed Message Authentication Codes
(MACs) based on symmetric encryption algorithms (e.g., DES) or on
one-way hash functions (e.g., MD5 or SHA-1). For multicast
communication, one-way hash algorithms combined with asymmetric
signature algorithms are appropriate, though performance and space
considerations currently preclude use of such algorithms. Note that
since authentication is optional, this algorithm may be "NULL".

3.3 Outbound Packet Processing

In transport mode, the sender encapsulates the upper layer protocol
information in the ESP header/trailer, and retains the specified IP
header (and any IP extension headers in the IPv6 context). In tunnel
mode, the outer and inner IP header/extensions can be inter-related
in a variety of ways. The construction of the outer IP
header/extensions during the encapsulation process is described in
the Security Architecture document. If there is more than one IPsec
header/extension required by security policy, the order of the
application of the security headers MUST be defined by security
policy.

3.3.1 Security Association Lookup

ESP is applied to an outbound packet only after an IPsec
implementation determines that the packet is associated with an SA
that calls for ESP processing. The process of determining what, if
any, IPsec processing is applied to outbound traffic is described in
the Security Architecture document.

3.3.2 Packet Encryption

In this section, we speak in terms of encryption always being applied
because of the formatting implications. This is done with the
understanding that "no confidentiality" is offered by using the NULL
encryption algorithm. Accordingly, the sender:

1. encapsulates (into the ESP Payload field):
- for transport mode -- just the original upper layer
protocol information.
- for tunnel mode -- the entire original IP datagram.
2. adds any necessary padding.
3. encrypts the result (Payload Data, Padding, Pad Length, and
Next Header) using the key, encryption algorithm, algorithm
mode indicated by the SA and cryptographic synchronization
data (if any).
- If explicit cryptographic synchronization data, e.g.,
an IV, is indicated, it is input to the encryption
algorithm per the algorithm specification and placed
in the Payload field.
- If implicit cryptographic synchronication data, e.g.,
an IV, is indicated, it is constructed and input to
the encryption algorithm as per the algorithm
specification.

The exact steps for constructing the outer IP header depend on the
mode (transport or tunnel) and are described in the Security
Architecture document.

If authentication is selected, encryption is performed first, before
the authentication, and the encryption does not encompass the
Authentication Data field. This order of processing facilitates
rapid detection and rejection of replayed or bogus packets by the
receiver, prior to decrypting the packet, hence potentially reducing
the impact of denial of service attacks. It also allows for the
possibility of parallel processing of packets at the receiver, i.e.,
decryption can take place in parallel with authentication. Note that
since the Authentication Data is not protected by encryption, a keyed
authentication algorithm must be employed to compute the ICV.

3.3.3 Sequence Number Generation

The sender's counter is initialized to 0 when an SA is established.
The sender increments the Sequence Number for this SA and inserts the
new value into the Sequence Number field. Thus the first packet sent
using a given SA will have a Sequence Number of 1.

If anti-replay is enabled (the default), the sender checks to ensure
that the counter has not cycled before inserting the new value in the
Sequence Number field. In other words, the sender MUST NOT send a
packet on an SA if doing so would cause the Sequence Number to cycle.
An attempt to transmit a packet that would result in Sequence Number
overflow is an auditable event. (Note that this approach to Sequence
Number management does not require use of modular arithmetic.)

The sender assumes anti-replay is enabled as a default, unless
otherwise notified by the receiver (see 3.4.3). Thus, if the counter
has cycled, the sender will set up a new SA and key (unless the SA
was configured with manual key management).

If anti-replay is disabled, the sender does not need to monitor or
reset the counter, e.g., in the case of manual key management (see
Section 5). However, the sender still increments the counter and
when it reaches the maximum value, the counter rolls over back to
zero.

3.3.4 Integrity Check Value Calculation

If authentication is selected for the SA, the sender computes the ICV
over the ESP packet minus the Authentication Data. Thus the SPI,
Sequence Number, Payload Data, Padding (if present), Pad Length, and
Next Header are all encompassed by the ICV computation. Note that
the last 4 fields will be in ciphertext form, since encryption is
performed prior to authentication.

For some authentication algorithms, the byte string over which the
ICV computation is performed must be a multiple of a blocksize
specified by the algorithm. If the length of this byte string does
not match the blocksize requirements for the algorithm, implicit
padding MUST be appended to the end of the ESP packet, (after the
Next Header field) prior to ICV computation. The padding octets MUST
have a value of zero. The blocksize (and hence the length of the
padding) is specified by the algorithm specification. This padding
is not transmitted with the packet. Note that MD5 and SHA-1 are
viewed as having a 1-byte blocksize because of their internal padding
conventions.

3.3.5 Fragmentation

If necessary, fragmentation is performed after ESP processing within
an IPsec implementation. Thus, transport mode ESP is applied only to
whole IP datagrams (not to IP fragments). An IP packet to which ESP
has been applied may itself be fragmented by routers en route, and
such fragments must be reassembled prior to ESP processing at a
receiver. In tunnel mode, ESP is applied to an IP packet, the
payload of which may be a fragmented IP packet. For example, a
security gateway or a "bump-in-the-stack" or "bump-in-the-wire" IPsec
implementation (as defined in the Security Architecture document) may
apply tunnel mode ESP to such fragments.

NOTE: For transport mode -- As mentioned at the beginning of Section
3.1, bump-in-the-stack and bump-in-the-wire implementations may have
to first reassemble a packet fragmented by the local IP layer, then
apply IPsec, and then fragment the resulting packet.

NOTE: For IPv6 -- For bump-in-the-stack and bump-in-the-wire
implementations, it will be necessary to walk through all the
extension headers to determine if there is a fragmentation header and
hence that the packet needs reassembling prior to IPsec processing.

3.4 Inbound Packet Processing

3.4.1 Reassembly

If required, reassembly is performed prior to ESP processing. If a
packet offered to ESP for processing appears to be an IP fragment,
i.e., the OFFSET field is non-zero or the MORE FRAGMENTS flag is set,
the receiver MUST discard the packet; this is an auditable event. The
audit log entry for this event SHOULD include the SPI value,
date/time received, Source Address, Destination Address, Sequence
Number, and (in IPv6) the Flow ID.

NOTE: For packet reassembly, the current IPv4 spec does NOT require
either the zero'ing of the OFFSET field or the clearing of the MORE
FRAGMENTS flag. In order for a reassembled packet to be processed by
IPsec (as opposed to discarded as an apparent fragment), the IP code
must do these two things after it reassembles a packet.

3.4.2 Security Association Lookup

Upon receipt of a (reassembled) packet containing an ESP Header, the
receiver determines the appropriate (unidirectional) SA, based on the
destination IP address, security protocol (ESP), and the SPI. (This
process is described in more detail in the Security Architecture
document.) The SA indicates whether the Sequence Number field will

be checked, whether the Authentication Data field should be present,
and it will specify the algorithms and keys to be employed for
decryption and ICV computations (if applicable).

If no valid Security Association exists for this session (for
example, the receiver has no key), the receiver MUST discard the
packet; this is an auditable event. The audit log entry for this
event SHOULD include the SPI value, date/time received, Source
Address, Destination Address, Sequence Number, and (in IPv6) the
cleartext Flow ID.

3.4.3 Sequence Number Verification

All ESP implementations MUST support the anti-replay service, though
its use may be enabled or disabled by the receiver on a per-SA basis.
This service MUST NOT be enabled unless the authentication service
also is enabled for the SA, since otherwise the Sequence Number field
has not been integrity protected. (Note that there are no provisions
for managing transmitted Sequence Number values among multiple
senders directing traffic to a single SA (irrespective of whether the
destination address is unicast, broadcast, or multicast). Thus the
anti-replay service SHOULD NOT be used in a multi-sender environment
that employs a single SA.)

If the receiver does not enable anti-replay for an SA, no inbound
checks are performed on the Sequence Number. However, from the
perspective of the sender, the default is to assume that anti-replay
is enabled at the receiver. To avoid having the sender do
unnecessary sequence number monitoring and SA setup (see section
3.3.3), if an SA establishment protocol such as IKE is employed, the
receiver SHOULD notify the sender, during SA establishment, if the
receiver will not provide anti-replay protection.

If the receiver has enabled the anti-replay service for this SA, the
receive packet counter for the SA MUST be initialized to zero when
the SA is established. For each received packet, the receiver MUST
verify that the packet contains a Sequence Number that does not
duplicate the Sequence Number of any other packets received during
the life of this SA. This SHOULD be the first ESP check applied to a
packet after it has been matched to an SA, to speed rejection of
duplicate packets.

Duplicates are rejected through the use of a sliding receive window.
(How the window is implemented is a local matter, but the following
text describes the functionality that the implementation must
exhibit.) A MINIMUM window size of 32 MUST be supported; but a
window size of 64 is preferred and SHOULD be employed as the default.

Another window size (larger than the MINIMUM) MAY be chosen by the
receiver. (The receiver does NOT notify the sender of the window
size.)

The "right" edge of the window represents the highest, validated
Sequence Number value received on this SA. Packets that contain
Sequence Numbers lower than the "left" edge of the window are
rejected. Packets falling within the window are checked against a
list of received packets within the window. An efficient means for
performing this check, based on the use of a bit mask, is described
in the Security Architecture document.

If the received packet falls within the window and is new, or if the
packet is to the right of the window, then the receiver proceeds to
ICV verification. If the ICV validation fails, the receiver MUST
discard the received IP datagram as invalid; this is an auditable
event. The audit log entry for this event SHOULD include the SPI
value, date/time received, Source Address, Destination Address, the
Sequence Number, and (in IPv6) the Flow ID. The receive window is
updated only if the ICV verification succeeds.

DISCUSSION:

Note that if the packet is either inside the window and new, or is
outside the window on the "right" side, the receiver MUST
authenticate the packet before updating the Sequence Number window
data.

3.4.4 Integrity Check Value Verification

If authentication has been selected, the receiver computes the ICV
over the ESP packet minus the Authentication Data using the specified
authentication algorithm and verifies that it is the same as the ICV
included in the Authentication Data field of the packet. Details of
the computation are provided below.

If the computed and received ICV's match, then the datagram is valid,
and it is accepted. If the test fails, then the receiver MUST
discard the received IP datagram as invalid; this is an auditable
event. The log data SHOULD include the SPI value, date/time
received, Source Address, Destination Address, the Sequence Number,
and (in IPv6) the cleartext Flow ID.

DISCUSSION:

Begin by removing and saving the ICV value (Authentication Data
field). Next check the overall length of the ESP packet minus the
Authentication Data. If implicit padding is required, based on

the blocksize of the authentication algorithm, append zero-filled
bytes to the end of the ESP packet directly after the Next Header
field. Perform the ICV computation and compare the result with
the saved value, using the comparison rules defined by the
algorithm specification. (For example, if a digital signature and
one-way hash are used for the ICV computation, the matching
process is more complex.)

3.4.5 Packet Decryption

As in section 3.3.2, "Packet Encryption", we speak here in terms of
encryption always being applied because of the formatting
implications. This is done with the understanding that "no
confidentiality" is offered by using the NULL encryption algorithm.
Accordingly, the receiver:

1. decrypts the ESP Payload Data, Padding, Pad Length, and Next
Header using the key, encryption algorithm, algorithm mode,
and cryptographic synchronization data (if any), indicated by
the SA.
- If explicit cryptographic synchronization data, e.g.,
an IV, is indicated, it is taken from the Payload
field and input to the decryption algorithm as per the
algorithm specification.
- If implicit cryptographic synchronization data, e.g.,
an IV, is indicated, a local version of the IV is
constructed and input to the decryption algorithm as
per the algorithm specification.
2. processes any padding as specified in the encryption
algorithm specification. If the default padding scheme (see
Section 2.4) has been employed, the receiver SHOULD inspect
the Padding field before removing the padding prior to
passing the decrypted data to the next layer.
3. reconstructs the original IP datagram from:
- for transport mode -- original IP header plus the
original upper layer protocol information in the ESP
Payload field
- for tunnel mode -- tunnel IP header + the entire IP
datagram in the ESP Payload field.

The exact steps for reconstructing the original datagram depend on
the mode (transport or tunnel) and are described in the Security
Architecture document. At a minimum, in an IPv6 context, the
receiver SHOULD ensure that the decrypted data is 8-byte aligned, to
facilitate processing by the protocol identified in the Next Header
field.

If authentication has been selected, verification and decryption MAY
be performed serially or in parallel. If performed serially, then
ICV verification SHOULD be performed first. If performed in
parallel, verification MUST be completed before the decrypted packet
is passed on for further processing. This order of processing
facilitates rapid detection and rejection of replayed or bogus
packets by the receiver, prior to decrypting the packet, hence
potentially reducing the impact of denial of service attacks. Note:

If the receiver performs decryption in parallel with authentication,
care must be taken to avoid possible race conditions with regard to
packet access and reconstruction of the decrypted packet.

Note that there are several ways in which the decryption can "fail":

a. The selected SA may not be correct -- The SA may be
mis-selected due to tampering with the SPI, destination
address, or IPsec protocol type fields. Such errors, if they
map the packet to another extant SA, will be
indistinguishable from a corrupted packet, (case c).
Tampering with the SPI can be detected by use of
authentication. However, an SA mismatch might still occur
due to tampering with the IP Destination Address or the IPsec
protocol type field.

b. The pad length or pad values could be erroneous -- Bad pad
lengths or pad values can be detected irrespective of the use
of authentication.

c. The encrypted ESP packet could be corrupted -- This can be
detected if authentication is selected for the SA.,

In case (a) or (c), the erroneous result of the decryption operation
(an invalid IP datagram or transport-layer frame) will not
necessarily be detected by IPsec, and is the responsibility of later
protocol processing.

4. Auditing

Not all systems that implement ESP will implement auditing. However,
if ESP is incorporated into a system that supports auditing, then the
ESP implementation MUST also support auditing and MUST allow a system
administrator to enable or disable auditing for ESP. For the most
part, the granularity of auditing is a local matter. However,
several auditable events are identified in this specification and for
each of these events a minimum set of information that SHOULD be
included in an audit log is defined. Additional information also MAY
be included in the audit log for each of these events, and additional

events, not explicitly called out in this specification, also MAY
result in audit log entries. There is no requirement for the
receiver to transmit any message to the purported sender in response
to the detection of an auditable event, because of the potential to
induce denial of service via such action.

5. Conformance Requirements

Implementations that claim conformance or compliance with this
specification MUST implement the ESP syntax and processing described
here and MUST comply with all requirements of the Security
Architecture document. If the key used to compute an ICV is manually
distributed, correct provision of the anti-replay service would
require correct maintenance of the counter state at the sender, until
the key is replaced, and there likely would be no automated recovery
provision if counter overflow were imminent. Thus a compliant
implementation SHOULD NOT provide this service in conjunction with
SAs that are manually keyed. A compliant ESP implementation MUST
support the following mandatory-to-implement algorithms:

- DES in CBC mode [MD97]
- HMAC with MD5 [MG97a]
- HMAC with SHA-1 [MG97b]
- NULL Authentication algorithm
- NULL Encryption algorithm

Since ESP encryption and authentication are optional, support for the
2 "NULL" algorithms is required to maintain consistency with the way
these services are negotiated. NOTE that while authentication and
encryption can each be "NULL", they MUST NOT both be "NULL".

6. Security Considerations

Security is central to the design of this protocol, and thus security
considerations permeate the specification. Additional security-
relevant aspects of using the IPsec protocol are discussed in the
Security Architecture document.

7. Differences from RFC 1827

This document differs from RFC 1827 [ATK95] in several significant
ways. The major difference is that, this document attempts to
specify a complete framework and context for ESP, whereas RFC 1827
provided a "shell" that was completed through the definition of
transforms. The combinatorial growth of transforms motivated the
reformulation of the ESP specification as a more complete document,
with options for security services that may be offered in the context
of ESP. Thus, fields previously defined in transform documents are

now part of this base ESP specification. For example, the fields
necessary to support authentication (and anti-replay) are now defined
here, even though the provision of this service is an option. The
fields used to support padding for encryption, and for next protocol
identification, are now defined here as well. Packet processing
consistent with the definition of these fields also is included in
the document.

Acknowledgements

Many of the concepts embodied in this specification were derived from
or influenced by the US Government's SP3 security protocol, ISO/IEC's
NLSP, or from the proposed swIPe security protocol. [SDNS89, ISO92,
IB93].

For over 3 years, this document has evolved through multiple versions
and iterations. During this time, many people have contributed
significant ideas and energy to the process and the documents
themselves. The authors would like to thank Karen Seo for providing
extensive help in the review, editing, background research, and
coordination for this version of the specification. The authors
would also like to thank the members of the IPsec and IPng working
groups, with special mention of the efforts of (in alphabetic order):
Steve Bellovin, Steve Deering, Phil Karn, Perry Metzger, David
Mihelcic, Hilarie Orman, Norman Shulman, William Simpson and Nina
Yuan.

References

[ATK95] Atkinson, R., "IP Encapsulating Security Payload (ESP)",
RFC 1827, August 1995.

[Bel96] Steven M. Bellovin, "Problem Areas for the IP Security
Protocols", Proceedings of the Sixth Usenix Unix Security
Symposium, July, 1996.

[Bra97] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Level", BCP 14, RFC 2119, March 1997.

[HC98] Harkins, D., and D. Carrel, "The Internet Key Exchange
(IKE)", RFC 2409, November 1998.

[IB93] John Ioannidis & Matt Blaze, "Architecture and
Implementation of Network-layer Security Under Unix",
Proceedings of the USENIX Security Symposium, Santa Clara,
CA, October 1993.

[ISO92] ISO/IEC JTC1/SC6, Network Layer Security Protocol, ISO-IEC
DIS 11577, International Standards Organisation, Geneva,
Switzerland, 29 November 1992.

[KA97a] Kent, S., and R. Atkinson, "Security Architecture for the
Internet Protocol", RFC 2401, November 1998.

[KA97b] Kent, S., and R. Atkinson, "IP Authentication Header", RFC
2402, November 1998.

[MD97] Madson, C., and N. Doraswamy, "The ESP DES-CBC Cipher
Algorithm With Explicit IV", RFC 2405, November 1998.

[MG97a] Madson, C., and R. Glenn, "The Use of HMAC-MD5-96 within
ESP and AH", RFC 2403, November 1998.

[MG97b] Madson, C., and R. Glenn, "The Use of HMAC-SHA-1-96 within
ESP and AH", RFC 2404, November 1998.

[STD-2] Reynolds, J., and J. Postel, "Assigned Numbers", STD 2, RFC
1700, October 1994. See also:
http://www.iana.org/numbers.html

[SDNS89] SDNS Secure Data Network System, Security Protocol 3, SP3,
Document SDN.301, Revision 1.5, 15 May 1989, as published
in NIST Publication NIST-IR-90-4250, February 1990.

Disclaimer

The views and specification here are those of the authors and are not
necessarily those of their employers. The authors and their
employers specifically disclaim responsibility for any problems
arising from correct or incorrect implementation or use of this
specification.

Author Information

Stephen Kent
BBN Corporation
70 Fawcett Street
Cambridge, MA 02140
USA

Phone: +1 (617) 873-3988
EMail: kent@bbn.com

Randall Atkinson
@Home Network
425 Broadway,
Redwood City, CA 94063
USA

Phone: +1 (415) 569-5000
EMail: rja@corp.home.net

Full Copyright Statement

Copyright (C) The Internet Society (1998). All Rights Reserved.

This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.

The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.

This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.