Network Working Group J. Wu Internet Draft Y. Cui Expiration Date: August 2007 X. Li Tsinghua University C. Metz E. Rosen (Editor) S. Barber P. Mohapatra Cisco Systems, Inc. J. Scudder Juniper Networks, Inc. February 2007 Softwire Mesh Framework draft-wu-softwire-mesh-framework-02.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Wu, et al. [Page 1] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 Abstract The Internet needs to be able to handle both IPv4 and IPv6 packets. However, it is expected that some constituent networks of the Internet will be "single protocol" networks. One kind of single protocol network can parse only IPv4 packets and can process only IPv4 routing information; another kind can parse only IPv6 packets and can process only IPv6 routing information. It is nevertheless required that either kind of single protocol network be able to provide transit service for the "other" protocol. This is done by passing the "other kind" of routing information from one edge of the single protocol network to the other, and by tunneling the "other kind" of data packet from one edge to the other. The tunnels are known as "Softwires". This framework document explains how the routing information and the data packets of one protocol are passed through a single protocol network of the other protocol. The document is careful to specify when this can be done with existing technology, and when it requires the development of new or modified technology. Wu, et al. [Page 2] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 Table of Contents 1 Specification of requirements ...................... 3 2 Introduction ....................................... 4 3 Scenarios of Interest .............................. 7 3.1 IPv6-over-IPv4 Scenario ............................ 7 3.2 IPv4-over-IPv6 Scenario ............................ 9 4 General Principles of the Solution ................. 11 4.1 'E-IP' and 'I-IP' .................................. 11 4.2 Routing ............................................ 12 4.3 Tunneled Forwarding ................................ 12 5 Distribution of Inter-AFBR Routing Information ..... 12 6 Softwire Signaling ................................. 14 7 Choosing to Forward Through a Softwire ............. 16 8 Selecting a Tunneling Technology ................... 16 9 Selecting the Softwire for a Given Packet .......... 17 10 Softwire OAM and MIBs .............................. 18 10.1 Operations and Maintenance (OAM) ................... 18 10.2 MIBs ............................................... 19 11 Softwire Multicast ................................. 19 12 Inter-AS Considerations ............................ 20 13 Security Considerations ............................ 21 14 Acknowledgments .................................... 21 15 Normative References ............................... 21 16 Informative References ............................. 22 17 Full Copyright Statement ........................... 25 18 Intellectual Property .............................. 25 1. Specification of requirements The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Wu, et al. [Page 3] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 2. Introduction The routing information in any IP backbone network can be thought of as being in one of two categories: "internal routing information" or "external routing information". The internal routing information consists of routes to the nodes that belong to the backbone, and to the interfaces of those nodes. External routing information consists of routes to destinations beyond the backbone, especially destinations to which the backbone is not directly attached. In general, BGP [RFC4271] is used to distribute external routing information, and an "Interior Gateway Protocol" (IGP) such as OSPF [RFC2328] or IS-IS [RFC1195] is used to distribute internal routing information. Often an IP backbone will provide transit routing services for packets that originate outside the backbone, and whose destinations are outside the backbone. These packets enter the backbone at one of its "edge routers". They are routed through the backbone to another edge router, after which they leave the backbone and continue on their way. The edge nodes of the backbone are often known as "Provider Edge" (PE) routers. The term "ingress" (or "ingress PE") refers to the router at which a packet enters the backbone, and the term "egress" (or "egress PE") refers to the router at which it leaves the backbone. Interior nodes are often known as "P routers". Routers which are outside the backbone but directly attached to it are known as "Customer Edge" (CE) routers. (This terminology is taken from [RFC4364].) When a packet's destination is outside the backbone, the routing information which is needed within the backbone in order to route the packet to the proper egress is, by definition, external routing information. Traditionally, the external routing information has been distributed by BGP to all the routers in the backbone, not just to the edge routers (i.e., not just to the ingress and egress points). Each of the interior nodes has been expected to look up the packet's destination address and route it towards the egress point. This is known as "native forwarding": the interior nodes look into each packet's header in order to match the information in the header with the external routing information. It is, however, possible to provide transit services without requiring that all the backbone routers have the external routing information. The routing information which BGP distributes to each ingress router specifies the egress router for each route. The ingress router can therefore "tunnel" the packet directly to the egress router. "Tunneling the packet" means putting on some sort of Wu, et al. [Page 4] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 encapsulation header which will force the interior routers to forward the packet to the egress router. The original packet is known as the "encapsulation payload". The P routers do not look at the packet header of the payload, but only at the encapsulation header. Since the path to the egress router is part of the internal routing information of the backbone, the interior routers then do not need to know the external routing information. This is known as "tunneled forwarding". Of course, before the packet can leave the egress, it has to be decapsulated. The scenario where the P routers do not have external routes is sometimes known as a "BGP-free core". That is something of a misnomer, though, since the crucial aspect of this scenario is not that the interior nodes don't run BGP, but that they don't maintain the external routing information. In recent years, we have seen this scenario deployed to support VPN services, as specified in [RFC4364]. An edge router maintains multiple independent routing/addressing spaces, one for each VPN to which it interfaces. However, the routing information for the VPNs is not maintained by the interior routers. In most of these scenarios, MPLS is used as the encapsulation mechanism for getting the packets from ingress to egress. There are some deployments in which an IP-based encapsulation, such as L2TPv3 [RFC3931] or GRE [RFC2784] is used. This same technique can also be useful when the external routing information consists not of VPN routes, but of "ordinary" Internet routes. It can be used any time it is desired to keep external routing information out of a backbone's interior nodes, or in fact any time it is desired for any reason to avoid the native forwarding of certain kinds of packets. This framework focuses on two such scenarios. 1. In this scenario, the backbone's interior nodes support only IPv6. They do not maintain IPv4 routes at all, and are not expected to parse IPv4 packet headers. Yet it is desired to use such a backbone to provide transit services for IPv4 packets. Therefore tunneled forwarding of IPv4 packets is required. Of course, the edge nodes must have the IPv4 routes, but the ingress must perform an encapsulation in order to get an IPv4 packet forwarded to the egress. 2. This scenario is the reverse of scenario 1, i.e., the backbone's interior nodes support only IPv4, but it is desired to use the backbone for IPv6 transit. Wu, et al. [Page 5] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 In these scenarios, a backbone whose interior nodes support only one of the two address families is required to provide transit services for the other. The backbone's edge routers must, of course, support both address families. We use the term "Address Family Border Router" (AFBR) to refer to these PE routers. The tunnels that are used for forwarding are referred to as "softwires". These two scenarios are known as the "Softwire Mesh Problem" [SW- PROB], and the framework specified in this draft is therefore known as the "Softwire Mesh Framework". In this framework, only the AFBRs need to support both address families. The CE routers support only a single address family, and the P routers support only the other address family. It is possible to address these scenarios via a large variety of tunneling technologies. This framework does not mandate the use of any particular tunneling technology. In any given deployment, the choice of tunneling technology is a matter of policy. The framework accommodates at least the use of MPLS ([RFC3031], [RFC3032]), both LDP-based ([RFC3036]) and RSVP-TE-based ([RFC3209]), L2TPv3 [RFC3931], GRE [RFC2784], and IP-in-IP [RFC2003]. The framework will also accommodate the use of IPsec tunneling, when that is necessary in order to meet security requirements. It is expected that in many deployments, the choice of tunneling technology will be made by a simple expression of policy, such as "always use IP-IP tunnels", or "always use LDP-based MPLS", or "always use L2TPv3". However, other deployments may have a mixture of routers, some of which support, say, both GRE and L2TPv3, but others of which support only one of those techniques. It is desirable therefore to allow the network administration to create a small set of classes, and to configure each AFBR to be a member of one or more of these classes. Then the routers can advertise their class memberships to each other, and the encapsulation policies can be expressed as, e.g., "use L2TPv3 to tunnel to routers in class X, use GRE to tunnel to routers in class Y". To support such policies, it is necessary for the AFBRs to be able to advertise their class memberships; a standard way of doing this must be developed. Policy may also require a certain class of traffic to receive a certain quality of service, and this may impact the choice of tunnel and/or tunneling technology used for packets in that class. This needs to be accommodated by the softwires framework. The use of tunneled forwarding often requires that some sort of signaling protocol be used to set up and/or maintain the tunnels. Wu, et al. [Page 6] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 Many of the tunneling technologies accommodated by this framework already have their own signaling protocols. However, some do not, and in some cases the standard signaling protocol for a particular tunneling technology may not be appropriate, for one or another reason, in the scenarios of interest. In such cases (and in such cases only), new signaling methodologies need to be defined and standardized. In this framework, the softwires do not form an overlay topology which is visible to routing; routing adjacencies are not maintained over the softwires, and routing control packets are not sent through the softwires. Routing adjacencies among backbone nodes (including the edge nodes) are maintained via the native technology of the backbone. There is already a standard routing method for distributing external routing information among AFBRs, namely BGP. However, in the scenarios of interest, we may be using IPv6-based BGP sessions to pass IPv4 routing information, and we may be using IPv4-based BGP sessions to pass IPv6 routing information. Furthermore, when IPv4 traffic is to be tunneled over an IPv6 backbone, it is necessary to encode the "BGP next hop" for an IPv4 route as an IPv6 address, and vice versa. The method for encoding an IPv4 address as the next hop for an IPv6 route is specified in [V6NLRI-V4NH]; the method for encoding an IPv6 address as the next hop for an IPv4 route is specified in [V4NLRI-V6NH]. 3. Scenarios of Interest 3.1. IPv6-over-IPv4 Scenario In this scenario, the client networks run IPv6 but the backbone network runs IPv4. This is illustrated in Figure 1. Wu, et al. [Page 7] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 +--------+ +--------+ | IPv6 | | IPv6 | | Client | | Client | | Network| | Network| +--------+ +--------+ | \ / | | \ / | | \ / | | X | | / \ | | / \ | | / \ | +--------+ +--------+ | AFBR | | AFBR | +--| IPv4/6 |---| IPv4/6 |--+ | +--------+ +--------+ | +--------+ | | +--------+ | IPv4 | | | | IPv4 | | Client | | | | Client | | Network|------| IPv4 |-------| Network| +--------+ | only | +--------+ | | | +--------+ +--------+ | +--| AFBR |---| AFBR |--+ | IPv4/6 | | IPv4/6 | +--------+ +--------+ | \ / | | \ / | | \ / | | X | | / \ | | / \ | | / \ | +--------+ +--------+ | IPv6 | | IPv6 | | Client | | Client | | Network| | Network| +--------+ +--------+ Figure 1 IPv6-over-IPv4 Scenario The IPv4 transit core may or may not run MPLS. If it does, MPLS may be used as part of the solution. While Figure 1 does not show any "backdoor" connections among the Wu, et al. [Page 8] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 client networks, this framework assumes that there will be such connections. That is, there is no assumption that the only path between two client networks is via the pictured transit core network. Hence the routing solution must be robust in any kind of topology. Many mechanisms for providing IPv6 connectivity across IPv4 networks have been devised over the past ten years. A number of different tunneling mechanisms have been used, some provisioned manually, others based on special addressing. More recently, L3VPN techniques from [RFC4364] have been extended to provide IPv6 connectivity, using MPLS in the AFBRs and optionally in the backbone [V6NLRI-V4NH]. The solution described in this framework can be thought of as a superset of [V6NLRI-V4NH], with a more generalized scheme for choosing the tunneling (softwire) technology. In this framework, MPLS is allowed, but not required, even at the AFBRs. As in [V6NLRI-V4NH], there is no manual provisioning of tunnels, and no special addressing is required. 3.2. IPv4-over-IPv6 Scenario In this scenario, the client networks run IPv4 but the backbone network runs IPv6. This is illustrated in Figure 2. Wu, et al. [Page 9] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 +--------+ +--------+ | IPv4 | | IPv4 | | Client | | Client | | Network| | Network| +--------+ +--------+ | \ / | | \ / | | \ / | | X | | / \ | | / \ | | / \ | +--------+ +--------+ | AFBR | | AFBR | +--| IPv4/6 |---| IPv4/6 |--+ | +--------+ +--------+ | +--------+ | | +--------+ | IPv6 | | | | IPv6 | | Client | | | | Client | | Network|------| IPv6 |-------| Network| +--------+ | only | +--------+ | | | +--------+ +--------+ | +--| AFBR |---| AFBR |--+ | IPv4/6 | | IPv4/6 | +--------+ +--------+ | \ / | | \ / | | \ / | | X | | / \ | | / \ | | / \ | +--------+ +--------+ | IPv4 | | IPv4 | | Client | | Client | | Network| | Network| +--------+ +--------+ Figure 2 IPv4-over-IPv6 Scenario The IPv6 transit core may or may not run MPLS. If it does, MPLS may be used as part of the solution. While Figure 2 does not show any "backdoor" connections among the Wu, et al. [Page 10] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 client networks, this framework assumes that there will be such connections. That is, there is no assumption the only path between two client networks is via the pictured transit core network. Hence the routing solution must be robust in any kind of topology. While the issue of IPv6-over-IPv4 has received considerable attention in the past, the scenario of IPv4-over-IPv6 has not. Yet it is a significant emerging requirement, as a number of service providers are building IPv6 backbone networks and do not wish to provide native IPv4 support in their core routers. These service providers have a large legacy of IPv4 networks and applications that need to operate across their IPv6 backbone. Solutions for this do not exist yet because it had always been assumed that the backbone networks of the foreseeable future would be dual stack. 4. General Principles of the Solution This section gives a very brief overview of the procedures. The subsequent sections provide more detail. 4.1. 'E-IP' and 'I-IP' In the following we use the term "I-IP" ("Internal IP") to refer to the form of IP (i.e., either IPv4 or IPv6) that is supported by the transit network. We use the term "E-IP" ("External IP") to refer to the form of IP that is supported by the client networks. In the scenarios of interest, E-IP is IPv4 if and only if I-IP is IPv6, and E-IP is IPv6 if and only if I-IP is IPv4. We assume that the P routers support only I-IP. That is, they are expected to have only I-IP routing information, and they are not expected to be able to parse E-IP headers. We similarly assume that the CE routers support only E-IP. The AFBRs handle both I-IP and E-IP. However, only I-IP is used on AFBR's "core facing interfaces", and E-IP is only used on its client-facing interfaces. Wu, et al. [Page 11] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 4.2. Routing The P routers and the AFBRs of the transit network participate in an IGP, for the purposes of distributing I-IP routing information. The AFBRs use IBGP to exchange E-IP routing information with each other. Either there is a full mesh of IBGP connections among the AFBRs, or else some or all of the AFBRs are clients of a BGP Route Reflector. Although these IBGP connections are used to pass E-IP routing information (i.e., the NLRI of the BGP updates is in the E-IP address family), the IBGP connections run over I-IP, and the "BGP next hop" for each E-IP NLRI is in the I-IP address family. 4.3. Tunneled Forwarding When an ingress AFBR receives an E-IP packet from a client facing interface, it looks up the packet's destination IP address. In the scenarios of interest, the best match for that address will be a BGP-distributed route whose next hop is the I-IP address of another AFBR, the egress AFBR. The ingress AFBR must forward the packet through a tunnel (i.e, through a "softwire") to the egress AFBR. This is done by encapsulating the packet, using an encapsulation header which the P routers can process, and which will cause the P routers to send the packet to the egress AFBR. The egress AFBR then extracts the payload, i.e., the original E-IP packet, and forwards it further by looking up its IP destination address. Several kinds of tunneling technologies are supported. Some of those technologies require explicit AFBR-to-AFBR signaling before the tunnel can be used, others do not. 5. Distribution of Inter-AFBR Routing Information AFBRs peer with routers in the client networks to exchange routing information for the E-IP family. AFBRs use BGP to distribute the E-IP routing information to each other. This can be done by an AFBR-AFBR mesh of IBGP sessions, but more likely is done through a BGP Route Reflector, i.e., where each AFBR has an IBGP session to one or two Route Reflectors, rather than to other AFBRs. The BGP sessions between the AFBRs, or between the AFBRs and the Route Reflector, will run on top of the I-IP address family. That Wu, et al. [Page 12] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 is, if the transit core supports only IPv6, the IBGP sessions used to distribute IPv4 routing information from the client networks will run over IPv6; if the transit core supports only IPv4, the IBGP sessions used to distribute IPv6 routing information from the client networks will run over IPv4. The BGP sessions thus use the native networking layer of the core; BGP messages are NOT tunneled through softwires or through any other mechanism. In BGP, a routing update associates an address prefix (or more generally, "Network Layer Reachability Information", or NLRI) with the address of a "BGP Next Hop" (NH). The NLRI is associated with a particular address family. The NH address is also associated with a particular address family, which may be the same as or different than the address family associated with the NLRI. Generally the NH address belongs to the address family that is used to communicate with the BGP speaker to whom the NH address belongs. Since routing updates which contain information about E-IP address prefixes are carried over BGP sessions that use I-IP transport, and since the BGP messages are not tunneled, a BGP update providing information about an E-IP address prefix will need to specify a next hop address in the I-IP family. Due to a variety of historical circumstances, when the NLRI and the NH in a given BGP update are of different address families, it is not always obvious how the NH should be encoded. There is a different encoding procedure for each pair of address families. In the case where the NLRI is in the IPv6 address family, and the NH is in the IPv4 address family, [V6NLRI-V4NH] explains how to encode the NH. In the case where the NLRI is in the IPv4 address family, and the NH is in the IPv6 address family, [V4NLRI-V6NH] explains how to encode the NH. If a BGP speaker sends an update for an NLRI in the E-IP family, and the update is being sent over a BGP session that is running on top of the I-IP network layer, and the BGP speaker is advertising itself as the NH for that NLRI, then the BGP speaker MUST, unless explicitly overridden by policy, specify the NH address in the I-IP family. The address family of the NH MUST not be changed by a Route Reflector. In some cases (e.g., when [V4NLRI-V6NH] is used), one cannot follow this rule unless one's BGP peers have advertised a particular BGP capability. This leads to the following softwires deployment restriction: if a BGP Capability is defined for the case in which an E-IP NLRI has an I-IP NH, all the AFBRs in a given transit core MUST Wu, et al. [Page 13] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 advertise that capability. If an AFBR has multiple IP addresses, the network administrators usually have considerable flexibility in choosing which one the AFBR uses to identify itself as the next hop in a BGP update. However, if the AFBR expects to receive packets through a softwire of a particular tunneling technology, and if the AFBR is known to that tunneling technology via a specific IP address, then that same IP address must be used to identify the AFBR in the next hop field of the BGP updates. For example, if L2TPv3 tunneling is used, then the IP address which the AFBR uses when engaging in L2TPv3 signaling must be the same as the IP address it uses to identify itself in the next hop field of a BGP update. In [V6NLRI-V4NH], IPv6 routing information is distributed using the labeled IPv6 address family. This allows the egress AFBR to associate an MPLS label with each IPv6 address prefix. If an ingress AFBR forwards packets through a softwire than can carry MPLS packets, each data packet can carry the MPLS label corresponding to the IPv6 route that it matched. This may be useful at the egress AFBR, for demultiplexing and/or enhanced performance. It is also possible to do the same for the IPv4 address family, i.e. to use the labeled IPv4 address family instead of the IPv4 address family. The use of the labeled IP address families in this manner is OPTIONAL. 6. Softwire Signaling A mesh of inter-AFBR softwires spanning the transit core must be in place before packets can flow between client networks. Given N dual-stack AFBRs, this requires N^2 "point-to-point IP" or "label switched path" (LSP) tunnels. While in theory these could be configured manually, that would result in a very undesirable O(N^2) provisioning problem. Therefore manual configuration of point-to- point tunnels is not considered part of this framework. Because the transit core is providing layer 3 transit services, point-to-point tunnels are not required by this framework; multipoint-to-point tunnels are all that is needed. In a multipoint-to-point tunnel, when a packet emerges from the tunnel there is no way to tell which router put the packet into the tunnel. This models the native IP forwarding paradigm, wherein the egress router cannot determine a given packet's ingress router. Of course, point-to-point tunnels might be required for some reason which goes beyond the basic requirements described in this document. E.g., QoS or security considerations might require the use of point-to-point tunnels. So point-to-point tunnels are allowed, but not required, by this framework. Wu, et al. [Page 14] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 If it is desired to use a particular tunneling technology for the softwires, and if that technology has its own "native" signaling methodology, the presumption is that the native signaling will be used. This would certainly apply to MPLS-based softwires, where LDP or RSVP-TE would be used. A softwire based on IPsec would use standard IKE/IPsec signaling, as that is necessary in order to guarantee the softwire's security properties. A Softwire based on GRE might or might not require signaling, depending on whether various optional GRE header fields are to be used. GRE does not have any "native" signaling, so for those cases, a signaling procedure needs to be developed to support Softwires. Another possible softwire technology is L2TPv3. While L2TPv3 does have its own native signaling, that signaling sets up point-to-point tunnels. For the purpose of softwires, it is better to use L2TPv3 in a multipoint-to-point mode, and this requires a different kind of signaling. The signaling to be used for GRE and L2TPv3 to cover these scenarios is BGP-based, and is described in [ENCAPS-SAFI]. If IP-IP tunneling is used, or if GRE tunneling is used without options, no signaling is required, as the only information needed by the ingress AFBR to create the encapsulation header is the IP address of the egress AFBR, and that is distributed by BGP. When the encapsulation IP header is constructed, there may be fields in the IP whose value is determined neither by whatever signaling has been done nor by the distributed routing information. The values of these fields are determined by policy in the ingress AFBR. Examples of such fields may be the TTL field, the DSCP bits, etc. It is desirable for all necessary softwires to be fully set up before the arrival of any packets which need to go through the softwires. That is, the softwires should be "always on". From the perspective of any particular AFBR, the softwire endpoints are always BGP next hops of routes which the AFBR has installed. This suggests that any necessary softwire signaling should be either be done as part of normal system startup (as would happen, e.g., with LDP-based MPLS), or else should be triggered by the reception of BGP routing information (such as is described in [ENCAPS-SAFI]); it is also helpful if distribution of the routing information that serves as the trigger is prioritized. Wu, et al. [Page 15] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 7. Choosing to Forward Through a Softwire The decision to forward through a softwire, instead of to forward natively, is made by the ingress AFBR. This decision is a matter of policy. In many cases, the policy will be very simple. Some useful policies are: - if routing says that an E-IP packet has to be sent out a "core- facing interface" to an I-IP core, send the packet through a softwire - if routing says that an E-IP packet has to be sent out an interface that only supports I-IP packets, then send the E-IP packets through a softwire - if routing says that the BGP next hop address for an E-IP packet is an I-IP address, then send the E-IP packets through a softwire - if the route which is the best match for a particular packet's destination address is a BGP-distributed route, then send the packet through a softwire (i.e., tunnel all BGP-routed packets). More complicated policies are also possible, but a consideration of those policies is outside the scope of this document. 8. Selecting a Tunneling Technology The choice of tunneling technology is a matter of policy configured at the ingress AFBR. It is envisioned that in most cases, the policy will be a very simple one, and will be the same at all the AFBRs of a given transit core. E.g., "always use LDP-based MPLS", or "always use L2TPv3". However, other deployments may have a mixture of routers, some of which support, say, both GRE and L2TPv3, but others of which support only one of those techniques. It is desirable therefore to allow the network administration to create a small set of classes, and to configure each AFBR to be a member of one or more of these classes. Then the routers can advertise their class memberships to each other, and the encapsulation policies can be expressed as, e.g., "use L2TPv3 to talk to routers in class X, use GRE to talk to routers in class Y". To support such policies, it is necessary for the AFBRs to be able to advertise their class memberships. [ENCAPS-SAFI] specifies a way in which an AFBR may advertise, to other AFBRS, various Wu, et al. [Page 16] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 characteristics which may be relevant to the polcy (e.g., "I belong to class Y"). In many cases, these characteristics can be represented by arbitrarily selected communities or extended communities, and the policies at the ingress can be expressed in terms of these classes (i.e., communities). Policy may also require a certain class of traffic to receive a certain quality of service, and this may impact the choice of tunnel and/or tunneling technology used for packets in that class. This framework allows a variety of tunneling technologies to be used for instantiating softwires. The choice of tunneling technology is a matter of policy, as discussed in section 2. While in many cases the policy will be unconditional, e.g., "always use L2TPv3 for softwires", in other cases the policy may specify that the choice is conditional upon information about the softwire remote endpoint, e.g., "use L2TPv3 to talk to routers in class X, use GRE to talk to routers in class Y". It is desirable therefore to allow the network administration to create a small set of classes, and to configure each AFBR to be a member of one or more of these classes. If each such class is represented as a community or extended community, then [ENCAPS-SAFI] specifies a method that AFBRs can use to advertise their class memberships to each other. This framework also allows for policies of arbitrary complexity, which may depend on characteristics or attributes of individual address prefixes, as well as on QoS or security considerations. However, the specification of such policies is not within the scope of this document. 9. Selecting the Softwire for a Given Packet Suppose it has been decided to send a given packet through a softwire. Routing provides the address, in the address family of the transport network, of the BGP next hop. The packet MUST be sent through a softwire whose remote endpoint address is the same as the BGP next hop address. Sending a packet through a softwire is a matter of encapsulating the packet with an encapsulation header that can be processed by the transit network, and then transmitting towards the softwire's remote endpoint address. In many cases, once one knows the remote endpoint address, one has all the information one needs in order to form the encapsulation header. This will be the case if the tunnel technology instantiating the softwire is, e.g., LDP-based MPLS, IP-in-IP, or GRE without Wu, et al. [Page 17] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 optional header fields. If the tunnel technology being used is L2TPv3 or GRE with optional header fields, additional information from the remote endpoint is needed in order to form the encapsulation header. The procedures for sending and receiving this information are described in [ENCAPS- SAFI]. If the tunnel technology being used is RSVP-TE-based MPLS or IPsec, the native signaling procedures of those technologies will need to be used. IPsec procedures will be discussed further in a subsequent revision of this document. RSVP-TE procedures will be discussed in companion documents. If the packet being sent through the softwire matches a route in the labeled IPv4 or labeled IPv6 address families, it should be sent through the softwire as an MPLS packet with the corresponding label. Note that most of the tunneling technologies mentioned in this document are capable of carrying MPLS packets, so this does not presuppose support for MPLS in the core routers. 10. Softwire OAM and MIBs 10.1. Operations and Maintenance (OAM) Softwires are essentially tunnels connecting routers. If they disappear or degrade in performance then connectivity through those tunnels will be impacted. There are several techniques available to monitor the status of the tunnel end-points (AFBRs) as well as the tunnels themselves. These techniques allow operations such as softwires path tracing, remote softwire end-point pinging and remote softwire end-point liveness failure detection. Examples of techniques applicable to softwire OAM include: o BGP/TCP timeouts between AFBRs o ICMP or LSP echo request and reply addressed to a particular AFBR o [BFD] packet exchange between AFBR routers Another possibility for softwire OAM is to build something similar to the [RFC4378] or in other words creating and generating softwire echo request/reply packets. The echo request sent to a well-known UDP Wu, et al. [Page 18] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 port would contain the egress AFBR IP address and the softwire identifier as the payload (similar to the MPLS forwarding equivalence class contained in the LSP echo request). The softwire echo packet would be encapsulated with the encapsulation header and forwarded across the same path (inband) as that of the softwire itself. This mechanism can also be automated to periodically verify remote softwires end-point reachability, with the loss of reachability being signaled to the softwires application on the local AFBR thus enabling suitable actions to be taken. Consideration must be given to the trade offs between scalability of such mechanisms verses time to detection of loss of endpoint reachability for such automated mechanisms. In general a framework for softwire OAM can for a large part be based on the [RFC4176] framework. 10.2. MIBs Specific MIBs do exist to manage elements of the softwire mesh framework. However there will be a need to either extend these MIBs or create new ones that reflect the functional elements that can be SNMP-managed within the softwire network. 11. Softwire Multicast A set of client networks, running E-IP, that are connected to a provider's I-IP transit core, may wish to run IP multicast applications. Extending IP multicast connectivity across the transit core can be done in a number of ways, each with a different set of characteristics. Among them are: - Extend each client multicast tree through the transit core, so that for each client tree there is exactly one tree through the core. - Use one multicast tree in the core, add all the AFBRs to it, make it look to the client multicast control protocols as if the transit network is a LAN over which they can run transparently. - Use more than one multicast tree in the core, but less than one per client tree, and perform some kind of aggregation of client trees to core trees. Wu, et al. [Page 19] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 - Don't use any multicast trees in the core, have the ingress AFBRs replicate the multicast traffic and then unicast each replica. This list does not exhaust the set of alternatives. There are also additional issues which are somewhat orthogonal, such as whether it is best for the transit core and the clients to be using the same multicast control protocols or not, what multicast control protocols and service models need to be supported, etc. All these issues will be considered more fully in a subsequent revision of this document. 12. Inter-AS Considerations We have so far only considered the case where a "transit core" consists of a single Autonomous System (AS). If the transit core consists of multiple ASes, then it may be necessary to use softwires whose endpoints are AFBRs attached to different Autonomous Systems. In this case, the AFBR at the remote endpoint of a softwire is not the BGP next hop for packets that need to be sent on the softwire. Since the procedures described above require the address of remote softwire endpoint to be the same as the address of the BGP next hop, those procedures do not work as specified when the transit core consists of multiple ASes. There are two ways to deal with this situation. 1. Don't do it; require that there be AFBRs at the edge of each AS, so that a transit core does not extend more than one AS. 2. Specify a new BGP attribute that allows an AFBR to identify itself without using the NH field. This "next AFBR" attribute would be passed unchanged by non-AFBRs, but each AFBR disseminating a given routing update would replace any existing "next AFBR" attribute by its own address. When an ingress AFBR is choosing a softwire to send a packet through, if a "next AFBR" attribute is present, it would use that rather than the next hop to help it choose the proper softwire. Wu, et al. [Page 20] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 13. Security Considerations Security for softwire signaling can be achieved using BGP/TCP MD5- keying. The softwire data plane can employ encryption of the data packets using Ipsec. This will be explained in a companion document. [RFC4111] outlines the L3VPN security framework which in many cases is directly applicable to the softwire mesh framework. 14. Acknowledgments David Ward, Chris Cassar, Gargi Nalawade, Ruchi Kapoor, Pranav Mehta, Mingwei Xu and Ke Xu provided useful input into this document. 15. Normative References [ENCAPS-SAFI] "BGP Information SAFI and BGP Tunnel Encapsulation Attribute", P. Mohapatra and E. Rosen, draft-pmohapat-idr-info-safi- 00.txt, September 2006. [RFC2003] "IP Encapsulation within IP", C. Perkins, October 1996. [RFC2119] "Key words for use in RFCs to Indicate Requirement Levels", S. Bradner, March 1997. [RFC2784] "Generic Routing Encapsulation (GRE)", D. Farinacci, T. Li, S. Hanks, D. Meyer, P. Traina, RFC 2784, March 2000. [RFC3031] "Multiprotocol Label Switching Architecture", E. Rosen, A. Viswanathan, R. Callon, RFC 3031, January 2001. [RFC3032] "MPLS Label Stack Encoding", E. Rosen, D. Tappan, G. Fedorkow, Y. Rekhter, D. Farinacci, T. Li, A. Conta, RFC 3032, January 2001. [RFC3209] D. Awduche, L. Berger, D. Gan, T. Li, V. Srinivasan, and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC 3209, December 2001. [RFC3931] J. Lau, M. Townsley, I. Goyret, "Layer Two Tunneling Protocol - Version 3 (L2TPv3)", RFC 3931, March 2005. [V4NLRI-V6NH] F. Le Faucheur, E. Rosen, "Advertising an IPv4 NLRI Wu, et al. [Page 21] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 with an IPv6 Next Hop", draft-lefaucheur-idr-v4nlri-v6nh-00.txt, October 2006. [V6NLRI-V4NH] J. De Clercq, D. Ooms, S. Prevost, F. Le Faucheur, "Connecting IPv6 Islands over IPv4 MPLS using IPv6 Provider Edge Routers (6PE)", RFC 4798, February 2007. 16. Informative References [RFC1195] R. Callon, "Use of OSI IS-IS for Routing in TCP/IP and Dual Environments", RFC 1195, December 1990. [RFC2328] J. Moy, "OSPF Version 2", RFC 2328, April 1998 [RFC3036] "LDP Specification", L. Andersson, P. Doolan, N. Feldman, A. Fredette, B. Thomas, January 2001. [RFC4111] L. Fang, "Security Framework for Provider-Provisioned Virtual Private Networks (PPVPNs)", RFC 4111, July 2005. [RFC4176] Y. El Mghazli, T. Nadeau, M. Boucadair, K. Chan, A. Gonguet, "Framework for Layer 3 Virtual Private Networks (L3VPN) Operations and Management", RFC 4176, October 2005. [RFC4271] Rekhter, Y,, Li T., Hares, S., "A Border Gateway Protocol 4 (BGP-4)", RFC 4271, January 2006. [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private Networks (VPNs)", RFC 4364, February 2006. [RFC4378] D. Allan and T. Nadeau, "A Framework for Multi-Protocol Label Switching (MPLS) Operations and Management (OAM)", RFC 4378, February 2006. [BFD] D. Katz and D. Ward, "Bidirectional Forwarding Detection", draft-ietf-bfd-base-04 (work in progress), October 2005. [SW-PROB] X. Li, "Softwire Problem Statement", draft-ietf-softwire- problem-statement-00.txt, December 2005. Authors' Addresses Wu, et al. [Page 22] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 Jianping Wu Tsinghua University Department of Computer Science, Tsinghua University Beijing 100084 P.R.China Phone: +86-10-6278-5983 Email: jianping@cernet.edu.cn Yong Cui Tsinghua University Department of Computer Science, Tsinghua University Beijing 100084 P.R.China Phone: +86-10-6278-5822 Email: yong@csnet1.cs.tsinghua.edu.cn Xing Li Tsinghua University Department of Electronic Engineering, Tsinghua University Beijing 100084 P.R.China Phone: +86-10-6278-5983 Email: xing@cernet.edu.cn Chris Metz Cisco Systems, Inc. 3700 Cisco Way San Jose, Ca. 95134 USA Email: chmetz@cisco.com Wu, et al. [Page 23] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 Eric C. Rosen Cisco Systems, Inc. 1414 Massachusetts Avenue Boxborough, MA, 01719 USA Email: erosen@cisco.com Simon Barber Cisco Systems, Inc. 250 Longwater Avenue Reading, ENGLAND, RG2 6GB United Kingdom Email: sbarber@cisco.com Pradosh Mohapatra Cisco Systems, Inc. 3700 Cisco Way San Jose, Ca. 95134 USA Email: pmohapat@cisco.com John Scudder Juniper Networks 1194 North Mathilda Avenue Sunnyvale, California 94089 USA Email: jgs@juniper.net Wu, et al. [Page 24] Internet Draft draft-wu-softwire-mesh-framework-02.txt February 2007 17. Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 18. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf- ipr@ietf.org. Wu, et al. [Page 25]