Internet-Draft CATS with Generic Metric October 2024
Yuan, et al. Expires 24 April 2025 [Page]
Workgroup:
IDR
Internet-Draft:
draft-yuan-idr-generic-metric-cats-00
Published:
Intended Status:
Standards Track
Expires:
Authors:
Y.DY. Yuan
ZTE Corporation
Z.FL. Zhou
ZTE Corporation
D.H. Huang
ZTE Corporation
C.QD. Chen
ZTE Corporation
D.CN. Dai
ZTE Corporation

Computing Aware Traffic Steering (CATS) with Generic Metric

Abstract

Steering traffic for computing-related services considering computing resources and circumstances is discussed in CATS WG. Correspondingly, publishing services and updating computing conditions turns out to be a significant issue. It SHOULD be realized that multiple same common metrics are required from both network and service instances in order to evaluate overall performance and further achieve and fulfill appropriate traffic steering and scheduling. Therefore, an implementation for distributed CATS with generic metrics delivery and distribution based on BGP is proposed and discussed in this draft.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 24 April 2025.

Table of Contents

1. Introduction

Since for computing related services, AR/VR, metaverse for instance, the performance experienced by clients and customers is determined not only by network metrics but also by computing circumstances. Relevant use cases and problem statements are discussed in [I-D.ietf-cats-usecases-requirements]. For CATS framework introduced in [I-D.ietf-cats-framework], it would be an essential and significant issue of computing metrics publishing and updating for CATS.

Generally, control plane for CATS could be organized and deployed in various patterns and forms depending on the specific schemes of computing metrics collection and notification, instance selection and path calculation and other workflows. Especially for distributed metrics collection and distributed control plane implementations, protocols including BGP, BGP-LS, IGP would be mentioned to extend their capabilities to support metrics distribution and collection.

Furthermore, for computing metrics, they could be classified into multiple types and categories. A typical instance for computing metric analysis and discussion is presented in [I-D.ysl-cats-metric-definition]. Generally, there could be converted, abstract and generic metrics or explicit metadata. In another aspect, to achieve end-to-end service provisioning, metrics of same dimensions among network infrastructure and service instances SHOULD be considered together while unique types of computing metrics MAY be processed independently.

General considerations for metrics which MAY be distributed and utilized in CATS are discussed below.


                                        Computing Resources
                                        Inst latency
                                        Service bandwidth
                                        Abstract metrics
                                                            +---+
                                                         +-----+ )
                                                 +----- +|C-SMA|  +
                                                /      ( +-----+   )
                                               /      ( +--+    --  )
+--------------+              +--------------+/     (   |LB|---(  )  )
|CATS-Forwarder|--------------|CATS-Forwarder|------(   +--+    --   )
+--------------+              +--------------+       (              )
                 Network                              +------------+
                 link(policy) latency                 Service Instance
                 link(policy) bandwidth
                 link(policy) metric

Figure 1: Network and Computing Metrics

In distributed control plane scenarios, especially when the service traffic needs to traverse multiple ASes, computing metrics SHOULD be distributed among CATS-Forwarders and be considered when performing ordered updates of routes. Thus, a distribution scheme based on generic metric introduced in [I-D.ietf-idr-bgp-generic-metric] is proposed in this draft. Generic metric is proposed to accumulate and propagate different types of metrics as it will aid in intent-based end-to-end path across BGP domains. Similarly, CATS SHOULD also be recognized as another intend-based end-to-end routing scenario. Computing-related services would be identified with multiple intents and thus these intents and relevant metrics SHOULD be able to be distributed. Furthermore, computing metrics, especially generic and common types of metrics, require to be accumulated and thus processed along the path of distribution. Detailed implementation will be introduced and discussed in the following sections.

2. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

3. Generic Metrics for CATS

In [I-D.ietf-idr-bgp-generic-metric], Accumulative Metric is defined.


 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Accumulated Metric Code    |   Accumulated Metric Length   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    Accumulated Metric Data...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 2: AMetric TLV

For the field of Metric Type in Accumulated Metric Data, values would be determined from IGP-Protocol registry for metric-types. Thus, parameters including latency, upstream/downstream bandwidth and configured TE metric of service instances could be encoded accordingly for a CATS scenario, in order to be processed in a general accumulative manner along the path.


 0                 1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Metric-type 1 | Metric-flags1 | Metric 1 value...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 0                 1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Metric-type 2 | Metric-flags2 | Metric 2 value...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 0                 1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Service Metric| Metric-flags  | Service Metric value...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Figure 3: Accumulative Metrics

Besides metric types defined with IGP registry, unique metric types would also be considered for a CATS scenario to extend and modify a current AMetric scheme. Suppose a general Service Metric or Cost would be proposed which specify the estimated or tested performance of a service instance with an abstract value. With normalized Service Metric and multiple dimensions of existing generic metrics, the implementations for CATS turn out to be various patterns. Regarding similar classifications for manifestations of discontinuity, typical senarios will be displayed in the following sections.

4. Senario 1: Minimum End-to-end Latency for Computing-related Service


                                                       +---+
                                                    +-----+ )
                                                   +|C-SMA|  +
                                                  ( +-----+   )
                                                 ( +--+    --  )
+---------+     policy     +---------+         (   |LB|---(  )  )
|CATS     |----------------|CATS     |---------(   +--+    --   )
|Forwarder|----------------|Forwarder|---+      (              )
+---------+                +---------+    \      +------------+
                                           \                 +---+
           \\                               \             +-----+ )
            \\                               \           +|C-SMA|  +
             \\                               \         ( +-----+   )
              \\                               \       ( +--+    --  )
               \\                               \    (   |LB|---(  )  )
                \\  policy                       +---(   +--+    --   )
                 \\                                   (              )
                  \\                                   +------------+
                   \\                                  +---+
                    \\                              +-----+ )
                     \\                            +|C-SMA|  +
                      \\                          ( +-----+   )
                         +---------+             ( +--+    --  )
                         |CATS     |           (   |LB|---(  )  )
                         |Forwarder|-----------(   +--+    --   )
                         +---------+            (              )
                                                 +------------+

Figure 4: Minimum End-to-end Latency for Computing-related Service

  1. C-SMAs collect computing-related metrics and pre-process relevant metadata. C-SMAs would be configured to establish BGP peers to CATS-Forwarders and thus distribute and update computing metrics with Generic Metric attribute. Suppose services deployed here require minimum end-to-end latency, delay would be filled in the update packets according to Generic Metric. Here, service routes MAY be distributed with next hop as a load balancer.

  2. Services would be deployed in VRFs or a public VRF. CATS-Forwarders might be enabled to detect the latency to their correlated load balancers. Thus, service routes of same prefixes are updated with accumulated latency values. The value includes a processing delay of service instances and a detected delay between the CATS-Forwarder and the load balancer. Comparing among routes of same service prefixes, these routes would be re-ordered determined by the accumulated latency. When selecting a best route, the service route will be distributed to the remote device and the next hop would be modified as the CATS-Forwarder itself.

  3. Similarly, remote CATS-Forwarders would be able to detect the latency of policies or network links. Therefore, CATS-Forwarders could calculate the end-to-end latency values for each candidate service instance with resolved TE policies. Identically, ordered updates are performed and best routes are correspondingly determined. Since a delay parameter is accumulated along the path of service routes distribution, the accumulation would aid remote CATS-Forwarders to perform the specific latency-intent-based path selection.

The workflow also works for circumstances when service traffic needs to traverse multiple ASes. The end-to-end latency would be accumulated and calculated along the path of service routes distribution.


                                                             +---+
                                                          +-----+ )
       +------------+  +-----------+  +-----------+      +|C-SMA|  +
       |            |  |           |  |           |     ( +-----+   )
       |            |  |           |  |           |    ( +--+    --  )
       |        ASBR|--|ASBR   ASBR|--|ASBR   ASBR|--(   |LB|---(  )  )
       |            |  |           |  |           |  (   +--+    --   )
       |            |  |           |  |           |   (              )
       |            |  +-----------+  +-----------+    +------------+
       |            |
       |            |
       |            |
+---------+         |
|CATS     |         |
|Forwarder|         |
+---------+         |
       |            |                                        +---+
       |            |                                     +-----+ )
       |            |  +-----------+  +-----------+      +|C-SMA|  +
       |            |  |           |  |           |     ( +-----+   )
       |            |  |           |  |           |    ( +--+    --  )
       |        ASBR|--|ASBR   ASBR|--|ASBR   ASBR|--(   |LB|---(  )  )
       |            |  |           |  |           |  (   +--+    --   )
       |            |  |           |  |           |   (              )
       +------------+  +-----------+  +-----------+    +------------+

Figure 5: End-to-end Latency Accumulation among Multiple ASes

5. Senario 2: Minimum Cost for Computing-related Service with constrained latency


                                     (For Inst 1 and 2)

             Delay 15,Cost 30         Delay 10,Cost 20
             Delay 25,Cost 25         Delay 20,Cost 15
            <-----------------       <-----------------

                                                       +---+
                                                    +-----+ )
                                                   +|C-SMA|  +
                                                  ( +-----+   )
                                       Delay 5   ( +--+    --  )
+---------+     policy     +---------+ Cost 10 (   |LB|---(  )  )
|CATS     |----------------|CATS     |---------(   +--+    --   )
|Forwarder|----------------|Forwarder|---+      (              )
+---------+                +---------+    \      +------------+
                                           \                 +---+
           \\                               \             +-----+ )
            \\                               \           +|C-SMA|  +
             \\                               \         ( +-----+   )
              \\                       Delay 6 \       ( +--+    --  )
               \\                      Cost 12  \    (   |LB|---(  )  )
                \\  policy                       +---(   +--+    --   )
                 \\                                   (              )
                  \\                                   +------------+
                   \\                                  +---+
                    \\                              +-----+ )
                     \\                            +|C-SMA|  +
                      \\                          ( +-----+   )
                         +---------+  Delay 8    ( +--+    --  )
                         |CATS     |  Cost 14  (   |LB|---(  )  )
                         |Forwarder|-----------(   +--+    --   )
                         +---------+            (              )
                                                 +------------+
                                       (For Inst 3)

                                      Delay 10,Cost 20
                                     <-----------------

Figure 6: Minimum Cost for Computing-related Service with constrained latency

  1. Similar to Scenario 1, C-SMAs collect computing-related metrics and distribute computing metrics with Generic Metric attribute. Suppose services deployed here require minimum end-to-end cost, TE metric for instance. Additionally, end-to-end latency is configured as constraints for ordered updates of routes. Converted costs and detected latency values would be filled in the update packets.

  2. Service routes of same prefixes are updated with accumulated latency values and costs. The latency value includes a processing delay of service instances and a detected delay between the CATS-Forwarder and the load balancer. Similarly, The cost value includes a notified cost and a configured cost to the next hop. Additional path MAY be enabled at CATS-Forwarders, and thus service route will be distributed to the remote device and the next hop would be modified as the CATS-Forwarder itself.

  3. Finally, remote CATS-Forwarders calculate the end-to-end latency values and overall costs for each candidate service instance with resolved policies or forwarding paths. Ordered updates with configured constraints are performed and best or appropriate routes are correspondingly determined.

Therefore, a generic metric scheme would work well for multi-factor scenarios.

6. Senario 3: Normalized Metrics in Distribution Process

It SHOULD be considered that generic metrics MAY be not always supported for each ASes and devices alongside the distribution process. Under certain circumstances, these metrics would be normalized or be transmitted unchanged.


                                   (For Inst 1 and 2)

                                    Delay 10,Metric 10
                                    Delay 20,Metric 12
       Cost+Normalized Metric      <------------------

        +------------------+                         +---+
        |                  |                      +-----+ )
        |                  |                     +|C-SMA|  +
        |                  |         Delay 5    ( +-----+   )
        |                  |         Cost 10   ( +--+    --  )
        |                +---------+         (   |LB|---(  )  )
        |                |CATS     |---------(   +--+    --   )
        |                |Forwarder|---+      (              )
        |                +---------+    \      +------------+
        |                  |             \                 +---+
        |                  |              \             +-----+ )
        |                  |               \           +|C-SMA|  +
        |                  |         Delay 8\         ( +-----+   )
        |                  |         Cost 10 \       ( +--+    --  )
        |                  |                  \    (   |LB|---(  )  )
Service |                  |                   +---(   +--+    --   )
Metric  |                  |                        (              )
Unaware |                  |                         +------------+
        |                  |                         +---+
        |                  |                      +-----+ )
        |                  |                     +|C-SMA|  +
        |                  |        Delay 6     ( +-----+   )
        |              +---------+  Cost 15    ( +--+    --  )
        |              |CATS     |           (   |LB|---(  )  )
        |              |Forwarder|-----------(   +--+    --   )
        |              +---------+            (              )
        |                  |                   +------------+
        |                  |         (For Inst 3)
        |                  |
        |                  |         Delay 10,Metric 15
        +------------------+        <------------------

Figure 7: Minimum Cost for Computing-related Service with constrained latency

Normalization algorithms and strategies could be configured at CATS-Forwarders. When an AS or device is unaware of specific type of generic metric, a service metric displayed in the figure for instance, the metric value could be converted and normalized. For instance, service metric values could be magnified ten-fold to be common IGP Cost values. Afterwards, normalized values could be accumulated with IGP Costs to next hop. With the other implementation, unrecognized values would be transmitted unchanged if the remote devices are capable of analyzing such metrics. Ordered updates of service routes could be performed with a purpose of minimum service metric with constraints of end-to-end latency and cost.


                                     (For Inst 1 and 2)

    Service Metric                    Delay 10,Metric 10
    Accumulated Cost,Delay            Delay 20,Metric 12
    <------------------              <------------------

+---------+    +-------------+                         +---+
|         |    |             |                      +-----+ )
|         |    |             |                     +|C-SMA|  +
|         |    |             |         Delay 5    ( +-----+   )
|         |    |             |         Cost 10   ( +--+    --  )
|         |    |           +---------+         (   |LB|---(  )  )
|         |    |           |CATS     |---------(   +--+    --   )
|         |    |           |Forwarder|---+      (              )
|         |    |           +---------+    \      +------------+
|         |    |             |             \                 +---+
|         |    |             |              \             +-----+ )
|         |    |             |               \           +|C-SMA|  +
|         |    |             |         Delay 8\         ( +-----+   )
|         |    |             |         Cost 10 \       ( +--+    --  )
| Service |--- | Service     |                  \    (   |LB|---(  )  )
| Metric  |    | Metric      |                   +---(   +--+    --   )
| Aware   |--- | Unaware     |                        (              )
|         |    |             |                         +------------+
|         |    |             |                         +---+
|         |    |             |                      +-----+ )
|         |    |             |                     +|C-SMA|  +
|         |    |             |        Delay 6     ( +-----+   )
|         |    |         +---------+  Cost 15    ( +--+    --  )
|         |    |         |CATS     |           (   |LB|---(  )  )
|         |    |         |Forwarder|-----------(   +--+    --   )
|         |    |         +---------+            (              )
|         |    |             |                   +------------+
|         |    |             |         (For Inst 3)
|         |    |             |
|         |    |             |         Delay 10,Metric 15
+---------+    +-------------+        <------------------

Figure 8: Minimum Service Metric for Computing-related Service with constrained latency and cost

7. Conclusion

About Computing Aware Traffic Steering (CATS) with Generic Metric, several considerations SHOULD be noted:

8. Security Considerations

TBA.

9. Acknowledgements

TBA.

10. IANA Considerations

TBA.

11. Normative References

[I-D.ietf-cats-framework]
Li, C., Du, Z., Boucadair, M., Contreras, L. M., and J. Drake, "A Framework for Computing-Aware Traffic Steering (CATS)", Work in Progress, Internet-Draft, draft-ietf-cats-framework-04, , <https://datatracker.ietf.org/doc/html/draft-ietf-cats-framework-04>.
[I-D.ietf-cats-usecases-requirements]
Yao, K., Trossen, D., Contreras, L. M., Shi, H., Li, Y., Zhang, S., and Q. An, "Computing-Aware Traffic Steering (CATS) Problem Statement, Use Cases, and Requirements", Work in Progress, Internet-Draft, draft-ietf-cats-usecases-requirements-03, , <https://datatracker.ietf.org/doc/html/draft-ietf-cats-usecases-requirements-03>.
[I-D.ietf-idr-bgp-generic-metric]
Sangli, S. R., Hegde, S., Das, R., Decraene, B., Wen, B., Kozak, M., Dong, J., Jalil, L., and K. Talaulikar, "Accumulated Metric in NHC attribute", Work in Progress, Internet-Draft, draft-ietf-idr-bgp-generic-metric-00, , <https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-generic-metric-00>.
[I-D.ysl-cats-metric-definition]
Yao, K., Shi, H., and C. Li, "CATS metric Definition", Work in Progress, Internet-Draft, draft-ysl-cats-metric-definition-00, , <https://datatracker.ietf.org/doc/html/draft-ysl-cats-metric-definition-00>.
[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.

Authors' Addresses

Dongyu Yuan
ZTE Corporation
Nanjing
China
Fenlin Zhou
ZTE Corporation
Nanjing
China
Daniel Huang
ZTE Corporation
Nanjing
China
Qiudong Chen
ZTE Corporation
Nanjing
China
Chunning Dai
ZTE Corporation
Nanjing
China