Internet-Draft DES Problem Statement July 2024
Du, et al. Expires 9 January 2025 [Page]
Intended Status:
Z. Du
China Mobile
K. Yao
China Mobile
H. Yang
China Mobile

Use Cases and Problem Statements of Online Data Express Service


This document describes use cases and problem statements of Online Data Express Service.

Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 9 January 2025.

Table of Contents

1. Introduction

As the rapid development of digital technologies, transporting a bulk of data conveniently between two faraway places has become a fundamental requirement for the Internet. However, the user experience of transferring a large file is not always that satisfying. For example, although we can connect to a cloud service anywhere now, sometimes the file upload experience is slow and troublesome. Alternatively, we can post a hard disk to others for this big data transferring.

The Online Data Express Service (ODES) in this document is defined as a convenient online service, which can transfer a large file or a bundle of files between different entities within a time limitation. The file size may be at the GB, TB level, or even bigger. The time limitation may be several hours, one day or two days.

The target approach in ODES should be more flexible and convenient than the manual hard disk express, and can be quicker than the approach of transferring directly by using the Internet and the default TCP transport protocol.

The ODES use cases include inter-cloud backup and disaster recovery, film and television editing, scientific computing, cooperation of intelligent computing centers, etc. This document also analyzes the problems that need to be considered for ODES. A related gap analysis document can be found in [I-D.zhao-tsvwg-odes-gap-analysis].

2. Use Cases of ODES

2.1. Inter-Cloud Backup and Disaster Recovery

As the development of the cloud computing industry, cloud data centers are bearing a large amount of various enterprise IT services. The storage, transmission, and protection of the massive growth data bring new challenges. For example, disaster recovery of core application data is necessary to ensure the enterprise data security and the business continuity. In the scenario of disaster recovery of the operator's traffic data, the daily data backup volume of a single IT cloud resource pool is at the TB level. The primary and backup data centers are normally built in different locations with long data transmission distances. However, they do not have high requirements for data transmission timeliness. By utilizing the tidal effect of the network, we can use the idle bandwidth at night for the transmission, so as to improve the data transmission efficiency and reduce the data transmission cost.

2.2. Film and Television Editing

The shooting materials for films and television variety shows need to be edited and rendered by a post-production company. Due to the uncertain shooting location, the shooting materials need to be transferred in bulk to the post-production company's site according to the shooting and production cycle. The raw material data volume of a large-scale variety show or film is at the PB level, with a single transmission of approximately 10TB to 100TB of data. The manual hard disk express method involves two times of data copies (the upload at the source, and the download at the destination), and manual handling (taking an airplane or a high-speed rail to transport the disk array). Each trip takes 2-3 days and requires dedicated personnel to operate, resulting in poor timeliness and low efficiency. How to fully utilize network capabilities and provide convenient online data movement services for the audio and video industry through online transmission while meeting timeliness and reducing labor investment costs poses new challenges to the network.

2.3. Scientific Computing

In the development of intelligent computing and supercomputing, the import and export of big data from intelligent computing and supercomputing centers is required. However, it lacks efficient and low-cost solutions, especially in supporting scientific computing scenarios such as astronomy and meteorology. Taking the calculation of FAST astronomical data of China as an example, FAST has approximate 200 observation projects per year, with a single project generating observation data of TB~PB magnitude and an annual output of approximately 15PB. If the data export is done manually, the data export application may be delayed for several months due to the lack of dedicated personnel responsible for data copy operations. In addition, data transmission and destination data import operations are very time-consuming, significantly affecting the timeliness of data acquisition. In conclusion, there is an urgent need for an efficient and economical online data transmission solution for large-scale data migration scenarios in scientific computing.

2.4. Gene Sequencing

Gene sequencing technology is becoming increasingly mature, significantly shortening the sequencing time and promoting its application comprehensively. It can provide various gene sequencing and data analysis services to scientific research institutions, medical service institutions, or individuals. Traditional gene sequencing mainly relies on local laboratory analyses, and its timeliness and scale are constrained by local computing resources, making it difficult to improve. Cloud-based gene sequencing data have gradually become an industry trend. A domestic gene company would have a gene sequencing data volume of 100PB/year, with a cloud data volume of approximately TB~100TB/time. Nowadays, the gene sequencing data source and supercomputing cloud data center are connected through a fixed bandwidth dedicated line, which is expensive and lacks of cost-effective solutions.

3. Problem Statement of ODES

With the vigorous development of industrial digitalization and cloud computing, the demand for high-capacity data transmission in different places is increasing. At the same time, scenarios such as multi-cloud data backup and data on remote cloud in different places put forward higher requirements on the throughput of online data transmission. There is an urgent need to achieve high-throughput transmission of massive data in WAN.

By analyzing and summarizing the above typical application scenarios, we obtain the following common features.

  1. Large amount of data transmitted in a single time: TB~PB.

  2. High flow transmission frequency: There is a demand for regular or irregular data transmission, with high peak bandwidth requirements.

  3. Low real-time requirements: Mainly warm or cold data, not strongly real-time hot data, but the faster the transmission completion time, the better.

  4. Cost-sensitive: Customers do not want to pay for high bandwidth dedicated lines separately because the transmission frequency is variable, which leads to low network utilization and cost-effectiveness.

To support differentiated data delivery services and create task-based data delivery services, the target ODES system should meet the following requirements.

  1. High throughput in WAN is an important goal of data express service. Technologies such as wide-area RDMA and Elephant flow load balancing could be utilized to achieve high throughput network data transmission.

  2. Compared with using fixed bandwidth of traditional dedicated lines for data transmission, data express supports the bandwidth elastic expansion and contraction function, providing users with plug and play, stop after use and highly elastic services.

  3. Data, as the core asset of users, enterprises, social organizations, etc., are related to the interests of all parties, so we must attach great importance to data security. On the basis of providing efficient transmission services, data express strictly guarantees the security of user data and ensures that user data will not be stolen or tampered with.

  4. The data express business has the characteristics of large amount of data and long transmission distance. In order to better meet the business requirements, it is needed to ensure that the data can be transmitted to the destination in a time as short as possible if required, so that the flow completion time can be shorten.

4. IANA Considerations


5. Security Considerations


6. Acknowledgements


7. Contributors

The following people have substantially contributed to this document:

        Guangyu Zhao
        China Mobile

8. References

8.1. Normative References

Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <>.

8.2. Informative References

Zhao, G., Yang, H., and Z. Li, "Gap Analysis of Online Data Express Service(ODES)", Work in Progress, Internet-Draft, draft-zhao-tsvwg-odes-gap-analysis-01, , <>.

Authors' Addresses

Zongpeng Du
China Mobile
No.32 XuanWuMen West Street
Kehan Yao
China Mobile
No.32 XuanWuMen West Street
Hongwei Yang
China Mobile
No.32 XuanWuMen West Street