Tales & Tips from the Trenches: Data Movement in Edge Computing

We have learned from our recent articles : Demystifying Edge Computing and Types of and How to use Edge Computing – how Edge Computing architecture, with its capabilities of distributed computing, is addressing the rising scale and ubiquity of data.

The continuous shifting nature of the center of data’s gravity between the decentralized edge and the centralized alter ego (the “cloud”) has in turn made the data movements between them extremely important.

Why Data Movement is
Critical in Edge Computing

Edge computing augments and expands the possibilities of today’s primarily centralized, hyperscale cloud model, supports the systemic evolution and deployment of the IoT, and supports entirely new application types, enabling next-generation digital business applications. However, every business is a data business and today’s enterprises can now begin to extract previously untapped value from data in the new cloud-edge ecosystem. As we support this scope of data solutions from hyperscale cloud data centers to edge-based home thermostats and tactical warfighters, the distributed models of edge computing and its cloud-complementing nature have raised the classic data management questions around, “What data goes where and how?” The innovations in cloud services based offline and online data movements, have paved the way for much more efficient ways of using information and making this new computing model a success with right data at right time and at right location.

How to Define a Data Movement Problem

Data transfer situations can be offline or over network connection. The data movement problem can be better defined with the variables of size of the data to be moved, frequency of data transferand the network bandwidth available for data transfer between edge and cloud.

Possible Scenarios of Data Movement Problems

Scenario 1: Transfer large datasets (few Terabytes to few Petabytes) with no or low network bandwidth (less than 100Mbps).

Available network bandwidth is limited, or non-existent, and large datasets need to be transferred.

Scenario 2: Transfer large datasets (few Terabytes to few Petabytes) with moderate to high network bandwidth (100 Mbps – 1 Gbps).

Available network bandwidth is moderate to high, and large datasets need to be transferred.

Scenario 3: Transfer small datasets (few Gigabytes to few Terabytes) with limited to moderate network bandwidth (45 Mbps /T3 connection in datacenter to 1 Gbps).

Available network bandwidth is limited to moderate and small datasets need to be transferred.

Scenario 4: Periodic data transfers.

Point-in-time data transfer at regular intervals, or continuous data transfer is required

Types of Data Movement Solutions

Today, the evolving cloud and edge computing architectures, bundled with requirements of economies of scale, shape the data movement solutions based on quality of service, latency, self-healing engineering, dynamic provisioning, distributed data experience, risk management and security challenges.

Major Options Supporting Data Movement Solution Types

Offline transfer using shippable devices: Physical shippable devices or disks can be used when offline one-time bulk data transfer is required after copying the data to the devices/disks.

Online transfer: Transferring data online between edge and cloud can happen over network connection with the following types of solutions:

  • Web-based and graphical interface tools can be used where non-automated data transfer and occasional data transfers are required.
  • Scripted and programmatic tool driven data transfers with REST APIs can be used either with cloud services provider software or software development kits (SDK) for .NET, Java, Python, Node/JS, C++, Go, PHP, Ruby etc.
  • Managed cloud native data pipelines can be set upto regularly transfer data between several edges & cloud and transform data as required while data is in motion.
  • Object replication methods can support continuous data ingestion along with managed data pipelines for periodic data transfers in a synchronous /asynchronous fashion.

Key Features Defining Data Movement Solution Capabilities

The following keydata movement solution features define its capabilities through quantitative and qualitative attributes to solve the data movement problems in the edge computing:

  • Data Size: Data volume supported in exporting and importing data between edge and cloud end points
  • Data Formats: Capabilities supporting movement of Structured data, Unstructured data, or both
  • Hardware / Form Factors: Specific hardware / form factors of device required for data movement
  • Network Interface: Specific interface control requirements for online data movements
  • Security & Encryption: Security controls for data at rest, in motion, in use along with encryption needs
  • Pricing: Considerations for pricing in terms of data uploads and downloads within certain geographies
  • Data processing: Capabilities in terms of data conditioning and transformations during data movement
  • Performance: Optimized performance in terms of setup, data movement speed, processing, and caching

Hopefully this article has provided you with the knowledge of how data movement patterns are critical in conjoining the edge and cloud by addressing the continuous shifting nature of the center of data’s gravity between them and make this computing complementary model a success.

This quarter’s column is written by Shibasis Mitra with THE MITRE CORPORATION and has over 25 years of experience in Data Management. He has helped public and private sector clients with diverse Data Solutions, Analytic Tool Development, Cloud & Edge based platform builds from strategy through execution.

He has strong expertise in architecting complex Data Management systems with cloud native and open-source services within agile spaces following Software Development Life Cycle (SDLC). Shibasis has a bachelor’s degree in Engineering and is certified in Data Modeling, Cloud technologies and Agile Software Development.

Approved for Public Release; Distribution Unlimited. Public Release Case Number 21-3342

The author’s affiliation with The MITRE Corporation is provided for identification purposes only, and is not intended to convey or imply MITRE’s concurrence with, or support for, the positions, opinions, or viewpoints expressed by the author. ©2021 The MITRE Corporation. ALL RIGHTS RESERVED

Share this post

Bonnie O'Neil

Bonnie O'Neil

Bonnie O'Neil is a Principal Computer Scientist at the MITRE Corporation, and is internationally recognized on all phases of data architecture including data quality, business metadata, and governance. She is a regular speaker at many conferences and has also been a workshop leader at the Meta Data/DAMA Conference, and others; she was the keynote speaker at a conference on Data Quality in South Africa. She has been involved in strategic data management projects in both Fortune 500 companies and government agencies, and her expertise includes specialized skills such as data profiling and semantic data integration. She is the author of three books including Business Metadata (2007) and over 40 articles and technical white papers.

scroll to top