1 of 76

Edge Computing

Jonathan Fürst

IoT Platform Group, NEC Labs Europe

Heidelberg, Germany��<jonathan.fuerst@neclab.eu>

2 of 76

About me

  • MSc and PhD from IT University of Copenhagen.
  • Since January 2019 researcher at NEC Labs Europe.
  • Background in sensor networks, IoT systems especially in a non-residential building context.
  • Teaching this course before when it was still called Pervasive Computing.

· 2

3 of 76

Intro

· 3

Dec. 2018, NEC acquires KMD

4 of 76

Intro

NEC Labs Europe

· 4

5 of 76

Instrumenting the Physical World

5

→ Goal: Use sensor data to understand (model) and influence (actuate) physical systems.

Let’s just use the cloud for that!

6 of 76

Intro

Cloud Computing

· 6

Source: https://www.red-gate.com/simple-talk/cloud/cloud-development/a-comprehensive-introduction-to-cloud-computing/

7 of 76

Intro

Cloud Computing

· 7

8 of 76

Intro

What might be the problem with that?

· 8

9 of 76

Intro

· 9

10 of 76

Intro

  • Data Flow is being reversedTraditional: Content distribution from core to edge.�IoT: Generate (huge amounts of) data at the network edge and move towards core to process in cloud.
  • Diverse and interdependent QoS requirements of IoT application
    • Complex tradeoffs between responsiveness, accuracy, power consumption, and cost.
  • Rich clients with (some) processing power, that make decision based on local data.
    • E.g., drone, autonomous vehicle, smartphone…

Paradigm Shift in IoT Applications

· 10

11 of 76

Intro

Camera Input

  • Huge amounts of data (would congest network links).
  • High processing power needed, e.g., for computer vision algorithms (cannot just do all on device).
  • Requires low latency for many applications (augmented reality, surveillance…).

Approach:

  • Bring cloud resources closer to the end devices (edge, fog computing).

Example: Camera Input

· 11

12 of 76

Intro

· 12

Basic Idea:

Find and rescue disaster victims supported by drones.

13 of 76

Intro

13

T1 - (V) SLAM

T3 Infrared based Detection

T4 CV People Detection

T2 - 3D model reconstruction

T2 - 3D model reconstruction

T2 - 3D model reconstruction

T4 CV People Detection

T4 CV People Detection

T4 CV People Detection

14 of 76

Outline

  • Intro
  • Processing Sensor Data 101
  • Edge Computing Concepts
  • Edge Programming Models & Execution Frameworks
  • IoT Edge Platforms

14

15 of 76

Processing Sensor Data

15

16 of 76

Processing Sensor Data

  • “Something that conveys information.”

E.g., about the state or behavior of a physical system (voltage, current, or magnetic field strength...),

Mathematically:

  • “A function of one or more independent variables.”

Independent variable: Time

What is a Signal?

· 16

17 of 76

Processing Sensor Data

Continuous-time Signal: continuous independent variable (time)

Analog Signal

Discrete-time Signal: discrete independent variable values (sequence of numbers)

→ Dependent variable (signal amplitude) can be continuous or discrete.

Digital signal: Both time and amplitude are discrete.

Types of Signals

· 17

18 of 76

Processing Sensor Data

Continuous to Discrete-time Signal

· 18

19 of 76

Processing Sensor Data

Time series represent ordered real-valued measurements at regular temporal intervals

A time series X = {x1, x2,..., xn} for t =t1, t2,..., tn is a discrete function with value x1 for time t1, value x2 for time t2, and so on.

Time-series Data

· 19

20 of 76

Processing Sensor Data

Data Acquisition

· 20

  • Need to synchronize readings from different sensors (might be distributed)
  • Row of data for each timestamp:

Dataset

  • They are called data objects, samples , examples, instances, data points, observations
  • The columns are called attributes / features (dimensionality of the observations)

Practical Implementation:�Pandas or R Dataframe

21 of 76

Processing Sensor Data

Discrete and Continuous Attributes

· 21

Discrete Attribute

  • Has only a finite or countably infinite set of values
  • Examples: zip codes, counts, or the set of words in a collection of documents
  • Often represented as integer variables.
  • Note: binary attributes are a special case of discrete attributes

Continuous Attribute

  • Has real numbers as attribute values
  • Examples: temperature, height, or weight.
  • Practically, real values can only be measured and represented using a finite number of digits.
  • Continuous attributes are typically represented as floating-point variables.

22 of 76

Processing Sensor Data

· 22

23 of 76

Major Tasks in Data Preprocessing

  • Data cleaning
    • Fill in missing values, smooth noisy data, identify or remove outliers, and resolve inconsistencies
  • Data annotation
    • Annotate the data using the ground truth, e.g., video recordings
  • Data integration
    • Integration of multiple datasets or files
  • Data reduction
    • Dimensionality reduction
    • Numerosity reduction
    • Data compression
  • Data transformation
    • Generalize and/or normalize data

24 of 76

Why data cleaning?

  • Incomplete (missing) data
  • Noisy data
  • Inconsistent data
  • Outliers

25 of 76

About noise

  • In signal processing, noise is a general term for unwanted (and, in general, unknown) modifications that a signal may suffer during capture, storage, transmission, processing, or conversion.
  • Noise is often random errors in measurements.
  • Common to model noise as Gaussian probability distribution with zero mean.

http://www.gaussianwaves.com/2013/11/simulation-and-analysis-of-white-noise-in-matlab/

26 of 76

What is a filter?

The purpose of a filter is to remove some data while attenuating others e.g. for further processed (e.g. an action).

27 of 76

Mean (Moving average) and Median Filter

  • Simple, but often effective
  • Take mean/median of values over window size (e.g., last 5 values)
  • Variant: weighted mean filter

But:

  • Response lag (does not matter for non-live data)
  • No estimation of higher level variables (e.g., speed)

What about window size?

28 of 76

What about window size?

● Large: blur edges, remove small events

● Too small: noise may lead to false detections

29 of 76

Mean (Moving average) and Median Filter Example

https://dsp.stackexchange.com/questions/27349/moving-average-vs-moving-median

30 of 76

Location Tracking Example

31 of 76

Processing Sensor Data

  • Which sensors to use?
  • Deal with noisy sensor data
  • Power Accuracy Tradeoff
    • E.g., GSM, WiFi, GPS
    • Sampling frequency
  • Sensors in mobile device or sensors in infrastructure?

Is this enough?

Sensing the Physical World

· 31

32 of 76

Processing Sensor Data

Modeling / Inferring the Physical World

  • Identifying the right parameters; achieving high accuracy and precision in predictions.

· 32

Rules vs. Machine Learning

33 of 76

Processing Sensor Data

Rules (Declarative Approach)

· 33

  • If-then rules

34 of 76

Processing Sensor Data

Basic Steps:

  1. Collect sensor data of different types of situations that the user experiences in the application.
  2. Label data.
  3. (Feature Engineering)
  4. Learn the probabilistic relationships between sensor data and situations using your favorite algorithm.

Machine Learning Approach

· 34

35 of 76

Processing Sensor Data

Summary

· 35

  • The real world is messy, signals are messy → need to smooth�→ use filtering techniques
  • Often mean/median filter are a good start.
  • Often we cannot directly sense what we are interested in or we might have to deal with incomplete information and uncertainty.
  • We can use rules (for simple interference) or machine learning (for more complicated interferences).���Edge Computing is where some of this processing can happen.

36 of 76

Edge Computing

36

37 of 76

Edge Computing

Goals

· 37

  • Smaller Latency → enable responsive sensitive applications
  • Improve Reliability → don’t depend on cloud response
  • Avoid Bandwidth Limitations → just single hop to edge device
  • Improve Data Privacy → do data processing in own network

→ Bring cloud resources closer to end-users (e.g., IoT devices).

38 of 76

Edge Computing

  1. How to distribute computational tasks across networked devices efficiently?

Bonus: Run them reliable.��

  • How to provide easy interface to the developers?

Challenges

· 38

Distributed Execution Frameworks

Programming Models

39 of 76

Edge Computing

1. The network is reliable

2. Latency is zero

3. Bandwidth is infinite

4. The network is secure

5. Topology doesn’t change

6. There is one administrator

7. Transport cost is zero

8. The network is homogenous

Peter Deutsch 1994 / James Gosling 1997

The 8 fallacies of distributed computing

· 39

40 of 76

Edge Computing: Terms

  • “a horizontal, physical or virtual resource paradigm that resides between smart end-devices and traditional cloud or data centers. This paradigm supports vertically-isolated, latency-sensitive applications by providing ubiquitous, scalable, layered, federated, and distributed computing, storage, and network connectivity.”
  • Cloudlet → A fog node.
  • Mist computing → more lightweight fog nodes
  • Edge computing → often a synonym for fog computing

Fog Computing

· 40

41 of 76

Edge Computing: Terms

Fog Computing

· 41

42 of 76

CDNs (e.g., Akamai)

On content delivery side this is nothing new

· 42

Source: https://developer.akamai.com/blog/2017/06/14/optimizing-cacheability-web-app-performance/

43 of 76

Edge Computing

Edge computing generalizes and extends the CDN concept by leveraging cloud computing infrastructure. As with CDNs, the proximity of cloudlets to end users is crucial. However, instead of being limited to caching web content, a cloudlet can run arbitrary code just as in cloud computing. This code is typically encapsulated in a virtual machine (VM) or a lighter-weight container for isolation, safety, resource management, and metering.

· 43

44 of 76

Edge Computing

Summary

· 44

  • Edge Computing as a generalization of CDN concept for computation.
  • Avoids some of the limitations of the cloud and makes certain applications feasible.

  • Bootstrapping problem:�Without unique applications and services that leverage edge computing, there is no incentive for deploying cloudlets. Yet, without large-enough cloudlet deployments, there is little incentive for developers to create those new applications and services.
  • Plus:�Question of a good programming model and execution environment.

45 of 76

Edge Programming Models & Execution Frameworks

45

46 of 76

Cloudlets and MAUI (2009/10)

Motivation:

  • Mobile Computing:�→ limited computing, battery, memory
  • Cloud Computing:�→ helps to offload computation, but high WAN latency and bandwidth-induced delays (plus 3G is battery expensive).

· 46

47 of 76

Cloudlets and MAUI (2009/10)

“...mobile users seamlessly utilize nearby computers to obtain the resource benefits of cloud computing without incurring WAN delays and jitter. Rather than relying on a distant “cloud,” a mobile user instantiates a “cloudlet” on nearby infrastructure and uses it via a wireless LAN. Crisp interactive response for immersive applications that augment human cognition is then much easier to achieve because of the proximity of the cloudlet. We confirm that a critical untested aspect of this vision, namely rapid customization of cloudlet infrastructure, is achievable through dynamic VM synthesis.”

Vision

· 47

48 of 76

Cloudlets and MAUI (2009/10)

Mobile-Edge computing framework:

MAUI decides at runtime which methods should be remotely executed, driven by an optimization engine that achieves the best energy savings possible under the mobile device’s current connectivity constraints.

MAUI

· 48

49 of 76

Cloudlets and MAUI (2009/10)

  • Uses code portability to create two versions of a smartphone application, one of which runs locally on the smartphone and the other runs remotely in the infrastructure.
  • Uses programming reflection combined with type safety to automatically identify the remoteable methods and extract only the program state needed by those methods.
  • Profiles each method of an application and uses serialization to determine its net- work shipping costs (i.e., the size of its state). MAUI combines the network and CPU costs with measurements of the wireless connectivity, such as its bandwidth and latency to construct a linear programming formulation of the code offload problem.

MAUI: Details

· 49

50 of 76

Cloudlets and MAUI (2009/10)

MAUI: Results

· 50

51 of 76

Cloudlets and MAUI (2009/10)

· 51

52 of 76

Google Stadia 2019

Stadia latency 166ms or roughly 5 frames of input lag in a 30fps.

Assumption:

  • Good Internet connection & close-by data center
  • controller uses WiFi to connect directly to the game running in Google’s data center

· 52

53 of 76

Google Edge Datacenters

Calder, Matt, Xun Fan, Zi Hu, Ethan Katz-Bassett, John Heidemann, and Ramesh Govindan. "Mapping the expansion of Google's serving infrastructure." In Proceedings of the 2013 conference on Internet measurement conference, pp. 313-326. ACM, 2013.

· 53

54 of 76

FogFlow (IEEE IoT17)

Motivation

· 54

edge computing has great potential to reduce bandwidth consumption and end-to-end latency, but it raises much more complexity than cloud computing since the cloud-edge environment is more open, heterogeneous, and dynamic

Can we program applications over cloud-edges easily, like programming them in the cloud?

Can we let the cloud-edge platform to automatically manage and optimize its own resources under such dynamics?

Complicate to realize services due to lack of programming model and poor interoperability:

spend months for each service

service/application

providers

No approach of dealing with dynamics like device mobility, instant service usage, temporary failure:

applications have to face those issues

service realization

during the development phase

resource management

during the deployment phase

new

services

New requirements

come frequently

Cheng, Bin, et al. "FogFlow: Easy Programming of IoT Services Over Cloud and Edges for Smart Cities." IEEE Internet of Things Journal (2017).

Slide by Bin Cheng

55 of 76

FogFlow (IEEE IoT17)

What Is FogFlow (1): Cloud-Edge Orchestrator

  • FogFlow is a cloud-edge orchestrator to orchestrate dynamic NGSI-based data processing flows on-demand between producers and consumers for providing timely results to make fast actions, based on context (system context and data context)

· 55

Producers

(sensors)

cloud

edge

edge

edge

raw context information

timely results

FogFlow

dynamic processing flows

Data context

System context

Slide by Bin Cheng

Open Sourced: https://github.com/smartfog/fogflow

56 of 76

Use Case: Smart Awning

· 56

  1. Enable faster response time with lower resource usage
  2. Nearly zero management cost

Slide by Bin Cheng

57 of 76

IoT Edge Platforms

57

58 of 76

Nebbiolo: Industry 4.0

58

59 of 76

Microsoft Azure IoT Edge

· 59

60 of 76

Microsoft Azure IoT Edge

· 60

61 of 76

AT&T and Linux Foundation: Akraino

· 61

62 of 76

OpenStack Edge Computing

· 62

63 of 76

OpenFog Consortium

· 63

https://www.openfogconsortium.org/

64 of 76

End-user Privacy: Databox Project

· 64

https://www.databoxproject.uk

65 of 76

New Hardware: Autonomous Cars

65

https://www.anandtech.com/show/9903/nvidia-announces-drive-px-2-pascal-power-for-selfdriving-cars

66 of 76

New Hardware: Autonomous Cars

66

https://www.anandtech.com/show/9903/nvidia-announces-drive-px-2-pascal-power-for-selfdriving-cars

67 of 76

Take-Aways

67

68 of 76

Take-Aways

  • Edge computing is a concept that brings computational resources (cloud resources) closer to the device (sensors, mobile phones, drones, cars...) to overcome latency, bandwidth, reliability and privacy limitations of cloud computing, combined with added hardware heterogeneity.�→ Raises system complexity:
    • Much research in academia and industry to design new execution frameworks and programming models
    • Question of who is going to establish platforms (cloud vs. mobile providers). Role of 5G.
    • Killer app for edge computing?

· 68

69 of 76

Intro

NEC Labs Europe

· 69

Internship or Master Thesis

Flexible dates.

Work language is English

Please talk/write to me.

jonathan.fuerst@neclab.eu

The research leading to these results has received funding from the European Community's Horizon 2020 research and innovation programme under grant agreement nº 779747

70 of 76

Bonus Slides

70

71 of 76

MagnetOS (MobiSys’05)

Programming ad-hoc (sensor) networks is complicated:

  • they are treated as a system of standalone systems, as�a network of independent, autonomous computers.�→ forces applications to provide all of their requisite mechanisms and policies for their operation themselves (e.g., how to react to low battery)

Motivation

· 71

Liu, Hongzhou, et al. "Design and implementation of a single system image operating system for ad hoc networks." Proceedings of the 3rd international conference on Mobile systems, applications, and services. ACM, 2005.

72 of 76

MagnetOS (MobiSys’05)

  • Main idea: single virtual machine of top of ad-hoc nodes as a programming model and OS abstraction. MagnetOS can partition traditional monolithic Java applications on a bytecode level.
  • Object instances are smallest unit of mobility. These objects communicate through events and can be migrated through network nodes. This migration/placement is based on a set of algorithms that are all based on minimizing the mean path length of data packets (because communication is main energy impact in sensor nets).

Approach

· 72

73 of 76

Dell EMC and Vmware

· 73

74 of 76

Edge Computing

Application Areas

· 74

Discuss 5 min:�What are applications that benefit from edge computing and how?

Can we avoid cloud computing all together for all applications?

75 of 76

FogFlow (IEEE IoT17)

What Is FogFlow (2): High Level View

· 75

Slide by Bin Cheng

Open Sourced: https://github.com/smartfog/fogflow

76 of 76

FogFlow (IEEE IoT17)

  • Models a program as a directed graph of the data flowing between operations.
  • Explicitly defined inputs and outputs connect operations.
  • Operations run as soon as their input becomes available.

→ popular in Streaming Data processing because it’s easy to parallelize�e.g., Google Cloud, Apache Beam, Flink.

Dataflow Programming

· 76