Mobility-Aware and Energy-Efficient Task Offloading Strategy for Mobile Edge Workflows

: With the rapid growth of the Industrial Internet of Things (IIoT), the Mobile Edge Computing (MEC) has coming widely used in many emerging scenarios. In MEC, each workflow task can be executed locally or offloaded to edge to help improve Quality of Service (QoS) and reduce energy consumption. How‐ ever, most of the existing offloading strategies focus on indepen‐ dent applications, which cannot be applied efficiently to workflow applications with a series of dependent tasks. To address the issue, this paper proposes an energy-efficient task offloading strategy for large-scale workflow applications in MEC. First, we formulate the task offloading problem into an optimization problem with the goal of minimizing the utility cost, which is the trade-off between energy consumption and the total execution time. Then, a novel heuristic algorithm named Green DVFS-GA is proposed, which includes a task offloading step based on the genetic algorithm and a further step to reduce the energy consumption using Dynamic Voltage and Frequency Scaling (DVFS) technique. Experimental results show that our proposed strategy can significantly reduce the energy consumption and achieve the best trade-off compared with other strategies.


Introduction
Industry 4.0, with the help of the Internet and other network services, greatly improves the quality of industrial products and services [1] .As the Internet of Things (IoT) application in Industry 4.0, the Industrial Internet of Things (IIoT) improves the efficiency of the industrial manufacturing process and accelerates the process of industrial intelligence by connecting various terminal devices, mobile devices, sensors, and other devices in industrial systems.With the development of the IIoT technology, more and more IIoT terminal devices are deployed for sensing data, collecting information, or submitting requests in many emerging scenarios, such as smart grid, smart logistics, and intelligent factories [2,3] .With the sharp increase in the number of terminal devices and data volume, the IIoT will generate a large number of applications with specific needs, such as low latency and low energy consumption.However, the IIoT terminal devices are still constrained in its resources (e. g., computation capability, storage, bandwidth, and battery life) [4,5] .If the resource demand cannot be met, it will degrade the Quality of Service (QoS) of workflow applications, especially the battery bottleneck prevents the running of long-duration complex workflow applications [6] .
The emergence of Mobile Edge Computing (MEC) [7] , as an integration of edge computing and mobile computing, has provided a promising solution to the problem of resource constraint in the IIoT by offloading computation-intensive, network-intensive, or powerintensive tasks (e.g., workflow applications such as face recognition, image processing, audio editor, and scientific computation) from IIoT terminal devices onto the edge servers for execution.In the MEC-based IIoT systems, task offloading can help to significantly improve the capabilities of the network so as to enhance the QoS of workflow applications.Moreover, offloading strategies can reduce energy consumption and increase the battery life of IIoT terminal devices to improve the utilization of IIoT terminal devices.However, task offloading will inevitably need the communication between the IIoT terminal device and the edge, thereby increasing the cost of communication time, energy consumption, and network bandwidth [8,9] .Therefore, the task offloading decision needs to consider the QoS constrains of workflow applications, and then make a tradeoff between the computation cost and the communication cost to achieve specific requirements in various IIoT application.
In recent years, many researchers focus on finding the optimal task offloading solutions for workflow applications in IIoT based on MEC [10,11] .Various task offloading methods based on task division are proposed to distribute the mobile resources and the edge resources to mobile applications [12,13] .The ultimate goal of these approaches is to improve the quality of service, e. g., shorten the response time, increase the throughput, or extend the battery life by reducing the energy cost of the IIoT device.However, most existing task offloading approaches focus on the independent single application, which are not applied to the complicated workflow applications in IIoT.In practice, the complicated IIoT workflows may contain a large number of dependent subapplications with different features, so these existing approaches cannot handle this scenario efficiently.
Accordingly, in this paper, we investigate the task offloading strategy for workflow applications running in the IIoT system based on MEC.In order to maximize the benefits of MEC in energy saving and service quality improving such as execution time, we consider a type of utility cost of the terminal device synthesizing the execution time and energy consumption, and then formulate the task offloading problem in MEC as an optimization problem with the goal of minimizing the utility cost (namely finding the best trade-off between execution time and energy consumption) while satisfying the task execution time constraints.Furthermore, to address this problem, we propose a three-phase task offloading algorithm, named Green DVFS-GA, to obtain the approximate optimal solution for the optimization problem.In phase-one, we partition the large-scale task execution workflow graph into several partial critical paths; in phase-two, we use genetic algorithm (GA) to make the task offloading decision based on the utility cost of the mobile device; and in phase-three, we further reduce the energy consumption by adjusting the operation frequency of the mobile device based on the Dynamic Voltage and Frequency Scaling (DVFS) technique [14] .To verify the effectiveness of our proposed task offloading strategy, we first demonstrate a real-world case study on a simple composite workflow application.Afterwards, simulation experiments are conducted and Green DVFS-GA is compared with three other baseline task offloading algorithms.The results show the better performance of Green DVFS-GA in minimizing the energy consumption and the best trade-off between the execution time and energy consumption.
In conclusion, the major contributions of our work are as follows: 1) Workflow application inferred in our work consists of a series of ordered subtasks to accomplish a specific goal, rather than a standalone application task studied by most scholars.During its execution process, the subtasks can be executed in the IIoT terminal device or offloaded into the edge servers to execute.2) To balance the total execution time and the energy consumption, we define a new vital objective function "utility cost" to represent the tradeoff between them.And further, the energy-efficient task offloading problem in MEC has been transformed into an optimization problem with the goal of minimizing the utility cost while meeting the QoS constraints.To the best of our knowledge, using the utility cost to combine the execution time and energy consumption organically is rarely involved by other work.3) In addition, we design a threestage approach, named Green DVFS-GA, to find the approximated optimal task offloading decision for the above optimization problem.And comprehensive experiments have illustrated our proposed approach performs better than other baseline algorithms in most cases with the key measurements.
The remainder of this paper is organized as follows: Section 1 presents the related work.Section 2 defines all the models used in this paper.Section 3 provides our Green DVFS-GA algorithm.Section 4 demonstrates the evaluation results on a real-world case study and some simulation experiments.Finally, Section 5 makes a conclusion of this paper and introduces the future work.

Related Work
MEC is an emerging distributed computing paradigm, which can support the mobile workflows offload some specific tasks into the edge servers to execute and further reduce the time latency or energy consumption.The Cloudlet is the first edge computing framework proposed in 2009 [15] , which takes advantage of the nearby high capability computers, and allows resource-limited mobile devices connecting with them using Wi-Fi Apps to reduce the network delay of the mobile application and the power consumption [16] .Since then, many scholars have carried out relevant researches in MEC based on users requirements from two aspects: task offloading strategy and real-time offloading system construction.
The optimal task offloading decision is a basic key technology to determine the efficiency of the strategy [17] .According to whether all the tasks need to be offloaded, there are two different offloading strategies: one based on the full offloading mode [18] and the other based on the partial offloading mode [19] .Generally, there are two types of optimization goals of task offloading problem in MEC, i.e., the whole execution time of the task and the energy consumption of the mobile device.For example, Ref. [20] proposed a task offloading method with an optimization objective of minimizing the task time latency based on the game theory for multi-user participating MEC system.Combining the correlation relationship of the terminal users, Ref. [21] proposed a two-layer task offloading strategy with a goal of minimizing the energy consumption to provide low energy consumption services for users.Besides, many scholars have comprehensively considered the response time of tasks and the energy consumption of the system, and proposed a comprehensive system optimization goal [22,23] .Based on the design of optimization task offloading decision-making strategy, many real-time task offloading frameworks have been constructed [24][25][26] .For example, Ranji et al [25] proposed a delay-aware and energy-efficient offloading scheme named EEDOS in MEC with the goal of addressing the delay and energy costs in a joint approach.
However, these task offloading approaches still focus on independent single application instead of the composite workflow applications, which can be only accomplished by executing a series of sub-applications (or referred as tasks) with different specifications such as hardware and software services.Therefore, in this paper, we propose the task offloading algorithm for complicated workflow applications containing a large number of dependent sub-applications.

System Model for Task Offloading
In this section, we will describe the construction of the optimization model for task offloading in detail.Firstly, we introduce the workflow application model and the MEC system model used in the paper.Then, we present the workflow application execution time model and the power consumption model used for the task offloading.Finally, the optimization model with constraints is constructed.

Workflow Application Model
A mobile composite workflow application in MEC is a process which contains a series of sub-applications (exchangeable with "sub-tasks" in this paper) performed in series or in parallel to be automated and executed collaboratively.
We use a directed acyclic graph (DAG) with n tasks (nodes) G = ( VEw ) , to represent the mobile workflow application with a general topology.The n tasks are generated by terminal devices.The set of vertices V = {v 1 v 2 v i ν n } denotes the set of the ordered execut- able tasks, and the directed edge e ij Î E ( ij = 12n and i ¹ j) demonstrates the constraint between the task v i and the task v j , i.e., the finish time of v i should be earlier than the start time of v j .Given a task graph, there are an entrance node without any predecessors (denoted as the entrance task v entry ) and an exit node without any successors (denoted as the exit task v end ).The set of node weights w = {w 1 w 2 w i w n } describes the compu- tation workload for each task.If task v i executes on the terminal device but its successor node executes on the edge sever, some data will be transmitted to the edge, denoted as Tdata ij .Similarly, some data will be received by the terminal device, denoted as Rdata ij , when task v i executes on the edge but its predecessor executes on the terminal device.Figure 1 illustrates a graph representation of a workflow application with 12 tasks, which contains an entrance task v 0 and an exit task v 13 .

MEC System Model
Assuming that the core of terminal device has M different frequency levels ( l 1 < l 2 <  < l M < 1 ) and the maximum frequency is f max , the actual operating frequency of terminal device can be denoted as f M = l m *f max .
It is known to all that the power consumption of the mobile P M is relative with its frequency and the square of operating voltage, and the operating voltage of the core is relative with frequency [16,17] .Therefore, the power consumption could be denoted as P M = αf γ M , where α and γ are constants related to the terminal device that could be obtained by testing.In addition, a 0-1 variable is introduced to represent the offloading decision of task v i .Specifically, x i = 0 denotes that v i is executed on the terminal device, and x i = 1 denotes that v i is offloaded to the edge server to execute.The task scheduling decision could be represented as the set X = {x 1 x 2 x i x n }.Actually, since the start and the end task of the workflow application must be executed on the device locally, we always have x 0 = 0 and x n + 1 = 0.
The power is denoted as P 0 , when the terminal device is idle, and the operating frequency of edge server is f e , where both P 0 and f e are constants that can be measured.When a task is offloaded into the edge, the data transmission rate is denoted as R. The power for sending and receiving data of the terminal device are denoted as P s and P r respectively, where P s is considerably larger than P r .

Execution Time Model
The execution time of workflow is made up of the computation time and the communication time.Since the computation time of the task v i is affected by its offloading decision, the computation time of the task v i is denoted as T comp i ( x i ) .And the communication time of each edge e ij is affected by the offloading decisions of the task v i and the task v j .Therefore, the communication time between the task v i and the task v j is denoted as T comp ij ( x i x j ) , where x i and x j denote the task offloading decision of the task v i and the task v j , respectively.The concrete definition of task execution time are as follows.

Computation time
a) Executed locally: When task v i is executed on the terminal device, i.e., x i = 0, the execution time of the task x i is denoted as b) Executed remotely: If we offload the task v i onto the edge, the computation time of task x i is denoted as

Communication time
When the offloading decision of the task v i and the task v j is the same, the value of communication time between tasks is 0. There are two situations when the offloading decision of tasks v i and v j is not the same.The one is that we execute the task v i on the terminal device, and offload its immediate successor task v j onto the edge server, the value of communication time can be denoted as Tdata ij ∕ R. The other is that we offload the task v i onto the edge and execute its immediate successor on the terminal device, the value of communication time can be denoted as Rdata ij ∕ R. To summarise, the formula of communication time between the task v i and the task v j is represented as follows Eq. ( 1).
Based on the results of the above analysis, the execution time of application can be represented by equation (2), where T finish n + 1 denotes the finish time of the last task in the workflow application.

Power Consumption Model
Power consumption of the terminal device consists of computation power consumption and communication power consumption, which is influenced by the offloading decision.The computation power consumption of task v i with the offloading decision x i is represented as The communication power consumption of e ij with the offloading decision x i and x j is denoted as ) .Thus, the power consumption of task v i can be formulated as follows.2.4.1 Computation power consumption a) Executed locally: When task v i is executed on the terminal device, x i = 0, the power consumption of task v i only includes the consumed computation power on the terminal device, calculated as M .b) Executed remotely: If offload the task v i onto the edge, the basic power consumption of terminal device is

Communication power consumption
The communication power consumption is affected by the offloading decisions of two tasks in the edge e ij .If the offloading decision of the task v i and its immediate successor v j are the same, both executed on the terminal device/on the edge, the value of communication power consumption is 0; if the task v i is executed on the edge but its immediate successor v j is executed on the terminal device, the value of communication power consumption is denoted as if the task v i is executed on the terminal device but its immediate successor v j is executed on the edge, the value of communication power consumption is denoted as E comm ij ( x i = 0x j = 1 ) = P s *Rdata i ∕ R; To summarise, ac- cording to different offloading decision, the formula of communication power consumption of the task v i and the task v j is denoted as equation ( 3).
In conclusion, the energy consumption of the terminal device in MEC can be calculated as equation ( 4). ) ( )

Optimization Model with Constraints
The energy-efficient task offloading problem in MEC can be transformed into an optimization problem to minimize the total execution time and the total energy consumption while meeting the execution deadline.However, since there is a contradiction between minimizing the total execution time and energy consumption to some extent, the objective of this paper is to find the best trade-off between them.Therefore, the trade-off as the utility cost of the decision is defined as in equation ( 5), where T ( X ) and E ( X ) denote the total execution time of the workflow application in the MEC and the total energy consumption of the terminal device respectively with the offloading decision X.The denominator in equation ( 5) denotes the execution time of the workflow application executed on the terminal device locally and the total energy consumption of the terminal device, respectively.
Thus, the optimization problem of task offloading can be further denoted as an optimal problem to minimize the utility cost with constraints as illustrated below with equations ( 6), ( 7), (8), and ( 9), where T max denotes the execution deadline.The optimization goal is finding the optimal task offloading decision X to minimize the utility cost in equation ( 6) while meeting the execution deadline in equation (7).There is a constraint that the task offloading decision is a 0-1 variable in equation (8).
The constraint guarantees that the start task and end task of the workflow application must be executed on the terminal device locally in equation ( 9).
Min U ( X ) = T ( ) T ( X ) < T max (7) ( ) 3 Green DVFS-GA Algorithm for Task Offloading In fact, the optimization task offloading problem of minimizing the utility cost with some constraints is an NP hard problem.In order to obtain an approximate optimal solution, we design a three-stage approach, named as Green DVFS-GA, for the ordered task execution of mobile workflow application.In the first stage, we find the Partial Critical Path (PCP) of the task execution graph.In the second stage, we make the offloading decisions of all tasks on the partial critical path using genetic algorithm with the goal of minimizing the utility cost.In the last stage, we use DVFS to further reduce the energy consumption of the system.

Task Offloading Algorithm Based on PCP
In a workflow, the longest execution path from the entry task to exit tasks is denoted as the critical path, which is widely used in task scheduling (in this paper task offloading in the task graph also can be seen as the task scheduling).The task scheduling on the critical path is the key of designing an optimal task execution solution.In this paper, the entire deadline of the workflow is divided into some sub processes by using PCP, and we recursively allocate each critical task on the PCP till all tasks in the workflow are allocated.

Basic definition
We define several basic notions to find all the PCPs in the MEC.The earliest start time of task v i is repre-sented as EST ( v i ) when it starts to be executed, and the latest finish time of task v i is represented as LFT ( v i ) when its execution is completed.The computation time and communication time of the task are affected by different task offloading decision and diverse operating frequency, which makes it hard to accurately calculate the earliest start time EST ( v i ) and the latest finish time LFT ( v i ) of task v i .Therefore, an approximate value is used to calculate the EST ( v i ) and LFT ( v i ) of each task.
We assume that all tasks are offloaded onto the edge in order to execute with the maximum operating frequency (f max ), and the earliest start time EST ( v i ) can be calcu- lated by equation (10): where P ( v i ) denotes the parent tasks set of the task v i .
Similarly, the calculation of the latest finish time LFT ( v i ) is represented by equation (11): where C ( v i ) denotes the children tasks set of the task v i .

Task offloading algorithm based on PCP
The proposed task offloading algorithm based on PCP is given in Algorithm 1. Two dummy nodes v 0 and v n + 1 are created and added in the task execution graph G (Line 1 and 2).Then, the earliest start time and latest finish time of each task can be obtained by equations (10)  and (11) in G (Line 3 and 4).Since the entrance task and exit task can only be executed on the terminal device and cannot be offloaded, we set them as the scheduled nodes (Line 5).A task that has made its offloading decision is defined as scheduled node and when all tasks in G are scheduled, the task offloading procedure terminates.Finally, we call the OffloadingParents() procedure to schedule the rest of tasks.

Finding the partial critical path procedure
The all partial critical paths of the task v can be found by using the Algorithm 2 recursively in the MEC system.The outer while loop (from Line 2 to Line 17) in Algorithm 2 aims to schedule all the tasks.And in the inner while loop (from Line 4 to Line 8), the algorithm aims to find its critical parents task which is unscheduled to obtain the PCP of a particular task (Line 5).Ac-cording to the basic graph theory, the critical parent of the task v is the parent node that leads to the maximum earliest start time.Then, the scheduled critical parent will be added as the beginning of PCP after the inner while loop (Lines 9 and 10).Next, we call the procedure Offloading (PCP) (demonstrated in Algorithm 3) to make the offloading decision of tasks and schedule the tasks on the PCP before their latest finish time (Line 11).The earliest start time and latest finish time of each task on the PCP should be recalculated, then we call the procedure OffloadingParents() recursively to schedule all tasks in the workflow graph G after the scheduling procedure on PCP (Lines 12-16).
Algorithm 1 Task offloading algorithm (G = ( VEw ) , T max ) Input: the task execution graph G = ( VEw ) and time deadline T max Output: the offloading decision X 1: Procedure TaskOffloading(G = ( VEw ) , T max )

2:
Add the entrance task v 0 , the exit task v n + 1 ;

3:
Compute EST ( v i ) of each task according to equation (10);

5:
Set the task v 0 and the task v n + 1 to be scheduled;

Energy-Efficient Task Offloading on PCP Using GA
Genetic algorithm (GA) is a widely used search algorithm to find true or approximate solutions for optimization problems.In this paper, we use GA to find the best task offloading decision on the partial critical path.As illustrated in Algorithm 3, an initial population that contains some candidate solutions (called individual) is generated randomly (Line 2).The quality of each individual can be evaluated by the fitness function (defined as the optimization objective, Line 3).The outer loop (Lines 4-27) is the generation iteration process.If the iteration terminates, then we can find the best individual with the best fitness (i.e., the best solution of the optimization problem).Otherwise, the algorithm does the next iteration and produces the next generation (Lines 6-8).The elite genes will be selected as the parent to produce the offspring of the next generation by their fitness value (Line 10).By the gene operation, crossover (Lines 11-17) and mutation (Lines 19-25), the generation is stronger and better, and the best individual is more close to the optimal one.The detail explanation is as follows.

Individual representation
Due to the fact that entrance task and exit task have been scheduled on the PCP, we only have to determine other tasks between them.We choose a PCP from the workflow graph shown in Fig. 1 as an example to demonstrate the individual representation in the population (as shown in Fig. 2).In the PCP, v 0 and v 13 are two dummy nodes scheduled on the mobile locally.The other tasks in between all have two execution methods: executed locally or remotely, and their offloading decision is "0" or "1".Thus, an individual in the population can be visualized as a dimension decision sequence contains both the tasks and the corresponding task offloading decision.

Initial population
We use random heuristic to generate the initial population with M individuals, where M is the number of populations.Each individual contains several pairs of tasks and its corresponding offloading decision.The length of each individual equals to the PCP length without two scheduled tasks.

Construction of fitness function
A fitness function can be helpful to identify the quality of an individual in the population.The objective of our Green DVFS-GA is to find the best task offloading decision to minimize the energy consumption in MEC while meeting the execution deadline.Thus, we use a fitness function (shown in equation ( 12)) that combined the energy consumption and execution time of tasks on the path, where X 0 = {000} represents that all the tasks on the PCP execute on the mobile locally, v k Î PCP denotes the task on the PCP, and T finish v end ( X 0 ) rep- resents the latest finish time of the end task on PCP.The larger the fitness value, the smaller the utility cost is.

Gene selection
We consider the probability of new individuals contribution to the quality of the entire population, and select them as the parent individuals based on their fitness value.In this paper, the new individuals are selected by using roulette wheel selection strategy [19] .

Gene crossover
The first genetic operator to generate a second generation population is crossover at an individual level, which generates new offspring from two selected individuals in the current population.Single-point crossover, two-point crossover, and multiple-point crossover are the most common methods of gene crossover.The difference is the number of random points selected as the dividing points.We use two-point crossover to generate new individual in this paper illustrated as Fig. 3

(a).
There are two steps during the crossover: In the first step, we randomly select two points from the selected parents, which divide the parents into three parts; In the second step, we swap the middle part and combine them into two new offspring.

Gene mutation
The next operator to generate a second generation population is gene mutation, which generates the offspring from a single parent.We use a simple gene mutation rule as shown in Fig. 3 (b).We select one task randomly from the parent and change its execution way, such as change "0" into "1" or change "1" into "0".

GA Re-Reducing Energy Consumption Using DVFS
In the first stage of Green DVFS-GA algorithm, we have obtained the task offloading decision with minimum utility cost combining the energy consumption and execution deadline.In this stage we assume the operation frequency of terminal device is the maximum frequency f max .In fact, the terminal device in MEC system has M different frequency level ( l 1 < l 2 <  < l M = 1 ) and its actual operation frequency is f M = l m *f max .Therefore, we can use the DVFS technique to reduce the energy consumption of terminal device further.
Algorithm 4 describes the procedure of re-reducing the system energy by DVFS.The inputs of the algo-rithms include the workflow graph G ( VEw ) , the opera- tion frequency level ( f max l m ) , and the task set scheduled on the terminal device (v M ), which has removed the entrance task and the exit task.And the new operation frequency set FM for the tasks executed on the terminal device locally is the output of the algorithm.The main idea of the algorithm is as follows: for each locally executed task, if there is a time slack between the finish time of the task (u) and the start time of its successor (u'), we can try the lower operation frequency to execute the task until we find an appropriate frequency that does not postpone the start time of its successor (u') (Line 3-8).Actually, the latest finish time of task (u) equals to the earliest start time of its successor (u').T comp ( ul m ) de- notes the computation time of task (u) with the operation frequency (l m ), thus EST (u) + T comp ( ul m ) is the actual finish time of the task (u).If there is the time slack between the actual finish time and the latest finish time of task (u), we can use the lower operation frequency to execute the task (Line 7).

Experiment Evaluation
In this section, the effectiveness of our proposed task offloading strategy is evaluated on both a realworld application and simulated testing cases.The realworld mobile workflow application is implemented by mobile device simulating industrial robot inspection, which is a two-stage process consisting of object recognition and object counting.This typical mobile workflow application involves a large number of transactions and each of them could be offloaded either onto the mobile device or onto the edge server to be executed.The appropriate task offloading approach for the real application workflow is demonstrated below.Furthermore, comprehensive experiments are designed to evaluate the performance of Green DVFS-GA by simulating mobile application workflows with different specifications using MATLAB and comparing its performance with three baseline algorithms.

A Real-World Case Study
Clearly, task offloading decisions depend on not only the characteristics of the IIoT devices and MEC environments, but also a global understanding of the workflow applications.In this section, we will introduce a real-world case study to explain the main factors affecting the task offloading decision.

Application description
The workflow application consists of a two-stage process including object recognition and object counting.For the robot in the inspection process, the target object must be verified first.Specifically, when the robot verifies the inspection object, a target photo must be uploaded first.The application receives the photo and then sends it to the image acquisition unit.Afterwards, an object detection process will segment the photo and determine the potential location of the target object.After that, the image processing unit extracts the special features and tries to match with the all images of applica-tion object database.If it matches with one or some existing objects, the program will be able to switch to the next step, start to object counting (select N images).In this workflow application, the input variables include the target photo, photos of all objects in the database, and number of images for object counting (N).In our experiments, we can change these variables and observe the influence of different task offloading decisions on the energy consumption and the execution time.

Environment setup
We use an Honor mobile phone as the testing terminal device, whose operating system version is Android 6.0.And we employ a Huawei laptop with eight-core CPU as the testing edge server.PowerTutor is used to obtain the energy usage of the two machines.Table 1 shows the tested parameters of the environment.

Offloading decisions
We consider the workflow application as a task graph consisting of two nodes (v 1 , v 2 ), where v 1 is the object recognition task and v 2 is object counting task and then add two dummy nodes (v 0 , v 3 ), thus the workflow application can be denoted as a directed acyclic graph v 0 →v 1 →v 2 →v 3 .Due to both the object recognition and object counting are computation-intensive and memory-intensive [27] , the evaluation task offloading decision of Green DVFS-GA are all X = [0110] (namely v 0 is ex- ecuted on the terminal device, v 1 is executed on the edge, v 2 is executed on the edge and v 3 is executed on the terminal device) with different inputs: N and photos of all objects in the database.In addition, we consider its possible offloading decisions for comparison.Figure 4 illustrates some representative experiment results when the number of images is N=8 and N=10.It is clear that our proposed algorithm, Green DVFS-GA (X = [0110]), That is because the object recognition process is more time-consuming and more computation-intensive than the object counting, same reason as in the case of X = and X = [0110].When the tasks related to object recognition offloaded into the edge and regardless of the offloading decision of the tasks related to object counting, the execution time and the energy usage have been significantly reduced.As shown in the Fig. 4  Unlike N=8, in this case, the offloading decision of tasks related to object counting is also an important factor to the whole execution time of the application and the whole energy usage of the terminal device.The comparison between X = [0000] and X = [0010] (or the comparison between X = [0100] and X = [0110] in Fig. 4 (c) show that when offloading the tasks related to the object counting into the edge, the execution time has decreased.That is because, with the increasing number of images, more time is needed for object counting and its computation time on the terminal device is much longer than the communication time for offloading the task into the edge.The same reason is for the results shown in Fig. 4 (d).In conclusion, the offloading decision of our proposed algorithm X = [0110] outperforms other methods in both the execution time and the energy consumption, same as in the case of N=8.

Simulation Experiments
Here, we demonstrate the simulation experiments for the generic performance of our proposed task offloading strategy with a large number of simulated mo-bile application workflows.
We compare the task offloading results of the proposed Green DVFS-GA with three different baselines.
Specifically, Baseline 1 is our proposed algorithm but without the last phase for DVFS, which is used to test the effectiveness of the DVFS technique in reducing the energy consumption; Baseline 2 is that all of the tasks executed on the terminal device locally; Baseline 3 is the one-climb policy without execution time constraints, in which there can be at most one migration from the terminal device to the edge towards the objective of minimum energy consumption.

Simulation setup
We test our proposed task offloading strategy with different scale and randomly generated task graphs.We use MATLAB to generate the task graph randomly and implement the Green DVFS-GA algorithm and the baseline algorithm in Java.We use the same real-world environmental parameters as shown in Table 1 for the simulation experiments.Furthermore, for the purpose of fair comparison, we compare Green DVFS-GA with Baseline 3 algorithm using the same three task graphs and environmental parameters as used in Ref. [28].Other settings such as the data transmission power and receiving power of the terminal device are P s =0.1W and P r = 0.05W, the idle power and computation power of the terminal device are P 0 =0.001W and P M =0.5W, the maximum CPU operation frequency of the terminal device is F M =500MHz and the supportable frequency level is M= 10 (l m = 0.1, 0.2, … , 1).The CPU frequency of edge server is F e =3 000 MHz.The data transmission rate is 50kb/s.

Evaluation results
Tables 2 and 3 illustrate the simulation results with two different simulation environments as mentioned above.a) Results on different scales of task graphs: Table 2 shows the results of energy consumption and execution time with different task offloading methods on ten different task graphs in which the number of nodes is from 12 to 102 (contains two dummy nodes as explained in our algorithm).As shown in the table, Green DVFS-GA outperforms Baseline 1, Baseline 2, and Baseline 3 in the energy consumption.But for the execution time, the performance of Green DVFS-GA is not the best.And on the contrary, Baseline 1 with non-DVFS outperforms than the other algorithms and Baseline 3 also has the minimum execution time in the case of N≤42 and N=62, 92.That is because Green DVFS-GA, of which the goal is minimizing the utility cost, tends to obtain lower energy consumption with the sacrifice of longer execution time within the deadline.b) Results on different types of task graphs: we use three different types of task graphs: mesh, tree and general topology as introduced in Ref. [28], all of them contains 13 nodes with two dummy nodes.Table 3 shows the comparison results with different algorithms under different execution deadlines.According to the results, under all the three topologies, Green DVFS-GA also achieves lower energy consumption of the terminal device with the sacrifice of execution time when it is within the deadline.Specifically, for the tree topology, when the deadline is 1.3 s, Green DVFS-GA outperforms Baseline 3 in both the energy consumption and execution time.In conclusion, Green DVFS-GA can achieve significant reduc- tion in energy consumption of terminal devices.In addition, it has the advantage of finding the optimal offloading solutions within the given execution deadline compared with other baseline algorithms.
c) Results on the utility cost: Figure 5 illustrates the utility cost of Green DVFS-GA, Baseline 1, and Baseline 3 on ten different task graphs.Since Baseline 2 is running all tasks on the terminal devices, its utility cost is always one and hence not shown in Fig. 5.As it can been seen, in most cases Green DVFS-GA outperforms Baseline 1 and 3 except for the case N=52 in which Baseline 1 is better.Since utility cost is defined to indicate the trade-off between the energy consumption and the workflow execution time, the results actually demonstrate that Green DVFS-GA can achieve the best trade-off task offloading solutions than others in most cases.

Conclusion
MEC is becoming a promising platform for mobile workflow applications which are pervasive in business process management and people  s everyday life.However, service quality and energy consumption are two prominent challenges hindering the success of mobile workflow applications running in the MEC.In this paper, in order to achieve an optimal trade-off between the service quality and the energy consumption of the terminal device, we formulated the task offloading problem into an optimization model to minimize the utility cost (denotes the trade-off between execution time and energy consumption) while meeting the execution deadline for the mobile workflow applications.A three-phase task offloading strategy, named as Green DVFS-GA, has been proposed to find the approximate optimal task offloading decision.The results of a real-world case study and some simulation experiments verify that the proposed task offloading strategy can significantly reduce energy consumption within the execution deadline compared with other baseline algorithms.
In the future, besides GA, we will explore other optimization algorithms such as ACO (Ant Colony Optimization) and PSO (Particle Swarm Optimization).

Fig. 1
Fig.1 An example of figure

Table 1 1 Value
Detailed parameters Parameter The maximum CPU frequency of the device (f M )/GHz CPU frequency of the edge (f e )/GHz Idle power of the mobile device (P 0 )/W Computation power of the device (P M )/W Computation power of the edge (P e )/W Data sending power of the device (P s )/W Data receiving power of the device (P r )/W Data transmission rate (R)/kb•s -other methods in both the execution time of the application and the energy consumption of the terminal device.With the increasing of N, the execution time is longer and the terminal device consume more energy when X = [0000].In addition, when the number of pictures in the database increases, the execution time also becomes longer and the energy consumption becomes larger due to longer matching process.a) N=8: Figure 4 (a) and Figure 4 (b) show that the execution time and the energy consumption results when N=8.In this case, it is clear that the performance is almost the same when X = [0000] and X = [0010].
(a) and (b), the execution time of our proposed Green DVFS-GA (X = [0110]) decreases 21.96% on average than executed locally.And the offloading strategy can save 87.5% energy than executed locally.b) N=10: Figure 4 (c) and Fig.4 (d) show the execution time and the energy consumption results when N= 10.

Fig. 5
Fig.5 Utility cost comparison results on different task graph