- Email: [email protected]

Available online at www.sciencedirect.com

ScienceDirect

Procedia Computer Science 00 (2019) 000–000 Procedia Computer Science 00(2019) (2019)19–27 000–000 Procedia Computer Science 150

www.elsevier.com/locate/procedia www.elsevier.com/locate/procedia

13th International Symposium “Intelligent Systems” (INTELS’18) 13th International Symposium “Intelligent Systems” (INTELS’18) Model Predictive Control for Autonomous Underwater Vehicles Model Predictive Control forFernando Autonomous Rui Gomes, Lobo Pereira∗ Underwater Vehicles ∗ SYSTEC, Faculty ofFernando Engineering,Lobo University of Porto Rui Gomes, Pereira Institute for Systems and Robotics, Porto, Portugal SYSTEC, Faculty of Engineering, University of Porto Institute for Systems and Robotics, Porto, Portugal

Abstract The Attainable Set Model Predictive Control scheme is discussed and shown to meet the needed system behavioral properties Abstract while satisfying real-time requirements underlying the control of Autonomous Underwater Vehicle formations, including the strict The Attainable Setconstraints. Model Predictive Control scheme is discussed and shown meet the needed system complexity behavioral properties on-board resource More specifically, the proposed approach targetstothe on-line computational and relies while satisfying real-time requirements underlying the control of Autonomous Underwater Vehicle formations, strict on taking advantage of the control problem time invariant elements, in order to replace, as much as possible, including on-line bythe off-line on-board resource More specifically, the proposed approach targets the on-line computational complexity and relies computation, whileconstraints. guaranteeing asymptotic stability, and promoting the best trade-off between feedback control near optimality, on advantage of the control problem time invariant elements, in order replace, to as the much as possible,variability. on-line byThe off-line andtaking robustness to perturbations (due to disturbances, and uncertainties), and to adaptivity environment data computation, while guaranteeing asymptotic stability, and promoting the best trade-off between feedback control near optimality, computed off-line is stored onboard in look-up tables, and recruited and adapted on-line with small computation effort according and perturbations to disturbances, and uncertainties), and adaptivity to theimportant environment data to therobustness real-time to context specified(due by communicated or sensed data. This scheme is particularly to anvariability. increasingThe range of computed off-line is stored onboard in look-up tables, and recruited and adapted on-line with small computation effort according applications exhibiting severe real-time constraints. to the real-time context specified by communicated or sensed data. This scheme is particularly important to an increasing range of applications severe real-time constraints. c 2019 The exhibiting Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) © 2019 The Authors. Published by Elsevier B.V. c 2019 The Author(s). Published by Elsevier B.V. committee of the 13th International Symposium “Intelligent Systems” Peer-review under responsibility This is an open access article underof thethe CC scientific BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) This is an open access article under thescientific CC BY-NC-ND license (INTELS’18). Peer-review under responsibility of the committee of the(https://creativecommons.org/licenses/by-nc-nd/4.0/) 13th International Symposium “Intelligent Systems” (INTELS’18). Peer-review under responsibility of the scientific committee of the 13th International Symposium “Intelligent Systems” Keywords: model predictive control; AUV formation control; stability; robustness; control architecture. (INTELS’18). Keywords: model predictive control; AUV formation control; stability; robustness; control architecture.

1. Introduction 1. Introduction Systems of autonomous – air, surface, and underwater – robotic vehicles are increasingly regarded as options of choice to gather data required to respond to an widening range of societal needs in science, environment, exploitation Systems of autonomous – air, surface, and underwater – robotic vehicles are increasingly regarded as options of of natural resources (geologic and biomass), surveillance, security, defense, and leisure, [1]. In this article, we dischoice to gather data required to respond to an widening range of societal needs in science, environment, exploitation cuss the Attainable Set Model Predictive Control, AS-MPC, which is essentially concerned with the computational of natural resources (geologic and biomass), surveillance, security, defense, and leisure, [1]. In this article, we discomplexity of control systems designed to optimize motion control of one or more Autonomous Underwater Vehicles cuss the Attainable Set Model Predictive Control, AS-MPC, which is essentially concerned with the computational (AUVs) in order to execute complex missions. The nuclear challenge consists essentially in de-conflicting requirecomplexity of control systems designed to optimize motion control of one or more Autonomous Underwater Vehicles ments inherent to the computational complexity generally associated with typical Model Predictive Control(MPC) (AUVs) in order to execute complex missions. The nuclear challenge consists essentially in de-conflicting requireschemes, and those associated with the satisfaction of very strict constraints on time, power, and computation that ments inherent to the computational complexity generally associated with typical Model Predictive Control(MPC) arises in the real-time running this class of marine systems. More specifically, a computationally efficient architecture schemes, and those associated with the satisfaction of very strict constraints on time, power, and computation that arises in the real-time running this class of marine systems. More specifically, a computationally efficient architecture ∗

Corresponding author. Tel.: +351 22 508 1857; fax: +351 22 508 1443. E-mail address: [email protected] ∗ Corresponding author. Tel.: +351 22 508 1857; fax: +351 22 508 1443. E-mail address: [email protected] c 2019 The 1877-0509 Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) 1877-0509 © 2019 The Published bybyElsevier B.V. cunder 1877-0509 2019 The Authors. Author(s). Elsevier B.V. Peer-review responsibility of Published the scientific committee oflicense the 13th(https://creativecommons.org/licenses/by-nc-nd/4.0/) International Symposium “Intelligent Systems” (INTELS’18). This is an open access article under the CC BY-NC-ND This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility of the scientific committee of the 13th International Symposium “Intelligent Systems” (INTELS’18). Peer-review under responsibility of the scientific committee of the 13th International Symposium “Intelligent Systems” (INTELS’18). 10.1016/j.procs.2019.02.006

20 2

Rui Gomes et al. / Procedia Computer Science 150 (2019) 19–27 F. Lobo Pereira and R. Gomes / Procedia Computer Science 00 (2019) 000–000

to control formations of AUVs is presented and discussed. These features are critical for AUV systems as they operate in contexts of scarce resources and high uncertainty or environmental variability. The proposed control structure, and the various underlying results, enabling the articulation of resources optimization with state feedback control while keeping the onboard computational burden very low, were developed in the context of [2], and were partially published in a number of papers, [3, 4]. The key idea to tackle this challenge consists in, by taking advantage of time-invariant data underlying the formulation of the involved optimization problems, computing a priori (i.e., off-line) a number of, as comprehensive as possible, simple building blocks required for optimal or sub-optimal control synthesis as a function of a set of parameters associated with a satisfactory number of more likely typified on-line situations. This, possibly large, amount of data is stored on onboard in appropriate look-up tables. Then, the actual on-line control synthesis is performed by computing the parameter values that take into account the sampled data to retrieve the best control values from the look-up table. The scheme is such that the computation involved in the on-line adaptivity requires several orders of magnitude less time and computational power relatively to the conventional MPC schemes. For a recent reference, see [5]. The hybrid nature of the controller enables different modes of operation, notably, in dealing with unanticipated obstacles. Optimization and feedback control are brought together in a novel MPC scheme constructed with precomputed Attainable Sets and Value Functions by taking advantage of the time-invariance subsystems and leaving to the realtime context, adaptivity operations requiring relatively low computational burden. The research underlying this article was driven by the need to design optimizing control systems for single and formation of AUVs with requirements extracted from concrete operational applications in marine sciences and environment carried out by LSTS, the Laboratory of Underwater Systems and Technologies (https://lsts.fe.up.pt/) of Faculty of Engineering of Porto University. LSTS has, over the years, been devoting a very significant R&D effort to the advancement of theoretical research in systems and control, as well as technologies to design and build such systems. The notorious achievements that gained worldwide visibility makes LSTS a reference institution with a consolidated leadership in many key niches of the pertinent very interdisciplinary contexts. This is attested by leadership of the EU wide Marine Research Infra-structure Network (https://www.eumarinerobots.eu/) by Porto University. In the next section we provide a statement of the AUV formation control problem, followed by brief outline of some representative developments on AUV control in section 3. The main definitions and the main ideas of the novel MPC scheme, denoted by AS-MPC Scheme are the subject of section 4 which also includes the statement of some key properties as well as a version with an inner feedback loop to ensure additional robustness. In section 5, an extension o hybrid systems revealing how the approach suits the context of high unexpected variability present in most applications. A simulation example involving collision avoidance illustrates the approach. Some general conclusions and prospective research and developments are outlined in the last section. 2. General Problem Statement The AUV formation control problem is formulated in order to define strategies for the relative behaviors of given set the vehicles, regarded as a single reconfigurable generalized vehicle. The operation in formation is required to fulfill the activities requirements to achieve the mission goal. Here, we focus in the mission of gathering payload data in a given water column volume along a pre-planned path. The AUVs motion control should be such that the integral error with respect to a given reference trajectory and the total control effort during a certain period of time is minimized subject to constraints as indicated below. A general formulation of the optimal control problem associated with this mission is: t0 +T (PT ) Minimize g(x(t0 + T )) + f0 (t, x(t), xr (t), u(t))dt t0

subject to x˙(t) = f (t, x(t), xr (t), u(t)) u ∈ U,

x(t0 = x0 ,

h(t, x(t)) ≤ 0,

L − a.e.

x(t0 + T ) ∈ CT

g(t, x(t), u(t)) ≤ 0

I is the endpoint cost functional, f0 : R I ×R I n × R Im →R I is the running cost integrand, being, for e.g., where g : R In →R 1 2 for some Q 0 and R , given by 2 x(t) − xr (t)Q + u(t)R , with f : R Im → R I n, h → R I ×R In → R I q , and I ×R In ×R

Rui Gomes et al. / Procedia Computer Science 150 (2019) 19–27 F. Lobo Pereira and R. Gomes / Procedia Computer Science 00 (2019) 000–000

21 3

g:R I ×R In ×R Im → R I k represent, respectively, the vehicle dynamics, the state constraints, and the mixed constraints, I m : u(t) ∈ Ω}, Ω ⊂ R I m is the set of measurable controls, CT ⊂ R I n is a target set imposed U := {u : [t0 , t0 + T ] → R by the application domain, and xr is the trajectory to be tracked. The mappings g, f , f0 , g and h, and the sets CT and Ω satisfy the following standing assumptions (SA): ∀(t, u) ∈ R I × Ω, are Lipschitz in x with constant Km where m I n+1 , represents the respective map, and continuous in (t, u) for all x ∈ R I n , ∃K ≥ 0 such that f (t, x, Ω) ≤ K, ∀(t, x) ∈ R and CT and Ω are compact. To instance (PT ) in the context of AUV formation of N vehicles tracking a reference trajectory xr , consider, for the AUV i: x = col(ηi , νi ), and u = τi , i = 1, . . . , N; f0 (·, ·, ·) = (ηi (t) − ηir (t))T Q(ηi (t) − ηir (t)) − τiT (t)Rτi (t); g(·) = 0; being ηir (·) the ith vehicle reference trajectory, and τi its control (denoted by u in (PT )). By considering, for each vehicle (we drop the index i), the state and the controls given by η = [x, y, ψ]T and ν = [u, v, r]T and τ = [τu , τr ]T , respectively, the dynamics are given in [6]. For details, [4]. Other types of constraints that might usually be present in AUV formations are: endpoint state constraints, ηi (t + T ) ∈ Ct+T , control constraints, τi (s) ∈ U i , state constraints, (ηi (s), νi (s)) ∈ Si , communication constraints gci, j (ηi (s), η j (s)) ∈ Ci,c j , ∀ j ∈ Gc (i), and formation constraints gi,f j (ηi (s), η j (s)) ∈ Ci,f j , ∀ j ∈ G f (i). Although (PT ) seems to be centralized, it also serves for decentralized implementations in which each vehicle has its own controller involving its own motion and other activities control, and the formation pattern cohesion. Typically, each vehicle communicates acoustically with its neighbors, being the relative positions such that full connectivity is ensured. With the appropriate, possibly distributed, coordinated controllers, the decentralized schemes provide better power, communications and computation budgets. It is important to remark that (PT ), and the motion pattern have to be reconfigurable for various modes of operation. These may include, among others, data gathering, obstacle collision avoidance, communication, and loitering. 3. Brief State-of-the-Art The main difficulties to model accurately, and, hence, to control, AUVs are due to hydrodynamic effects, particularly if they are moving in significant flow fields. Typically, a number of motion modes are considered and, for each one of them, a concentrated parameter model (ordinary differential equation) is obtained by using identification methods, [7, 6]. Extended versions of these control systems for very diverse robot craft have been considered for single and multiple vehicles. Linear, and non-linear control theories provide tools that led to very popular design techniques, [6, 8, 9, 10]. Early on, it became clear that, along with feedback control, resources optimization plays a key role in contexts of scarce resources. Thus, MPC became a design approach of choice for many applications. Thus, the huge available literature on MPC schemes and their properties is no surprising, [5]. Let a given optimization and control horizons T and ∆, with T substantially greater than ∆, be given, then a typical conventional MPC scheme is as follows: 1. Initialization. Let (t0 , x0 ) be the current time and state value estimate, i.e., x(t0 ) = x0 , and set up the various initial parameters or conditions like filter parameters, initial control for the recursive control optimization procedure, cost functional weights, model parameter estimates, etc. 2. Sample the state variable at time t0 and generate the associate state estimate, x¯0 . 3. Compute the optimal control strategy, u∗ , in the prediction horizon, i.e., [t0 , t0 + T ], by solving the optimal control problem (PT ). 4. Apply the obtained optimal control during the current control horizon, [t0 , t0 + ∆]. 5. Slide time by ∆, i.e., t0 = t0 + ∆, and adapt parameters and models as needed. 6. Goto step 2. Here, (PT ) can be any application specific optimal control problem such as the one in the previous section. Moreover, as it can be perceived from the literature sample below, there is a wide range of schemes that can be considered. All these conventional MPC schemes generate feedback control syntheses that attempt to conciliate optimization with feedback control, [11, 12, 13, 14]. For details, see [2]. Moreover, MPC schemes inherit a huge versatility from the optimal control framework that allows the consideration of very complex dynamics subject to very diverse types of constraints, encompassing those arising in vehicle formations. A wide variety of MPC controllers for formations of robotic vehicles were designed to deal with key challenges such as communications failures and delays in continuum and discrete times, centralized and decentralized contexts, linear and nonlinear dynamics, leader-follower and leader-

Rui Gomes et al. / Procedia Computer Science 150 (2019) 19–27 F. Lobo Pereira and R. Gomes / Procedia Computer Science 00 (2019) 000–000

22 4

less schemes, collision-free motion, cooperative and competitive strategies, single and multiple objectives. Moreover, a wide range of applications (surveillance, exploration, tracking paths and trajectories), have been considered in a vast literature of which we single out [15, 16, 17, 18, 19, 20, 21, 22, 23]. The power and computational requirements for running most of these MPC schemes in a are weakly suitable for AUVs: (i) intense on-line computational burden, and (ii) design issues to ensure the required performances do note consider onboard resource constraints. This work considers some of these issues, and, in line of the one proposed in [3] and is an extension of [4],concerns a substantially different approach to AUV formation control. 4. The Attainable Set MPC 4.1. The Basic AS-MPC Let ∆ be the control horizon, and t0 the current time. The AS-MPC scheme is follows: 1. Initialization: t0 , x(t0 ), and other specific parameters 2. Solve (PTA L ) to obtain z∗ and, from the look-up table, extract the associated piecewise constant u∗ on [t0 , t0 + ∆] to steer the system from x0 at t0 to z∗ at t0 + ∆ 3. Apply u∗ during [t0 , t0 + ∆] as time elapses from t0 to t0 + T . 4. Sample x at t0 + ∆ to obtain x¯ = x(t0 + ∆), and get data from neighboring AUVs, some of which might obtain the new x0 as an improvement of the state estimate x¯. 5. Slide time by ∆, i.e., t0 = t0 + ∆ 6. Update the Attainable Set (AS) with the new x(t0 ) by appropriate translation and rotation, and update the Value Function (VF) at the new t0 + ∆, by taking into account the data from Step 4. 7. Goto Step 2. Here, (PTA L ) is stated as follows: Minimize V(t0 + ∆, z) over z ∈ A f (t0 + ∆; t0 , x0 ). We recall the definitions of two key objects invoked, AS and VF, in (PTA L ): (AS) In [24], the Attainable Set at t from (t0 , x0 ) with x(t0 ) = x0 is given by A f (t; t0 , x0 ) = {z : z = x(t), x˙ = f (t, x, u), u ∈ U|[t0 ,t] , x(t0 ) = x0 }, t ≥ t0 (VF) Let T L ≥ 0 be a large number, with, possibly, T L = ∞, and J(ξ, u) = g(ξ) +

t

TL

f0 (τ, x(τ), u(τ))dτ. Then,

V(t, z) := min {J(ξ, u) : ξ = x(T T L ), x˙ = f (τ, x, u), L-a.e., x(t) = z, u∈U} u∈U,ξ∈CT L

First, we remark that, for the sake of simplicity, we omitted the state and the mixed constraints. Observe that the computation of AS and of VF are extremely demanding. However, for a certain number of parameters, some of which may correspond to different operation scenarios, both AS and VF can be computed off-line due to the time invariance of the data of interest. Moreover, we note that for T = T L , by a standard change of state variable that eliminates the running cost, and by using the principle of optimality (we are considering only positional functionals, [25]), we conclude that (without relabelling) (PT ) and (PTA L ) are equivalent. See [4] for details. The need to consider approximations are due to the fact that, in general, the computational complexity of the AS and VF are huge even for relatively small systems. In [2], it is discussed why the “ε-dense cloud of points” defined as endpoints of trajectories generated by piecewise constant controls are preferable to approximate AS relatively to the

Rui Gomes et al. / Procedia Computer Science 150 (2019) 19–27 F. Lobo Pereira and R. Gomes / Procedia Computer Science 00 (2019) 000–000

23 5

other two types of approximations: Polyhedral, [26, 27], and ellipsoidal, [28]. For positional systems, [25], the VF may be computed by solving the Hamilton-Jacobi-Bellman equation (HJBE) which are known to be very computationally intensive, in spite of, the enormous research effort that lead to a number of efficient software packages, [29, 30]. It is important to remark that the main computational effort is due to the need on adapting the AS and VF for the current AUV configuration and the environment context. Given that, typically, this involves only translations and rotations of the AS, incorporating penalization terms in the VF, or, in the worst case recompute an approximation of the VF either by interpolation or solve a low dimension dynamic programming problem in a very short time and space regions, the on-line computational complexity is very small. 4.2. Properties of the AS-MPC We start by considering the asymptotic stability for conventional MPC schemes, and, then, show how these results naturally migrate to the AS-MPC, in the context of absence of perturbations. There is a large number of approaches to design the closed loop stability of MPC schemes which may require the specification of appropriate time endpoint state constraints, terminal cost, control constraints, certain relation between terminal and running costs, of simply, additional assumptions on the data of the problem for sufficiently long optimization horizons, [5, 31, 32]. We focus on the later class of results since they are less intrusive in the specification of the optimal control problem whose statement mirrors the actual specification of the application objectives and constraints. For this purpose, we follow closely the results discussed in [31]. Let us consider (PT ) with t0 = 0 of section 2 and add the following assumptions to the ones already considered I n+1 , (ii) col( f, f0 ) is C (iii) f has a linearly controllable critical there:(i) col( f, f0 )(0, 0) = (0, 0) ∈ R 2 in both arguments, point at (0, 0), (iv) ∃ K0 > 0 such that f0 (x, v) ≤ K0 x2 + u2 , and ∀ T > 0 (including T = ∞), and (v) ∃ a feasible control process with x(0) = x, ( x¯Tx (·), u¯ Tx (·)) that solves (PT ). From optimal control theory, [33], there several classes of assumptions that which are sufficient for optimality, and we will not dwell on this point here. Denote by I n : J T (x) < ∞}, and let J T (x) the cost associated with the optimal control process ( x¯Tx (·), u¯ Tx (·)), ΓT the set {x ∈ R T T T 2 Γr {x ∈ Γ : J (x) ≤ r }. Now, let us state the main result (Theorem 9) from [31]: Proposition 1. Let r > 0 be given and suppose that the terminal g is nonnegative, C2 , and locally quadratically bounded. Then, for each ∆ > 0, ∃T¯ < ∞ such that ∀ T ≥ T¯ , the (T, ∆)-MPC, i.e., the MPC scheme with the optimization and control horizons, respectively, T and ∆, is exponentially stabilizable. Moreover, ΓTs −∆ ⊂ Γr∞ for some s > r is contained in the region of attraction of the (T, ∆)-MPC scheme. This result shows that there exists a finite horizon length in which the stability of the receding horizon scheme is, under a controllability assumption, guaranteed in an appropriate sub-level set of a finite horizon cost for a general quadratically bounded terminal cost. A result ensuring mere asymptotic stability, could be derived under much weaker controllability assumptions, (see, e.g., [32]) but this is not the key point. However, this is marginal to the main objective of this part of this subsection. The key point here, is to recall that the AS-MPC is equivalent to the conventional MPC with T = ∞, and, thus, it turns out that the closed loop stability is even easier to hold since the larger the value of T , the closer to a Lyapunov function JT (x) is in the attraction domain, [31]. Now, we address issues related to robustness. The fact that the AUV is in open loop control during [t0 , t0 + ∆], implies that persistent perturbations, even of low intensity may prevent the point z∗ to e reached at t0 + ∆ by the associated u∗ with serious consequences to AS-MPC performance. In the presence of uncertainties or disturbances, the optimal control problem solved online depends on the sampled state, and, thus, the decision variable is now a sequence of control laws, designated in the literature by feedback MPC. Since, now, the optimization is carried out over the possible realizations of the disturbances or uncertainties, the feedback MPC is naturally superior relatively to that of ordinary MC in the same context. However, there is a price o pay: the optimal control problem is much more complex than the deterministic context one since its solution is a control state feedback law and, thus, infinite dimensional, [5]. This calls for trading optimality for lower complexity of the control problem for each disturbance or uncertainty realization. Among the several schemes to address this huge challenge, the so-called tube-MPC has been gaining popularity. The basic idea is, instead of considering a bundle of trajectories, each one corresponding to a specific disturbance or uncertainty realization, to synthesize a nominal controller by using an ordinary MPC scheme that involves solving the optimal control problem without perturbations, and, at the same time, applying an auxiliary

Rui Gomes et al. / Procedia Computer Science 150 (2019) 19–27 F. Lobo Pereira and R. Gomes / Procedia Computer Science 00 (2019) 000–000

24 6

controller – possibly defined by simple linear feedback – that steers the current realization of the trajectory, obtained by sampling, to the nominal trajectory. In the context of the AS-MPC addressed in this article, we propose a scheme that, in many ways correspond to the tube-MPC but restricted to a certain class of perturbations. In this context, to mitigate the impact of disturbances, the role of the “auxiliary controller” is played by a higher frequency feedback control, i.e., involving sampling at intermediate points of [t0 , t0 + ∆], being the intermediate controls chosen, without any optimization, to steer the state to a neighborhood of the nominal trajectory. The idea becomes clearer with the presentation of the Robust AS-MPC (RAS-MPC), first developed in [2]. For this, consider the set Cεt0 = {z ∈ R I n : V(t0 + ∆, z) ≤ V(t0 + ∆, z∗ ) + εt0 } ∩ ∗ t0 t0 [z + ε B1 (0)] where ε > 0 depends on the expected level of perturbations and B1 (0) is the unit ball around the zero. Compute Nεt0 as a function of the overall level of perturbations in relation with the propulsion of the AUV and let ∆¯ = ∆t0 . Then, the RAS-MPC scheme is obtained from the AS-MPC, by replacing Steps 2. and 3. by Nε

2. Solve (PTA L ) to obtain z∗ , extract the associated piecewise constant u∗ on [t0 , t0 + ∆] to steer the system from x0 ¯ at t0 to z∗ at t0 + ∆ from the look-up table, and compute z¯1 as the state of the system at t0 + ∆. ¯ during the interval [t0 , t0 + ∆). ¯ 3. Let z¯0 = x0 . Apply u∗ obtained in Step 2. restricted to [t0 , t0 + ∆) For i = 1 to Nεt0 − 1, ¯ t0 + i∆), ¯ compute z¯i+1 ∈ A f (t0 + (i + 1)∆; ¯ t + i∆, ¯ z¯i ) ∩ Ab (t0 + (i + 1)∆; ¯ t0 + ∆, Cεt0 ). 3.1 In the interval [t0 + (i − 1)∆, ¯ ¯ ¯ 3.2 Once the subinterval [t0 + (i − 1)∆, t0 + i∆) is elapsed, sample the state at t0 + i∆ to obtain z˜i , compute u¯ i ¯ t0 + (i + 1)∆]. ¯ to steer the state from z˜i to z¯i+1 on [t0 + i∆, 3.3 Let i = i + 1, update the pertinent parameters, including εt0 , if necessary or convenient to either ensure robustness or improve performance. 3.4 Goto intermediate Step [3.1]. In the above, the Backward Attainable Set is given by Ab (τ; t, Cεt0 ) := {z ∈ R I n : z f ∈ A f (t; τ, z), ∀z f ∈ Cε t0 }, with τ < t (see [27, 34, 24]). Since very short time intervals are considered, it is not difficult to devise computationally light schemes, for example, based in linear approximations, in alternative to the off-line computation and on-board storage of backward attainable sets. Although, we will not dwell on this here, it is possible to define abstract procedures to compute the required thresholds based on sensed data that, for some general specification of the class of perturbations, guarantee a predefined level of sub-optimality, see [2] for details. Under the above conditions, it is guarantee that z¯Nεt0 ∈ Cεt0 , where the value of εt0 depends on the significance of the perturbations consistency.

Next, we address the issue, of approximation to optimality in the absence of perturbations, by stating a result ∗ , u∗T,∆ ) the control process resulting from the application of the from [2], where its proof appears. Denote by (xT,∆ AS-MPC scheme with (PT ) as optimal control problem. Notice, that this is quite different from the AS-MPC above where T = ∞. Let J(x, u) be its cost functional, with the state endpoint term replaced by the running time equivalent, evaluated at (x, u) over [0, ∞), and by J(x, u)|[α,β] and Jk (x, u), its restriction to [α, β], and to [k∆, (k + 1)∆]. Proposition 2. Consider (x∗ , u∗ ) to be an optimal control process for the infinite horizon optimal control problem such ∗ ∗ that lim x∗ (t) = ξ∗ , where ξ∗ is an equilibrium in C∞ . Let (xT,∆ , u∗T,∆ ) be s.t. lim xT,∆ (t) = ξ∗ . Then, t→∞

t→∞

lim

∆↓0,T ↑∞

∞ k=1

∗ ∗ Jk (xT,∆ , u∗T,∆ ) = J(x∗ , u∗ ), and lim Jk (xT,∆ , u∗T,∆ ) − J(x∗ , u∗ )|[k∆,(k+1)∆] = 0. k→∞

5. Extension of AS-MPC to Hybrid Systems Now, we show how the AS-MPC can be extended to accommodate significant changes due to the occurrence of uncontrolled discrete events. The notion of optimization is relative to a the known realization of discrete events and considered for, the current state and the known chain of future events. Thus, the dynamics of the AS-MPC scheme is given by a hybrid automaton encompassing the desired modes of operations and the discrete events triggering the transitions between them. Thus, the design of the AS-MPC described previously, now, requires an event-driven control strategy ensuring liveness and nonblocking properties, [35]. The general approach consists in designing an hybrid supervisory controller that, once composed with the original system ensures the specified set of properties.

Rui Gomes et al. / Procedia Computer Science 150 (2019) 19–27 F. Lobo Pereira and R. Gomes / Procedia Computer Science 00 (2019) 000–000

25 7

This “closed loop” hybrid automaton will constitute the dynamics of the AS-MPC scheme. Let us focus in the motion control of a three AUVs formation in a plane carrying the required navigation and payload sensors whose mission consists in gathering data along a given path. The considered tasks are: (i) Gather data along a given path while the AUVs keep the triangle formation and the decentralized controller simultaneously ensures path tracking and formation pattern; (ii) Avoid collision with obstacles by detecting obstacles, characterizing them, collision-avoidance path replanning, and, possibly, formation reconfiguration; (iii) Communication required to transmit gathered data, enable mission follow-up, and receive new commands if necessary. The higher level automata and the set of discrete modes and events triggering transitions are depicted in Fig. 1. The Start Survey event triggers the operation nominal Mission accomplished

Start Survey

Begin

Mission Start

AUV formation plan execution

Begin

Mission accomplished

End

Tx data

EOT Reset

Abort mission

Perturbations to mission plan

Success

Failure

Replan tasks and/or reorganize formation

End

A Wide Passage

Obstacle Overcome

C

D

Thin Passage

Obstacle in range

B

Fig. 1. Main System Automaton (Left) and Main operation modes and events (Right).

mode A – data gathering while tracking the given path in a triangle formation – and the Mission Accomplished event initiates the recovery mode. The mission follow-up requires the monitoring mode D - from time to time one AUV surfaces for data transmission and receiving commands, returning to A once completed. If an Obstacle in range event occurs, then mode B - obstacle characterization and a collision avoidance path computation - is activated. Then, either a Wide passage is available and the formation is kept unchanged and the system returns to mode A tracking the original path, or a Thin passage is available and the system transits to mode C where the formation is reconfigured to overcome de obstacle, and, once the obstacle is overcame, the system returns to mode A. The automata associated with obstacle avoidance and the respective simulation for an AUV triangle formation are depicted in 2. In the simulation t6

B t6’

No obstacle in range

O3

Obst. In range

Reset (Manual)

Characterize obstacle Obst. Overcome

Failed

Complete Failed

Re-plan to overcome obstacle

Obstacle is not distinct

Overcoming obstacle New obstacle detected

Characterize “new” obstacle

No

O2

Abort mission

t4’ O1

Update ahead obstacle

Complete

t5’

t5

t4

Yes

Safe Passage test Obstacle is distinct

Failed

t2 A

t3

t1

Fig. 2. Obstacle collision avoidance automaton (Left) and Simulation results for obstacle avoidance (Right).

run, we considered: obstacle detection and characterization data obtained by a range finder, relatively sparse unmapped obstacles, obstacles locally modelled by circles, and range finder sensor distance much larger than that travelled by the AUV in ∆ time units, as well as event generators to support discrete motion decisions or formation, and cost function changes based on the range finder data. Once an obstacle is detected, it is characterized to compute the remaining path to the target as close as possible to the one to be tracked, while avoiding collision. The formation pattern can be

26 8

Rui Gomes et al. / Procedia Computer Science 150 (2019) 19–27 F. Lobo Pereira and R. Gomes / Procedia Computer Science 00 (2019) 000–000

adjusted to facilitate the obstacle characterization and overcoming. Whenever an AUV is close to an obstacle, then its AS-MPC is modified by adding penalization function to the value function to ensure a given safety distance to the obstacle. The diagram of the simulation results is practically self-explanatory. This mission starts at t1 at point A in the data gathering mode and the goal is to reach B along the segment B − A while avoiding collision with obstacles and maintaining as much as possible the triangle formation. At t2 and t3 obstacles O1 and O2 are detected and at t4 a safe passage between them is detected. At t4 , O3 is detected and a simple specific modification of the AS-MPC optimization at mode B determines which of the two alternatives - Follow through a thin passage between O2 and O3 with longitudinal formation along a path closer to the original one and with loss of the gathered data quality, or Circumvent O3 by the left along a longer route far away from the original path, but preserving the triangle formation to pursue the mission. Once the obstacles are overcome at t6 , the triangle formation is adopted until point B is reached and the mission mode changes to D to proceed with data transmission. 6. Conclusions This article concerns the control of AUVs - either stand alone or in formation - with promising high societal impact applications whose “autonomy” requirements raise huge challenges. The control structure proposed to conciliate the huge computational complexity inherent to the optimal feedback control and decision-making, with the limited onboard computational and power resources, designated by AS-MPC, is discussed and its properties, notably, optimality, stability, and robustness are discussed. In order to handle also the effect of discrete events - that, typically, are extremely relevant, the proposed framework is extended to control systems whose dynamics are given by hybrid automata. The obtained simulation results with the AS-MPC scheme and its robust versions formulated in the hybrid systems context to enable obstacle collision avoidance strategies are extremely encouraging of future developments towards experimentation with AUV systems. Acknowledgements The authors acknowledge the partial support of FCT R&D Unit SYSTEC - POCI-01-0145-FEDER-006933 funded by ERDF — COMPETE2020 — FCT/MEC — PT2020 (extension to 2018), Project STRIDE - NORTE-01-0145FEDER-000033 funded by ERDF — NORTE 2020, and project MAGIC - POCI-01-0145-FEDER-032485 funded by ERDF — COMPETE 2020 - POCI — FCT/MEC - PIDDAC References [1] Sousa, J., Maciel, B., Pereira, F.L.. Sensor systems and networked vehicles. Networks and Heterogeneous Media 2009;4(2):223–247. [2] Gomes, R.. An AUV Systems Model Predictive Control Approach. Ph.D. thesis; Faculty of Engineering, Porto University; 2018. [3] Gomes, R., Pereira, F.. A Reach Set MPC Scheme for the Cooperative Control of Autonomous Underwater Vehicles. In: Procs PhysCon, Florence, Italy. 2017, p. 135–140. [4] Gomes, R., Pereira, F.. A hybrid systems model predictive control framework for auv motion control. In: Procs ECC 2018, Limassol, Cyprus. 2018, p. 152–157. [5] Rawlings, J., Mayne, D.. Model Predictive Control: Theory and Design. Nob Hill Publishing, LLC; 2015. [6] Fossen, T.. Guidance and Control of Ocean Vehicles. Wiley; 1994. [7] Prestero, T.. Verification of a six-degree of freedom simulation model for the REMUS AUV. Master’s thesis; MIT / WHOI; 2001. [8] Kristiansen, R., Nicklasson, P.. Spacecraft formation flying: A review and new results on state feedback control. Acta Astronautica 2009; 65:1537–1552. [9] Ren, W., Beard, R.. Virtual structure based spacecraft formation control with formation feedback. In: AIAA Guidance, Navigation and Control Conf., Monterey CA. 2002, p. 2002–4963. [10] Lv, Y., Hu, Q., Ma, G., Zhou, J.. 6 dof synchronized control for spacecraft formation flying with input constraint and parameter uncertainties. ISA Transactions 2011;50(4):573 – 580. [11] Mayne, D., Rawlings, J., Rao, C., Scokaert, P.. Constrained model predictive control: Stability and optimality. Automatica 2000;36(6):789– 814. [12] Breger, L., How, J., Richards, A.. Model predictive control of spacecraft formations with sensing noise. In: American Control Conf.; vol. 4. 2005, p. 2385–2390. [13] Franco, E., Magni, L., Parisini, T., Polycarpou, M., Raimondo, D.. Cooperative constrained control of distributed agents with nonlinear dynamics and delayed information exchange: A stabilizing receding-horizon approach. IEEE TAC 2008;53:324–338.

Rui Gomes et al. / Procedia Computer Science 150 (2019) 19–27 F. Lobo Pereira and R. Gomes / Procedia Computer Science 00 (2019) 000–000

27 9

[14] Langson, W., Chryssochoos, I., Rakovic, S., Mayne, D.. Robust model predictive control using tubes. Automatica 2004;40:125–133. [15] Franco, E., Parisini, T., Polycarpou, M.. Cooperative control of discrete-time agents with delayed information exchange: A receding-horizon approach. In: IEEE CDC. 2004, p. 4274–4279. [16] Keviczky, T., Borrelli, F., Balas, G.. Decentralized receding horizon control for large scale dynamically decoupled systems. Automatica 2006;42:2105–2115. [17] Keviczky, T., Borrelli, F., Fregene, K., Godbole, D., Balas, G.. Decentralized receding horizon control and coordination of autonomous vehicle formations. IEEE Trans Control Systems Tech 2008;16:19–33. [18] Consolini, L., Morbidi, F., Prattichizzo, D., Tosques, M.. Leader-follower formation control of nonholonomic mobile robots with input constraints. Automatica 2008;44(5):1343–1349. [19] Chao, Z., Ming, L., Shaolei, Z., Wenguang, Z.. Collision-free UAV formation flight control based on nonlinear MPC. In: Conf. on Electronics, Communications and Control. 2011, p. 1951–1956. [20] Quintero, S., Copp, D., Hespanha, J.. Robust UAV coordination for target tracking using output-feedback model predictive control with moving horizon estimation. In: ACC. 2015, p. 3758–3764. [21] Bertrand, S., Marzat, J., Piet-Lahanier, H., Kahn, A., Rochefort, Y.. MPC strategies for cooperative guidance of autonomous vehicles. In: AerospaceLab J.; vol. 8. 2014, p. 250–256. [22] Andrade, R., Raffo, G., Rico, J.. Model predictive control of a tilt-rotor uav for load transportation. In: European Control Conf. 2016, p. 2165–2170. [23] Shen, C., Shi, Y., Buckham, B.. Path-following control of an AUV using multi-objective model predictive control. In: American Control Conf. 2016, p. 4507–4512. [24] Kurzhanski, A., Varaiya, P.. Dynamic optimization for reachability problems. J Optim Th & Appl 2001;108:227–251. [25] Krasovskii, N., Subbotin, A.. Game-Theoretical Control Problems. Springer Series in Soviet Mathematics. Springer-Verlag New York; 1988. ISBN 9781461283188. [26] Baturin, V., Goncharova, E., Pereira, F., Sousa, J.. Polyhedral approximations to the reachable set of impulsive dynamic control systems. Autom & Remote Control 2006;69(3). [27] Graettinger, T., Krogh, B.. Hyperplane method for reachable state estimation for linear time-invariant systems. J of Optim Theory and Appl 1991;69:555–588. [28] Kurzhanski, A., V´alyi, I.. Ellipsoidal Calculus for Estimation and Control. Birkhuser; 1997. ISBN 978-0-8176-3699-9. [29] Sethian, J.. Level Set Methods and Fast Marching Methods. Cambridge University Press; 2 ed.; 1999. [30] Michel, I., Bayen, A., Tomlin, C.. Computing reachable sets for continuous dynamics games using level sets methods. IEEE TAC 2005; 50:980–1001. [31] Jadbabaie, A., Hauser, J.. On the stability of receding horizon control with a general terminal cost. IEEE Trans Autom Control 2005; 50(5):674–678. [32] Gruene, L., Pannek, J.. Nonlinear Model Predictive Control : Theory and Algorithms. Communications and Control Engineering. Springer Verlag; 2017. ISBN 978-3-319-46024-6. [33] Vinter, F.. Optimal Control. Springer; 2000. [34] Varaiya, P.. Reach set computation using optimal control. In: Proc. KIT Workshop on Verification of Hybrid Systems, Verimag, Grenoble. 1998, p. 1–25. [35] Cassandras, C., Lafortune, S.. Introduction to Discrete Event Systems. Springer Verlag; 2008. ISBN 978-0-387-33332-8.