Dipartimento di Elettronica, Informazione e Bioingegneria ...

15
On the design of regularized explicit predictive controllers from input-output data ? Valentina Breschi a , Andrea Sassella a , Simone Formentin a a Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Piazza L. da Vinci 32, 20133 Milano, Italy. Abstract On the wave of recent advances in data-driven predictive control, we present an explicit predictive controller that can be constructed from a batch of input/output data only. The proposed explicit law is build upon a regularized implicit data-driven predictive control problem, so as to guarantee the uniqueness of the explicit predictive controller. As a side benefit, the use of regularization is shown to improve the capability of the explicit law in coping with noise on the data. The effectiveness of the retrieved explicit law and the repercussions of regularization on noise handling are analyzed on two benchmark simulation case studies, showing the potential of the proposed regularized explicit controller. Key words: Data-driven control; learning-based control, predictive control, explicit MPC 1 Introduction One of the main challenges of modern control theory is finding the most effective (and efficient) approaches to benefit from data when designing a controller for an unknown system. Traditionally, learning-based control strategies rely on a two-step procedure, which entails the identification of a model for the unknown system and the design of a model-based controller. These tech- niques can profit from established tools for system iden- tification [27], but the resulting models are often not op- timized for control, as their goal is to approximate the system dynamics by minimizing some fitting error. Al- though control-oriented identification approaches have been proposed (see e.g., [21, 23]), they still do not allow one to avoid the two-stage procedure, with the modeling phase frequently making use of the lion’s share of time and resources. With a change in the data-handling paradigm, several techniques have been proposed to design controllers di- rectly from data, while bypassing an explicit identifica- tion phase. Consolidated techniques for data-driven con- ? This project was partially supported by the Italian Min- istry of University and Research under the PRIN’17 project “Data-driven learning of constrained control systems”, con- tract no. 2017J89ARP. Email addresses: [email protected] (Valentina Breschi), [email protected] (Andrea Sassella), [email protected] (Simone Formentin). trol, such as Virtual Reference Tuning (VRFT) method [13, 20, 22], Iterative Feedback Tuning (IFT) [24] and Correlation-based Tuning (CbT) [26, 38], directly em- ploy data to tune the controller, but they have two ma- jor drawbacks. Firstly, they rely on the definition of a reference model embedding the desired closed-loop be- havior. In this context, reference model selection thus becomes a rather delicate and time consuming task, with the reference model being the main tuning knob of these approaches [11, 34, 39]. Secondly, state-of-the- art direct control techniques are not naturally equipped to cope with saturation and constraints, thus requir- ing additional layers in the control structure (see, e.g., [10, 12, 29, 32, 33]) to handle them. More recently, the regained popularity of results from behavioral theory [41] have lead to the introduction of alternative data-based control schemes that rely on a trajectory-based description of the system dynamics. These range from passivity-based [4, 28, 31, 35–37] and model reference [9] controllers, to optimal [16,17,19] and predictive [5, 6, 8, 14, 15, 18, 33] ones, with the latter built to tackle constraints. Contribution and related works. In the spirit of [33], we derive the explicit solution for the data-based predic- tive control problem introduced in [7]. Like in traditional Model Predictive Control (MPC) [1], transitioning from a data-based implicit scheme to a data-driven explicit law entails that the optimal control action can be com- puted via simple function evaluations, rather than re- quiring the solution of an optimization problem in real- Preprint 25 October 2021 arXiv:2110.11808v1 [eess.SY] 22 Oct 2021

Transcript of Dipartimento di Elettronica, Informazione e Bioingegneria ...

Page 1: Dipartimento di Elettronica, Informazione e Bioingegneria ...

Onthedesign of regularized explicit predictive controllers

from input-output data ?

Valentina Breschi a, Andrea Sassella a, Simone Formentin a

aDipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Piazza L. da Vinci 32, 20133 Milano, Italy.

Abstract

On the wave of recent advances in data-driven predictive control, we present an explicit predictive controller that can beconstructed from a batch of input/output data only. The proposed explicit law is build upon a regularized implicit data-drivenpredictive control problem, so as to guarantee the uniqueness of the explicit predictive controller. As a side benefit, the useof regularization is shown to improve the capability of the explicit law in coping with noise on the data. The effectiveness ofthe retrieved explicit law and the repercussions of regularization on noise handling are analyzed on two benchmark simulationcase studies, showing the potential of the proposed regularized explicit controller.

Key words: Data-driven control; learning-based control, predictive control, explicit MPC

1 Introduction

One of the main challenges of modern control theoryis finding the most effective (and efficient) approachesto benefit from data when designing a controller for anunknown system. Traditionally, learning-based controlstrategies rely on a two-step procedure, which entailsthe identification of a model for the unknown systemand the design of a model-based controller. These tech-niques can profit from established tools for system iden-tification [27], but the resulting models are often not op-timized for control, as their goal is to approximate thesystem dynamics by minimizing some fitting error. Al-though control-oriented identification approaches havebeen proposed (see e.g., [21,23]), they still do not allowone to avoid the two-stage procedure, with the modelingphase frequently making use of the lion’s share of timeand resources.With a change in the data-handling paradigm, severaltechniques have been proposed to design controllers di-rectly from data, while bypassing an explicit identifica-tion phase. Consolidated techniques for data-driven con-

? This project was partially supported by the Italian Min-istry of University and Research under the PRIN’17 project“Data-driven learning of constrained control systems”, con-tract no. 2017J89ARP.

Email addresses: [email protected](Valentina Breschi), [email protected] (AndreaSassella), [email protected] (SimoneFormentin).

trol, such as Virtual Reference Tuning (VRFT) method[13, 20, 22], Iterative Feedback Tuning (IFT) [24] andCorrelation-based Tuning (CbT) [26, 38], directly em-ploy data to tune the controller, but they have two ma-jor drawbacks. Firstly, they rely on the definition of areference model embedding the desired closed-loop be-havior. In this context, reference model selection thusbecomes a rather delicate and time consuming task,with the reference model being the main tuning knobof these approaches [11, 34, 39]. Secondly, state-of-the-art direct control techniques are not naturally equippedto cope with saturation and constraints, thus requir-ing additional layers in the control structure (see, e.g.,[10, 12,29,32,33]) to handle them.More recently, the regained popularity of results frombehavioral theory [41] have lead to the introduction ofalternative data-based control schemes that rely on atrajectory-based description of the system dynamics.These range from passivity-based [4, 28, 31, 35–37] andmodel reference [9] controllers, to optimal [16,17,19] andpredictive [5,6,8,14,15,18,33] ones, with the latter builtto tackle constraints.

Contribution and related works. In the spirit of [33],we derive the explicit solution for the data-based predic-tive control problem introduced in [7]. Like in traditionalModel Predictive Control (MPC) [1], transitioning froma data-based implicit scheme to a data-driven explicitlaw entails that the optimal control action can be com-puted via simple function evaluations, rather than re-quiring the solution of an optimization problem in real-

Preprint 25 October 2021

arX

iv:2

110.

1180

8v1

[ee

ss.S

Y]

22

Oct

202

1

Page 2: Dipartimento di Elettronica, Informazione e Bioingegneria ...

time. This can be computational advantageous, particu-larly when the problem at hand is relatively simple andthe sampling time rather small. Differently from [7], ourshift to the explicit predictive law allows one to obtainthe optimal input by constructing a set of data-basedmatrices only, with the trajectory-based model of thesystem being ultimately transparent to the final user.Since the structure of the problem proposed in [7] pre-vents the computation of the explicit solution as it is, wepropose to augment the performance-oriented predictivecontrol cost with a regularization term, acting on thetrajectory-based model of the plant. As such, we denom-inate the presented controller Regularized Explicit Data-Driven Predictive Controller (R-EDDPC). Apart fromallowing the explicit solution to be found, we show thatthe regularization can help coping with noisy data, inline with what is proposed in [7] to robustify the implicitscheme. We show that the obtained explicit predictivelaw is piecewise affine, retrieving a controller that resem-bles the ones introduced in [1,33]. Differently from [33],the explicit law obtained in this work depends on in-put/output data only, thus not requiring the state of thecontrolled system to be fully measurable. Nonetheless,we show that the two explicit laws are equivalent undersome design assumptions.

Outline. The paper is organized as follows. After recall-ing some preliminaries, the problem of designing explicitpredictive controllers from data is stated in Section 2.In Section 3 we shift from the implicit predictive controlproblem in [7] to its explicit counterpart. The proposedsolution is then compared with that of [33] in Section 4.In Section 5, we show how the explicit predictive con-troller can be extended to handle set points changes andto be further robustified against noise. The performanceof the proposed method is then discussed in Section 6by means of two benchmark case studies. The paper isended by some concluding remarks.

Notation. Let N and R be the set of natural and realnumber respectively. Let IN = 0, 1, . . . , N−1. Denotewith Rn the set of real column vectors of dimension nand with Rn×m the set of real matrices with n rowsand m columns. Given A ∈ Rn×m, we indicate withA′ ∈ Rm×n its transpose, while we denote by [A]i the i-th row ofA and by [A]i:j the subset of rows ofA, startingfrom the i-th up to the j-th row (for i < j). Whenn = m, we indicate the inverse of A as A−1, while A†

denotes its right inverse when n 6= m. We denote withIn the identity matrix of dimension n, while we do notspecify the dimension of zero vectors or matrices. Givenx ∈ Rn, we denote the squared 2-norm of this vector as‖x‖2, while ‖x‖2Q = x′Qx. Given Q ∈ Rn×n, Q 0 andQ 0 indicate that the matrix is positive semi-definiteand positive definite, respectively. Let Qi ∈ Rn×n, fori = 1, . . . , L. Then, diag(Q1, . . . , QL) denotes the blockdiagonal matrix composed by QiLi=1. Given a sequence

ukN−1k=0 , we denote the associated Hankel matrix as

HL(u), i.e.,

HL(u) =

u0 u1 · · · uN−L

u1 u2 · · · uN−L+1

......

. . ....

uL−1 uL · · · uN−1

, (1)

while a window of the sequence is indicated as

u[a,b] =

ua

ua+1

...

ub

, (2)

with a < b.

2 Problem formulation

The data-driven predictive control formulation proposedin [7] represents the starting point from which we de-rive R-EDDPC. Therefore, we here recall the results onwhich the former lays its foundation, starting from a for-mal definition of persistently exciting sequence.

Definition 1 (Persistence of excitation) Given a

signal νk ∈ Rη, the sequence νkN−1k=0 is said to be

persistently exciting of order L if rank(HL(ν)) = ηL.

Consider now a linear time-invariant (LTI) system P,with state xk ∈ Rn, input uk ∈ Rm and output yk ∈ Rp.A trajectory of this system can be formally defined asfollows.

Definition 2 (System trajectory) An input/output

sequence uk, ykN−1k=0 is a trajectory of the LTI system

P if there exists an initial condition x ∈ Rn and a statesequence xkNk=0 such that

xk+1 = Axk +Buk, x0 = x,

yk = Cxk +Duk,

for all k ∈ IN , where (A,B,C,D) is a minimal realiza-tion of P.

Given a noiseless trajectory uk, ykN−1k=0 of P, with

ukN−1k=0 being persistently exciting of order L+ n, the

following result further holds.

Theorem 1 (Trajectory-based representation [7])

Let uk, ykN−1k=0 be a trajectory of an LTI system P.

Assume that ukN−1k=0 is persistently exciting of order

2

Page 3: Dipartimento di Elettronica, Informazione e Bioingegneria ...

L+ n. Then uk, ykL−1k=0 is a trajectory of P if and only

if there exists α ∈ RN−L+1 such that[HL(u)

HL(y)

]α =

[u[0,L−1]

y[0,L−1]

], (3)

where y = ykN−1k=0 .

According to this result, a single input/output sequenceof P can be used to retrieve a representation of the sys-tem spanning the vector space of all its trajectories, pro-vided that such a sequence is properly generated.

We can now mathematically formulate the problem tack-led in this work. Let P be an unknown LTI system oforder n ∈ N, with m ∈ N inputs and p ∈ N outputs,which is here supposed to be controllable and observ-able 1 . Assume that we can carry out experiments on P,so as to collect a sequence of N ∈ N input/output pairs,

DN = udk, ydkN−1k=0 , and assume that the input data sat-

isfy the following condition.

Assumption 1 (Quality of data) The input se-

quence ud = udkN−1k=0 is persistently exciting of order

L+ 2n, according to Definition 1.

Our goal is to find an explicit data-based solution for theimplicit data-driven predictive control (DD-PC) prob-lem introduced in [7], that is defined as follows:

minα,u,y

L−1∑k=0

`(uk, yk) (4a)

s.t.

[u[−n,L−1]

y[−n,L−1]

]=

[HL+n(ud)

HL+n(yd)

]α, (4b)

[u[−n,−1]

y[−n,−1]

]= χ0, (4c)

[u[L−n,L−1]

y[L−n,L−1]

]= χL, (4d)

uk ∈ U, yk ∈ Y ∀k ∈ IL. (4e)

Therefore, we aim at explicitly finding the optimal in-put sequence u, the corresponding outputs y and thetrajectory-based model α ∈ Rnα , with nα = N−L−n+1,so as to minimize the quadratic cost

`(uk, yk) = ||uk − us||2R + ||yk − ys||2Q,

over a prediction horizon L, with Q 0 and R 0,with us ∈ Rm and ys ∈ Rp verifying the subsequentdefinition.

1 This assumption is shared with [7].

Definition 3 (Equilibrium [7]) An input/output pair(us, ys) is an equilibrium of the LTI system P if the se-quence uk, yknk=0, with uk = us and yk = ys for allk ∈ In is a trajectory of G.

Meanwhile, our search for the optimal sequences andmodel is constrained by (i) the initial condition in (4c),(ii) the terminal constraint in (4d), and (iii) the valueconstraints in (4e). The initial condition is characterizedby χ0 ∈ Rn(m+p), which changes at each time instantt ∈ N. Specifically, this vector collects the n past in-put/output pairs resulting from the application of theDD-PC law, i.e., at time t it is defined as

χ0 =

[u[t−n,t−1]

y[t−n.t−1]

].

The terminal constraint is instead shaped by a constantvector χL ∈ Rn(m+p), given by

χL =

[usn

ysn

],

where usn and ysn stack n copies of us and ys, respectively.Lastly, the sets characterizing the value constraints in(4e), namely U ⊆ Rm and Y ⊆ Rp, are assumed tobe polytopic. Note that, the terminal constraint furtherinfluences the choice of the prediction horizon L, sinceL ≥ n for the problem to be well-posed.

Remark 1 (Stability and recursive feasibility)As proven in [7], within a noiseless setting the data-based formulation in (4) guarantees recursive feasibilityand closed-loop exponential stability of the equilibrium(us, ys).

3 From implicit DD-PC to R-EDDPC

The stepping stone for the derivation of the explicit data-driven solution to (4) lays in its reformulation as an op-timization problem where the unique decision variableis α. To this end, let us define the following matrices

HPγ =[HL+n(γd)]1:n, HFγ =[HL+n(γd)]n+1:L+n, (5a)

HTγ =[HL+n(γd)]L+1:L+n, Hkγ =[HL+n(γd)]k+1, (5b)

where γ is a generic placeholder, to be replaced with ei-ther u or y. We stress that HPu , HFu and HTu are all fullrow rank matrices, since ud is persistently exciting of or-der L+2n. Accordingly, problem (4) can be manipulated

3

Page 4: Dipartimento di Elettronica, Informazione e Bioingegneria ...

and equivalently recast as:

minα||α||2Wd

+ 2c′dα (6a)

s.t.

[HPuHPy

]α = χ0, (6b)

[HTuHTy

]α = χL, (6c)

Hkuα ∈ U, Hkyα ∈ Y, ∀k ∈ IL, (6d)

where

Wd = (HFu )′RHFu + (HFy )′QHFy , (6e)

cd = −[(HFu )′RusL + (HFy )′QysL

], (6f)

with Q= diag(Q, . . . , Q) 0, R= diag(R, . . . , R) 0and usL and ysL stacking L copies of us and ys, respec-tively. However, the weighting matrix Wd in (6e) can beshown to be positive semi-definite, as illustrated in theproof of the following lemma.

Lemma 1 (Positive semi-definiteness of Wd) Thematrix Wd∈Rnα×nα in (6e) is positive semi-definite.

Proof: Since Q 0, then (HFy )′QHFy is positive semi-definite. As Theorem 1 requires N ≥ (m+1)(L+n)−1,then nα ≥ m(L + n). Since HFu ∈ RmL×nα is full rowrank by construction and mL < nα, then (HFu )′RHFuis positive semi-definite, despite R 0. As such, Wd ispositive semi-definite, thus concluding the proof.

This structural property of Wd makes the cost of the op-timization problem in (6) convex, but not strictly convex,ultimately hampering the possibility to retrieve a uniqueexplicit solution of the problem. To overcome this lim-itation, we augment the cost with an L2-regularizationterm, leading to the following regularized data-basedpredictive control problem:

minα||α||2Wd

+ 2c′dα+ ρα‖α‖2 (7a)

s.t.

[HPuHPy

]α = χ0, (7b)

[HTuHTy

]α = χL, (7c)

Hkuα ∈ U, Hkyα ∈ Y, ∀k ∈ IL, (7d)

where ρα > 0 is an hyper-parameter to be tuned.

Remark 2 (Small ρα) For sufficiently small values ρα,the difference between problem (4) and (7) can becomenegligible. In this case, R-EDDPC is likely to inherit thesame property of the implicit solution (see [7, SectionIII.B]).

3.1 The role of regularization

By resulting in the addition of a positive constant tothe diagonal of Wd, the regression term allows us to by-pass the structural issue characterizing the cost in (4a).Therefore, the larger ρα, the more the regularized costwill differ from the one of problem (6). One should thuspick a relatively small regularization parameter for theexplicit solution of (7) to be as close as possible to theone of the implicit MPC problem in (6).Even if the regularized problem (6) has been mainly in-troduced to allow for the computation of the explicitlaw, we stress that this alternative formulation of theimplicit MPC problem goes in the direction of robusti-fication, along the same line followed in [7, 14]. Indeed,L2-regularization has a shrinkage effect on the (implicit)model of P embedded in α. By penalizing the size ofits components, the regularization terms steers the ele-ments of α towards zero and, concurrently, towards eachothers. As such, its use (i) prevents the model from be-coming excessively complex, hence hindering overfitting,(ii) it helps in handling noisy data, by implicitly reduc-ing the influence of noise in the prediction accuracy, and(iii) it alleviates problems that can be caused by highlycorrelated features. Therefore, in selecting ρα one has tofurther account for the shrinking effect of this additionalpenalty.

3.2 The derivation of R-EDDPC

To derive the explicit solution of (7), let us firstly mergethe constraints in (7b) and (7c) into a single equalityHdα = χ. Thanks to the polytopic structure of the valueconstraints in (7d), we can rewrite them as Gdα ≤ β, sothat the problem in (7) can be equivalently recast as

minα

1

2||α||2Wρ

d+ c′dα (8a)

s.t. Hdα = χ, (8b)

Gdα ≤ β, (8c)

where the cost is scaled with respect to the one in (7)and W ρ

d = Wd + ραInα . Moreover, let us introduce thefollowing assumption on the constraints in (8c).

Assumption 2 (Constraints) Given Gd in (8c), let its

rows associated with active constraints be denoted by Gd.

The rows of G =[G′d H′d

]′are linearly independent.

Under this assumption, the closed-form data-driven so-lution of the implicit problem in (8) is given by the fol-lowing theorem.

Theorem 2 (R-EDPPC) Let Assumptions 1-2 hold,with the latter satisfied for all possible combinations ofactive constraints M ∈ N. Then, the data-driven explicit

4

Page 5: Dipartimento di Elettronica, Informazione e Bioingegneria ...

law coupled with (8) is unique and is given by

u(χ) =

Fd,1χ+ fd,1 if Ed,1χ ≤ Kd,1,...

Fd,Mχ+ fd,M if Ed,Mχ ≤ Kd,M .

(9)

Proof: Since problem (8) is strictly convex by construc-tion, it has a unique solution α, whose closed-form canbe found by applying the Karush-Kuhn-Tucker (KKT)conditions. Therefore, the following holds:

W ρdα+ G′dλ+H′dµ+ cd = 0, (10a)

λ′(Gdα− β) = 0, (10b)

Hdα− χ = 0, (10c)

Gdα− β ≤ 0, (10d)

λ ≥ 0, (10e)

where λ ∈ R(m+p)L and µ ∈ R2n(m+p) are the Lagrangemultipliers associated with the inequality and equalityconstraints in (8), respectively.Consider a generic set of active constraints in (8c), and

let us distinguish the Lagrange multipliers λ ∈ Rnλ asso-ciated with the latter and the ones coupled with inactive

constraints, here indicated as λ. Accordingly, we denoteby Gd and β the rows of Gd and β associated with active

constraints, while we indicate with Gd, β the remainingrows.Since both (10b) and the dual feasibility condition in

(10e) have to hold for inactive constraints, λ is a vec-tor of zeros and, thus, the KKT conditions in (10b) and(10d) can be equivalently restated as:

Gdα− β = 0, (11a)

Gdα− β < 0, (11b)

where the product with λ′ in (11a) is neglected, sincethis vector is non-zero by definition. Thanks to this re-formulation, we can then merge (10c) and (11a), so asto obtain a single equality condition, here defined as

Gα− b− Sχ = 0, (12)

where

G =

[GdHd

], b =

0

], S =

[0

In(m+p)

],

and the dimensions of S depend on the number of activeconstraints. We can then exploit the stationary condi-tion in (10a) to express α as a function of the non-zeroLagrange multipliers, i.e.,

α = −(W ρd )−1

(G′δ + cd

), (13)

where δ is given by δ = [ λ′ µ′ ]′. By merging (13) and

(12), we then obtain the equality

− GW δ − b− Sχ = 0 (14)

where GW = G(W ρd )−1G′ and b = G(W ρ

d )−1cd + b. Wecan thus retrieve the closed-form expression for δ, whichis given by

δ = −G−1W (b+ Sχ). (15)

Since GW can always be inverted according to our as-sumptions, we can thus substitute (15) into (13), ulti-mately obtaining the following data-driven expressionfor α:

α = (W ρd )−1

[G′G−1

W (b+ Sχ)− cd]. (16)

Accordingly, we can retrieve the associated data-drivenexpression for the predicted input sequence for a givenset of active constraints as

u[0,L−1](χ)=HFu (W ρd )−1

[G′G−1

W (b+Sχ)−cd]. (17)

By combining the primal and dual feasibility conditionsin (10d) and (10e), we can further characterize the poly-hedral region where (17) holds, which is shaped by thefollowing combination of inequalities:

Gd(W ρd )−1

[G′G−1

W (b+Sχ)−cd]−β≤ 0, (18a)

[G−1W (b+ Sχ)]1:nλ ≤ 0, (18b)

where nλ denotes the number of active constraints.By considering all M possible combinations of activeconstraints, straightforward manipulations result in anexplicit predictive control sequence defined as

u[0,L−1] =

Fd,1χ+ fd,1 if Ed,1χ ≤ Kd,1,...

Fd,Mχ+ fd,M if Ed,Mχ ≤ Kd,M ,(19a)

where

Fd,i = HFu (W ρd )−1G′iG

−1W,iSi, (19b)

fd,i = HFu (W ρd )−1

(G′iG

−1W,ibi − cd,i

), (19c)

Ed,i =

[Gd(W ρ

d )−1G′iG−1W,iSi,

G−1W,iSi

], (19d)

Kd,i =

βi + Gd(W ρd )−1

(cd,i − G′iG

−1W,ibi

)−G−1

W,ibi

, (19e)

for all i = 1, . . . ,M , where Gi, bi, cd,i and Si indicate

the rows of G, b, cd and S associated with the i-th set of

5

Page 6: Dipartimento di Elettronica, Informazione e Bioingegneria ...

active constraint, and GW,i is constructed accordingly.Lastly, by selecting the first element of this sequence, itcan easily be shown that the control action to be appliedis indeed piecewise affine and that it has the structurein (9), thus concluding the proof.

Note that, the explicit control law in (9) has a structuresimilar to the one obtained in the standard model-basedcase [1, 3], but the dependence on the matrices of thestate-space model of P have now been replaced witha set of data matrices. Moreover, differently from theexplicit predictive law introduced in [33], the obtainedpiecewise affine law is inherently output-feedback, thusnot requiring a direct measurement of the state.

Remark 3 (Relaxation) Assumption 2 can be relaxedin practice, since redundant constraints can be removedwith degeneracy handling strategies, like the ones pro-posed in [2].

Remark 4 (Coping with general constraints)The data-driven law in (9) can be readily adapted tohandle more general polytopic constraints of the form

Gdα ≤ β + ϕχ. (20)

In this case, the definition of S in (12) changes as follows:

S =

In(m+p)

],

with ϕ indicating the rows of ϕ associated to the consid-ered set of active constraints. Nonetheless, the deriva-tions in the proof of Theorem 2 remain unchanged, andso does the form of the explicit control law.

4 A comparison with E-DDPC

The explicit law derived in Section 3.2 is not the first ofits kind. Indeed, a data-driven explicit law has alreadybeen proposed in [33]. Our aim is thus to compare the twoDD-PC problems that R-EDDPC and E-DDPC solve.To this end, let us recall the implicit problem consideredin [33] by focusing on the case in which the prediction,control and constraint horizons are all equal to L, i.e.,

minu[0,L−1]

L−1∑k=0

[‖xk‖2Q + ‖uk‖2R

]+ ‖xL‖2P (21a)

s.t. xk+1 =X1,N

U0,1,N

X0,N

†[ukxk

], k=0, . . . , L−1,

(21b)

x0 = x, (21c)

uk ∈ U, xk ∈ X, k = 0, . . . , L− 1, (21d)

where x ∈ Rn denotes the initial state for the predictionat time t and

U0,1,N =[udn u

dn+1 · · · udN+n−1

], (22)

X0,N =[xdn x

dn+1 · · · xdN+n−1

], (23)

X1,N =[xdn+1 x

dn+2 · · · xdN+n

], (24)

with xdk denoting the state measured or reconstructedfrom data at instant k. Note that, since the input is per-sistently exciting of order L+ 2n according to Assump-tion 1, the condition required for the model in (21b) torepresent the behavior of the unknown system P holdswhen the state is fully measured (see [33, Theorem 1]).From (4) and (21) to be comparable, when the state isnot fully measured, we solve problem (21) by consider-ing the non-minimal state realization

zk =[u′k−n · · · u′k−1 y

′k−n · · · yk−1

]′. (25)

Note that, for the predictive model in (21b) to be welldefined in this scenario, the input has to be persistentlyexciting of order 2n+ 1 [16, Theorem 7]. In our setting,this condition still holds thanks to Assumption 1. Byrelying on the above problem setting, we can prove theexistence of the following equivalence relations.

Lemma 2 (Model equivalence) The predictive mod-els in (4b) and (21b) are equivalent.

Proof: See Appendix A.1

Lemma 3 (Constraints equivalence) The con-straints in (4c) and (21c) are always equivalent. Instead,the ones in (4e) and (21d) are equivalent if: (i) X ≡ Ywhen the state is fully measured; (ii) X ≡ Un × Ynotherwise.

Proof: See Appendix A.2.

Since (21) aims at steering both states and inputs tozero, while (4) depends on the equilibrium (us, ys), wemake the comparison for (us, ys) = (0, 0). We can nowderive the following result on the relationship between(4) and (21).

Theorem 3 (Problem equivalence) Consider the

6

Page 7: Dipartimento di Elettronica, Informazione e Bioingegneria ...

relaxed data-driven problem

minα,u,y

L−1∑k=0

[‖yk‖2Q+‖uk‖2R

]+

∥∥∥∥[u[L−n,L−1]

y[L−n,L−1]

]∥∥∥∥2

P

+ρα‖α‖2

(26a)

s.t.

[u[−n,L−1]

y[−n,L−1]

]=

[HL+n(ud)

HL+n(yd)

]α, (26b)

[u[−n,−1]

y[−n,−1]

]= χ0, (26c)

uk ∈ U, yk ∈ Y ∀k ∈ IL, (26d)

where the terminal constraint in (4d) has been softenedand added to the cost. Then, (21) and (26) are equivalentin the following cases:

(i) the state is fully measurable, ρα → 0 and the weights

in (21a) are Q = Q, R = R, P = T ′P T , whereT ∈ Rn×(m+p)n satisfies

xL = T

[u[L−n,L−1]

x[L−n,L−1]

], (27)

(ii) the state is not fully measured, ρα → 0 and theweights in (21a) are

Q = V ′

[R 0

0 Q

]V, R = 0, P = P + Q, (28)

with V ∈ R(m+p)×n(m+p) verifying[uk−1

yk−1

]= V zk. (29)

Proof: See Appendix A.3.

Since the steps performed to compute R-EDDPC andE-DDPC are fundamentally the same, the equivalencebetween (4) and (26) shows that the two explicit laware likely to match when ρα vanishes and the terminalconstraint is lifted to the cost in (7) or, alternatively, theterminal cost is replaced with a hard constraint in (21).

5 Extensions

We now highlight how the explicit solution derived inSection 3.2 can be extended to two different scenar-ios. Firstly, we show how the presented derivation canbe adapted to handle tracking tasks, according to thescheme proposed in [6]. Then, along the line followedin [7], we introduce a slack variable to further robustifythe approach with respect to noisy data and discuss howthis modifies the obtained explicit solution.

5.1 Handling changing set points in R-EDDPC

When the control objectives shifts from reaching a givenequilibrium point (us, ys) to tracking an input/outputreference behavior (ur, yr), the implicit MPC problemto be solved at each time step t ∈ N can be modified asfollows [6]:

minα,u,yus,ys

˜(uk, yk, us, ys) (30a)

s.t.

[u[−n,L−1]

y[−n,L−1]

]=

[HL+n(ud)

HL+n(yd)

]α, (30b)

[u[−n,−1]

y[−n,−1]

]= χ0, (30c)

[u[L−n,L−1]

y[L−n,L−1]

]= Ω

[us

ys

], (30d)

uk ∈ U, yk ∈ Y ∀k ∈ IL, (30e)

(us, ys) ∈ Us × Ys, (30f)

where the cost is now given by

˜(uk, yk, us, ys) =

L−1∑k=0

‖uk − us‖2R + ‖yk − ys‖2Q

+ ‖us− ur‖2Ψ+‖ys− yr‖sΦ + ρα‖α‖2,

so as to penalize the deviation of the set point (us, ys)from the desired target (ur, yr), with Ψ,Φ 0. Notethat, Ω ∈ 0, 1n(m+p)×m+p in (30d) is such that:

Ω

[us

ys

]= χT .

Moreover, the constraint in (30f) entails some value con-ditions on the equilibrium point, here assumed to be stillcharacterized via a set of polytopic constraints.

Within this scenario, the resulting explicit predictive lawcan be computed by relying on the following lemma.

Lemma 4 (Piecewise affine solution) Let α collectthe optimization variables of problem (30) and χ stackthe initial conditions χ0 and the reference to be tracked,i.e.,

α =

α

us

ys

, χ =

χ0

ur

yr

. (31)

Then, problem (30) is explicitly solved by a piecewiseaffine law defined as in (9), with χ substituting χ.

7

Page 8: Dipartimento di Elettronica, Informazione e Bioingegneria ...

Proof: Based on the definition of α in (31) we can rewritethe constraint in (30b) as

[u[−n,L−1]

y[−n,L−1]

]=

[HL+n(ud) 0

HL+n(yd) 0

]α. (32)

This equivalent representation allows us to recast thetracking MPC problem (8) as

minα

1

2||α||2Wρ

d+ χ′cdα (33a)

s.t. Hdα = Sχ, (33b)

Gdα ≤ β, (33c)

where (33c) is obtained by merging (30e)-(30f). Notethat the matrices characterizing this equivalent problemare defined as follows:

W ρd = (HFu )′RHFu + (HFy )′QHFy

+ (Sus)′ΨSus + (Sys)

′ΦSys + ραI,

cd = −

0 0 0

0 Ψ 0

0 0 Φ

,Hd =

[HP 0

HT −Ω

], S =

[I 0 0

0 0 0

],

with

HFu =[HFu −Cu

s

L 0], HFy =

[HFy 0 −Cy

s

L

],

HP =

[HPu ,HPy

], HT =

[HTuHTy

],

and where Sγ selects any variable (denoted genericallywith the placeholder γ) within a given vector, while CγLallows one to construct L copies of it. This reformulationallows us to follow the same step presented in Section 3.2to derive the explicit law, by replacing S in (12) with

S =

[0

S

],

and cd in (13)-(18) with cdχ. As a consequence, the pre-dicted input sequence has the same closed-form in (19),

with

Fd,i = HFu (W ρd )−1

(G′iG

−1W,iSi − cd,i

),

fd,i = HFu (W ρd )−1G′iG

−1W,ibi,

Ed,i =

Gd(W ρd )−1

(G′iG

−1W,iSi + cd,i

)G−1W,iSi

,Kd,i =

[βi − Gd(W ρ

d )−1G′iG−1W,ibi

−G−1W,ibi

],

where G, GW and b can be readily customized to thecurrent problem from the matrices introduced in Sec-tion 3.2, while the subscript indicates the i-th set of ac-tive constraints. We can then retrieve the input by ex-tracting the first component of this sequence only. Thisresult leads to a law that takes the same piecewise affineform as (9), thus concluding the proof.

5.2 Robustification with slack variables

Assume that the measured outputs yd used to constructthe Hankel matrices in (4b) are corrupted by noise. Inthis case, a robust DD-PC formulation similar to the oneproposed in [7] can be recovered by introducing an addi-tive slack variable σ ∈ Rp(L+n) on the predicted outputand the corresponding regularization term as follows:

minα,u,y

L−1∑k=0

`(uk, yk) + ρα‖α‖2 + ρσ‖σ‖2 (34a)

s.t.

[u[−n,L−1]

y[−n,L−1] + σ

]=

[HL+n(ud)

HL+n(yd)

]α, (34b)

[u[−n,−1]

y[−n,−1]

]= χ0, (34c)

[u[L−n,L−1]

y[L−n,L−1]

]= χL, (34d)

uk ∈ U, yk ∈ Y ∀k ∈ IL. (34e)

Along the line of [7, Remark 3], we do not include anyconstraint on the slack variable, by leveraging on thefact that its values can be practically contained by se-lecting ρσ > 0 sufficiently large. In turn, this entails that(34) can be formulated without any prior information onthe measurement noise features, while accounting for thepossible mismatch between the outputs predicted fromnoisy data and the true one.To derive the explicit predictive law for this robust for-mulation, let us introduce the extended optimizationvariable

α =

σ

], (35)

8

Page 9: Dipartimento di Elettronica, Informazione e Bioingegneria ...

and modify the constraint in (34b) as[u[−n,L−1]

y[−n,L−1]

]=

[HL+n(ud) 0

HL+n(yd) −IL+n

]α=

[HL+n(ud)

HL+n(yd)

]α,

(36)accordingly. By redefining the matrices in (5) as

HPγ =[HL+n(γd)]1:n, HFγ =[HL+n(γd)]n+1:L+n,

(37a)

HTγ =[HL+n(γd)]L+1:L+n, Hkγ =[HL+n(γd)]k+1,

(37b)

where γ is still a placeholder, we can retrieve the explicitpredictive law by following the same steps described inSections 3-3.2.

6 Benchmark numerical examples

In this section, we analyze the performance of R-EDDPCon two benchmark examples. Initially, we consider theproblem introduced in [3, Section 7.1], with the plant Pmodified so as to force the whole state to be measurable.This choice allows us to compare the performance of R-EDDPC with the one of E-DDPC. We then consider theexample introduced in [7]. In this case, we juxtapose theresults attained by R-EDDPC with the ones achieveddesigning the explicit MPC law using the standard two-stage procedure, i.e., by identifying a state-space modelof the system first. We stress that, independently of theconsidered example, R-EDDPC is always designed byrelying on data only. All computations have been car-ried out on an Intel Core i7-7700HQ processor, runningMATLAB 2019b.

6.1 SISO system with fully measurable state

Consider the system introduced in [3, Section 7.1], whichis characterized by the following state dynamics:

xk+1 =

[0.7326 −0.0861

0.1722 0.9909

]xk +

[0.0609

0.0064

]uk, (38)

and assume that its states are measurable. To constructthe matrices characterizing R-EDDPC, we have fed theplant with a random input sequence of length N = 100,that is uniformly distributed in [−5, 5], so as to consideran experimental framework similar to the one introducedin [33]. The collected data are here corrupted by additivenoise, i.e.,

ydk = xdk + vk,

where v ∼ N (0,Υ), with Υ being a diagonal matrix cho-sen so as to yield an average Signal-to-Noise Ratio SNRover the two output channels equal to 20 [dB].

0 5 10 15 20 25 30 35 40 45-1

-0.5

0

0.5

1

1.5

(a) First state component

0 5 10 15 20 25 30 35 40 45-1

-0.5

0

0.5

1

1.5

(b) Second state component

Figure 1. SISO system with measurable state: comparisonbetween the performance of the oracle explicit MPC, E-D-DPC and R-EDDPC over a noiseless test. The oracle is ob-tained by using the true model of the system.

To compare R-EDDPC with E-DDPC, instead of solv-ing (7), we explicitly solve a regularized version of thedata-driven problem shown in (26) with weights chosenaccording to Theorem 3, namely

minα,u,x

L−1∑k=0

[‖xk‖2Q+‖uk‖2R

]+ ‖xL‖2P + ρα‖α‖2 (39a)

s.t.

[u[−n,L−1]

x[−n,L−1]

]=

[HL+n(ud)

HL+n(xd)

]α, (39b)

[u[−n,−1]

x[−n,−1]

]= χ0, (39c)

− 2 ≤ uk ≤ 2 ∀k ∈ IL, (39d)

with L = 2, Q = I2, R = 0.01 and P found by solvinga data-driven Lyapunov function, as explained in [16],and the input feasibility constraint corresponding tothe one considered in [33]. We stress that the explicitlaw for this alternative MPC formulation can still beretrieved as in Section 3.2, by properly augmenting W ρ

dand reshaping χ in (8). The regularization parameter ραis instead tuned using cross-validation, i.e., by selectingthe one minimizing the cost in (39a) within a set ofcandidate values. Notice that such a procedure requiresa closed-loop experiment for each value of ρα to be as-

9

Page 10: Dipartimento di Elettronica, Informazione e Bioingegneria ...

Table 1SISO system with fully measurable state: SNR vs ρα andRMSEO (mean ± standard deviation) over 30 Monte Carlosimulations for each noise level.

SNR [dB] ρα (mean±std) RMSEO (mean±std)

40 0.7±0.2 (0.3± 0.2) · 10−2

30 2.0±0.6 (1.0± 0.5) · 10−2

20 6.2±2.2 (2.7± 2.0) · 10−2

10 16.0±10.2 (9.8± 5.8) · 10−2

sessed. In this work, cross-validation is performed byconsidering a set of noisy state/input samples of lengthNcv = 100 gathered by feeding the plant with a newrandom input sequence uniformly distributed in [−5, 5].This procedure leads to the choice of ρα = 5.Let us denote by oracle explicit controller the law ob-tained by using the actual model of P. Figure 1 reportsthe comparison between the responses attained withR-EDDPC, E-DDPC and the oracle explicit controllerover a noiseless closed-loop test. Clearly, the differencebetween the three outcomes is generally negligible, witha slight discrepancy in the transient response that canbe due to the different strategies employed to handlenoise in R-EDDPC and E-DDPC. This results is in linewith the expectations of Section 4.We then assess the robustness of R-EDDPC to noisy

data by considering 30 different realizations of thedatasets used in cross-validation and other 30 realiza-tions to construct the explicit law for increasing levelsof noise, for a total of 60 dataset for each noise level. Westress that a new hyper-parameter ρα is tuned for eachof the 30 training sets. To assess the performance of theretrieved R-EDDPC, we consider the following indicator

RMSEO =1

2

2∑i=1

√√√√ 1

50

49∑t=0

([yt]i − [y?t ]i), (40)

which compares R-EDDPC with the oracle explicit con-troller over the same noiseless test considered in Fig-ure 1, with y?t

Nv−1t=0 being the output resulting from

the use of the oracle controller. Table 1 shows that boththe sampled mean and standard deviation of the indica-tor in (40) remain small. Instead, the values of the reg-ularization parameter obtained via cross-validation aremodified to cope with the increasing noise level. Thishighlights that the hyper-parameter ρα can be activelyexploited to improve the performance of the explicit pre-dictive controller against noise. We remark that the in-dicators obtained when using the robustified approachpresented in Section 5 are comparable to the one in Ta-ble 1, when ρσ ≥ 100.

Table 2Linearized four tank system: DD-PC [7] vs R-EDDPC. Av-erage time τ [s] and worst case time τwc [s] needed to findthe optimal control action and memory required for storage.

τ [s] τwc [s] Memory [kB]

DD-PC [7] 1.9 · 10−2 1.2 · 10−1 3302

R-EDDPC 8.5 · 10−7 2.7 · 10−5 2.1

6.2 Linearized four tank system

Consider now the following fourth order system:

xk+1 =Axk +Buk + wk, (41a)

yk =

[1 0 0 0

0 1 0 0

]xk + vk, (41b)

where

A =

0.921 0 0.041 0

0 0.918 0 0.033

0 0 0.924 0

0 0 0 0.937

, B =

0.017 0.001

0.001 0.023

0 0.061

0.072 0

,

already considered in [7]. Differently from [7], thestate evolution is conditioned by the process noisewk ∼ N (0,∆), where ∆ has been randomly chosen as

∆ = 10−3

10 1 2 3

1 10.01 2 1.5

2 2 3 4

3 1.5 4 7

,

(notice that positive definiteness is verified), while themeasurement are affected by an additive noise vk ∼N (0,Υ), with Υ chosen equal to 5.76 · 10−4I2, for theaverage output Signal-to-Noise ratio to be comparableto the one in [7]. Our goal is to force the inputs and out-puts of the system to reach the equilibrium point

us =

[1

1

], ys =

[0.65

0.77

],

without requiring them to satisfy any value constraintover the prediction horizon. Since we share the same ob-jective and specifications as [7], we design the explicitpredictive controller by considering the robust formula-tion in (34) and by selecting the same parameters con-sidered therein, i.e.,

L = 30, Q = 3I2, R = 10−4I2, ρα = 0.1, ρσ = 103.

10

Page 11: Dipartimento di Elettronica, Informazione e Bioingegneria ...

Table 3Linearized four tank system: explicit MPC (E-MPC) vs R-EDDPC. Comparison over 30 Monte Carlo runs with respectto the percentage of unstable instances in closed-loop andthe values of J (mean ± standard deviation) in (42).

Unstable runs [%] J (mean±std)

E-MPC+N4SID 83 % 81.94±122.18

R-EDDPC 0 % 9.00±0.04

Analogously, we generate the data to construct the Han-kel matrices in (4b) as in [7], by feeding the plant witha random input sequence of length N = 400, uniformlydistributed within [−1, 1].In this setting, we compare the responses attainedwith the implicit DD-PC in [7] (denoted as yit) andR-EDDPC, by considering the following indicator:

RMSEIE =1

2

2∑j=1

√√√√ 1

Nv

Nv−1∑t=0

([yt]j − [yit]j)

which allows us to assess the discrepancy in performanceattained with the two controllers over a closed-looptest. Over a noiseless test of length Nv = 600, we obtainRMSEIE = 3.4 · 10−7, which shows that the implicitand explicit law coincide in terms of performance, as ex-pected. Nonetheless, in this case R-EDDPC is far moreconvenient from a computational perspective and interms of memory occupation with respect to its implicitcounterpart. Indeed, as shown in Table 2, the averageand the worst case CPU times required to compute theoptimal input by starting from 104 randomly chosenvalues of χ0 in (34c) are approximately 4 orders of mag-nitude smaller for R-EDDPC, with the latter occupyingonly 6% of the memory required to store all the matri-ces needed for the implicit solution of (34). While thefirst result is somehow expected, since with R-EDDPCthe optimization problem in not solved in real-time, andthe optimal input can be computed via simple functionevaluations, the result on memory occupation is mainlydue to the features of the considered MPC problem.Indeed, due to the lack of value constraints in the con-sidered problem, R-EDDPC is a linear law. Therefore,this advantage in terms of memory requirements mightbe lost if value constraints are included in the problem.

We now compare the performance attained by R-EDDPC with the ones achieved by designing an explicitpredictive controller via the conventional two-phasestrategy. By using the same data exploited to con-struct the Hankel matrices for R-EDDPC, we thusidentify a model for the system in (41) via the N4SIDapproach [30, 40] and, then, we design a model-basedpredictive controller as in [3]. This two-stage solutionis here denoted as E-MPC. Note that, since the latterneeds the state of the system for the computation ofthe optimal input, E-MPC further requires the designof a Kalman filter [25] based on the identified model.

In this work, this additional step is carried out underthe assumption that an oracle provides us with the truecovariance matrices of the process and measurementnoise, so as to skip the Kalman filter design phase. Wepoint out that the need of the state estimator alreadyhighlights an intrinsic advantage of R-EDDPC, whichrelies on input/output data only and, thus, solely in-volve the construction of the Hankel matrices to bedesigned and deployed.Initially, we evaluate the robustness of R-EDDPC andE-MPC to noisy data. To this end, we consider 30datasets, characterized by different realizations of theprocess and measurement noise in (41). For each ofthem, we identify a model of the system and then de-sign the explicit MPC and the R-EDDPC laws. Foreach of the explicit controllers obtained with E-MPCand R-EDDPC, we then run the same noiseless test andevaluate the attained closed-loop performance throughthe following index:

J =

Nv−1∑t=0

[‖yt − ys‖2Q + ‖ut − us‖2R

], (42)

where Nv = 600. Table 3 reports its mean value andstandard deviation over stable closed-loop instancesonly. Clearly, R-EEDPC outperforms the standardmodel-based approach in terms of robustness with re-spect to different realizations of the training set, gener-ally leading to better performance, as also confirmed bythe results reported in Figure 2. This remarkable differ-ence between the two approaches can be linked to thepoor quality of the identified model, which indeed resultsin an average fit of 15 % in validation, thus jeopardizingthe performance of both E-MPC and the Kalman filter.On the other hand, avoiding a preliminary identificationstep, adding the regularization term and further robus-tifying R-EDDPC via the slack variable have proven tobe a valid strategy to handle both measurement andprocess noise. We point out that similar results are alsoobtained by choosing alternative covariance matricesfor the noise in (41) at random, while maintaining asimilar Signal-to-Noise Ratio on the output.

Remark 5 (Case-study dependend conclusions)We wish to stress that the tests presented here, but per-formed with no process noise, have led to much betterperformance of the identification procedure and, thus,of the model-based controller. It follows that this casestudy must be considered as a special case, nonethelesshighlighting that it might be worthwhile to map the datadirectly onto the predictive controller, instead of firstidentifying the model of the system. Further research isneeded to generalize such a statement.

7 Conclusions

In this paper, we have presented a data-driven regular-ized explicit predictive controller (R-EDDPC), which

11

Page 12: Dipartimento di Elettronica, Informazione e Bioingegneria ...

0 100 200 300 400 500 600

0.2

0.4

0.6

0.8

1

1.2

1.4

0 100 200 300 400 500 600

0.2

0.4

0.6

0.8

1

1.2

1.4

(a) First output

0 100 200 300 400 500 600

0.2

0.4

0.6

0.8

1

1.2

1.4

0 100 200 300 400 500 600

0.2

0.4

0.6

0.8

1

1.2

1.4

(b) Second output

Figure 2. Linearized four tank system: E-MPC with identified model (left panels) vs R-EDDPC (right panels). Comparisonbetween the mean (line) and standard deviations (shaded areas) of the outputs resulting by the use of the two controllers,limited to stable closed-loop instances only. When looking at the performance of R-EDDPC, the standard deviation is negligible.

can be designed with a batch of properly generatedinput/output data. The effectiveness of R-EDDPC hasbeen proven on two benchmark simulation examples,showing its correspondence with E-DDPC in [33]. Thenumerical results additionally show that R-EDDPCmight outperform an explicit MPC relying on an poorlyidentified model of the system to be controlled.Due to the crucial role of the regularization parame-ter on the performance attained in closed-loop, futureworks will be devoted to devise hyper-parameter tun-ing techniques not involving closed-loop experiments.In addition, future research will explore strategies toextend the latter to handle nonlinear systems.

References

[1] A. Alessio and A. Bemporad. A Survey on ExplicitModel Predictive Control, pages 345–369. Springer BerlinHeidelberg, Berlin, Heidelberg, 2009.

[2] A. Bemporad, K. Fukuda, and F.D. Torrisi. Convexityrecognition of the union of polyhedra. ComputationalGeometry, 18(3):141–154, 2001.

[3] A. Bemporad, M. Morari, V. Dua, and E.N. Pistikopoulos.The explicit linear quadratic regulator for constrainedsystems. Automatica, 38(1):3–20, 2002.

[4] J. Berberich, J. Kohler, F. Allgower, and M.A. Muller.Dissipativity properties in constrained optimal control: Acomputational approach. Automatica, 114, 2020.

[5] J. Berberich, J. Kohler, M. A. Muller, and F. Allgower. Onthe design of terminal ingredients for data-driven mpc, 2021.arXiv/2101.05573.

[6] J. Berberich, J. Kohler, M.A. Muller, and F. Allgower.Data-driven tracking MPC for changing setpoints. IFAC-PapersOnLine, 53(2):6923–6930, 2020. 21th IFAC WorldCongress.

[7] J. Berberich, J. Kohler, M.A. Muller, and F. Allgower. Data-driven model predictive control with stability and robustnessguarantees. IEEE Transactions on Automatic Control,66(4):1702–1717, 2021.

[8] J. Bongard, J. Berberich, J. Kohler, and F. Allgower. Robuststability analysis of a simple data-driven model predictivecontrol approach, 2021.

[9] V. Breschi, C. De Persis, S. Formentin, and P. Tesi. Directdata-driven model-reference control with Lyapunov stabilityguarantees, 2021. arXiv/2103.12663.

[10] V. Breschi and S. Formentin. Direct data-driven controlwith embedded anti-windup compensation. In Proceedings ofthe 2nd Conference on Learning for Dynamics and Control,volume 120 of Proceedings of Machine Learning Research,pages 46–54, 2020.

[11] V. Breschi and S. Formentin. Proper closed-loopspecifications for data-driven model-reference control. IFAC-PapersOnLine, 54(9):46–51, 2021.

[12] V. Breschi, D. Masti, S. Formentin, and A. Bemporad. NAW-NET: neural anti-windup control for saturated nonlinearsystems. In 2020 59th IEEE Conference on Decision andControl (CDC), pages 3335–3340, 2020.

12

Page 13: Dipartimento di Elettronica, Informazione e Bioingegneria ...

[13] M.C. Campi, A. Lecchini, and S.M. Savaresi. Virtualreference feedback tuning: a direct method for the design offeedback controllers. Automatica, 38(8):1337–1346, 2002.

[14] J. Coulson, J. Lygeros, and F. Dorfler. Regularized anddistributionally robust data-enabled predictive control. In2019 IEEE 58th Conference on Decision and Control (CDC),pages 2696–2701, 2019.

[15] J. Coulson, J. Lygeros, and F. Dorfler. Distributionally robustchance constrained data-enabled predictive control. IEEETransactions on Automatic Control, pages 1–1, 2021.

[16] C. De Persis and P. Tesi. Formulas for data-driven control:Stabilization, optimality, and robustness. IEEE Transactionson Automatic Control, 65(3):909–924, 2019.

[17] C. De Persis and P. Tesi. Low-complexity learning of linearquadratic regulators from noisy data. Automatica, 128, 2021.

[18] F. Dorfler, J. Coulson, and I. Markovsky. Bridging direct &indirect data-driven control formulations via regularizationsand relaxations, 2021. arXiv/2101.01273.

[19] F. Dorfler, P. Tesi, and C. De Persis. On the certainty-equivalence approach to direct data-driven lqr design, 2021.arXiv/2109.06643.

[20] S. Formentin, M.C. Campi, A. Care, and S.M. Savaresi.Deterministic continuous-time virtual reference feedbacktuning (vrft) with application to pid design. Systems &Control Letters, 127:25–34, 2019.

[21] S. Formentin and A. Chiuso. Control-oriented regularizationfor linear system identification. Automatica, 127:109539,2021.

[22] S. Formentin, S.M. Savaresi, and L. Del Re. Non-iterativedirect data-driven controller tuning for multivariable systems:theory and application. IET control theory & applications,6(9):1250–1257, 2012.

[23] M. Gevers. Identification for control: From the earlyachievements to the revival of experiment design. EuropeanJournal of Control, 11(4):335–352, 2005.

[24] H. Hjalmarsson. Iterative feedback tuning—an overview.International Journal of Adaptive Control and SignalProcessing, 16(5):373–395, 2002.

[25] R. E. Kalman. A New Approach to Linear Filteringand Prediction Problems. Journal of Basic Engineering,82(1):35–45, 1960.

[26] A. Karimi, L. Miskovc, and D. Bonvin. Iterative correlation-based controller tuning. International Journal of AdaptiveControl and Signal Processing, 18(8):645–664, 2004.

[27] L. Ljung. System Identification (2nd Ed.): Theory for theUser. Prentice Hall PTR, USA, 1999.

[28] T. Martin and F. Allgower. Dissipativity verification withguarantees for polynomial systems from noisy input-statedata. In 2021 American Control Conference (ACC), 2021.

[29] D. Masti, V. Breschi, S. Formentin, and A. Bemporad. Directdata-driven design of neural reference governors. In 2020 59thIEEE Conference on Decision and Control (CDC), pages4955–4960, 2020.

[30] Matlab system identification toolbox, 2019.

[31] J. M. Montenbruck and F. Allgower. Some problems arisingin controller design from big data via input-output methods.In 2016 IEEE 55th Conference on Decision and Control(CDC), pages 6525–6530, 2016.

[32] D. Piga, S. Formentin, and A. Bemporad. Direct data-driven control of constrained systems. IEEE Transactionson Control Systems Technology, 26(4):1422–1429, 2018.

[33] A. Sassella, V. Breschi, and S. Formentin. Learning explicitpredictive controllers: theory and applications, 2021. arXiv/

2108.08412.

[34] D. Selvi, D. Piga, and A. Bemporad. Towards directdata-driven model-free design of optimal controllers. In2018 European Control Conference (ECC), pages 2836–2841.IEEE, 2018.

[35] M. Sharf. On the sample complexity of data-driven inferenceof the L2-gain. IEEE Control Systems Letters, 4(4):904–909,2020.

[36] M. Sharf, A. Koch, D. Zelazo, and F. Allgower. Model-freepractical cooperative control for diffusively coupled systems.IEEE Transactions on Automatic Control, pages 1–1, 2021.

[37] M. Tanemura and S. Azuma. Efficient data-driven estimationof passivity properties. IEEE Control Systems Letters,3(2):398–403, 2019.

[38] K. Van Heusden, A. Karimi, and D. Bonvin. Data-drivenmodel reference control with asymptotically guaranteedstability. International Journal of Adaptive Control andSignal Processing, 25(4):331–351, 2011.

[39] M. van Meer, V. Breschi, T. Oomen, and S. Formentin. Directdata-driven design of lpv controllers with soft performancespecifications. Journal of the Franklin Institute, 2021.

[40] P. Van Overschee and B. De Moor. N4SID: Subspacealgorithms for the identification of combined deterministic-stochastic systems. Automatica, 30(1):75–93, 1994.

[41] J.C. Willems, P. Rapisarda, I. Markovsky, and B.L.M. DeMoor. A note on persistency of excitation. Systems & ControlLetters, 54(4):325–329, 2005.

A Appendix A

A.1 Proof of Lemma 2

(i) Assume that the state is fully measurable and con-sider the predictive model in (21b). By propagatingit over time, we can characterize the stack of pre-dicted states x[1,L] as follows:

x[1,L] =[Γd Ξd

] [u[0,L−1]

x0

],

where Γd and Ξd are defined as in [33, AppendixA]. Since ud is persistently exciting of order L+2n,it exists α ∈ Rnα such that

x[1,L] =[Γd Ξd

] [HFuHFx

]α = HL(x+

d )α,

where x+d = xdk

N+nk=n+1, HFu and HFx are defined as

in (5) and the last equality stems from the defini-tions of Γd and Ξd.

Consider now the model in (4b) and decouplethe initial state from the predicted ones, by relyingon the decomposition in (5). Let us introduce the

13

Page 14: Dipartimento di Elettronica, Informazione e Bioingegneria ...

transformation T ∈ RLn×L(m+p) such that:

x[1,L] = T

[u[0,L−1]

x[0,L−1]

].

By premultiplying both sides of (4b) for T , we ob-tain

x[1,L] = T

[HFuHFx

]α = HL(x+

d )α,

where we have exploited the properties of T and wehave replaced the measured output with the state,since they coincide. Therefore, the predictive mod-els in (4b) and (21b) are equivalent.

(ii) Assume that the state is not fully measurable. Let

T be the transformation that allows to reconstructthe predicted state sequence z[1,L], with zk definedas in (25) for k = 1, . . . , L, i.e.,

z[1,L] = T

[u[−n,L−1]

y[−n,L−1]

].

By premultiplying both sides of (4b) for this trans-formation matrix, we obtain:

z[1,L] = T

[HL+n(ud)

HL+n(yd)

]α = HL(z+

d )α,

where z+d = zdk

N+nk=n+1 and

zdk =[(udk−n)′ · · · (udk−1)′ (ydk−n)′ · · · (ydk−1)′

]′.

Let us now recast the model in (21b) according toits input/ouput counterpart (see [16, Section VI]),i.e.,

zk+1 = Z1,N

[U0,1,N

Z0,N

]† [uk

zk

], (A.1)

where

Z0,N =[xdn x

dn+1 · · · xdN+n−1

], (A.2)

Z1,N =[xdn+1 x

dn+2 · · · xdN+n

]. (A.3)

(A.4)

By propagating this model over time, we can ex-press the predictive state sequence as

z1,L =[Γd Ξd

] [u[0,L−1]

z0

]=[Γd Ξd

] [HFuHFy

]α,

where we have exploited the definitions of HFu andHFy in (5) and the fact that the input sequence

is persistently exciting of order L + 2n. Since Γdand Ξd embed the data-driven counterpart of the(unknown) model of P, it can be straightforwardlyproven that the following holds:

z1,L =[Γd Ξd

] [HFuHFy

]α = HL(z+

d )α,

where z+d is defined as before, thus concluding the

proof.

A.2 Proof of Lemma 3

Consider the initial conditions in (4c) and (21c). Whenthe state is fully measured, we can reconstruct the initialstate from input/state data as

x0 = T

[u[−n,−1]

x[−n,−1]

],

where T ∈ Rn×(m+p)n is the matrix characterizing thiscoordinate transformation. By premultiplying both sidesof (4c) by T , it can be straightforwardly proven that thefollowing holds:

x0 = Tχ0 = x,

which corresponds to the initial condition imposed via(21c). Instead, when the state is not measured, we canrely on the use of the nonminimal representation (25) toreformulate (21c) as:

z0 = z.

Based on the definition of zk in (25), it easily followsthat (21c) is equal to (4c).As for the conditions that have to be satisfied for the fea-sibility constraints to hold, they stem straightforwardlyfrom the definition of yk when the state if fully measur-able, and the one of zk when yk 6= xk, thus concludingthe proof.

A.3 Proof of Theorem 3

The equivalence between the constraints of the problemsin (21) and (26) follows from the results in Lemma 2-3.The conditions on the weighting matrices can instead beproven as follows.

(i) When the state is fully measured, the transforma-tion matrix T allows one to reconstruct the terminalcost characterizing problem (21a). Moreover, sincethe state is measured, we can replace yk with xk.Therefore, the equivalence of (26) and (21) can be

14

Page 15: Dipartimento di Elettronica, Informazione e Bioingegneria ...

straightforwardly verified by choosing the weight-ing matrices Q, R and P as indicated in the state-ment.

(ii) When the state is not fully measured, according tothe nonminimal representation in (25), the problemin (21a) has to be modified as follows:

minu[0,L−1]

L−1∑k=0

[‖zk‖2Q + ‖uk‖2R

]+ ‖zL‖2P (A.5a)

s.t. zk+1 =Z1,N

U0,1,N

Z0,N

†[ukzk

], k=0, . . . , L−1,

(A.5b)

z0 = z, (A.5c)

uk ∈ U, zk ∈ X, (A.5d)

where Z0,N and Z1,N are defined as in (A.2) and(A.3), respectively. By substituting the matrices in(28) into (A.5), it can be easily seen that the cost

J(u[0,L−1]) in (A.5a) is equivalent to:

J(u[0,L−1]) = ‖u−1‖2R + ‖y−1‖2Q

+

L−1∑k=0

‖yk‖2Q + ‖uk‖2R +

∥∥∥∥[u[L−n,L−1]

y[L−n,L−1]

]∥∥∥∥2

P

.

Since in (A.5) we optimize over u[0,L−1], the twoinitial terms in the cost can be neglected, thus con-cluding the proof.

15