{% import macros as m with context %} {% set week = 7 %} {{ m.exercise_head(week) }} This week is concerned with linarization, that is, how to (approximately!) turn a non-linear discete dynamical system of the form: .. math:: \mathbf x_{k+1} = \mathbf f_k(\mathbf x_k, \mathbf u_k) into a linear dynamical system suitable for LQR: .. math:: \mathbf x_{k+1} = A \mathbf x_k + B \mathbf u_k + \mathbf d_k. The procedure described in the notes assume we are interested in states and actions that are close to an expansion point :math:`\bar{ \mathbf{x} }_k` and :math:`\bar{ \mathbf{u} }_k` and then approximate the dynamics using a Taylor expansion, see (:cite:t:`herlau`, Section 17.1). .. math:: \mathbf f_k(\mathbf x_k, \mathbf u_k) \approx \mathbf f_k(\bar {\mathbf{x} }, \bar {\mathbf{u} }) + \underbrace{\frac{\partial {\mathbf f}_k}{\partial \mathbf x} (\bar {\mathbf x}, \bar {\mathbf u})}_{=A} ( {\mathbf x}_k - \bar {\mathbf x} ) + \underbrace{\frac{\partial \mathbf f_k}{\partial \mathbf u} (\bar {\mathbf x}, \bar {\mathbf u} )}_{=B} (\mathbf u_k - \bar {\mathbf u}) Thus we can re-arrange this expression to yield: .. math:: \mathbf x_{k+1} & = A \mathbf x_k + B \mathbf u_k + \mathbf d_k \\ \mathbf d_k & = \mathbf f_k(\bar {\mathbf x}, \bar {\mathbf u}) - A \bar {\mathbf x} - B \bar {\mathbf u} To implement this, we need to be able to compute the terms above. As an example, we will use the Pendulum-model we have seen several times during the course. In our example we will assume that :math:`\bar{\mathbf{x} } = \begin{bmatrix} 0 \\ 1 \\ 0.5\end{bmatrix}` and :math:`\bar{\mathbf{u}} = \begin{bmatrix} 0 \end{bmatrix}` We can compute :math:`f_k(\bar{\mathbf x}_{k}, \bar{\mathbf u}_{k})` as follows: .. runblock:: pycon >>> from irlc.ex04.model_pendulum import DiscreteSinCosPendulumModel >>> import numpy as np >>> model = DiscreteSinCosPendulumModel() >>> x_bar = [0, 1, 0.5] # [cos(theta), sin(theta), d theta/dt] >>> u_bar = [1] # Get an arbitrary action >>> print("Compute f_k(xbar, ubar):", model.f(x_bar,u_bar)) Meanwhile the two Jacobians can be computed as follows: .. runblock:: pycon >>> from irlc.ex04.model_pendulum import DiscreteSinCosPendulumModel >>> import numpy as np >>> model = DiscreteSinCosPendulumModel() >>> x_bar = [0, 1, 0.5] # [cos(theta), sin(theta), d theta/dt] >>> u_bar = [1] # Get an arbitrary action >>> A, B = model.f_jacobian(x_bar,u_bar) >>> print(A) >>> print(B) See also :ref:`ex4`. Iterative LQR ------------------------------------------------------------------------------------------------------- The iterative LQR exercise closely follow the pseudo code in ... In the following is an overview of the various functions that are provided as part of the exercise: The cost and dynamic approximation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ In iterative LQR the cost-function will be approximated using a second-order Taylor expansion. In other words, we assume a cost-function of the standard form: .. math:: c_N(\mathbf x_N) + \sum_{k=0}^{N-1} c_k(\mathbf x_k, \mathbf u_k) .. note:: :class: margin See (:cite:t:`herlau`, Subequation 17.8) for a definition of these expressions. This cost-function can then be taylor expanded around expansion points :math:`\bar {\mathbf x}_k, \bar {\mathbf u}_k` to get a quadratic approximation of the form: .. math:: c_k(\delta \mathbf x_k, \delta \mathbf u_k) & = \frac{1}{2} \delta \mathbf x_k^\top c_{\mathbf x\mathbf x,k} \delta \mathbf x_k + c_{\mathbf x,k}^\top \delta \mathbf x_k + \frac{1}{2} \delta \mathbf u_k^\top c_{\mathbf u\mathbf u,k} \delta \mathbf u_k + c_{\mathbf u,k}^\top \delta \mathbf u_k + \delta \mathbf u_k^\top c_{\mathbf u\mathbf x,k} \delta \mathbf x_k + c_k \\ c_N(\delta \mathbf x_N) & = \frac{1}{2} \delta \mathbf x_N^\top c_{\mathbf x\mathbf x,N} \delta \mathbf x_N + c_{\mathbf x,N}^\top \delta \mathbf x_N + c_k This should be compared to the form of the cost-function the :func:`~irlc.ex06.dlqr.LQR` algorithm assumes which is: .. math:: \frac{1}{2}\mathbf x_N^\top Q_N \mathbf{x}_N + \frac{1}{2}\mathbf q_N^\top \mathbf{x}_N + q_{N} + \sum_{k=0}^{N-1} \left[ \frac{1}{2} \mathbf{x}_k^\top Q_k \mathbf{x}_k + \frac{1}{2} \mathbf{u}_k^\top R_k \mathbf{u}_k + \cdots \right] So we can identify these forms by assuming that e.g. :math:`Q_N \equiv c_{\mathbf x \mathbf x, N}` and so on. Within the code all of these cost terms will be represented as follows: - :python:`c` = :math:`\begin{bmatrix} c_0, c_1, \dots, c_{N-1} c_N\end{bmatrix} = \begin{bmatrix} q_0 & q_1 & \dots & q_N \end{bmatrix}` - :python:`c_x` = :math:`\begin{bmatrix} c_{\mathbf x, 0} & c_{\mathbf x, 1} & \dots & c_{\mathbb x, N-1} & c_{\mathbf x, N} \end{bmatrix}` - :python:`c_xx` = :math:`\begin{bmatrix} c_{ {\mathbf xx}, 0} & c_{ {\mathbf xx}, 1} & \dots & c_{ {\mathbf xx}, N-1} & c_{ {\mathbf xx}, N} \end{bmatrix}` Meanwhile, the dynamics is of the form :math:`\mathbf x_{k+1} = A_k \mathbf x_k + B_k \mathbf u_k` where: - :python:`A` = :math:`\begin{bmatrix} A_0 & A_1 & \dots & A_{N-1} \end{bmatrix}` - :python:`B` = :math:`\begin{bmatrix} B_0 & B_1 & \dots & B_{N-1} \end{bmatrix}` Classes and functions ---------------------------------------------------------------------------------- .. autofunction:: irlc.ex07.ilqr.backward_pass .. autofunction:: irlc.ex07.ilqr.cost_of_trajectory .. autofunction:: irlc.ex07.ilqr.get_derivatives .. autofunction:: irlc.ex07.ilqr.forward_pass Solutions to selected exercises ------------------------------------------------------------------------------------------------------- {% if show_solution[week] %} .. admonition:: Solution to problem 1 :class: dropdown **Part a:** The dynamics is: .. math:: x_{k+1} = x_k + \Delta f(x_k, u_k) = x_k(1 + \Delta 4u_k) = f_k(x_k, u_k) **Part b:** The derivative give us .. math:: A & = \frac{\partial f}{\partial x} = 1 + 4\Delta = 3\\ B & = \frac{\partial f}{\partial u} = 4\Delta \bar x = 2\bar x Finally we compute :math:`d` to be: .. math:: d = f_k(\bar x, \bar u) - A \bar x - B \bar u = \bar x(1+4\Delta) - 3 \bar x - 2\bar x = -2 \bar x So therefore: .. math:: x_{k+1} \approx 3x_k + 2\bar x u_k - 2\bar x Note that we just applied the procedure outlined at the top of this page. {% endif %} {{ m.embed('https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=72924f5b-6617-4f9b-8362-b1330169acbc', 'Problem 7.1', True) }}