{% import macros as m with context  %}
{% set week = 5 %}
{{ m.exercise_head(week) }}


This weeks exercises will be slightly different since we will focus on implementing a single method, namely the direct control method we saw during the lecture.

Direct optimization is fundamentally just about translating the control problem into an optimization problem of the form:

.. math::

    \mathbf{z}^* & = \min\left[ J(\mathbf{z}) \right] \\
    \mathbf z & = (\mathbf x_0, \mathbf u_0, \mathbf x_1, \mathbf u_1, \dots, \mathbf x_{N-1}, \mathbf u_{N-1}, t_0, t_F) \\
    \mathbf z_\text{ub} \leq   \mathbf z & \leq \mathbf z_\text{ub} \\
    \mathbf x_k - \mathbf x_{k+1}     + \frac{h_k}{2}(\mathbf f_{k+1} + \mathbf f_{k} ) & = 0

That is, a minimization problem subject to both linear and non-linear constraints.

.. note::
    :class: margin

    You can read more about the Scipy optimizer we use `here <https://docs.scipy.org/doc/scipy-0.18.1/reference/tutorial/optimize.html>`__.

This optimization problem is then fed into an optimizer, which gives us an answer in the form of a vector :math:`\mathbf{z}^*`.

In our case, we will use the build-in non-linear optimizer :func:`scipy.optimize.minimize`. This optimizer is very powerful, but it means
that a good place to begin with the optimization is to understand what sort if inputs it expects, and what outputs it gives in return. Consider the following minimization problem:


.. math::

    J(\mathbf z) & = z_1^2  + 3z_2^2 \\
    \mathbf{z}^* & = \min\left[ J(\mathbf{z}) \right] \\
     \begin{bmatrix} 0 \\ -0.5 \end{bmatrix} \leq   \mathbf z & \leq \begin{bmatrix} 1 \\ 2 \end{bmatrix} \\
    1 - z_1 - 2 z_2 & \leq 0 \\
        1 - z_1^2 - z_2 & \leq 0 \\
    2 z_1 + z_2 - 1 & = 0

This problem can be solved by specifying the inequality and equality constraints as dictionaries and calling :func:`~scipy.optimize.minimize` as follows:

.. runblock:: pycon

    >>> import numpy as np
    >>> from scipy.optimize import Bounds, minimize
    >>> def J_fun(z):
    ...     return z[0]**2 + 3 * z[1] ** 2
    >>>
    >>> def J_jac(z):
    ...     return np.array([2 * z[0], 6 * z[1]])
    >>>
    >>> ineq_cons = {'type': 'ineq',
    ...             'fun': lambda x: np.array([1 - x[0] - 2 * x[1],
    ...                                        1 - x[0] ** 2 - x[1]]),
    ...             'jac': lambda x: np.array([[-1.0, -2.0],
    ...                                        [-2 * x[0], -1.0]])}
    >>>
    >>> eq_cons = {'type': 'eq',
    ...             'fun': lambda x: np.array([2 * x[0] + x[1] - 1]),
    ...             'jac': lambda x: np.array([2.0, 1.0])}
    >>>
    >>> z_lb, z_ub = [0, -0.5], [1.0, 2.0] # upper and lower bounds.
    >>> z0 = np.array([0.5, 0]) # Initial guess
    >>> bounds = Bounds(z_lb, z_ub)
    >>> res = minimize(J_fun, z0, method='SLSQP', jac=J_jac, constraints=[eq_cons, ineq_cons], bounds=bounds)
    >>> res

As shown by the output, the optimal value is :python:`res.x`.
The minimize function can accept any number of non-linear equality and in-equality constraints as a list (in this case it is given two), and the simple constraints as a :class:`scipy.optimize.Bounds`-object.

Note that the functions specifying the equality and inequality constraints are given as function, for instance :python:`lambda x: np.array([2 * x[0] + x[1] - 1])`,
and then we also need to specify the Jacobian of the equality constraint, i.e.

.. math::

    f(\mathbf{x} ) & = 2z_1 + z_2 - 1 \\
    J_x f(\mathbf{x})  & = \begin{bmatrix} 2 \\ 1 \end{bmatrix}

Similarly we need to specify the Jacobian of the objective function $\nabla E$. To make this feasible, our strategy will be to specify all equality and in-equalty constraints as symbolic
objects using :python:`sympy` and then compute the Jacobians explicitly. That means that in your implementation:

- :math:`x_k, u_k, t_0, \dots` and all other variables are symbolic (sympy) variables
- :math:`\mathbf z` is therefore a :python:`list` of symbolic expressions corresponding
- :math:`\mathbf z_\text{lb}, \mathbf z_\text{ub}` are lists of numbers
- :math:`\mathbf z_0` (the initial guess) is a list of numbers
- :math:`J(\mathbf{z})`, i.e. the objective function we minimize, is a symbolic expression depending on :math:`\mathbf{z}`
- The collocation constraints are defined as lists of symbolic expressions :python:`Ieq = [..., c_eq, ...]` with the convention that :math:`c_\text{eq} = 0`.
- Any non-linear inequality constraints are also defined as lists of symbolic expressions :python:`Iineq = [..., c_ineq, ...]` with the convention that :math:`c_\text{ineq} = 0`.

Your job is in other words to define the above variables (transcribe the problem) and then feeding it into the optimizer will happen automatically in the code.


Classes and functions
-------------------------------------------------------------------------

.. autofunction:: irlc.ex05.direct.collocate

.. autofunction:: irlc.ex05.direct.trapezoid_interpolant


Solutions to selected exercises
-------------------------------------------------------------------------------------------------------


{% if show_solution[week] %}

.. admonition:: Solution to problem 1
    :class: dropdown

    **Part a+b:** The dimensions are obviously the same. When :math:`N=3` the states are :math:`x_0, x_1, x_2` and the actions are
    :math:`u_0, u_1, u_2`. Since these are all 1-dimensional there are 8 dimensions because:

    .. math::

        \mathbf z = \begin{bmatrix} x_0 &  u_0 &  x_1 &  u_1 &  x_2 &  u_2 &  t_0 &  t_F \end{bmatrix}

    The constraints are therefore:

    .. math::

        \mathbf z_{\text{lb} }  & = \begin{bmatrix} 0 \\ -\infty \\   -\infty \\   -\infty \\  \frac{\pi}{2} \\   -\infty \\   0 \\  0 \end{bmatrix}, \quad
        \mathbf z_{\text{ub} }  = \begin{bmatrix} 0 \\   \infty \\   \infty \\   \infty \\   \frac{\pi}{2} \\   \infty \\  0 \\  10 \end{bmatrix}

    **Part c:** The two collocation constraints are, for :math:`k=0, 1` and using that :math:`f_k = u_k + \cos(x_k)`,

    .. math::

        \mathbf x_{k+1} - \mathbf x_k = \frac{t_F - t_0}{4} \left(u_{k+1}  + \cos(x_{k+1} ) + u_{k}  +  \cos(x_{k} ) \right)

    Applying the constraints gives the desired result.

    **Part d:** According to trapezoid collocation, the cost function will take the general form:

    .. math::

        E(u_0, u_1, u_2) = \frac{c_0 + c_1}{2} + \frac{c_1 + c_2}{2} = \frac{c_0 + c_2}{2} + c_1.

    where:

    .. math::

        c_k = \frac{t_F - t_0}{2}\left(  (2x_k - \frac{\pi}{2})^2  + \frac{\pi^2}{4} (2u_k + 2\cos(x_k) + 1 )^2 \right)

    If we insert the constraints, :math:`x_0=u_0=u_2=0` and :math:`x_2=\frac{\pi}{2}` and simplify we get the expression in the problem.

    **Part e:** The first step is to simplify the two constraints by (i) forming an equation where they are added together (ii) forming an equation by subtracting them from each other. Doing this gives two new equations:

    .. math::

        \frac{\pi}{2} & = \frac{t_F}{4}(2 u_1 + 2\cos x_1 + 1) \\
        2x_1 - \frac{\pi}{2} & = \frac{t_F}{4}

    We observe that the right-hand side of the first equation corresponds to the second term in the cost-function and the left-hand side of the second corresponds to the first term of the cost-function.
    Thus, we can use these two equations to eliminate all terms in the cost-function except :math:`t_F`. This means that

    .. math::

        E & = \frac{c_0 + c_2}{2} + c_1 \\
        \text{where} \quad \frac{c_0 + c_2}{2} & = \frac{3\pi^2}{4} t_F  \\
        \text{and} \quad c_1 & = \frac{t_F}{2} \left[  \left( \frac{t_F}{4} \right)^2 + \frac{\pi^2}{4} \left(  \frac{4 \frac{\pi}{2} }{  t_F} \right)^2   \right]

    Applying a bit of algebra and collecting terms gives us:

    .. math::

        E = \frac{t_F^3}{32} + t_F \frac{3\pi^2}{4} + \frac{\pi^4}{2}\frac{1}{t_F}

    To get the value of :math:`t_F` that minimize this expression we just set the derivative equal to zero: :math:`E'(t_F)=0`.
    If we do that we get:

    .. math::

        \frac{3}{32}t_F^2 + \frac{3\pi^2}{4}  - \frac{\pi^4}{2} \frac{1}{t_F^2} = 0

    Whenever you encounter something like this it is a good idea to see if it can be re-written as a polynomial. So we multiply by :math:`t_F^2` and get:

    .. math::

        \frac{3}{32}t_F^4 + \frac{3\pi^2}{4} t_f^2 - \frac{\pi^4}{2} = 0

    Polynomials with only even coefficients have the nice property they can be simplified by introducing :math:`y=t_F^2`.
    This gives us a second-order polynomial in :math:`y` which we can easily solve using high-school math. After carefully simplifying the expression we get:

    .. math::

        y = 4\pi^2 \left(  \frac{2\sqrt{3} }{3} - 1 \right)

    Thus, the final answer is:

    .. math::

        t_F = 2\pi \sqrt{  \frac{2\sqrt{3} }{3} - 1 }.


    By using the constraint equations we could potentially recover :math:`u_1` and :math:`x_1`, but that is left as an exercise.
{% endif %}


{{ m.embed('https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=6fb73624-55b0-4457-8986-b125017d31ae', 'Problem 5.1', True) }}

.. warning::
    :class: margin

    I make a small mistake in the first video (a numpy ndarray is not flattened correctly when I define :python:`z0`). The problem is corrected in the video for problem 5.4 below.

{{ m.embed('https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=76ab2ffd-9e98-453d-9b6f-b125017cec15', 'Problem 5.2', True) }}

{{ m.embed('https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=3fe9c035-57b1-4a96-8ff3-b125017cec15', 'Problem 5.3', True) }}

{{ m.embed('https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=0dc8fa60-2985-4d24-bfa4-b125017cec15', 'Problem 5.4', True) }}

{{ m.embed('https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=09ac11f3-463c-49e8-b28c-b125017cec0c', 'Problem 5.5', True) }}