M3C 2017 Homework 2

Due date/time: 16/11/2017, 23:59 GMT

Getting Started: If needed, update your local copy of the course repo (sync and pull). The homework/hw2/ directory contains the files, hw2_template.py, hw2_template.f90, and cost.f90. Make a new directory outside of the course repo, and place dev copies of the template files named in this new directory along with hw2_module. Ultimately, the final files that you submit should be titled hw2.py, hw2.f90, and cost.f90. Browse through the three files; the template files contain a number of function/subroutine headers, and you will have to add code as described below. First, however, you should add your name and 8-digit college id to the docstring/comment at the top of each file.

You will implement and analyze a few optimization methods for solving problems of the form:

Find \(\mathbf{x} \in \mathbb{R}^n\) such that \(\mathcal{J}(\mathbf{x})\) is minimized.

Note: As you work through the tasks below, you may add module variables and additional subroutines and functions as needed. Do not remove module variables or modify input/output of existing routines without first asking for permission.


1.(10 pts) Orientation. Browse through cost.f90. This module will be used/imported by both hw2_dev.f90 and hw2_dev.py. The module contains a number of module variables as well as subroutines for evaluating a cost function and adding noise to the cost. Now, consider the case when \(\mathbf{x} \in \mathbb{R}^2\). Add code to visualize in hw2_dev.py to generate well-designed contour (or surface) plots of this cost function both with and without added noise. In the former case, set the noise amplitude to be 1. Try \(-10 \le x_1,x_2 \le 10\) to begin with and feel free to adjust these values to customize your figures. Save the two figures as hw211.png and hw212.png and submit them with your codes. Add code in the name==main to call visualize and generate these figures.

2. (25 pts) Newton’s method You will now develop code to implement Newton’s method to find the minimum of \(\mathcal{J}\) as defined by costj. Recall that Newton’s method is an iterative scheme which, given a guess for the location of the minimum, \(\tilde{\mathbf{x}}\), computes a step, \(\mathbf{h}\), towards an improved guess using:

\[H(\tilde{\mathbf{x}}) \mathbf{h} = - \mathbf{g}(\tilde{\mathbf{x}})\]

Where \(H\) and \(g\) are the Hessian and gradient of the cost function, respectively.

i) Complete the subroutine newton in hw2_dev.f90 to implement this scheme. You should assume that the input variable, xg, has an arbitrary (but sane) dimension, and iterations should stop after itermax iterations or when \(\frac{|\mathcal{J}_i - \mathcal{J}_{i-1}|}{max(|\mathcal{J}_i)|,|\mathcal{J}_{i-1}|,1)} \le tol\), whichever comes first (\(\mathcal{J}_i\) is the cost after the ith iteration, see also convergence_check in the template file). The estimated location of the minimum after iterations have terminated should be stored in the output variable, xf, with the corresponding cost in jf.

Note: You may develop your code assuming that the grad and hess routines in cost return correct values for \(n>2\) (though they do not), or you may add the appropriate code. It is the implementation in newton that must be correct.

ii) Modify newton to store the path from xguess to xf in the module variable xpath. Complete the python function newton_test to call newton and, when display is True, generate 1-2 figures illustrating the rate of convergence of the method. The function should return the computed minimum and its location. Add code in the name==main section wich uses newton_test to investigate how its convergence depends on the initial guess. Generate up to 3 figures and submit them with your assignment naming them hw22x.png with x=1,2,3. A few representative initial guesses is sufficient; an exhaustive survey of the parameter space is not needed.

3. (40 pts) Bracket descent Consider the following bracket descent algorithm for finding the minimum of a cost function when \(\mathbf{x} \in \mathbb{R}^2\).

a) Having set \(xguess = (x0_1,x0_2)\), construct an equilateral triangle with xguess as its centroid and with one vertex at \((x0_1,x0_2+1)\). We will iteratively adjust these vertices to systematically decrease the values of the cost function at the vertices.

b) Label the three vertices as \(v_a,v_b,v_c\), and the cost function evaluated at each vertex as \(J_a,J_b,J_c\) with \(J_a \ge J_b \ge J_c\). We will now try to replace \(v_a\) with a new and ‘better’ vertex. Let the midpoint of \(l_{bc}\) be \(\mathbf{x}_m\) where \(l_{bc}\) is the line segment between \(v_b\) and \(v_c\), and let the line segment between \(v_a\) and \(\mathbf{x}_m\) be \(l^*\).

c) Rotate \(l^*\) 180 degrees about \(\mathbf{x}_m\) (in the \(x_1 - x_2\) plane) and evaluate the cost, \(J^*\) at the endpoint ( \(\mathbf{x}^*\)) of the rotated line segment which is outside of the triangle. The next step depends on the value of \(J^*\).

1) If \(J^*\) is smaller than the costs at the current vertices, extend the rotated line segment away from the triangle so that its length is doubled, and evaluate \(J^{**}\) at the new endpoint, \(:\mathbf{x}^{**}\). If \(J^{**}<J^*\), replace \(v_a\) with \(\mathbf{x}^{**}\) ; otherwise, replace \(v_a\) with \(\mathbf{x}^*\). Then return to step (b).

2) if \(J_c \le J^* \le J_a\), replace \(v_a\) with \(\mathbf{x}^*\) and return to step b)

3) if \(J^*\) is larger than \(J_a\) then shrink the rotated line segment towards the triangle such that its length is halved, and evaluate \(J^{**}\) at the new endpoint, \(:\mathbf{x}^{**}\).

  1. If \(J^{**}< J_a\), replace \(v_a\) with \(\mathbf{x}^{**}\), and return to step (b).
  2. Otherwise, rotate the shrunk line segement 180 degrees about \(\mathbf{x}_m\) and evaluate \(J^{**}\) at the new endpoint, \(\mathbf{x}^{**}\).

I) If \(J^{**}< J_a\), replace \(v_a\) with \(\mathbf{x}^{**}\), and return to step (b).

II) Otherwise, move \(v_a\) and \(v_b\) to the midpoints of \(l_{ac}\) and \(l_{bc}\), respectively, and return to step b)

Note: During rotation, extension, and contraction, one endpoint of \(l^*\) remains fixed at \(\mathbf{x}_m\).

This procedure terminates after either itermax iterations or when \(|J_a|+|J_b|+|J_c|\) satisfies the convergence check described in question 2, whichever comes first.

i) Add code to bracket_descent in hw2_dev.f90 to implement this algorithm with the cost function given in costj. The guess for the location of the minimum is provided as input, and the final computed minimum and its location should be stored in the output variables jf and xf.

ii) Store the location of the triangle’s centroid at each iteration (including the intial guess) in xpath with the corresponding costs stored in jpath. Add code to bracket_descent_test in your Python file to analyze how closely each step of the triangle’s centroid corresponds to a Newton step (computed as in question 2) and what implications (if any) the observed trends have on performance. Generate 1-2 appropriate figures in the function and submit them with your assignment (with names hw231.png, hw232.png). Add code in the name==main section to generate these figures, and discuss+explain the trends shown in the figures in the function’s docstring.

4. (25 pts) Performance and design

i) Add code to performance in hw2_dev.py to assess the performance of your bracket descent code and scipy’s l-bfgs-b minimizer for the cost function (without added noise) in costj. Compare the performance of the two methods, and consider different values of the initial guess setting \(x_2=-3\) and varying \(x_1=-100,-50,-10,-1\). Create 1-3 figures displaying the most important trends. Concisely explain these trends – why they are important and what is causing them – in the function docstring. You may make modifications to the bracket-descent scheme (in a new Fortran subroutine) if desired, but clearly explain any such changes in the perfomance docstring.

ii) Analyze the influence of noise on the performance of these two methods by adjusting the module variable, c_noise_amp. Again make 1-2 figures displaying the most important trends and discuss/explain the trends in the function docstring.

For both parts i) and ii): Save and submit your figures following the hw24x.png naming convention. Add code in the name==main section to call performance and generate the figures**

iii) Describe 3 distinct features that could be (or have been) added to your Python code to improve its suitability for use as scientific software (e.g. something that could be used and developed further by another scientifically-literate person). You do not need to add these features to your code.

Notes: 1. All figures created by your code should be well-made and properly labeled. The title of each figure should include your name and the name of the function which created it.

2. Marking will primarily focus on the functionality of your code, i.e. does it produce the right answer(s) and function as required? However, points may be deducted if there are substantial inefficiencies in the code, e.g. if it is ludicrously slow.

Submitting the assignment

To submit the assignment for assessment, go to the course blackboard page, and click on the “Assignments” link on the left-hand side of the page, and then click on Homework 2. Click on “Write Submission” and add the statement: “This is all my own unaided work unless stated otherwise.” If you are a CDT student, add a 2nd line with “CDT Fluids”, or “CDT TSM” as appropriate.

Click on “Browse My Computer” and upload your final files, cost.f90, hw2.f90, hw2.py, hw2xx.png. Finally, click “Submit”.