Introduction to reinforcement learning and control theory

Introduction to reinforcement learning and control theory#

Hint

Use ctrl+k to search.

This page contains material and information related to the spring 2025, version of the course Introduction to reinforcement learning and control, offered at DTU.

If you are thinking about taking the course you can read more about the course here or look at the Pre-requisites. If you are enrolled and just starting out, you should begin with the Installation. You can find the exercises and project descriptions in the menu to the left.

I have uploaded previous exam sets and practical information

Practicalities#

Note

This page is continuously updated with typos and other adjustments. I therefore recommend bookmarking it and using the newest version of the exercises.

Time and place:: Building B341, auditorium 21, 08:00–12:00
DTU Learn:: 02465
Exercise code:: https://lab.compute.dtu.dk/02465material/02465students.git
Course descriptions:: kurser.dtu.dk
Lecture recordings:: panopto.dtu.dk
Discord:: Discord channel (invitation link)
ChatTutor AI help:: chattutor.dk signup link
DTU python support:: pythonsupport.dtu.dk
Contact:: Tue Herlau, tuhe@dtu.dk.

Course schedule#

The schedule and reading can be found below. Click on the titles to read the exercise and project descriptions.

#	Date	Title	Reading	Homework	Exercise	Slides
	Jan 31th, 2025	Installation and self-test	Chapter 1-3 , [Her25]		[PDF]
1	Feb 7th, 2025	The finite-horizon decision problem	Chapter 4, [Her25]	1, 2	[PDF]	[1x] [6x]
2	Feb 14th, 2025	Dynamical Programming	Chapter 5-6.2, [Her25]	1	[PDF]	[1x] [6x]
3	Feb 21th, 2025	DP reformulations and introduction to Control	Section 6.3; Chapter 10-11, [Her25]	1, 2	[PDF]	[1x] [6x]
4	Feb 28th, 2025	Discretization and PID control	Chapter 12-14, [Her25]	1, 2	[PDF]	[1x] [6x]
	Mar 6th, 2025	Project 1: Dynamical Programming
5	Mar 7th, 2025	Direct methods and control by optimization	Chapter 15, [Her25]	1	[PDF]	[1x] [6x]
6	Mar 14th, 2025	Linear-quadratic problems in control	Chapter 16, [Her25]	1	[PDF]	[1x] [6x]
7	Mar 21th, 2025	Linearization and iterative LQR	Chapter 17, [Her25]	1	[PDF]	[1x] [6x]
8	Mar 28th, 2025	Exploration and Bandits	Chapter 1; Chapter 2-2.7; 2.9-2.10, [SB18]	1	[PDF]	[1x] [6x]
	Apr 3rd, 2025	Project 2: Control theory
9	Apr 4th, 2025	Bellmans equations and exact planning	Chapter 3; 4, [SB18]	1, 2	[PDF]	[1x] [6x]
10	Apr 11th, 2025	Monte-carlo methods and TD learning	Chapter 5-5.4+5.10; 6-6.3, [SB18]	1	[PDF]	[1x] [6x]
		🥚 🐤 Easter Holiday 🐤 🥚	🎮	🏖️	🍹
11	Apr 25th, 2025	Model-Free Control with tabular and linear methods	Chapter 6.4-6.5; 7-7.2; 9-9.3; 10.1, [SB18]	1	[PDF]	[1x] [6x]
12	May 2nd, 2025	Eligibility traces	Chapter 10.2; 12-12.7, [SB18]	1	[PDF]	[1x] [6x]
	May 8th, 2025	Project 3: Reinforcement Learning
13	May 9th, 2025	Deep-Q learning	Chapter 6.7-6.9; 8-8.4; 16-16.2; 16.5; 16.6, [SB18]	1	[PDF]	[1x] [6x]

You can find the course reading material further down on this page. The exam exam QA slides here. Details about the exam QA session will be announced on DTU Learn.

Note

Chapters 1–3 is background information about python and are therefore not part of the main course content (pensum). Knowledge of python is required for the exams.
The Homework column list those problems from the exercise PDF sheets (see the table above) that will be discussed during class. They are also indicated by a in the margin of the exercises. I encourage you to prepare them at home and present your solution during the exercise session.

Exercise sessions#

Hint

I will upload solutions to the programming problems on gitlab.

The teaching assistants will be available Fridays 10:00–12:00 after the lecture.

Location	Instructor	Email
Building B341, auditorium 21	Tue Herlau	tuhe@dtu.dk
Building B341, IT-015	Adam Bøttcher Haupt-Hansen	s224202@student.dtu.dk
Building B341, IT-015	Marius Emil Thornit	s224217@student.dtu.dk
Building B341, IT-019	Nikolaj Severin Stæhr Hertz	s214644@student.dtu.dk

For the exercises, you are encouraged to prepare the homework problems at home (see syllabus above), and present your solution during the exercise session.

Reading material#

The two books referenced in the course syllabus are available here

[Her24]:: Sequential Decision-Making
[SB18]:: Introduction to Reinforcement Learning (2020) (Authors homepage)

Additional reading material#

The following references are mentioned in the course as background information but are not part of the course syllabus.

Reference name	Download
(Tassa et al. [TET12])	tassa2012.pdf
(Kelly [Kel17])	kelly2017.pdf
(Herlau et al. [HMS24])	02450Book.pdf

Bibliography#

[Her25] (1,2,3,4,5,6,7,8)

Tue Herlau. Sequential decision making. (Freely available online), 2025. URL: https://www2.compute.dtu.dk/courses/02465/#reading-material.

[HMS24]

Tue Herlau, Morten Mørup, and Mikkel N. Schmidt. Introduction to Machine Learning and Data Mining. 02450 Lecture notes, 2024. (Freely available online). URL: https://www2.compute.dtu.dk/courses/02465/#reading-material.

[Kel17]

Matthew Kelly. An introduction to trajectory optimization: how to do your own direct collocation. SIAM Review, 59(4):849–904, 2017. (See kelly2017.pdf). URL: https://epubs.siam.org/doi/pdf/10.1137/16M1062569.

[SB18] (1,2,3,4,5,6)

Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. The MIT Press, second edition, 2018. (Freely available online). URL: https://www2.compute.dtu.dk/courses/02465/#reading-material.

[TET12]

Yuval Tassa, Tom Erez, and Emanuel Todorov. Synthesis and stabilization of complex behaviors through online trajectory optimization. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 4906–4913. IEEE, 2012. (See tassa2012.pdf). URL: https://ieeexplore.ieee.org/abstract/document/6386025.

Contents#

>>> from datetime import datetime
>>> print("This page was last updated at:", datetime.now().strftime("%d/%m/%Y %H:%M:%S"))
This page was last updated at: 18/06/2025 18:28:17