STP 425 – Stochastic Processes

Fall 2012

Instructor: Dr. Jay Taylor, office: PSA 447; phone: 965-2641; e-mail: jtaylor@math.asu.edu
Time: Tuesdays and Thursdays 1:30-2:45
Location: PSA 306
Office Hours: Wednesdays 12:00-2:00 in PSA 447; also by appointment.
Text: Markov Decision Theory: Discrete Stochastic Dynamic Programming by Martin Puterman (Wiley, 2005).

Course Description: Our focus will be on a special class of stochastic processes known as Markov decision processes, which are used to model randomly evolving systems that are under the partial control of an external observer. At each time, the observer can choose from a set of possible actions which will influence how the system develops at the next step. Each action has a cost and the observer's goal is to control the behavior of the process in the most cost-effective manner possible. Applications of Markov decision problems can be found in many disciplines, including economics, manufacturing, engineering, natural resources management and conservation, evolutionary game theory, epidemiology, and statistics. While studying Markov decision processes, we will also cover some of the basic theory of discrete- and continuous-time Markov chains.

Problem Sets: These will be announced in class and posted on the course web page at the following link. Solutions will also be posted on this page a few days after each set is collected. You are welcome to work in groups, but you should write up your solutions individually and you should always give due credit if your solution came from another source, such as a text book, from an online resource, or from one of your classmates. Also, you should attempt every problem and you should submit a description of these efforts even if you have been unable to solve a problem; partial credit will be given if there is evidence that you have seriously grappled with a problem. Lastly, your solutions should be neatly written and presented in a fashion that is organized and easy to read; in particular, if you have sloppy handwriting, then you should consider typing your solutions.

Student Projects: In addition to the problem sets, each student will be expected to perform a small research project involving the application of Markov decision theory to a problem of interest. This could come from any of the areas mentioned in the course description or, indeed, any other area where stochastic models are used. You can either work individually or in pairs, although group projects will be expected to be more substantial than individual projects. A short (1-2 page) project proposal will be due on October 9 and a final report (7 – 15 pages) will be due on the last day of class (December 11).

Grades: The problem sets will collectively contribute 300 points to the final grade, while the project proposal will be worth 20 points and the final report will be worth 80 points. Of the 400 points possible, 360 points or more will guarantee an A, 320 points or more will guarantee a B, 280 points or more will guarantee a C, and 240 points or more will guarantee a D.

Lecture notes: These will be posted as they are written. For reference, my lecture notes on Probability Theory can be found here.


Date

Topic

Reading

23 Aug

Overview

Put 1.1 – 1.7

28 Aug

Discrete-time Markov chains: formulation and examples

Taylor 2.1

30 Aug

Markov chains: Class Structure; Stationary Distributions

Taylor 2.2

4 Sept

Markov Decision Problems: Formulations

Put 2.1 – 2.2

6 Sept

Two-state Markov Decision Processes; Single-Product Inventory Control

Put 3.1 – 3.2

11 Sept

Deterministic Dynamic Programs; Optimal Stopping

Put 3.3 – 3.4

13 Sept

Controlled Discrete-time Dynamical Systems; Bandit Models

Put 3.5 – 3.6

18 Sept

Finite-Horizon MDPs: Optimality Criteria; Policy Evaluation

Put 4.1 – 4.2

20 Sept

F-H MDPs: Optimality Equations; Deterministic Policies; Backward Induction

Put 4.3 – 4.5

25 Sept

F-H MDPs: Examples

Put 4.6

27 Sept

F-H MDPs: Monotone Policies

Put 4.7

2 Oct

Infinite-Horizon MDPs: Value; Expected Total Reward

Put 5.1 – 5.3

4 Oct

Infinite-Horizon MDPs: Value; Expected Total Reward

Put 5.1 – 5.3

9 Oct

I-H MDPs: Optimality Criteria; Markov Policies

Put 5.4 – 5.6

11 Oct

I-H MDPs: Optimality Criteria; Markov Policies

Put 5.4 – 5.6

16 Oct

Fall break


18 Oct

Discounted MDPs: Policy Evaluation

Put 6.1

23 Oct

Discounted MDPs: Policy Evaluation

Put 6.1

25 Oct

Discounted MDPs: Optimality Equations

Put 6.2

30 Oct

Discounted MDPs: Optimality Equations

Put 6.2

1 Nov

Discounted MDPs: Value Iteration

Put 6.3

6 Nov

Discounted MDPs: Value Iteration

Put 6.3

8 Nov

Discounted MDPs: Value Iteration

Put 6.3

13 Nov

Discounted MDPs: Policy Iteration

Put 6.4

15 Nov

Discounted MDPs: Policy Iteration

Put 6.4

20 Nov

Discounted MDPs: Policy Iteration

Put 6.4

22 Nov

Thanksgiving – no class


27 Nov

Discounted MDP's: Modified Policy Iteration

Put 6.5

29 Nov

Discounted MDP's: Action Elimination

Put 6.7

4 Dec

Dynamic State Variable Models: Energy allocation in stochastic environments

Fischer et al. (2009)

6 Dec

Dynamic State Variable Models: Optimal Sampling of Unfamiliar Prey

Sherratt (2011)

11 Dec

Dynamic State Variable Models

Taylor notes