STP 425 – Stochastic Processes
Fall 2012
Instructor: Dr. Jay Taylor, office: PSA 447; phone:
9652641; email: jtaylor@math.asu.edu
Time: Tuesdays
and Thursdays 1:302:45
Location: PSA 306
Office
Hours: Wednesdays 12:002:00 in PSA 447; also by appointment.
Text: Markov Decision Theory: Discrete Stochastic
Dynamic Programming by
Martin Puterman (Wiley, 2005).
Course Description: Our focus will be on a special class of stochastic processes known as Markov decision processes, which are used to model randomly evolving systems that are under the partial control of an external observer. At each time, the observer can choose from a set of possible actions which will influence how the system develops at the next step. Each action has a cost and the observer's goal is to control the behavior of the process in the most costeffective manner possible. Applications of Markov decision problems can be found in many disciplines, including economics, manufacturing, engineering, natural resources management and conservation, evolutionary game theory, epidemiology, and statistics. While studying Markov decision processes, we will also cover some of the basic theory of discrete and continuoustime Markov chains.
Problem Sets: These will be announced in class and posted on the course web page at the following link. Solutions will also be posted on this page a few days after each set is collected. You are welcome to work in groups, but you should write up your solutions individually and you should always give due credit if your solution came from another source, such as a text book, from an online resource, or from one of your classmates. Also, you should attempt every problem and you should submit a description of these efforts even if you have been unable to solve a problem; partial credit will be given if there is evidence that you have seriously grappled with a problem. Lastly, your solutions should be neatly written and presented in a fashion that is organized and easy to read; in particular, if you have sloppy handwriting, then you should consider typing your solutions.
Student Projects: In addition to the problem sets, each student will be expected to perform a small research project involving the application of Markov decision theory to a problem of interest. This could come from any of the areas mentioned in the course description or, indeed, any other area where stochastic models are used. You can either work individually or in pairs, although group projects will be expected to be more substantial than individual projects. A short (12 page) project proposal will be due on October 9 and a final report (7 – 15 pages) will be due on the last day of class (December 11).
Grades: The problem sets will collectively contribute 300 points to the final grade, while the project proposal will be worth 20 points and the final report will be worth 80 points. Of the 400 points possible, 360 points or more will guarantee an A, 320 points or more will guarantee a B, 280 points or more will guarantee a C, and 240 points or more will guarantee a D.
Lecture notes: These will be posted as they are written. For reference, my lecture notes on Probability Theory can be found here.
Date 
Topic 
Reading 
23 Aug 
Overview 
Put 1.1 – 1.7 
28 Aug 
Discretetime Markov chains: formulation and examples 

30 Aug 
Markov chains: Class Structure; Stationary Distributions 

4 Sept 
Markov Decision Problems: Formulations 
Put 2.1 – 2.2 
6 Sept 
Twostate Markov Decision Processes; SingleProduct Inventory Control 
Put 3.1 – 3.2 
11 Sept 
Deterministic Dynamic Programs; Optimal Stopping 
Put 3.3 – 3.4 
13 Sept 
Controlled Discretetime Dynamical Systems; Bandit Models 
Put 3.5 – 3.6 
18 Sept 
FiniteHorizon MDPs: Optimality Criteria; Policy Evaluation 
Put 4.1 – 4.2 
20 Sept 
FH MDPs: Optimality Equations; Deterministic Policies; Backward Induction 
Put 4.3 – 4.5 
25 Sept 
FH MDPs: Examples 
Put 4.6 
27 Sept 
FH MDPs: Monotone Policies 
Put 4.7 
2 Oct 
InfiniteHorizon MDPs: Value; Expected Total Reward 
Put 5.1 – 5.3 
4 Oct 
InfiniteHorizon MDPs: Value; Expected Total Reward 
Put 5.1 – 5.3 
9 Oct 
IH MDPs: Optimality Criteria; Markov Policies 
Put 5.4 – 5.6 
11 Oct 
IH MDPs: Optimality Criteria; Markov Policies 
Put 5.4 – 5.6 
16 Oct 
Fall break 

18 Oct 
Discounted MDPs: Policy Evaluation 
Put 6.1 
23 Oct 
Discounted MDPs: Policy Evaluation 
Put 6.1 
25 Oct 
Discounted MDPs: Optimality Equations 
Put 6.2 
30 Oct 
Discounted MDPs: Optimality Equations 
Put 6.2 
1 Nov 
Discounted MDPs: Value Iteration 
Put 6.3 
6 Nov 
Discounted MDPs: Value Iteration 
Put 6.3 
8 Nov 
Discounted MDPs: Value Iteration 
Put 6.3 
13 Nov 
Discounted MDPs: Policy Iteration 
Put 6.4 
15 Nov 
Discounted MDPs: Policy Iteration 
Put 6.4 
20 Nov 
Discounted MDPs: Policy Iteration 
Put 6.4 
22 Nov 
Thanksgiving – no class 

27 Nov 
Discounted MDP's: Modified Policy Iteration 
Put 6.5 
29 Nov 
Discounted MDP's: Action Elimination 
Put 6.7 
4 Dec 
Dynamic State Variable Models: Energy allocation in stochastic environments 

6 Dec 
Dynamic State Variable Models: Optimal Sampling of Unfamiliar Prey 

11 Dec 
Dynamic State Variable Models 
Taylor notes 