Skip to Main Content

Stephanie Milani

Computer Science (B.S.), Psychology (B.A.)
“R-AMDP: Model-Based Learning for Abstract Markov Decision Process Hierarchies”

Stephanie Milani

Decision-making agents face immensely challenging planning problems when operating in large environments to solve complex tasks. A hierarchy of abstract Markov decision processes (AMDPs) provides a framework for decomposing such problems into distinct, related subtasks. AMDP hierarchies grant considerable speedup over related recursively and hierarchically optimal methods such as MAXQ and options. Each AMDP serves as a subgoal, and each is itself a planning problem with a local model and state space abstracted from a ground MDP. Agents are able to plan more efficiently by using a reduced state space at the appropriate level of abstraction; however, they require their subtask models to be specified by a human expert [5]. We describe an approach for automating model estimation by combining the R-MAX algorithm with AMDPs. We compare the resulting structures, R-AMDPs, with a similar approach, RMAXQ, and motivate its advantages. Ultimately, R-AMDPs represent the first step in learning AMDP hierarchies dynamically, completely from an agent’s experience.

This material is based upon work supported by the National Science Foundation under Grant No. IIS -1426452, and by DARPA under grants W911NF-15-1-0503 and D15AP00102.

What research experiences have you had?
I have been involved with the Multi-Agent Planning and Learning (MAPLE) lab since April 2016. This winter, under the guidance of the P.I., Dr. Marie desJardins, I collaborated with other students to research Abstract Markov Decision Processes (AMDPs). My tasks were to design an AMDP hierarchy and develop an ontology for composite objects for a house building domain. This work required me to learn about Markov Decision Processes (MDPs), partial Markov Decision Processes (POMDPs), Bayesian reinforcement learning (RL), and hierarchical model-based RL to make decisions about structuring the AMDP. My previous experience researching the role of dopamine and endocannabinoids in RL in rats at the University of Maryland School of Medicine provided me with a good theoretical foundation for understanding learning and planning from a computational perspective. This semester, I am collaborating with another student to implement a full AMDP for house building to test if an agent, using various planning algorithms, can successfully plan to construct suitable houses.

How did you find the research opportunity?
I took Dr. desJardins’ honors seminar about the sources and effects of complexity in natural and artificial systems and fell in love with computer science! I wanted to do something that integrated both of my degrees, so the natural choice seemed to be Artificial Intelligence. I went to her lab meeting and the rest is history.


Who is your mentor for your research, scholarship, or artistic project? (give full name and department) How did you arrange to work with this person?
Dr. Marie desJardins, Associate Dean of Academic Affairs for the College of Engineering and Information Technology, Professor of Computer Science in the Department of Computer Science and Electrical Engineering. I also work under Shawn Squire and John Winder, two of the graduate students in my lab.

Do you get course credit for this work? Paid? How much time do you put into it?
I do not get course credit and I do not get paid. I spend about 7 hours per week on MAPLE research.

What academic background did you have before you started?
Before I started, I had only taken one computer science class, but a lot of psychology and gender and women’s studies classes. I had also taken a few courses in Africana studies, Mathematics, Linguistics, Interdisciplinary Studies, Biology, and Chemistry.

How did you learn what you needed to know to be successful in this project?
The graduate students and Dr. desJardins provided clear instructions and expectations. I also felt comfortable to ask questions if I was unsure of something and to ask for a task if I felt like I was starting to drift.

What was the hardest part about your research?
The most difficult part was coming into research in a specialized field with virtually no computer science background. I had to put a lot of effort into understanding enough background information to begin making meaningful contributions. It’s also hard to make time for everything you want to do!

What was the most unexpected thing?
How much learning about learning and planning from a computational perspective has helped me learn and plan better! It’s not a completely surprising concept, but the degree to which it has helped certainly surprised me.

What is your advice to other students about getting involved in research?
Most importantly: learn how to manage your time so you can make meaningful contributions in everything you do. Start early and get organized, if you aren’t already. Ask questions if you are unclear of something, but also expect to figure out a lot of things yourself.

What are your career goals?
I want to do research in Artificial Intelligence or a related field. I want to continue with outreach in computer science education.

What else are you involved in on campus?
I am the Vice President of the Computer Science Education club, a member of the Retriever Robotics Club, a CWIT affiliate, and a Research Assistant for CS Matters in Maryland. I am also in the Honors College. Previously, I was the Secretary of QUMBC.


Get back jack!