## Dissertation and Thesis

• Optimization of Stochastic, Partially Observed Systems using Switched Policies and a Sampling-based Approach
Salvatore Candido
Doctoral Dissertation
University of Illinois, Urbana, Illinois, 2011

Abstract: We propose a new method for learning policies for large, partially observable Markov decision processes (POMDPs) that require long time horizons for planning. Computing optimal policies for POMDPs is an intractable problem and, in practice, dimensionality renders exact solutions essentially unreachable for even small real-world systems of interest. For this reason, we restrict the policies we learn to the class of switched belief-feedback policies in a manner that allows us to introduce domain expert knowledge into the planning process. This approach has worked well for the systems on which we have tested our approach, and we conjecture that it will be useful for many real-world systems of interest.

Our approach is based on a method like value iteration to learn a switching law. Because the POMDP problem is intractable, we use a Monte Carlo approximation to evaluate system behavior and optimize a switching law based on sampling. We explicitly analyze the sensitivity of expected cost (performance) with respect to perturbations introduced by our approximations, and use that analysis to avoid approximation errors that are potentially disastrous when using the computed policy. We demonstrate results on discrete POMDP problems from the literature and a resource allocation problem modeled after a team of robots attempting to extinguish a forest fire.

We then utilize our approach to build two algorithms that solve the minimum uncertainty robot navigation problem. We demonstrate that our approach can improve on techniques in the literature in terms of solution quality by demonstrating our results in simulation. Our second approach utilizes information-theoretic heuristics to drive the sampling-based learning procedure. We conjecture that this is a useful direction towards an efficient, general stochastic motion planning algorithm.


@phdthesis{candido_switchedpomdp,
title = {Optimization of Stochastic, Partially Observed Systems using Switched Policies and a Sampling-based Approach},
author = {Salvatore Candido},
school = {University of Illinois at Urbana-Champaign},
year = {2011}
}
• Motion Planning for Bipedal Walking Robots
Salvatore Candido
Master's Thesis
University of Illinois, Urbana, Illinois, 2007

Abstract: In this thesis I consider the problem of motion planning for walking, bipedal robots, including humanoid robots. While a number of problems complicate planning for robots with a large number of degrees of freedom that possess many dynamic constraints, such as a bipedal system, a simplified framework can be used to find an approximate solution to some planning problems even though, in general, the motion planning problem is intractable. I discuss the difficulties in the motion planning problem for walking robots and summarize previous research addressing the problem. I propose a hierarchical planning framework and outline and discuss two algorithms within that framework. I present results from implementations of these algorithms in simulation environments and discuss the future research required to use this work to create robust and viable motion planning software to deploy on mechanical robotic systems.


@mastersthesis{candido_motionplanninghumanoid,
title = {Motion Planning For Bipedal Walking Robots},
author = {Salvatore Candido},
school = {University of Illinois at Urbana-Champaign},
year = {2007}
}

## Journal Articles

• A Sampling-based Approach to Exploiting Domain Knowledge to Learn Policies for POMDPs
Salvatore Candido, James Davidson, and Seth Hutchinson
submitted to the Journal of Machine Learning Research, 2011 (under revision)
Abstract

We propose a new method for learning policies for large partially-observable Markov decision processes (POMDPs) that require long time horizons for planning. Computing optimal policies for POMDPs is an intractable problem and, in practice, dimensionality renders exact solutions essentially unreachable for even small real-world systems of interest. For this reason, we restrict the policies we learn to the class of switched belief-feedback policies in a manner that allows us to introduce domain expert knowledge into the planning process. This approach has worked well for the systems on which have tested our approach, and we conjecture that it will be useful for many real-world systems of interest.

Our approach is based on a method like value iteration to learn a switching law. Because the POMDP problem is intractable, we use a Monte Carlo approximation to evaluate system behavior and optimize a switching law based on sampling. We explicitly analyze the sensitivity of expected cost (performance) with respect to perturbations introduced by our approximations, and use that analysis to avoid approximation errors that are potentially disastrous when using the computed policy. We demonstrate results on POMDP problems from the literature and a resource allocation problem modeled after a team of robots attempting to extinguish a forest fire.

• Control and Planning of 3D Dynamic Walking with Asymptotically Stable Gait Primitives
Robert D. Gregg, Adam K. Tilton, Salvatore Candido, Timothy Bretl, and Mark W. Spong
submitted to the IEEE Transactions on Robotics, 2010 (under revision)
Abstract

In this paper we present a hierarchical control framework that enables motion planning for three-dimensional bipedal dynamic walkers in the same way that planning is already possible for quasi-static walkers. This framework is based on the construction of asymptotically stable gait primitives for a class of hybrid dynamical systems with impacts. Each primitive corresponds to an asymptotically stable hybrid limit cycle that admits rules for sequential composition with other primitives, reducing a high-dimensional feedback motion planning problem into a low-dimensional discrete tree search. As an example, we construct asymptotically stable gait primitives for a 3D compass- gait biped using geometric reduction-based control, where in this case each primitive corresponds to walking along an arc of constant curvature for a fixed number of steps. We apply a discrete search algorithm to plan a sequence of these primitives taking the biped stably from start to goal in three-dimensional workspaces with obstacles.

## Proceedings of Technical Meetings

• Minimum Uncertainty Robot Navigation using Information-guided POMDP Planning
Salvatore Candido and Seth Hutchinson
in the IEEE International Conference on Robotics and Automation, Shanghai, China, 2011

A ubiquitous problem in robotics is determining policies that move robots with uncertain process and observation models (partially-observed state systems) to a goal configuration while avoiding collision. We propose a new method to solve this minimum uncertainty navigation problem. We use a continuous partially-observable Markov decision process (POMDP) model and optimize an objective function that considers both probability of collision and uncertainty at the goal position. By using information-theoretic heuristics, we are able to find policies that are effective for both minimizing collisions and stopping near the goal configuration. We additionally introduce a filtering algorithm that tracks collision free trajectories and estimates the probability of collision.


@inproceedings{candido_icra2011,
title={Minimum Uncertainty Robot Navigation using Information-guided {POMDP} Planning},
author={Salvatore Candido AND Seth Hutchinson},
booktitle={Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
year={2011},
month={May},
}

• Minimum Uncertainty Robot Path Planning using a POMDP Approach
Salvatore Candido and Seth Hutchinson
in the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Taipei, Taiwan, 2010

We propose a new minimum uncertainty planning technique for mobile robots localizing with beacons. We model the system as a partially-observable Markov decision process and use a sampling-based method in the belief space (the space of posterior probability density functions over the state space) to find a belief-feedback policy. This approach allows us to analyze the evolution of the belief more accurately, which can result in improved policies when common approximations do not model the true behavior of the system. We demonstrate that our method performs comparatively, and in certain cases better, than current methods in the literature.


@inproceedings{candido_iros2010,
title={Minimum Uncertainty Robot Path Planning using a {POMDP} Approach},
author={Salvatore Candido AND Seth Hutchinson},
booktitle={Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year={2010},
month={October},
pages={1408-1413},
}

• Exploiting Domain Knowledge in Planning for Uncertain Robot Systems Modeled as POMDPs
Salvatore Candido, James Davidson, and Seth Hutchinson
in the IEEE International Conference on Robotics and Automation (ICRA), Anchorage, USA, 2010

We propose a planning algorithm that allows user-supplied domain knowledge to be exploited in the synthesis of information-feedback policies for systems modeled as partially observable Markov decision processes (POMDPs). POMDP models, which are increasingly popular in the robotics literature, permit a planner to consider future-stage uncertainty in both the application of actions and sensor observations.

With our approach, domain experts inject specialized knowledge into the planning process by providing a set of local policies that are used as primitives by the planner. If the local policies are chosen appropriately, the planner can evaluate further into the future, even for large problems, which can lead to better overall policies at decreased computational cost. We use a structured approach to encode the provided domain knowledge into the value function approximation.

We demonstrate our approach on a multi-robot fire fighting problem, in which a team of robots cooperates to extinguish a spreading fire (modeled as a stochastic process). The state space for this problem is significantly larger than is typical in the POMDP literature, and the geometry of the problem allows for the application of an intuitive set of local policies, thus demonstrating the effectiveness of our approach.


@inproceedings{candido_icra2010,
title={Exploiting Domain Knowledge in Planning for Uncertain Robot Systems Modeled as {POMDP}s},
author={Salvatore Candido AND James Davidson AND Seth Hutchinson},
booktitle={Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
year={2010},
month={May},
pages={3596-3603},
}

• Detecting Intrusion Faults in Remotely Controlled Systems
Salvatore Candido and Seth Hutchinson
in the American Control Conference (ACC), St. Louis, USA, 2009

In this paper, we propose a method to detect an unauthorized control signal being sent to a remote-controlled system (deemed an "intrusion fault" or "intrusion") despite attempts to conceal the intrusion. We propose adding a random perturbation to the control signal and using signal detection techniques to determine the presence of that signal in observations of the system. Detection of these perturbations indicates that an authorized or "trusted" operator is in control of the system. We analyze a worst case scenario (in terms of detection of the intrusion), discuss construction of signal detectors, and demonstrate our method through a simple example of a point robot with dynamics.


@inproceedings{candido_acc2009,
title={Detecting intrusion faults in remotely controlled systems},
author={Salvatore Candido AND Seth Hutchinson},
booktitle={Proceedings of the American Control Conference},
year={2009},
month={June},
pages={4968-4973},
}

• An Improved Hierarchical Motion Planner for Humanoid Robots
Salvatore Candido, Yong-Tae Kim, and Seth Hutchinson
in the IEEE-RAS International Conference on Humanoid Robots (Humanoids), Daejeon, Korea, 2008

Abstract: In our previous work, we proposed a hierarchical planner for bipedal and humanoid robots navigating complex environments based on a motion primitives framework. In this paper, we extend and improve that planner by proposing a different approach for the global and subgoal components of our planner. We continue to use a workspace decomposition that consists of a passage map, obstacle map, gradient map, and local map. We verify our approach using both simulation results and experimentally on a mechanical humanoid system.


@inproceedings{candido_humanoids2008,
title     = {An Improved Hierarchical Motion Planner for Humanoid Robots},
author    = {Salvatore Candido AND Yong-Tae Kim AND Seth Hutchinson},
booktitle = {Proceedings of the IEEE-RAS Internation Conference on Humanoid Robots},
year      = {2008},
month     = {December}
}

• Motion Strategies for Surveillance
Sourabh Bhattacharya, Salvatore Candido, and Seth Hutchinson
in Robotics: Science and Systems III (RSS), Atlanta, USA, 2007


AUTHOR    = {S. Bhattacharya and S. Candido and S. Hutchinson},
TITLE     = {Motion Strategies for Surveillance},
BOOKTITLE = {Proceedings of Robotics: Science and Systems},
YEAR      = {2007},
MONTH     = {June}
}
• A Workspace Decomposition for Hierarchical Motion Planning with Humanoid Robots
Yong-Tae Kim, Salvatore Candido, and Seth Hutchinson
in the International Conference on Advanced Robotics (ICAR), Jeju, Korea, 2007