Manipulation planning using learned symbolic state abstractions

Manipulation planning using learned symbolic state abstractions

Robotics and Autonomous Systems 62 (2014) 355–365 Contents lists available at ScienceDirect Robotics and Autonomous Systems journal homepage: www.el...

1MB Sizes 5 Downloads 58 Views

Robotics and Autonomous Systems 62 (2014) 355–365

Contents lists available at ScienceDirect

Robotics and Autonomous Systems journal homepage: www.elsevier.com/locate/robot

Manipulation planning using learned symbolic state abstractions Richard Dearden ∗ , Chris Burbridge School of Computer Science, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK

highlights • • • • •

Robotic manipulation planning system. Goals specified symbolically so different geometric solutions can be found. Learned two-way mapping between symbolic and geometric states. Mapping allows a symbolic state to be extracted from a scene. Allows a plan to be translated into a sequence of geometric configurations.

article

info

Article history: Available online 11 October 2013 Keywords: Intelligent robots Supervised learning Automatic planning Symbolic reasoning

abstract We present an approach for planning robotic manipulation tasks that uses a learned mapping between geometric states and logical predicates. Manipulation planning, because it requires task-level and geometric reasoning, requires such a mapping to convert between the two. Consider a robot tasked with putting several cups on a tray. The robot needs to find positions for all the objects, and may need to nest one cup inside another to get them all on the tray. This requires translating back and forth between symbolic states that the planner uses, such as stacked (cup1,cup2), and geometric states representing the positions and poses of the objects. We learn the mapping from labelled examples, and importantly learn a representation that can be used in both the forward (from geometric to symbolic) and reverse directions. This enables us to build symbolic representations of scenes the robot observes, but also to translate a desired symbolic state from a plan into a geometric state that the robot can achieve through manipulation. We also show how such a mapping can be used for efficient manipulation planning: the planner first plans symbolically, then applies the mapping to generate geometric positions that are then sent to a path planner. © 2013 Elsevier B.V. All rights reserved.

1. Introduction The vast majority of robots are currently controlled using hand-written programs. These programs are typically highly specialised to a single task, and include human-specified geometric information about the objects and behaviours being performed. This greatly limits the opportunities for reuse of the programs, and also makes creating new ones time consuming as it involves a mixture of writing code and recording geometric states of the robot and the objects the robot is interacting with. In this paper, we investigate how we can use parts of existing hand-written programs as atomic actions to produce a higher-level interface to the robot—one where a user specifies desired goals and the robot reassembles the parts of existing programs to generate a new program to achieve the goal. In particular, we are interested



Corresponding author. Tel.: +44 121 414 6687. E-mail addresses: [email protected], [email protected] (R. Dearden), [email protected]ham.ac.uk (C. Burbridge). 0921-8890/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.robot.2013.09.015

in robot manipulation tasks; we will use tidying a house as a motivating example throughout the paper. To allow the user to create new robot behaviours by specifying goals, we use an automatic planning approach. We represent the purpose of a fragment of one of the hand-built programs in terms of the preconditions that must be true to execute the code fragment and the effects it has on the state of the world. We refer to this representation as a planning action. The planner then searches for a sequence of these actions that will achieve the user-specified goal. Planning is challenging for robotic manipulation tasks because they contain a mixture of symbolic and geometric constraints. For example, we can plan to tidy a table by placing cups and plates on a tray and then moving them to the kitchen to be cleaned, but to accomplish the task we need to reason symbolically—how to achieve that no objects are on the table—and geometrically— where should I put each object, and how should I move the robot arm to achieve that. Many manipulation planning approaches (see Chapter 7 of [1] for an overview) assume that the task can be treated entirely as a geometric problem, with the challenge being to place all the objects in their desired positions. However, this

356

R. Dearden, C. Burbridge / Robotics and Autonomous Systems 62 (2014) 355–365

approach means that we are essentially providing part of the plan in advance. In our example, if one cup must be nested in another to get them all on the tray, this must be specified in the goal. By planning symbolically as well as geometrically, we make fewer assumptions about the domain information available to us, and can find a much richer set of plans for a task. Planning with a mixture of symbolic and geometric states requires a set of symbolic predicates that correspond to geometric relationships. For example, we need to be able to express that a cup is touching the tray, which we write symbolically as touching(cup,tray), but this predicate corresponds to a set of geometric configurations of the two objects, and some way must be found to represent this if we are to generate geometries for which the predicate is true. Previous approaches (see, for example, [2]) have manually built functions to do this. However, since we have the example programs available to us, we take the approach of learning the mapping between symbolic predicates and geometric states. By simulating the example programs we can generate a number of geometric states, and through a combination of comments in the code and knowledge of the program semantics we can label them with the predicates that hold in each one. This removes the burden on users to create the mapping themselves. By selecting the right representation, we can also use it both to translate from geometric to symbolic states (so as to determine what the symbolic representation of the current state is) and vice versa (to generate a geometric state corresponding to some symbolic predicate in a plan). Because the training data come from the existing robot programs, the learned mapping will also naturally reflect any implicit constraints in the programs, for example ensuring that no objects are placed too close to the edge of a table. We learn a kernel density estimate [3,4] from the training data for each predicate. This allows us not only to label unseen geometric states (the forward direction for the mapping), but also, via a hill-climbing search in the probability density function, to find geometric states that make predicates true with high probability (the backwards direction). We extend this to find geometric states for conjunctions of symbolic predicates and also – for backtracking – to find multiple, significantly different, geometric states corresponding to a symbolic state (this work was first published in [5]). Because of the interactions between the symbolic and geometric parts of the planning problem, the ideal planning approach would be to interleave the two, with geometric planning potentially causing symbolic backtracking when it fails, and vice versa. Unfortunately, generating paths and geometric states is very time consuming [6], so this hybrid planning approach is often unacceptably slow due to the many geometric states and robot path plans generated during the search for a plan. To overcome this, we take a different approach, preferring to generate a complete plan at a purely symbolic level, and then we translate that plan into a geometric one and generate paths to achieve the geometric configurations. If this process fails, we allow the system a limited amount of purely geometric backtracking – proposing new positions for the objects moved by the robot – before giving up and backtracking at the symbolic level to generate a different plan. The structure of this paper is as follows. In the next section, we discuss related work on both mapping between symbolic and geometric representations and on planning for robotic manipulation tasks. In Section 3, we describe the symbolic to geometric mapping in detail, and in Section 4 we present the planning algorithm that uses it. We present experiments to demonstrate the effectiveness of the approach in Section 5, and present our conclusions and future work in Section 6.

2. Related work A number of planning systems have been developed that operate in a mixture of geometric and symbolic states and therefore need to map between them. Probably the best known is aSyMov [7], which solves tasks very similar to ours using a planner that combines a symbolic planner with a probabilistic roadmap [8] for geometric planning. In common with most of these approaches, the authors assume that they are given a mapping from geometric to symbolic states. In addition, they restrict the objects to certain fixed world positions and only consider a single symbolic–geometric predicate ‘‘on’’, so the translation from symbolic to geometric states is trivial. Kaelbling et al. [2] have proposed an alternative approach to manipulation planning which does full hybrid planning, but, to reduce the complexity of the problem, uses a hierarchical planner and interleaves planning with execution. As with the above approach, it again only seems to use a single symbolic predicate for a geometric concept. Their mapping from symbolic to geometric states is handled by geometric suggesters which are hand-built functions for generating geometric states. Similar approaches have been proposed by Wolfe et al. [9] and Karlsson et al. [10], which both use a hierarchical task network planner to constrain the search space to make full hybrid planning more tractable. In [10], the authors use a uniform sampling approach to generate geometric states when backtracking, and in [6] the same authors extend this by using constraints to further reduce the amount of backtracking needed to find plans. Other recent ‘‘fully hybrid’’ approaches include work on semantic attachment by Dornhege et al. [11,12], and on planning modulo theories by Gregory et al. [13]. In the semantic attachment approach, the symbolic planning language includes calls to external functions that check preconditions of actions or calculate their effects and may update a number of state variables. For a manipulation domain, as described in [12], the RRT planner used to compute arm motions, and the geometric grasp planner used to check if an object can be grasped, would be linked to the symbolic planner through semantic attachments. Similarly, the planning modulo theories approach uses specialised solvers for specific theories – path planning might again be an example – to assist the symbolic planner. In both cases, it is unclear if it would be possible in a general way to use the specialised solvers to update our learned geometric predicates. The more traditional approach to manipulation planning is surveyed in Chapter 7 of [1]. These approaches treat planning as a completely geometric problem of searching in the set of reachable geometric configurations to reach a goal. However, this requires a geometric goal to be specified, which means that alternative geometric solutions that would solve the same symbolic goal cannot be generated—for example placing two cups on a tray or nesting the two cups on the tray. A recent alternative proposed by Mösenlechner and Beetz [14] is to specify goals symbolically but evaluate the plan geometrically. The idea is to use a high-fidelity physics simulation to predict the effects of actions and a hand-built mapping from geometric to symbolic states. Planning is conducted by a forward search, and the effects of actions determined by simulating them, and then using the mapping to update the symbolic state. This has advantages in terms of robustness, since multiple simulator runs can be performed, but is potentially very expensive for complex plans. The idea of learning a mapping from geometric to symbolic states has been considered before in the literature. However, what sets our approach apart from others is the idea of using the learned mapping in reverse, generating geometry from a symbolic description to enable symbolic planning operators to be utilised to produce geometric effects.

R. Dearden, C. Burbridge / Robotics and Autonomous Systems 62 (2014) 355–365

Jäkel et al. [15] present an approach to action learning and geometric constraint learning in the context of programming by demonstration. After segmenting logs of human-demonstrated actions, constraints on the possible motion of the robot arm are learned. However, constraints on object positions are not directly considered. Unlike our approach, they never need to use the mapping in the reverse direction because there is no task-level planning in their system: the task is only for the robot to execute the learned actions in different geometric configurations. Abdo et al. [16] also look at action learning from demonstration, but unlike [15] they do learn a symbolic representation of the actions. Where we concentrate on learning the geometric predicates, they pre-define the feature space and focus on learning the preconditions and effects of the actions, a direction in which we plan to extend this work in future. The recent work of Sjöö et al. [17,18] has also addressed spatial relationships. In [17], they consider the geometric meaning of on, and use hand-built rules to map from the geometric to the symbolic state. Although they go on to use the rules to construct a probability function of the on predicate, which gives them similar representational capabilities to the approach we take, the representation chosen does not generalise to other predicates, and requires hand construction. In [18], the same authors address the issue of hand-building rules, presenting a system that learns to classify object relationships by generating samples in simulation. While classification of several relations is successful, they are only interested in learning the mapping from geometric to symbolic states; using the mapping in reverse is not considered. Furthermore, the geometric state is described by a 93-dimensional feature vector comprising of contact points, surface normals and object poses, and the symbolic object relationships are learned from this feature vector. The high dimensionality means that very large amounts of training data are required to learn a good mapping, and it makes adapting their approach to compute the reverse mapping impractical. An approach similar to that in [18] is introduced by Rosman and Ramamoorthy in [19]. Again, contact points are used, this time between segmented point clouds. However, the use of K -nearest neighbours for classification again prevents using the classifier in reverse. 3. Symbolic to geometric mapping We learn a model of the probability density function Pr(C |G), where C is a symbolic predicate, G is geometric state, and Pr(C |G) is the probability that C holds in G. We represent the complete geometric state G as the position, scale, and rotation of the bounding box of every object in the scene, so, for a scene with n objects, G is a vector of size 10n, as described in Section 3.1. While bounding boxes provide a much coarser object representation than for example a mesh of each object, the simplicity of this representation keeps the dimensionality small, so learning is possible. As we describe below,we further reduce the dimensionality by learning in the space of relative poses where possible. Given an entire symbolic state S = C1 ∧ C2 ∧ · · · ∧ Cn , and assuming that the probability of each predicate is independent,1 the probability of it holding in a given geometric state can be calculated as

357

Our aim is to learn a classifier capable of producing class probability estimates from a number of labelled positive and negative examples of (G, C ) pairs, as discussed above. Most importantly however, we choose both a probability distribution function (PDF) model and a geometric state representation that allow optimising the probability and turning a given symbolic state representation back into an actual geometric state. The training examples could be hand-labelled geometric states, but, as we said above, can also be extracted from suitably annotated programs. We do this by adding an annotation after each action which specifies what symbolic effect the action has, so, after an action that grasps a mug and lifts it, we annotate the code with

#E (holds mug1_0), (not (touching mug1_0 table)), where #E indicates this is an effect annotation. At the start of the program, we require a similar annotation to record the initial state. By simulating the execution of the program we can then quickly generate a sequence of geometric states with corresponding sets of predicate labels. 3.1. Geometric state representation For a symbolic predicate C of two objects A and B, we define Gc , the three-dimensional geometric state representation relevant to C, as



GC = SA

SB

TBA

RA

RB ,



(2)

where So and Ro are the size and rotation of object o, respectively, and TBA is the relative translation between objects A and B (computed from the absolute positions in G). Using the difference in position between the objects rather than absolute positions provides better generalisation from the examples. However, the same cannot be done with the rotation, as many predicates require specific orientations (pouring for example). We represent the rotations using quaternions, taking the geometric state vector to 17 dimensions. Similarly, for a predicate C with a single argument such as (upside-down ?x), the representation is ten dimensional:



GC = S

P

R ,



(3)

where S and R are the size and rotation of the object as before, and P is its absolute position.2 We need to use the absolute position, because we will be optimising that when translating a symbolic to a geometric state, but also to support the learning of predicates which are position dependent, such as (reachable ?x), representing that the arm can reach the object. The size of each object is expressed as the dimensions of its bounding box. This has two important advantages. First, it allows us to generalise from one object to another, so we can learn the predicate (above ?x ?y) from examples of (above cup tray). Second, it keeps the dimensionality of the state space for learning sufficiently small that we can learn without requiring an extremely large training set. 3.2. Probability model

(1)

It is important to have a smooth continuous representation of Pr(C |GC ). This allows points of high probability in the multidimensional space to be located using numerical methods, which in turn allows finding geometric representations for a given symbolic state. Modelling the probability using Gaussian kernel density estimation (KDE) provides such a smooth function.

1 The predicates will often not be independent in practice, but learning a joint over all the predicates would require infeasibly many training examples.

2 In the case of upside-down, the S and P parameters are irrelevant. However, removing them from the representation would imply additional knowledge of the predicate that we do not wish to assume.

Pr(S |G) =

n 

Pr(Ci |G).

i=1

358

R. Dearden, C. Burbridge / Robotics and Autonomous Systems 62 (2014) 355–365

Fig. 1. One-dimensional example of a probability distribution modelled using Eq. (4).

Given a set of n+ example geometric states G+ that a predicate holds in, and a set of n− geometric states G− in which the same predicate does not hold, we model the probability as Pr(C |GC ) =

Pr(GC |C ) Pr(GC |C ) + Pr(GC |¬C )

,

(4)

where Pr(GC |C ) and Pr(GC |¬C ) are the multivariate KDEs of the positive and negative datasets for C , respectively, evaluated at GC : +

n 1  −∥G+ −GC ∥2 /h2 Pr(GC |C ) = + e i n i =1

(5)



n 1  −∥G− −GC ∥2 /h2 e i , Pr(GC |¬C ) = − n i =1

(6)

where h is the kernel bandwidth. We use the FigTree Fast Gaussian Transform library [4] for efficient KDE implementation. Since we have no reason to believe C is more likely to be true than false, Eq. (4) is simply Bayes’ rule with an uninformative prior over C . The norm we use in Eqs. (5) and (6) is the L1-norm. If, for example, the range of values for object positions was significantly greater than the range of values for rotation, this might lead to the distance measure being dominated by position. We have not observed this in practice, but it could easily be solved by normalisation. Fig. 1 shows an example of the probability model for a onedimensional case. It is clear that, in areas of negative examples, the function yields values near zero, and in areas of positive examples the function yields values near 1. Importantly, in the boundary between positive and negative examples, the function varies smoothly with a smoothness dependent upon the width of the Gaussian example points either side. This allows numerical hill-climbing techniques, as described in the following section. 3.3. Using the learned mapping The learned probability distribution can now be used both to classify given geometric states into symbolic states, and to generate geometric states that comply with desired symbolic states. Classification can be achieved by thresholding the class probability at some sensible value, depending upon how certain the classification needs to be. The more interesting case of using the learned distribution to generate geometric states matching desired symbolic configurations is more demanding. For a single predicate goal we achieve this by maximising the 17-dimensional probability function, keeping the six elements corresponding to object size constant. The smooth and continuous nature of the function allows maximisation to be carried out numerically using hill-climbing from an initial estimate selected randomly from G. We found the Broyden–Fletcher–Goldfarb–Shanno optimisation method [20] to perform suitably well for this purpose. Since the function is a sum of

Gaussians, different start points are likely to result in different local maxima. When considering a sentence consisting of the conjunction of more than one predicate using Eq. (1), the dimensionality of the space increases, and dependencies between objects have to be taken into account. When an object appears in multiple predicates, the size and orientation of the object only appear once in the geometric state. In addition to this, the translations between objects are never allowed to form a cycle. For example, if the state contains the translation between A and B, and between B and C, then the translation between A and C is already fixed. We avoid these problems by optimising over the absolute pose of all nonfixed objects (that is, we hill-climb in G), but evaluating the poses using the relative translations between the objects (from GC ) as described in Section 3.1. As an example, consider using the approach to find a geometric state for the five-predicate symbolic state

(and (above cup1 tray) (touching cup1 tray) (above cup2 tray) (touching cup2 tray) (not (touching cup1 cup2))). If each of the five predicates was treated independently, then the search space would contain 5 × 17 = 85 dimensions. Removing the repeated dimensions for the orientations of objects appearing in the state multiple times, and setting the sizes of all the objects constant leaves a search space of 27 dimensions. However, optimising directly in this space could lead to an impossible geometric state, as the space contains a loop of transformations: if a transformation between cup1 and the tray is found, and a transformation between cup2 and the tray is found, then the transformation between cup1 and cup2 has to be fixed, and cannot be found independently. To avoid inconsistencies, we optimise in the space of absolute object poses not relative poses. In this example, the optimisation vector comprises a quaternion and a position for each object considered in the state (3 objects, 21 dimensions). When calculating the probability of a state in this form, each relative translation between objects is calculated on the fly during the numerical hill-climbing. 3.4. Generating multiple candidate geometric states Our approach for generating geometric states for given symbolic states outlined in the previous section will provide only a single geometric state with a high probability of complying with the requested symbolic description. However, the reverse mapping is of a one-to-many nature, and in many cases multiple candidate geometric states are required. A key example of this is when performing hybrid planning, in which the robot may need to decide where to place a cup. If the first chosen place is not possible, because it is blocked by an obstacle, for example, alternative candidate locations that are significantly different from the first may need to be selected and tested during a backtracking stage. While we could avoid this problem by including all the objects in the scene in the optimisation, this leads to much slower optimisation times, and in practice we find that the overall performance is worse. Also, we would still have to perform this step due to other sources of failure at the path planning stage, including impossible arm configurations, unreachable positions, and the hand colliding with one object while trying to place another. In earlier work [5], we proposed masking the entire joint probability distribution (Eq. (1)), temporarily decreasing the probability of geometric states already created, thereby forcing the maximisation to select an alternative point. Although this approach was shown to work successfully, it is inefficient when trying to find multiple candidate geometries for a symbolic state involving more than one object. For example, when generating geometric states for two cups on a tray, the first and second

R. Dearden, C. Burbridge / Robotics and Autonomous Systems 62 (2014) 355–365

geometric states could be identical, except with the positions of the cups swapped. If backtracking in planning occurred because a cup does not fit in one of the locations, the newly generated state will also fail. We have found that it is better to mask each predicate independently, so for each predicate we generate a masking function consisting of a Gaussian centred on the geometric state found by hill-climbing for that predicate. For each point in the optimisation space, we then subtract the value of the masking Gaussian that is largest at that point. Let S = {C1 , . . . , Cn } be a complete symbolic state, let µ1 , . . . , µm be the geometric states found so far for predicate Ci , and let k = |GCi | be the number of dimensions in the representation of Ci . To mask the predicate for backtracking, we create a Gaussian with mean µj and covariance matrix Qi for each previous geometric state of that predicate: Ki (G) =

m 

1

 j =1

1

(2π ) |Qi | k

T Q −1 (G−µ ) j i

e− 2 (G−µj )

.

(7)

We then modify Eq. (1) as follows: Pr(S |G) =

n 

Pr(Ci |G) − max Ki (G).

i=1

(8)

i

This forces the hill-climbing to avoid masked geometric states. The set of masked states Gmask (one set per predicate) expands as canCi didate states are generated, so, if we backtrack multiple times, we end up with multiple masks for each predicate. Each mask point also has a covariance associated with it, the effect of which is to vary the amount of masking achieved. In the experiments reported here, we are using a covariance selected to mask an area about the size of the object being moved. 4. Task planning As we have already said, the planning task is made challenging by the mixture of geometric and symbolic states. To plan optimally for both of these we would need a hybrid planner that interleaved symbolic and geometric planning. However, this means that the search for a plan may involve finding many geometric states and robot path plans that are not actually used in the final plan. Since the translation from symbolic to geometric states (described in Section 3) is quite slow, a full hybrid approach is often impractical for complex problems. Instead, our approach is to plan symbolically, and then to generate a geometric plan from the symbolic plan and test it in the simulator. If the plan fails in simulation, we attempt to generate different geometric solutions by backtracking over the translation from symbolic to geometric states, and if that fails we then backtrack in the symbolic planner. Our hybrid planning approach is summarised in Algorithm 1. We describe each step in more detail below, using as a running example the task of placing two stackable cups on a saucer. In the example, the goal is

(above cup1 saucer) (above cup2 saucer), and the initial state is

(above cup1 table) (touching cup1 table) (above cup2 table) (touching cup2 table) (above saucer table) (touching saucer table). We assume that there are three actions available to the robot: grasp(?x) moves the robot’s hand to object ?x and grips it, release(?x) releases object ?x if it has been grasped, and move(?x,?y) moves grasped object ?x to be above and touching object ?y. The first step is to create a symbolic plan. The symbolic planner we use is Fast Downward [21], using the FF heuristic. This typically

359

Algorithm 1 Hybrid Planning Algorithm Require: MAX_G_BACKTRACKS: maximum allowed geometric backtracks Require: MAX_S_BACKTRACKS: maximum allowed symbolic backtracks procedure hybrid_planner(problem, domain, plan = None, gt = 0, st = 0) if plan is None then s_plan ← fast_downward(problem, domain) if gt < MAX_G_BACKTRACKS then g_plan ← hybridize_plan(s_plan) if g_plan ̸= FAIL and does_plan_work(s_plan, g_plan) then # test the plan in the simulator return g_plan, s_plan # success end if Mask each predicate in the failed action’s add list return hybrid_planner(problem, domain, s_plan, gt + 1,st) else if st < MAX_S_BACKTRACKS then Add the negation of the symbolic state resulting from s_plan to the goal in problem return hybrid_planner(problem, domain, None, 0, st + 1) end if return FAIL end procedure

generates initial plans for these simple manipulation domains in under a second—significantly faster than the translation to geometric states or the generation of paths for the robot. For our example, we might expect the initial plan generated to be

grasp(cup1), move(cup1, saucer), release(cup1), grasp(cup2), move(cup2, saucer), release(cup2). 4.1. Turning a symbolic plan into a geometric plan Given a symbolic plan without any attached geometry, we add geometry to it as follows. We first simulate the symbolic plan to generate the sequence of symbolic states the robot will visit when executing the plan. Each of these states consists of a set of predicates that are either true or false at that point in the plan. For each state, we then use the geometric to symbolic mapping in reverse to generate a geometric state in which the symbolic state predicates would be satisfied with high probability (any predicates not specified in the state are irrelevant to the validity of the symbolic plan, although, as we will discuss later, they may not be irrelevant to the geometric plan). The algorithm is presented as Algorithm 2. We do the translation to geometric states starting at the end of the plan and working towards the beginning, as this allows most of the geometric state to be computed as a single optimisation— generating the geometry starting at the beginning of the plan typically leads to more backtracking, as geometries inconsistent with subsequent plan steps are generated. We also restrict the geometric optimisation to only move objects that have been moved in the symbolic plan. If a predicate changes value in the plan, we generate the corresponding geometric state only once. This ensures that the geometric state will be consistent for as long as the predicate persists, so we do not need to check consistency between geometric solutions for a particular predicate. The generate_geometry procedure in Algorithm 2 requires two threshold parameters, T and Prmin , that control the maximum number of geometry generation attempts and the minimum accepted probability for generated states, respectively. Multiple hill-climbing attempts are used to reduce the likelihood of the gradient ascent being captured by a local maximum. This is particularly likely as the predicates become increasingly masked using Eq. (8). In experiments, we found setting T = 2 to be sufficient. Threshold Prmin controls the quality of the generated geometry. Setting it too high results in generated geometric states that are good being rejected, while setting it too low can result in poor geometric states being accepted. Based on the experiments in Section 5.2, we set Prmin = 0.7. This compromises between accepting erroneous states, and rejecting good states that, for example, simply have two cups close to each other but not touching.

360

R. Dearden, C. Burbridge / Robotics and Autonomous Systems 62 (2014) 355–365

Algorithm 2 Generation of geometry for a symbolic plan. procedure hybridize_plan(P) Input:Symbolic Plan P = {a1 , ..., an } Output: Sequence of geometric states G = G1 , ..., Gn where Gi is a symbolic state only for the objects whose state changed in step i of the plan. Let s0 be the initial state Let si , i ∈ 1, ..., n be the state after action ai Let mi , i ∈ 1, ..., n be the set of objects moved up to action ai # Determine the geometric states to generate for state si ∈ so , ..., sn do Remove all non-geometric predicate clauses from si end for for state si ∈ sn , ..., s0 do Remove any clause from si that is unchanged from the initial state end for for state si ∈ sn , ..., s0 do if si = ∅ then # No geometry needs to be generated, return an empty geometry Let Gi ← [] else Let o be the set of all objects in si Let Gi ← generate_geometry(si , o ∩ mi ) if thenGi = fail Return fail end if for state sj ∈ s1 , ..., si−1 Remove any clause from sj that is unchanged between sj and si end for end if end for return G = {G1 , ..., Gn } end procedure Require: T : Maximum number of times to attempt hill-climbing for each state. Require: Prmin : Minimum probability to accept a generated state as correct. procedure generate_geometry(s, o) Input: Set of symbolic predicates s to generate geometry for do Input: Set of objects o that can be moved in the geometry generation Output: Geometric state, or fail if no state can be found Construct the geometric search space for sentence s for i ∈ 1, ..., T do Generate a random start state in the space Perform hill-climbing via the BFGS algorithm if the probability of the resulting state ≥ Prmin then return the state end if end for return fail end procedure

Continuing our example above, we generate a geometric plan via Algorithm 2. It is possible that the geometric translation fails immediately at this point if no position with the two cups on the saucer exists. 4.2. From geometry to a path plan If the translation from a symbolic to a geometric plan succeeds, we then generate paths for the robot motions to grasp and move the objects using the OpenRAVE RRT path planner, and if this step succeeds we now have a useable plan in the geometric space, and the robot can go ahead and execute it. If the path planning fails, this indicates that it is impossible to move an object without some kind of collision (either between the object and an obstacle, between the arm and an obstacle, or a self-collision of the robot arm) occurring, and backtracking will be required (see below). Collisions between the object being moved and an obstacle occur because the geometric state generated in the previous step only includes the objects that appear in the symbolic state. These collisions could be avoided by including all objects in the gradient ascent rather than just those that appear in the predicates, but in our experience this leads to much increased times for generating geometric states, because the gradient ascent is in a higher-dimensionality space, and this cost far outweighs the cost of replanning on the rare occasions when this kind of collision occurs. Because the path planning uses the actual object model

rather than the bounding box, it is also occasionally possible to find solutions where the bounding boxes intersect but the objects do not. In domains where there are many more obstacles and so backtracking due to collisions is more frequent, it may be that considering all the objects during gradient ascent would be more efficient. 4.3. Geometric and symbolic backtracking If the path planner fails due to a collision, backtracking is used to try to find a plan that will succeed. The first backtracking is in the geometric translation as described in Section 3. For the action that failed, we mask (as described in Section 3.4) each of the predicates it would add to the state and perform the gradient ascent again in the masked space. In our example, if the action that failed is move(cup2, saucer), the predicates masked are above(cup2, saucer) and touching(cup2, saucer), both of which correspond to the same point in the geometric space— the relative positions of the two objects. If geometric backtracking fails to generate a geometric state that we can plan a path for, we then backtrack the symbolic planner to search for a different plan to accomplish the goal. To ensure that the plan found is different, we add the negation of the state for which a geometric solution could not be found to the goal so that state cannot be generated in the plan. This helps prevent plans that simply move objects in a different order from being generated, and thus encourages ‘‘significantly different’’ plans. In our example, this might produce the new plan

grasp(cup1), move(cup1,saucer), release(cup1), grasp(cup2), move(cup2, cup1), release(cup2), which nests cup2 inside cup1 on the saucer. This is again translated to a geometric plan and the robot motions planned by the RRT planner. If the path planner succeeds, the plan is passed to the robot for execution. The number of symbolic and geometric backtracks available to the planner are intentionally limited, left as a parameter that can be varied by the user, as there are generally infinitely many geometric states for a given symbolic one. We discuss the settings we use in Section 5. 5. Experiments and results To test our system, example geometric states of two cups and a tray were used for the geometric–symbolic mapping. Each state was labelled to indicate whether one cup was ‘above’ and/or ‘touching’ the tray or the other cup. The dataset for the ‘above’ predicate comprised 328 positive and 1672 negative examples, and the dataset for the ‘touching’ predicate comprised 384 positive and 1872 negative examples. These samples were then used to create the 17-dimensional probability distribution detailed in Section 3.2. Examples of the x and y dimensions of the distribution with z 3 cm above the tray are shown in Fig. 2. Each point on the plane is coloured according to the probability that a cup placed at that (X , Y , Z ) position meets the specific symbolic state: the subimages show the probability that the position is (b) above the tray; (c) above the tray and not touching either other cup; (d) above and touching the cup on the right. The bandwidth parameter (variance of the Gaussians) for the kernel density estimation was selected manually (it depends on the size of the workspace and the measurement units). Larger values make the boundary transitions slower, while low values lead to abrupt transitions that are more difficult for the optimisation. Ideally, it could be computed through leave-one-out cross-validation.

R. Dearden, C. Burbridge / Robotics and Autonomous Systems 62 (2014) 355–365

361

Fig. 2. Probability distributions superimposed on a geometric state. (a) Tray and two cups. (b) Probability of a cup above the table. (c) Probability of a cup above the table and not touching other cups. (d) Probability of a cup above and touching other cup. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 3. The geometric states tested in Table 1.

Fig. 4. Geometric states generated for the symbolic states give in Table 2. Table 1 Classification probabilities for the states shown in Fig. 3.

5.1. The forward mapping To evaluate the predicate classifiers in the forward direction, we carried out leave-one-out cross-validation for each predicate. Cross-validation for several classification thresholds suggested an optimum value of about 0.8. We found that, with significantly fewer training samples than the already small number that we demonstrate with here, the classification results in large numbers of false negatives. The resulting confusion matrices when using a threshold of 0.8 were as follows.

Actual

Above Not above

Actual

Touching Not touching

Predicted Above 328 24 Predicted Touching 376 24

Not above 0 1648 Not touching 8 1848

Fig. 3 illustrates the forward mapping by showing four simulated geometric configurations. For each of them, we compute the probability of a number of symbolic predicates, with the results in Table 1. From the table it can be seen that, even with a higher threshold in the region of 0.9, these states are classified correctly. Although the geometric configurations differ from the examples that the system was trained with, the choice of representation for the probability distribution means that states falling between two positive examples are awarded near-1 probability. The smooth

State

(a)

(b)

(c)

(d)

(above cup tray) (touching cup tray) (above cup2 tray) (touching cup2 tray) (touching cup cup2) (above cup cup2)

0.999 0.971 0.999 0.974 0.038 0.009

1.000 0.000 0.016 0.000 0.000 0.000

1.000 0.001 0.999 0.962 0.105 0.983

0.999 0.643 0.999 0.962 0.925 0.970

nature of the probability estimate can be further seen in example (d) (one cup stacked inside another) when deciding if the upper cup is touching the tray or not. A probability of 0.643 indicates that it is close to the tray, but not as close as the cup below, which is touching with probability greater than 0.9. 5.2. The reverse mapping Fig. 4 shows four geometric states automatically generated to match the symbolic states listed in Table 2. States (b) and (c) demonstrate the result of combining the individual predicate distributions. Trying to find a geometric state that has multiple cups on the tray but not touching each other results in a process somewhat like a spring-and-mass system during hill-climbing where the cups are pushing each other apart while the tray is pulling them towards it. The end configuration is the cups on the tray and as far apart from each other as possible. We also measured the time taken to generate a geometric state as the number of predicates grows. This is shown in Fig. 5. We show

362

R. Dearden, C. Burbridge / Robotics and Autonomous Systems 62 (2014) 355–365

Table 2 Classification probabilities for the states shown in Fig. 4.

(a) (b) (c) (d)

Goal state supplied

Probability

(and (above cup tray) (touching cup tray)) (and (above cup tray) (touching cup tray) (above cup2 tray) (touching cup2 tray) (not (touching cup cup2))) (and (above cup tray) (touching cup tray) (above cup2 tray) (touching cup2 tray) (above cup3 tray) (touching cup3 tray) (not (touching cup cup2)) (not (touching cup cup3)) (not (touching cup2 cup3))) (and (above cup tray) (touching cup tray) (above cup2 tray) (touching cup2 tray) (above cup3 cup2) (touching cup3 cup2) (not (touching cup cup2)))

0.989 0.977 0.941 0.887

two datasets. In the first, labelled ‘‘One Predicate per Object’’, the predicates consist of above(cupi , tray) for one to five cups, while in the second, labelled ‘‘Complete Scene’’, the predicates consist of

above(cupi , tray), touching(cupi , tray), not touching(cupi , cupj ) for one to four cups, leading to scenes with more predicates but fewer objects, so fewer dimensions added to the search space per predicate. The geometries in Fig. 4 correspond to this second dataset with two, five, nine, and nine predicates respectively. In all cases, the times are for just the call to generate_geometry in Algorithm 2. 5.3. Backtracking To demonstrate generating multiple different geometric solutions, Fig. 6 shows a two-dimensional slice through the masked PDFs on a single X , Y plane 3 cm above the tray resulting from generating six positions consistent with the symbolic state (above cup tray) (touching cup tray) using the masking approach detailed in Section 3.4. The effect of the Gaussian shape of the masks where the cups are placed can be seen in the distribution; a wider kernel width will mask more of the distribution, leading to the generated positions being further apart, while a narrower kernel width would have the opposite effect. 5.4. Robotic manipulation planning To demonstrate the applicability of our planning approach in a real-world robotics context, we used the learned predicates and planning system to generate plans for execution by a robot manipulator. We constructed a planning domain description in PDDL [22] comprising actions to pick objects and place them on other objects. For each symbolic action we have a corresponding fragment of robot code to perform the action with parameters to fill in for the geometric positions. The actions make calls to an RRT path planner to move the 6-DOF (degree of freedom) arm between poses. We tested our planning approach by setting the goal to be three cups above and touching the tray and the fourth above and touching another. The planner generated the sequence of actions necessary to arrange the cups and turned the symbolic plan into a full geometric plan. The plan was then directly executed on the real robot, as shown in Fig. 7.3 The robot used was an industrial arm (Kuka KR5-Sixx R850) with a two-finger gripper (RoadNarrows Graboid). Cup localisation was performed using a Kinect RGBD camera. The point cloud was segmented into clusters above the table plane, with each cluster centroid giving the location of an object. The transformation from Kinect frame to robot frame was calculated using a calibration board at a known location in the robot frame. We then tested our planning system in simulation to evaluate both the symbolic and the geometric backtracking. In total, 24 scenarios were evaluated, with the maximum number of geometric backtracks permitted set at 6 and the maximum number of symbolic backtracks set at 4. For each experiment, the goal

3 Video of execution available at: http://goo.gl/Qrm6B

Fig. 5. Time to generate a geometric state (without backtracking) as a function of number of predicates. The red data correspond to n objects with one predicate involving each object, while the green data correspond to a more realistic scene with a number of cups that must be above and touching a tray but not touching each other. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

supplied to the planner was to remove all cups from the table. The environment consisted of a variable number of cups and a tray on the table. The tray had a varying number of immovable objects on it. The results are shown in Table 3. The first column indicates which of the tray states shown in Fig. 8 was being tested, and the second column indicates the number of cups that were in the environment—i.e. the number of cups that the planner needed to fit on the tray. The third and fourth columns indicate the total number of backtracks that the planner took to solve the problem, and the final column gives the end configuration found. The results show that typically only one or two geometric backtracks are necessary, even on a cluttered tray. This is because the masking of the predicates quickly pushes the hill-climbing away from occupied areas. Masking each predicate individually rather than simply using one mask for the complete state helps too. If it is found that one cup does not fit in a given location on the tray, then this information is stored at the predicate level, and it prevents the system from trying to put any other cups in that location. The efficiency of the approach hinges on the number of geometric backtracks that occur, with each backtrack requiring geometry to be generated for each part of the symbolic plan. For the plans created here, a single geometric backtrack typically requires less than a minute using our Python-based implementation. The symbolic plan time was negligible, and the RRT plan time was minimal due to the nature of environment and robot and the efficient OpenRAVE based implementation. Symbolic backtracking only occurred when trying to place four cups on tray (d) and (f) (highlighted rows in Table 3). For tray (d), the symbolic planner first generated a plan to place all four cups separately on the tray. Geometry was then generated for this plan, but, as some of the chosen cup positions were in collision with the objects already on the tray, path planning failed, and geometric backtracking occurred. Geometric backtracking occurred six times (the maximum allowed number of geometric backtracks per

R. Dearden, C. Burbridge / Robotics and Autonomous Systems 62 (2014) 355–365

363

Table 3 Resulting states, backtrack counts, and total planning time when creating plans to place different numbers of cups onto the tray states shown in Fig. 8. For four cups in tray state (d), three cups were on the tray while one was nested in another. In tray state (f), the three reported results correspond to three different outcomes. In F1 , the planner failed to return a plan; in F2 , it returned a plan with all four cups individually on the tray; and in F3 , it placed three cups on the tray and nested one in another. Tray state

No. of cups

Geometric backtracks

Symbolic backtracks

Total time (s)

End state

A A A A B B B B C C C C D D D D E E E E F F F F1 F2 F3

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 4 4

0 0 0 0 1 0 0 0 0 1 1 1 0 2 1 6 0 1 1 1 0 2 1 6 1 1

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 4 0 1

6.01 16.33 27.51 46.03 11.17 15.27 28.46 49.84 5.95 30.67 55.29 91.41 5.87 54.39 70.35 246.26 6.27 36.35 61.47 124.73 6.40 44.37 59.13 528.60 110.85 270.08

Cup on tray Cups on tray Cups on tray Cups on tray Cup on tray Cups on tray Cups on tray Cups on tray Cup on tray Cups on tray Cups on tray Cups on tray Cup on tray Cups on tray Cups on tray One cup stacked Cup on tray Cups on tray Cups on tray Cups on tray Cup on tray Cups on tray Cups on tray FAIL Cups on tray One cup stacked

Fig. 6. The effect of the masking during geometric backtracking. The PDF of the learned predicates for a cup on a tray is shown with successive masks applied due to repeated backtracking.

symbolic plan), each time failing to find a configuration that would work. Symbolic backtracking then took place, and a new symbolic plan was created to place three of the cups on the tray and the fourth stacked on one of them. Geometry was then generated for this plan, and the plan successfully executed without further backtracking. When trying to clear four cups on to tray (f) (the last row in Table 3), the planner failed after reaching the maximum allowed four symbolic backtracks, each with six geometric attempts. The initial symbolic plan was again to place all four cups on the tray. The first symbolic backtrack then again suggested stacking one cup on to one of the others, for which still no geometric solution was

found. Symbolic backtracking occurred again, but each following symbolic plan simply swapped which cup was stacked on which, not changing the geometry of the situation in any way. Finally, we examine the times required to generate a subset of the plans in Table 3. For each plan, we show the symbolic planning time, the time to generate the geometry, the time for the path planner, and the total time. The results are shown in Table 4, and are the total time spent in each part of the process, which may include multiple backtracks. Here, the geometric planning time is the time required for the whole of the hybridize_plan procedure in Algorithm 2, so it will include multiple calls to generate_geometry even with no backtracking. Here, state A with

364

R. Dearden, C. Burbridge / Robotics and Autonomous Systems 62 (2014) 355–365

Fig. 7. The test planning scenario on the Kuka arm. (a) The initial configuration. (b) Ready to grasp a cup. (c) Cup above table. (d) Cup above another cup.

Fig. 8. The tray states for evaluating the planner. The objects on the tray are not movable by the robot. Table 4 Times for plan generation for selected problems from Table 3. State

Number of cups

Symbolic plan (s)

Geometric plan (s)

Path plan (s)

Total plan time (s)

A B D D D

1 1 1 2 4

0.02 0.02 0.02 0.04 0.92

3.50 6.20 3.49 40.09 210.60

2.49 4.56 2.36 13.46 34.74

6.01 11.17 5.87 54.39 246.26

one cup and state D with one cup involve no backtracking, state B with one cup involves one geometric backtrack, so the geometry and path planning times are the sums of the times for the two attempts, and similarly state D with two cups involves a total of three geometry generations and three path plans, and state D with four cups involves two symbolic plans (each takes just under half a second), six geometry generations, and three path plans (there

are fewer path plans because the geometry generation often failed to find a high-probability solution). While the 4 min required to generate a plan for this case is quite large, the problem involves seven objects in total (not including the table) and the generation of two different plans, each of which has 16 steps. It is also designed to demonstrate something close to worst-case behaviour for the algorithm.

R. Dearden, C. Burbridge / Robotics and Autonomous Systems 62 (2014) 355–365

6. Discussion and future work As we argued above, the mapping between geometric and symbolic states is a necessary part of any robotic system that needs high-level reasoning to achieve tasks. We have shown that such a mapping can be learned from examples and that the mapping can successfully be used in both directions. Furthermore, as we show in the experiments, the reverse mapping can cope with conjunctions of predicates, although performance does depend on the number of predicates in the conjunction. In addition, we have demonstrated the usefulness of the mapping in the context of hybrid planning. The combination of geometric and symbolic planning makes manipulation planning a difficult task. The learned mapping allows planning to be done purely symbolically and the resulting plan to be translated to a geometric one in a more efficiently manner than by using a complete hybrid planner. Since the computational cost is far larger for generating geometric states and robot motions, minimising the number of times this needs to be done is critical to keeping the overall computation time down. Performing the geometric translation backwards through the symbolic plan makes this even more efficient by optimising the positions of as many objects as possible simultaneously, thus greatly reducing the number of optimisations that must be performed: even though the optimisation is in a higher-dimensional space, this is less significant than doing repeated searches with backtracking. The approach we describe is not complete: there are solvable problems for which it will not find a plan. The final problem in Table 3 shows an example of this, although, if we allowed further symbolic backtracking for the runs that failed, a solution – with two stacks of two cups – would eventually be found. However, there are also examples where our approach will never generate a plan. One example is if an object must be moved off the tray entirely to make room for all the cups. Because that object is not required in the symbolic plan, it will never appear in that plan, so picking it up will not be considered. While we are choosing to sacrifice completeness for efficiency, we are nevertheless considering ways to solve this problem. One possibility is to allow the geometric path generation to change the symbolic problem to report the reasons it has failed to the symbolic planner. Another, which we have explored in [23], is to repair the plan at execution time. Finally, we are currently looking at how to learn the action models automatically from the example programs so that a complete system can be constructed directly from well-commented robot code. Part of the challenge of this is to learn models of actions that can be applied to objects other than the ones that appear in the programs, thus allowing generalisation to novel objects or combinations of objects. Acknowledgement This work was supported by the European Community’s Seventh Framework Programme as part of the GeRT project, grant agreement number ICT-248273. References [1] S.M. LaValle, Planning Algorithms, Cambridge University Press, 2006. [2] L.P. Kaelbling, T. Lozano-Pérez, Hierarchical task and motion planning in the now, in: Proceedings of Robotics and Automation, ICRA, 2011, pp. 1470–1477. [3] D.W. Scott, Multivariate Density Estimation: Theory, Practice, and Visualization, John Wiley and Sons, Inc., Hoboken, NJ, USA, 2008. [4] V.I. Morariu, B.V. Srinivasan, V.C. Raykar, R. Duraiswami, L.S. Davis, Automatic online tuning for fast Gaussian summation, in: Advances in Neural Information Processing Systems, NIPS, 2008.

365

[5] C. Burbridge, R. Dearden, Learning the geometric meaning of symbolic abstractions for manipulation planning, in: G. Herrmann, M. Studley, M.J. Pearson, A.T. Conn, C. Melhuish, M. Witkowski, J.-H. Kim, P. Vadakkepat (Eds.), TAROS, in: Lecture Notes in Computer Science, vol. 7429, Springer, ISBN: 978-3-642-32526-7, 2012, pp. 220–231. [6] F. Lagriffoul, L. Karlsson, A. Saffiotti, Constraints on intervals for reducing the search space of geometric configurations, in: 2012 IEEE/RSJ International Conference on, Intelligent Robots and Systems, IROS, 2012. [7] F. Gravot, S. Cambon, R. Alami, aSyMov: a planner that deals with intricate symbolic and geometric problems, in: ISRR, in: Springer Tracts in Advanced Robotics, vol. 15, Springer, 2003, pp. 100–110. [8] T. Siméon, J.-P. Laumond, C. Nissoux, Visibility based probabilistic roadmaps for motion planning, Advanced Robotics Journal 14 (6) (2000). [9] J. Wolfe, B. Marthi, S. Russell, Combined task and motion planning for mobile manipulation, in: Proceedings of the 20th International Conference on Automated Planning and Scheduling, 2010. [10] L. Karlsson, J. Bidot, F. Lagriffoul, A. Saffiotti, U. Hillenbrand, F. Schmidt, Combining task and path planning for a humanoid two-arm robotic system, in: Proceedings of TAMPRA: Combining Task and Motion Planning for RealWorld Applications. ICAPS 2012 Workshop, 2012. [11] C. Dornhege, P. Eyerich, T. Keller, S. Trüg, M. Brenner, B. Nebel, Semantic attachments for domain-independent planning systems, Towards Service Robots for Everyday Environments (2012) 99–115. [12] C. Dornhege, M. Gissler, M. Teschner, B. Nebel, Integrating symbolic and geometric planning for mobile manipulation, in: 2009 IEEE International Workshop on Safety, Security & Rescue Robotics, SSRR,, IEEE, 2009, pp. 1–6. [13] P. Gregory, D. Long, M. Fox, J.C. Beck, Planning modulo theories: extending the planning paradigm, in: Proceedings of the International Conference on Automated Planning and Scheduling. [14] L. Mösenlechner, M. Beetz, Using physics- and sensor-based simulation for high-fidelity temporal projection of realistic robot behavior, in: 19th International Conference on Automated Planning and Scheduling, ICAPS, 2009. [15] R. Jäkel, S. Schmidt-Rohr, M. Lösch, A. Kasper, R. Dillmann, Learning of generalized manipulation strategies in the context of Programming by Demonstration, in: 2010 10th IEEE-RAS International Conference on Humanoid Robots, Humanoids, IEEE, 2010, pp. 542–547. [16] N. Abdo, H. Kretzschmar, C. Stachniss, From low-level trajectory demonstrations to symbolic actions for planning, in: ICAPS Workshop on Combining Task and Motion Planning for Real-World Applications, 2012. [17] K. Sjöö, A. Aydemir, T. Morwald, K. Zhou, P. Jensfelt, Mechanical support as a spatial abstraction for mobile robots, in: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS, IEEE, 2010, pp. 4894–4900. [18] K. Sjöö, P. Jensfelt, Learning spatial relations from functional simulation, in: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS, IEEE, 2011, pp. 1513–1519. [19] B. Rosman, S. Ramamoorthy, Learning spatial relationships between objects, The International Journal of Robotics Research 30 (11) (2011) 1328–1342. [20] R. Fletcher, Practical Methods of Optimization, second ed., John Wiley & Sons, 1987. [21] M. Helmert, The fast downward planning system, Journal of Artificial Intelligence Research 26 (2006) 191–246. [22] D. McDermott, M. Ghallab, A. Howe, C.A. Knoblock, A. Ram, M. Veloso, D.S. Weld, D.E. Wilkins, PDDL—the planning domain definition language, Tech. Rep. DCS TR-1165, Yale University, New Haven, Connecticut, 1998. [23] C. Burbridge, Z. Saigol, F. Schmidt, C. Borst, R. Dearden, Learning operators for manipulation planning, in: 2012 IEEE/RSJ International Conference on, Intelligent Robots and Systems, IROS, 2012.

Richard Dearden is Reader in Intelligent Robotics in the School of Computer Science at the University of Birmingham, UK, where he has been since 2005. He works in the broad area of reasoning under uncertainty, and in particular on planning, execution, and fault diagnosis, all applied to autonomous robots. Before that he spent 5 years leading the Model-based Diagnosis and Recovery group at NASA Ames Research Centre, where he worked primarily on Mars rover diagnosis and planning. His Ph.D. (2000) was in Markov decision process planning at the University of British Columbia.

Chris Burbridge is a post-doctoral Research Fellow at the University of Birmingham; he is currently working in the areas of autonomous planning and robot manipulation. He obtained his Ph.D. in mobile robot navigation at the University of Ulster in 2010, and his M.Sc. in embedded systems and robotics at the University of Essex in 2006.