A dynamic programming approach to optimal control of robotic manipulators

A dynamic programming approach to optimal control of robotic manipulators

Pergamon Mechanics Re.aem~ Commtmicatiom, Vol. 25, No. 2, pp. 225-230, 1998 Copyright © 1998 Ehevi~ Science Ltd Printed in the USA. All tights reae~e...

339KB Sizes 2 Downloads 55 Views


Mechanics Re.aem~ Commtmicatiom, Vol. 25, No. 2, pp. 225-230, 1998 Copyright © 1998 Ehevi~ Science Ltd Printed in the USA. All tights reae~ed 0093-6413/98 $19.00 + .O0

Pli S0093-6413(98)0~29-9


Tuna BALKAN Department of Mechanical Engineering, CAD/CAM/Robotics Center, Middle East Technical University, 06531, Ankara, Turkey

(Received 21 May 1996; acceptedfor print 9 January 1998) Introduction The motion planning problem, in its simplest form, is to find a trajectory from a specified starting robot configuration to a specified goal configuration that avoids collision with a known set of obstacles in the working space. While the problem of avoiding obstacles in the work space of robot [1] is not a control theory problem in the normal sense, the problem of moving a mechanical system along the desired trajectory with a minimum performance index is an important control problem. Therefore, optimal trajectory planning problem can be divided into two sub-problems, namely planning a geometric path that accomplishes the desired task while avoiding collisions with obstacles, then tracking this geometric path optimally by finding optimal velocity and associated torque/force profiles along this path. Thus, an optimal control problem can be treated as an optimal trajectory planning problem, if a prescribed path is to be followed. Luh and Lin [2] presented an algorithm using linear programming method where it is assumed that the path consists of end points and a set of comer points and there are constraints on velocity and accelerations but without any bound on the control. Vukobratovic and Kircanski [3] suggested a method using dynamic programming for the solution of minimum energy trajectory planning problem, where a prescribed path is divided into number of segments, and the velocities are assumed to be constant along each segment. Shin and McKay described an algorithm for time optimal trajectory planning in their study [4]. The trajectory is assumed to be given in joint space by a single parameter. Then, by inserting this variable into the dynamic equations, the constraints on control torques are transformed to the constraints on second derivative of the path parameter. The basic idea to solve this problem is to select the control variable (pseudo-acceleration profiles) that produces the largest admissible velocity profile. Although they proved the time optimality by generating the trajectories as close as possible to the edge of the admissible region without going outside of it, the idea of selecting the largest admissible velocity does not work for the other optimization problems other than time. Pfeiffer and Johanni [5] came to same conclusions by following the same objective and formulation although their numerical algorithm is slightly different. Singh and Leu [6] presented an algorithm for optimal trajectory planning under torque and velocity constraints for a given path in either Cartesian or joint space, but results were shown only for one and two degrees of freedom manipulators. Although velocity constraints were taken into consideration, their interactions with the torque constraints can not be observed since velocity bounds were not reached in the simulations they performed. Chen and Desrochers [7] applied minimum principle of Pontryagin with a slight variation which originates from the fact that the constrained control set is replaced by a control set which is a function of the state. Zalzala and Morris [8] suggested an algorithm to find the time optimal trajectories by fitting optimal cubic and quadratic splines along each segment on the prescribed path. 225



Formulation of Ontimal Traiectorv Plannine Problem If a dynamic system is de,~ribed with the initial and final states as q = f(q(t),u(t))


q(to)=qo, q(tf)=qf

and q(to)=Clo, q ( t f ) : q f


then the problem is finding the controls u (t) which optimize the performance index J = K ( q (tf), tf) + flc tf L (q (t),u (t)) dt



where the functions K and L are constructed according to the various types of optimization problems, while the optimal solution satisfies the following constraints on torque, position, velocity and available power of the system.

q (t)- f(3.)= 0

q (t)c_ q* (t)


Dynamics of an n-degree of freedom robotic manipulator can be represented as A (q)q + B (q)lqq] + C (q)lq ] + G (q)= I


where A, B, C and G represent the kinetic energy, Coriolis, centrifugal and gravitational torque matrices, respectively. Then r" represents the control torque vector for known position vector q and its first and second time derivatives. In this study, a PUMA 560 manipulator is used for simulation and the above matrices have been obtained from a previous research [9]. The coordinate frame attachments for PUMA 560 are due to Denavit and Hartenberg, where frame i is attached to link i and axis zi lies along the axis of rotation of joint i [10]. It is difficult to find a closed-form solution to the above problem because of the complexity of the equations of motion of an n-degree of freedom manipulator and various performance indeces and constraints. However, dynamic programming approach suggests to divide the problem into sub-problems and to perform the optimization for each segment. Since the final time is not specified, discretized form of the problem can be obtained by dividing the given path into N segments. Then, discrctized optimal trajectory planning problem can be formulated by defining a dynamic system with initial and final states as

Cl(k)= f(q(k),u(k)) qlk--o=qN,


qlk=N= qN


qlk:0= qo • qlk:N: tin


where qk denotes the position at any point k, and k = N denotes the final state; then, the problem of finding the controls u(k) to optimize one of the performance index is given below. N

J~= ~ k=l


St(k), J2 =

Z k=l


u'r(k)cl(k)St(k) and J3= ~

(O~l+CO2u'r(k)q(k)~t (k)



where Jl, J2 and J3 represents the performance indices for time, energy, mixed time-energy minimization problems, respectively and the solution should satisfy the following constraints


q (k)- f ( ~ ) = 0

¢1(k)~ ¢1" (k)

u (k)~ u* (k)


Pi (k)< Pmax



where i denotes the i'th joint, and Pmaxis the available power of a power source common to all joints. Then, assuming a constant acceleration along a segment, the equations of motion of the manipulator can be rewritten as A (q(k)) [ 4 (k+l)- ~I (k)] + { g (q (k)[ 4 (k)q (k)] .2 + C (q (k) [ q (k) ] + G (q (k)) } ~ t (k) = P (k) ~ t (k)


Solution to the Ontimal Traiectorv Plannine Problem Let k be the index of states along the prescribed path and let Spi (k) denotes the set of possible velocities of joint i, which is obtained by discretizing the range of velocity at this point. During the travel of link i along the stage k (or between states k and k+l), all combinations of possible velocities at state k and all admissible velocities at state k+l create the alternatives for that stage. For a possible velocity at state k and an admissible velocity at state k+l, the acceleration of joint i at state k, and the travel time from state k to k+ 1 can be calculated by using the following equations. ~ii(k)= [qi(k+l)]2-[Cli(k)12 2 [ qi(k+l)- qi(k)]


St(k) = 2[qi(k+l)-qi(k)] [ ~li(k+l)] - [ ~li(k)]


Since the equations of motion are coupled, in order to determine the necessary torque to be applied at joint i, the remaining joints velocities and accelerations should be calculated. This is easily accomplished by simply recalling that each joint completes its respective motion within the same time interval calculated above. Therefore, the other joint velocities can be calculated by substituting the calculated time into the following equation. qj(k) = 2 [qj(k+l)-qj(k)] St(k) - qj (k+l)


where j denotes the remaining joints other than joint i. If any of the calculated joint velocities violates its constraint, or

cij(k) ~ q*(k)


then qi (k) is said to be inadmissable and discarded from the possible velocity set defined at state k. If, on the other hand, all joint velocity constraints satisfy their constraints, accelerations of these joints can be calculated by simply employing j instead of i in equation (11). If positions, velocities, and accelerations of all joints are substituted in the equations of motion of the manipulator, then each joint torque necessary to travel from state k to k+l can be calculated. However, the admissible set of controls should be checked at that point and if Ti,j(k) ~ ti* (k)


where i and j denotes all joints, then the velocity of joint i is said to be inadmissible and rejected from the possible velocity set at that state again. If all joint torque constraints are satisfied, then



using calculated velocity and torques, the necessary power at that stage can be calculated using the equation given below. P = TT(k)q (k)


Let AJ (qi (k), k) denotes the incremental performance index between states k and k+l, and J ° (qi (k), k) denotes the minimum performance index to reach the final state from state k, then the optimality principle of Bellman gives the following expression. j o (qi (k),k) = min { A J (qi (k),k) + J o (q (k+ l),k+l) }


This equation is applied for each velocity in the admissible velocity set Sai (k). Considering the alternatives obtained by applying "all admissible velocities at state k+l for one velocity at state k, then a unique optimal alternatives set is obtained for state k. The optimization process starts from the final state and ends at the initial state, which is called backward scheduling in dynamic programming. When the initial state (initial position on the path) is reached, the optimal position, velocity and torque profiles are obtained by simply tracing back to the final state by following the pointers. PUMA 560 Simulation Results The method is applied to a PUMA 560 manipulator and it is observed that dividing the path into about 35 segments and discretizing the velocity range to have about 40 possible velocities at a point is enough to have a rough solution. Then treating this velocity profile as an input, the algorithm is applied recursively for several times. Case 1 - Time Ootimal Control of PUMA 561) The arbitrarily selected constraints and prescribed paths in joint spaces are given in Table 1, where ~. is the path parameter and its initial and final values are 1 and 0, respectively. In addition to the above constraints, available power is limited to 500 W. Fig. l shows the optimal torque profiles. It is observed from this figure that, the first link applies its full torque to accelerate, and follows its torque bound to decelerate up to about t = 0.35 s. At that instant, optimal torque for the second link saturates and link 1 gets lower torque values than its maximum till the end of the motion where the optimal travel time was found to be 0.766 s. It is also observed that link 3 follows link 1 and link 2, respectively without using its full effort throughout the motion. The remaining three links also follows links I and 2 without using their full torque. This shows one and only one link is always saturated for an interval. From the optimal velocity profiles obtained, it is observed that none of the velocity constraints are reached during the motion. Also the consumed power at each instant is observed to be less than its limit at all times. Case 2 - Time Ot)timal Constrol of PUMA 560 with an Effective Velocitv Constraint In order to observe the effect of the velocity constraint, the above problem is solved by changing only the velocity constraint of the second link to + 2.0 rad/s. Fig.2a shows the optimal torque profiles for the first three links and Fig.2b shows time optimal position and velocity profiles for the ,second link. When Fig.2a is carefully investigated, it can be observed that link 1 is the leading link up to about 0.2 s, then none of the links uses their full torques up to about t = 0.8 s where torque applied at link 1 saturates for an instant and then link 2 becomes the leading link up to end of the motion. What happens between approximately t = 0.2 and 0.6 s can be clarified by investigating



Table 1. Constraints and prescribed paths for time optimal control Link 1

Link 2

Link 3

Torque constraints [N-m] +30 :t:90 Velocity constraints [rad/s] -t-4 +5 Prescribed path (~.-1)rd3 (l-~.)rd2

+50 +6 (1-~.)~3

,o?o,~.t,~-,., 6o

+10 +15 +3 +3 (~.-1)rd4 (l-~.)rd4


• ..

Link 5

Link 6 +10 +7 n/2




Link 4









o '


r a








4¢ t

r. , , ,


L oe

o* TI~[(*)


~[~ TI

06 I|1

FIG. 1 Optimal torque profiles (Case l)

TORQUE I~-ml sol







~ __.-~._

"tt _




"--,~_ _,




~ Ir'd/'~2 ~







I LO -

OS" 06-



071 ' --6o.



-too , ~





00 L ~ : . . : " ~ . . . . . . . . . . 0 O:

........... c,


~0o 0M

I i',1! i~)


-I~IE i , j

FIG.2 a) Time optimal torque profiles for the first three links, b) Time optimal position and velocity profiles for the second link (Case 2) Fig.2b, which shows that link 2 reaches to its velocity bound and follows it, such that the second link is the leading link between this time interval. This simulation shows that when a velocity constraint is reached, the applicable torques between this interval is redrawn without violating the constraint on the velocity. Case 3 - Time and Energy Optimal Control of the PUMA 560 Weightings for time and energy are taken as 1.0 and 2.5, respectively for this simulation by using the assumed constraints and paths given in Table 1. Fig.3 shows that each link travels with small torques applied while link 2 needs zero torque for almost 2/3 of the total travel time, to save energy• Then, for a small time interval, torque at link 1 saturates, but it decreases when torque at link 2 saturates, to decrease the travel time. In order to observe the effect of power limit to a time optimal control problem, power limit was reduced to 250 W with no changes in Table 1. It is observed that for such a case, the leading constraint is the available power.


T. BALKAN ,o.~SF_'?.""

............. "".r :


r~ ...........









FIG.3 Time-Energy optimal torque profiles for links i, 2 and 3 Conclusion In this study, an optimal control algorithm is derived by using dynamic programming. Since it is assumed that the desired path is prescribed, the optimal control problem is treated as an optimal trajectory planning problem. This algorithm is then applied for the optimal control of the PUMA 560 manipulator. The simulation results show that, there is always one and only one leading link at an instant of the motion for the time optimal control case. This is in contrast to the point-topoint minimum-time control law which requires that at least one of the control torques is always in saturation• For the mixed time-energy optimal problem, the results show that each link may use the inertial, gravitational and such effects to save energy. It is also observed that, some of the links needs zero torque to achieve their maximum velocities throughout the motion. R~'erences 1. T. Lozano-Perez, "A Simple Motion-Planning Algorithm for General Robot Manipulators", IEEE Journal of Roboti¢~ and Automation, VoI.RA-3, No.3, pp.224-238 (1987)• 2. Luh, J.Y.S. and Lin, O.S., "Optimum Path Planning for Robotic Manipulators", ASME Journal of Dynamic Systems, Measurement. and Control, Vol. 102, pp. 142-151 (1981). 3. Vukabratovic, M. and Kircanski, M., "A Method for Optimal Synthesis of Manipulation Robot Trajectories", ASME Journal of Dvnamic Systems. Measurement. and Control, Vol. 104, pp. 188-193 (1982). 4. Shin, K.G. and McKay, N.D., "A Dynamic Programming Approach to the Trajectory Planning of Robotic Manipulators", IEEE Transactions on Automatic Control, VoI.AC-31, No.6, pp.491-500 (1986). 5. Pfeiffer, F. and Johanni, R., "A Concept for Manipulator Trajectory Planning", IEEE Journal of Robotics and Automation, Vol.RA-3, No.2, pp. 115-123 (1987). 6. Singh S. and Leu M.C., "Optimal Trajectory Generation for Robotic Manipulators Using Dynamic Programming", ASME Journal of Dyn~ni~ Systems. Measurement, and Control, Vol.109, pp.88-96 (1987). 7. Desrochers, A.A. and Chen, Y., "Structure of Minimum-Time Control Law for Robotic Manipulators with Constrained Paths", Pr0c, IEEE GOTIf. on Roboti~:s and Automation, Arizona, May 14-19, pp.971-976, IEEE Computer Society Press (1989). 8. Zalzala, A.M.S. and Morris, A.S., "Structured Motion Planning in the Local Configuration Space", Robotica. Vol.9, pp.81-92 (1991). 9. Armstrong, B. Khatip O. and Burdick J., "The Explicit Dynamic Model and Inertial Parameters of the PUMA 560 Arm", Proc. IEEE Conf. on Robotics and Automation, San Francisco, CA, pp.510-518 (1986). 10. Denavit, J. and Hartenberg, R. S., "A Kinematic Notation for Lower-Pair Mechanisms Based on Matrices", Tran~;, of ASME. J. of Applied Mechanics, pp.215-221 (1955).