- Email: [email protected]

Multibody dynamics for human-like locomotion Mario Acevedoa, Hiram Ponceb a

Universidad Panamericana, Facultad de Ingenierı´a, Zapopan, Jalisco, Mexico Universidad Panamericana, Facultad de Ingenierı´a, Ciudad de Mexico, Mexico

b

1 Introduction The human body can be considered as multibody system. The general methods from multibody dynamics have been applied to study its motion just as has been done for machines and mechanisms. The study of machines that walk had probably its origin at the end of the 19th century. An early walking model appeared in about 1870. It used a linkage designed by Chevyshev to move the body along a straight horizontal path while the feet moved up and down to exchange support during stepping [1]. But the performance of such machines was limited by their fixed patterns of motion, and by the late 1950s, it had become clear that linkages providing fixed motion would not suffice, control would be needed. Digitally controlled legged robots started to appear in the late 1960s. But by the end of the 1970s were still limited to quasistatic gaits, that is, slow walking motions with the center of mass (CoM) of the robot always kept above its feet. Digitally controlled legged robots started to appear in the late 1960s. Dynamic legged locomotion started with the first dynamically walking bipedal robot in Tokyo University, and the hopping and running monopedal, bipedal and quadrupedal robots developed at the MIT [2]. Meanwhile it was demonstrated that stable dynamic walking motions could be obtained by pure mechanical means, developing the concept of passive dynamic walking. Analysis and design of legged robots remained as matter of research in universities and laboratories until the P2 humanoid robot from Honda appeared, followed in 2000 by the Asimo humanoid robot. The progress over the last decades has been remarkable and how to make legged robots walk and run dynamically is understood, but how best to make them walk and run efficient is still an open issue. And here is where multibody dynamics for human-like locomotion come into relevance. Multibody dynamics for human-like locomotion is a topic of great relevance in current research for a number of fields. It could help in illness Design and Operation of Human Locomotion Systems https://doi.org/10.1016/B978-0-12-815659-9.00003-2

© 2020 Elsevier Inc. All rights reserved.

51

52

Design and operation of human locomotion systems

diagnosis and to restore or improve lost of mobility, rehabilitation. In robotics can be very helpful to develop human-like robots. There is an area where these both field meet, the prosthetic and orthotic sector [3]. Different efforts to develop software tools to use multibody in this areas has been done, see for example, Sherman et al. [4]. Robotics for rehabilitation treatment is an emerging field which is expected to grow. In this area efforts are done for replacing the physical training effort of a therapist, allowing more intensive repetitive motions and delivering therapy at a reasonable cost, and to assess quantitatively the level of motor recovery by measuring force and movement patterns. Here although passive robotic rehabilitation devices are less complex, are limited compared to active devices where a multibody model is essential. See Dı´az et al. [5] for a complete count on applications. One of the most popular use of multibody dynamics in the analysis of real motions, where the inverse dynamics problem is solved for a model animated with some acquired motion, see Gong et al. [6]. As results driving torques at joints are obtained. Inverse and forward dynamics is used for true motion prediction, a problem representing a great challenge currently being a topic of intensive research. In addition compliance at joints is a topic of special attention somehow unattended in research, but having great influence [7]. In the case of humanoid robots, methods from multibody dynamics can be applied to simulate their motion. Practical difficulties can appear in general due to the unstable nature of gait, see, as example, Khusainov et al. [8], and the use of foot-ground contact models [9]. Dynamic balancing concepts have also contributed to human-like locomotion, see, as examples, Takayuki et al. [10] and Stojicevic et al. [11]. In this chapter the application of a force-balanced mechanism is proposed as a leg to be part of a biped robot. Stability is analyzed through the application of learning approaches based on an artificial intelligence, namely artificial hydrocarbon networks (AHN). Modeling and results from multibody dynamics simulation are presented.

2 Stability in human-like locomotion The problem of equilibrium is critical for planning, control, and analysis of legged robots [2]. Control algorithms for legged robots use the equilibrium criteria to avoid falls [12, 13]. Regardless of the application, the computational efficiency of the equilibrium tests is critical, as computation time is

Multibody dynamics for human-like locomotion

53

often the bottleneck of control algorithms. To achieve static equilibrium (i.e., a dynamic wrench equal to zero), the total wrench of gravity and contact forces must therefore be equal to zero. This means that the CoM of the mechanism projects vertically inside the convex hull of the contact points when these points are located in the same horizontal plane. To comply with the previous condition, it is necessary to calculate the horizontal momentum rotation with respect to the CoM whose location, in general, has also to be calculated for every moment. Although for arbitrary contact geometries, more complex and computationally expensive techniques are required to check equilibrium, some requiring to do it 100–1000 times per second [14]. Designing equilibrium controllers for legged robots is a challenging problem [15]. Inverted pendulums, ballbots, hoppers, and cart-pole systems have been revised in the literature [15–17]. It has been shown that nonlinear or more complex control systems have to be designed to balance those, which it also complicates the computational cost and demands robust actuators [17, 18]. In this work, we propose a force-balanced mechanism as a building element for the synthesis of legged robots that can be easily balance controlled. The mechanism has two degrees of freedom (DOFs), in opposition to the more traditional one DOF linkages generally used as legs in robotics [19]. As the mechanical system is balanced, the CoM of the whole system is located in a specific location within the mechanism. This facilitates the efficient use of the “projection of the center of mass” criterion with the aid of a counter rotating inertia, and reduces the number of calculations required by the control algorithm to check equilibrium and correct the CoM position. We conducted different experiments to balance the mechanism and to track unstable set-point positions. To do so, we implemented proportional error controllers with different strategies as well as learning approaches, based on an artificial intelligence method namely AHN.

3 One-leg mechanism model The proposed mechanism is presented in Fig. 1. It is a closed-loop mechanism formed by two inverted double-pendula balanced by force [20]. The right side pendulum is formed by bars OA and CF, while the left one is formed by bars OB and DG. Both double pendula have a counterweight (green disc) at points F and G, respectively.

54

Design and operation of human locomotion systems

Fig. 1 Proposed one-leg mechanism.

This configuration of the mechanism makes its CoM located exactly at point E. Thus, it is very easy to check the equilibrium condition: in this case to maintain the CoM of the system in a vertical position. The mechanism has five DOFs, if it is not in contact to the ground, otherwise it has two DOFs when point O is fixed (the case of lock up) or three DOFs when point O is sliding. The driven DOFs are the angle θ1, which allows to stretch and shrink the leg, and the angle θ2, moving a counter inertia defined by the bar EH and a mass that helps the system to equilibrate. In this way, the CoM of the whole leg system can be positioned in such a way that the static equilibrium is reached in a tilted pose, where the angle θ3 is different from 90 degrees.

3.1 Kinematics The dynamic equations of motion of the mechanism have been obtained following a multibody approach. In this case the multibody system is constructed by a group of rigid bodies, which depend on the kinematic constraints and on the applied forces.

Multibody dynamics for human-like locomotion

55

3.1.1 Kinematics If the proposed planar mechanism is made up of b moving rigid bodies, the number of Cartesian generalized coordinates is n ¼ 3 b. Thus the vector of generalized coordinates for the system can be written as Eq. (1) where qb ¼ [xb, yb, ϕb]T, in this case for all b ¼ 1, …, 5: q ¼ ½q1 , q2 , q3 , q4 , q5 T

(1)

On the other hand, a revolute kinematic joint between bodies i and j introduces a pair of constraints that in general can be described by Eq. (2); where ri is the position vector of the CoM of body i, Ai is its rotation matrix and s0i is the local coordinates vector that positions the kinematic pair with respect to its local reference frame. Similar description can be done for the terms corresponding to body j, thus we can write: Φm ¼ ðri + Ai s0i Þ ðrj + Aj s0j Þ

(2)

The complete set of m kinematic constraints, dependent on the generalized Cartesian coordinates, can be expressed as: Φðq,tÞ ¼ 0

(3)

The first derivative of Eq. (3) with respect to time is used to obtain the velocities, Eq. (4), while the second derivative of Eq. (3) with respect to time yields the accelerations equations as Eq. (5); where Φq is the Jacobian matrix, b and c are the velocities and accelerations respectively, containing timedependent expressions. Φq q_ ¼ b

(4)

Φq q€ ¼ c

(5)

3.2 Dynamics The equations of motion for the constrained multibody system are obtained applying the virtual power principle [21], and can be expressed by: Mq€ + ΦTq λ ¼ f

(6)

where M is the diagonal mass matrix, λ is the vector of Lagrange multipliers directly associated with the reactions at the joints, and f is the vector of applied external forces. For dynamics analysis, the kinematic constraint equations determine the algebraic configuration, while the dynamical behavior can be defined by the

56

Design and operation of human locomotion systems

second-order differential equations. Therefore, Eqs. (5), (6) are arranged to form of differential-algebraic equations as (Eq. 7): q€ f M ΦTq (7) ¼ λ γ Φq 0 having to solve the positions problem (3) and the velocities problem (4) every specific number of time steps to reduce the accumulation error in numerical simulations to obtain an accurate solution. For the case of the one-leg mechanism proposed three different cases can be analyzed: luck up or stiction (when point O is fixed to the ground at some location), sliding (when point O is in contact with the ground and moving in the horizontal direction), and contact (when point O loss contact with the ground because gait simulation is in process). These alternatives will be described in the following sections. 3.2.1 Lock up or stiction In this case Eq. (7) can be modified to include the stiction constraint equation: Φr ¼ 0 thus Eq. (7) is modified as: 2 32 3 2 3 M ΦTq ΦrT q€ f q 4 Φq 0 0 54 λ 5 ¼ 4 γ 5 Φrq 0 0 λr γr

(8)

(9)

Taking as reference to Fig. 1, all terms in Eq. (9) can be obtained. Body 1 is the bar OA, body 2 the bar OB, body 3 the bar CF, body 4 the bar DG, and body 5 bar EH. For this case the complete set of m kinematic constraints, m ¼ 12, dependent on the n ¼ 15 generalized Cartesian coordinates can be expressed as: 2 3 lOD cos ðϕ2 Þ lOC cosðϕ1 Þ x2 + x1 6 lOD sin ðϕ2 Þ lOC sinðϕ1 Þ y2 + y1 7 6 7 6 7 y2 lOD sinðϕ2 Þ 6 7 6 7 l cos ðϕ Þ x + x CI 3 1 3 6 7 6 7 l sin ðϕ Þ y + y CI 3 1 3 6 7 6 7¼0 lDJ cos ðϕ4 Þ x4 + x2 ΦðqÞ ¼ 6 (10) 7 6 7 l sin ðϕ Þ y + y DJ 4 2 4 6 7 6 lEJ cosðϕ4 Þ lEI cosðϕ3 Þ x4 + x3 7 6 7 6 lEJ sinðϕ Þ lEI sinðϕ Þ y4 + y3 7 4 3 6 7 4 lHE cos ðϕ Þ lEI cos ðϕ Þ x5 + x3 5 5 3 lHE sinðϕ5 Þ lEI sinðϕ3 Þ y5 + y3

Multibody dynamics for human-like locomotion

57

where lMN is the distance between points M and N. The stiction constraint is: Φr ðqÞ ¼ ½ x2 lOD cosðϕ2 Þ ¼ 0

(11)

Eq. (10) can be reorganized in a single set of constraints: Φs ¼ 0, to facilitate calculation. So the Jacobian matrix of the reorganized system, Φsq can be expressed as: 2

I2 6 02 6 6 I2 s Φq ¼ 6 6 02 6 4 02 02

rOC 01 01 01 01 01

I2 rOD I2 rOD 02 01 I2 01 02 01 02 01

02 01 02 02 01 02 I2 rCI 02 02 01 I2 I2 rEI I2 I2 rEI 02

01 02 01 02 01 02 rDJ 02 rEJ 02 01 I2

3 01 01 7 7 01 7 7 01 7 7 01 5 rHE

(12)

where I2 is the 2 2 identity matrix, 02 is the 2 2 zero matrix, 01 is the 2 1 zero vector, and:

rOC rEI rHE

sinðϕ1 Þ sinðϕ2 Þ sin ðϕ3 Þ ¼ lOC ;rOD ¼ lOD ;rCI ¼ lCI cos ðϕ1 Þ cos ðϕ2 Þ cos ðϕ3 Þ sinðϕ3 Þ sin ðϕ4 Þ sin ðϕ4 Þ ¼ lEI ;rDJ ¼ lDJ ;rEJ ¼ lEJ cos ðϕ3 Þ cosðϕ4 Þ cos ðϕ4 Þ sinðϕ5 Þ ¼ lHE cos ðϕ5 Þ

In this case b ¼ 0 because the constraints do not depend explicitly on time, and c can be expressed as: 2

3 lOD cos ðϕ2 Þω22 lOC cosðϕ1 Þω21 6 lOD sin ðϕ2 Þω22 lOC sinðϕ1 Þω21 7 6 7 6 7 lOD cos ðϕ2 Þω22 6 7 2 6 7 l sinðϕ Þω OD 2 2 6 7 2 6 7 lCI cos ðϕ3 Þω3 6 7 2 6 7 lDJ sinðϕ3 Þω3 6 7 c¼6 2 7 l cos ðϕ Þω DJ 4 4 6 7 2 6 7 l sinðϕ Þω CI 4 4 6 7 6 lEI cos ðϕ Þω2 lEJ cos ðϕ Þω2 7 3 4 3 4 7 6 6 lEI sin ðϕ Þω2 lEJ sinðϕ Þω2 7 3 4 3 4 7 6 4 lHE sinðϕ Þω2 + lEI cosðϕ Þω2 5 5 3 5 3 lHE cos ðϕ5 Þω25 + lEI sin ðϕ3 Þω23

(13)

58

Design and operation of human locomotion systems

The diagonal mass matrix is: 2 m1 I2 01 02 01 02 6 0T1 J1 0T1 0 0T1 6 6 02 01 m2 I2 01 02 6 T 6 01 0 0T1 J2 0T1 6 6 02 01 02 01 m3 I2 M¼6 6 0T 0 0T 0 0T 1 1 6 1 6 02 01 02 01 02 6 T T T 6 0 6 1 0 01 0 01 4 02 01 02 01 02 0T1 0 0T1 0 0T1

01 02 0 0T1 01 02 0 0T1 01 02 J3 0T1 01 m4 I2 0 0T1 01 02 0 0T1

01 02 0 0T1 01 02 0 0T1 01 02 0 0T1 01 02 J4 0T1 01 m5 I2 0 0T5

3 01 07 7 01 7 7 07 7 01 7 7 07 7 01 7 7 07 7 01 5 J5

(14)

where mi and Ji are the mass and moment of inertia of body i, respectively. Finally, 2 3 0 6 g m1 7 6 7 6 0 7 6 7 6 0 7 6 7 6 g m2 7 6 7 6 0 7 6 7 6 0 7 6 7 7 f ¼6 (15) 6 g m3 7 6 τ1 + τ2 7 6 7 6 0 7 6 7 6 g m4 7 6 7 6 τ1 7 6 7 6 0 7 6 7 4 g m5 5 τ2 where g is the acceleration of gravity, τ1 is the applied torque associated to angle θ1 between bodies 3 and 4, and τ2 is the applied torque associated to angle θ2 between bodies 3 and 5. 3.2.2 Sliding In the case of sliding constraints change as in Eq. (10), because point O can move in the x direction. The number of DOFs is increased in one. Unlike lock up, in sliding the external forces vector depend on some elements from vector λ, so Eq. (6) change to: Mq€ + ΦTq λ ¼ f + f ðλÞ

(16)

Multibody dynamics for human-like locomotion

59

where f(λ) are the forces dependent on some joint reactions. In our case the friction force at point O depend on the reaction between contact point O and the ground, the reaction associated with the third constraint in Eq. (10). To solve this multibody dynamics equations it is necessary to use an iterative process to calculate the correct reactions at the sliding joints, as described in Garcı´a de Jalo´n and Bayo [21]. In Eq. (16), f(λ) can depend on joint geometry and position. Thus the external forces vector is expressed as: 2

3 μ singðx_ O Þλ1 6 7 g m1 6 7 6 μ singðx_ O Þλ1 ðy1 OC sin ðϕ1 ÞÞ 7 6 7 6 7 0 6 7 6 7 g m2 6 7 6 7 0 6 7 6 7 0 6 7 6 7 f ¼6 g m3 7 6 7 τ1 + τ2 6 7 6 7 0 6 7 6 7 g m4 6 7 6 7 τ1 6 7 6 7 0 6 7 4 5 g m5 τ2

(17)

where μ is the friction coefficient, signðx_ O Þ is the sign of the velocity of point O in the x direction, and λ1 is the ground reaction (the normal force). The reactions at the joints can be calculated following an algorithm based on the fixed point iteration:

Algorithm 1 Iterative process to calculate joint reactions in sliding condition

Estimate λ0 and compute f(λ) repeat Calculate λ from Eq. (6) adjusted with f(λ) If abs(λ λ0) > tolerance, λ ¼ λ0 until abs(λ λ0) < tolerance

60

Design and operation of human locomotion systems

3.3 Biped robot, gait design The proposed leg mechanism can be part of a new concept for a biped robot, presented at Fig. 2, the counterweights are omitted for clarity. This robot has a trunk contributing with three DOFs: the displacement of point A, coordinates (x1, y1) and its orientation with respect to the horizontal line, ϕ1. The robot also has two of the proposed one-leg mechanism connected to the trunk and contributing with four additional DOFs: the angle between the two upper bars of red leg θ1, the angle between the two upper bars of blue leg θ3, the angle between the right upper bar of the red leg and the trunk θ2, and the angle between the right upper bar of the red leg and the trunk θ4, a balancing pendulum connected to the trunk also contributing with one additional DOF, θ5. In total this system has eight DOFs.

Fig. 2 Biped robot in the plane, using two of the proposed leg-mechanism.

Multibody dynamics for human-like locomotion

61

For this system the balancing principles can also help in designing the gate. Lets considering that point P2 is in contact to the ground, the robot really has six DOFs. Then we can define a specific trajectory to point P1, being careful to impose starting and ending zero velocity and acceleration. In this way we avoid impact between P1 and the ground. Then it is possible to define the motion of point A by reducing the shaking force of the robot by the optimal control of the acceleration of the total mass center of moving links. This means minimizing the norm the CoM acceleration along the trajectory. This can be done using the “bang-bang” profile, as is described in Briot et al. [22], by defining a specific gate time. The shaking force can be calculated straightforward following the procedure described in Acevedo [23]. To define the complete motion of the robot it is possible to define a constant value to angle θ5 and minimize the shaking moment imposing counter rotation of both legs. This is very easy as the location of the CoM of the system is located at point A. Once the complete motion of the system is defined, it is possible to use Eq. (6) to solve the inverse dynamics problem as: ΦTq λ ¼ f Mq€

(18)

4 Control of and learning the balancing task At first glance, the designing process of the balancing control system for the one-leg mechanism might be seen as a challenging task [15]. Nevertheless, the force-balanced mechanical design provides a simple way to take it over. Thus, in this section, we propose two approaches to solve the balancing problem in the mechanism: (i) designing an intelligent controller and (ii) learning control actions from scratch using reinforcement learning (RL). We introduce not only these strategies, but also we describe AHN that suit the balancing task under disturbances.

4.1 Overview of artificial hydrocarbon networks In machine learning, AHN algorithm is a supervised learning method inspired in the inner mechanisms and interactions of chemical hydrocarbon compounds [24]. This method aims to model data points like packages of information, called molecules. The interaction among these units allows capturing the nonlinearity of data correlation. From this point o view, an artificial hydrocarbon compound is built and it can be seen as a net of

62

Design and operation of human locomotion systems

molecules. If required, more than one artificial hydrocarbon compound can be added up to finally get a mixture of compounds. More precisely in AHN, the molecule is the basic unit of processing information. It performs an output response φ(x) due to an input x 2 k , as expressed in Eq. (19) where vC 2 represents a carbon value, hi, r 2 are the hydrogen values attached to this carbon atom, and d represents the number of hydrogen atoms in the molecule. φðxÞ ¼ vC

d4 k Y X ðxr hi, r Þ

(19)

r¼1 i¼1

Consider then that molecules are unsaturated (i.e., d < 4), then they are able to join with other unsaturated molecules forming chains. In AHN, these chains are namely hydrocarbon compounds or simply compounds. Throughout this chapter, compounds are made of n molecules: a linear chain of (n 2) CH2 molecules with two CH3 molecules, one at each side of the CH2-chain, as shown in Eq. (20), where CHd-symbol represents a molecule with d hydrogen [25]. CH3 CH2 ⋯ CH2 CH3

(20)

A piecewise function ψ denoted as Eq. (21) is associated to the compound; where Lt ¼ {Lt, 1, …, Lt, k} for all t ¼ 0, …, n are bounds where molecules can act over the input space. This function represents the behavior of the compound due to an input x. Other compound functions have been developed as presented in Ponce et al. [25], but the piecewise function has been the most adopted one [26–29]. 8 L0, r xr < L1, r < φ1 ðxÞ (21) ψðxÞ ¼ ⋯ ⋯ : φn ðxÞ Ln1, r xr Ln, r At last, different compounds can be selected and added up to form complex structures called mixtures. In AHN, a mixture is a linear combination of behavior compounds ψ j in finite ratios αj, namely stoichiometric coefficients or simply weights of compounds, as expressed in Eq. (22). SðxÞ ¼

c X αj ψ j ðxÞ

(22)

j¼1

For training purposes, the least squares error is used for obtaining carbon and hydrogen values, while bounds are computed using a gradient descent

Multibody dynamics for human-like locomotion

63

method with learning rate 0 < η < 1 based on the energy of adjacent molecules [25]. To this end, AHN is trained using the so-called AHN algorithm. Details can be found in Refs. [24, 25, 27, 28].

4.2 Artificial organic controller for balancing the one-leg mechanism Intelligent control systems able to work well in nonlinear and dynamic systems, such as those related to changes in the operating point, environmental noise, disturbances, uncertainty in sensor measurements, miscalibration, among other factors [30]. Intelligent control systems are primary built on artificial intelligence approaches in order to tackle the above-mentioned issues. Particularly, literature reports several intelligent control systems to handle uncertainties and disturbances, for example, fuzzy controllers [31], neural controllers [30], and neuro-fuzzy controllers [32]. Recently, artificial organic controllers (AOCs) have also been proposed [29, 33]. In this regard, we propose to use an AOC to tackle the balancing problem in the one-leg mechanism, since AOC has been proved to be very effective in uncertain domains [29, 33–35]. 4.2.1 Artificial organic controllers An AOC is an intelligent control system that performs the control law using an ensemble method namely fuzzy-molecular inference (FMI) system [34], as shown in Fig. 3. FMI consists of three steps: fuzzification, fuzzy inference engine, and defuzzification based on AHN. Fuzzification and fuzzy inference engine steps are quite similar to fuzzy logic. An input x is mapped to a set of fuzzy sets, using membership functions. Then, an inference operation, represented as a fuzzy rule, is applied to obtain a consequent value yp. Considering, the pth fuzzy rule Rp denoted as Fuzzy rules

Outputs

if __, then __

Inputs

Fuzzification

Fuzzy inference engine

Defuzzification

Fig. 3 Block diagram of the fuzzy-molecular inference system.

64

Design and operation of human locomotion systems

Eq. (23), inference computes yp in terms of an artificial hydrocarbon compound with n molecules, Mj, each one with function compound φj(x) for all j ¼ 1, …, n. In this work, the membership value of yp is calculated using the min function, expressed as μΔ(x1, …, xk), over the fuzzy inputs. Rp :

if x1 2 A1 ^ ⋯ ^ xk 2 Ak , then yp ¼ φj ðμΔ ðx1 ,…,xk ÞÞ

(23)

In the defuzzification step, it computes the crisp output value y, using the center of gravity approach [34], as expressed in Eq. (24). X μΔ ðx1 , …,xk Þ yp p

y¼ X

(24)

μΔ ðx1 ,…, xk Þ

p

Membership value

4.2.2 Design of the artificial organic controller Two torque inputs are expected for the one-leg mechanism: τ1 and τ2. In order to design a controller for the balancing task in this system, we decided to decoupled the two inputs such that τ1 will regulate the distance R between point E and the origin O, and τ2 will regulate the angle θ3 (see Fig. 1). To design the AOC for balancing the one-leg mechanism, we propose to use a proportional (P) controller for each torque input. Thus, a P-based AOC, or P-AOC, is developed as follows. First, the error signal e(t) is considered as input to the P-AOC with three fuzzy partitions: “negative” (N), “zero” (Z), and “positive” (P). Fig. 4 shows the input membership functions used for the both controlled variables. Then, the fuzzy rules for the P-AOC are proposed to be as the set summarized in Table 1. These rules consider the proportional action of the controller. Finally, a hydrocarbon compound of three molecules is proposed for

N

1

Z

P

0.5 0 -1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Error signal

Fig. 4 Input membership functions of the error signal e(t), for both distance R and angle θ3.

65

Multibody dynamics for human-like locomotion

Table 1 Fuzzy rules in the P-AOC

6.0

Error signal: e(t)

Control signal: u(t)

N Z P

MN MZ MP

MN

MZ

MP

MN

MZ

MP

-2.0

-0.4

-2.0

-2.0

-0.4

-2.0

C0.1

C0.0

C0.1 6.0

C0.1

C0.0

C0.1 6.0

-2.0

0.4

-2.0

-2.0

0.4

-2.0

-10

4

-4

6.0

10

-1.0

-0.4

0.4

1.0

Fig. 5 Molecular partitions for the output signal u(t): (left) of the distance value R and (right) of the angle θ3.

Torque 1 Distance & angle references

+_

Error

One-leg mechanism

P-AOC Torque 2

Distance Angle (tilt )

Feedback

Fig. 6 Block diagram of the balancing control system using the P-AOC strategy.

the defuzzification step. This compound represents a molecular partition of the output signal of the control law u(t), considering: “negative” (MN), “zero” (MZ) and “positive” (MP). Fig. 5 shows the two output hydrocarbon compounds, one for each controlled variable. To this end, Fig. 6 shows the overall balancing control system for the one-leg mechanism using the P-AOC strategy.

4.3 AHN-based reinforcement learning for balancing the one-leg mechanism Alternatively, a learning approach is proposed for tackling the balancing task in the one-leg mechanism using RL. The goal of RL is to produce/learn a policy π (i.e., a sequence of actions) that is performed by an agent or robot to reach a goal. Furthermore, RL in continuous states and actions has been

66

Design and operation of human locomotion systems

studied in limited occasions [36]. Moreover, for robotics, continuous RL is required for learning complex tasks [37], especially when control expertise is limited or the system is not known completely. Thus, we propose the usage of AHN as a continuous RL method, namely AHN-RL, to find a strategy for balancing the one-leg mechanism from scratch. It consists in four main steps: (i) initial data collection, (ii) learning the dynamics model, (iii) policy search, and (iv) updating the training dataset. Algorithm 2 depicts the overall functionality of the AHN-RL. In this work, we consider the dynamics function model f^ω of a continuous system as Eq. (25) where st 2 D represents the state and at 2 F the action applied in time t. ^s t + 1 ¼ st + f^ω ðst , at Þ

(25)

The dynamics function f^ω is proposed to be parameterized with AHN, where the parameter vector ω represents the hydrogen, carbon, and bound values over the hydrocarbon compound (i.e., {h, vC, L} for all molecules). Then, ^s t + 1 represents the predicted next state that occurs after the predicted change in state st over the time step Δt. 4.3.1 Initial data collection For training the AHN in the AHN-RL, the first step is to collect data from the system. Initially, a set of roll-outs of random action signals is applied to retrieve data. The samples considers tuples of the form ðst1 ,at1 Þ 2 D + F as training inputs and differences Δst ¼ st st1 2 D as training outputs, collected in a dataset D. 4.3.2 Learning of the dynamics model Then, the dynamics function model f^ω is trained by minimizing the error (26) using the gradient descent method over the training dataset D. As described below, the dataset is updated several times to increase the accuracy of the learning process. X 1 k Δst f^ω ðst1 ,at1 Þk2 EðωÞ ¼ (26) 2 ðs , a , s Þ2D t1 t1 t 4.3.3 Policy search Once the dynamics model is learned, a policy search method is applied to the learned model f^ω . For simplicity, we use the random-sampling shooting

Multibody dynamics for human-like locomotion

67

method described in Rao [38] and employed in Nagabandi et al. [36]. This policy evaluation considers that at each time step t, it generates K candidate action sequences of H actions each. Then, the learned dynamics model is applied to predict the resulting future states from running each action sequence and evaluating it in a prior cost function c(st, at) that encodes the task. The optimal action sequence is then selected, but only the first action at is executed. Then, a replan at the next step is done, as suggested in the model predictive control (MPC) [36]. 4.3.4 Updating of the dataset Lastly, tuples (st1, at1) and Δst resulting from this policy evaluation are collected in Df^ and added to the current training dataset D in order to update the learned model with this new information. At last, the proposed method iterates over the dynamics model training, policy search, and collection of new training data, until the task is learned.

Algorithm 2 Proposed AHN-based RL method

Apply random control signals and collect initial training dataset D. repeat Learn the dynamics model f^ω using D. π Policy search using f^ω and MPC. Df^ Collect new data (st1, at1) and Δst from π. Add new data to the training dataset: D D [ Df^. until task learned

5 Experimental results We numerically simulated the one-leg mechanism (Fig. 1). The distances OA ¼ OB ¼ CF ¼ DG ¼ 30 cm, OC ¼ CA ¼ 15 cm, OD ¼ DB ¼ 15 cm, CE ¼ EF ¼ 15 cm, DE ¼ EG ¼ 15 cm, and EH ¼ 12 cm. The physical parameters of the bodies are presented in Table 2. To analyze the performance of the above-mentioned control and learning approaches, we applied the two AHN-based strategies for tackling the balancing problem in the one-leg mechanism. To compare the performance of the AHN-based strategies, we developed: (i) a conventional P-controller as expressed in Eq. (27) with proportional gain KP ¼ diag{1, 10} and (ii) a

68

Design and operation of human locomotion systems

Table 2 Physical parameters of the mechanism bodies Mass (kg)

Moment of inertia (kg m2)

OA OB CF DG Disc at F Disc at G Counter inertia EH

0.0915 0.0915 0.0915 0.0915 0.0915 0.0915 0.271

0.0007621 0.0007621 0.0007621 0.0007621 0.000097 0.000097 0.0000736

Membership value

Body

1

N

Z

P

0.5 0 -2

-1.5

-1

-0.5

0

0.5

1

1.5

2

1.5

2

Membership value

u1(t) N

1

Z

P

0.5 0 -2

-1.5

-1

-0.5

0

0.5

1

u2(t )

Fig. 7 Output membership functions of the control signal u(t): (top) for the distance value R and (bottom) for the angle θ3.

fuzzy P-controller using almost the same input membership functions reported in Fig. 4, with fuzzy rules of Table 1, and the proposed output membership functions like the ones shown in Fig. 7. Notice that the output control signal u(t) represents the applied torques u(t) ¼ [τ1(t), τ2(t)]. uðtÞ ¼ KP eðtÞ + uðt 1Þ

(27)

For comparison purposes, four common performance indices for control systems were applied [39]: the integral of squared error (28), the integral of absolute error (29), the integral of time multiply squared error (30), and the integral of time multiply absolute error (31), where e is the error signal at time t. Z ∞ ISE ¼ e2 ðtÞdt (28) 0

Multibody dynamics for human-like locomotion

Z

∞

IAE ¼

jeðtÞjdt

69

(29)

0

Z

∞

ITSE ¼

te2 ðtÞdt

(30)

tjeðtÞjdt

(31)

0

Z ITAE ¼

∞

0

5.1 Control of balancing task under disturbances This experiment aims to measure the output response of the P-AOC under instant disturbances of 20N horizontal force, one per 500 ms, during 10 s. The one-leg mechanism has initial conditions at distance R ¼ 0.21 m and angle θ3 ¼ 90 degrees. Fig. 8 shows a comparison between the output response of the P-AOC and the other controllers, and Table 3 summarizes the performance indices for this experiment. As notice, all the controllers have almost the same output response. Particularly to P-AOC, it performs better than the fuzzy-P controller in terms of the indices, and it performs the fastest response (see Fig. 8). In addition, Fig. 9 shows the torques τ1 and τ2 applied to the one-leg mechanism using the different controllers. It can be seen that P-AOC has advantage in targeting the distance value R more than the other controllers since it applies less torque to maintain the reference. Also, in terms of balancing associated to the angle θ3, Fig. 9 shows that P-AOC generates the fastest torque τ2. It is important to note that the actions performed assured that the one-leg mechanism did not slip.

5.2 Control of balancing task under noisy conditions The goal of this experiment is to measure the output response of the P-AOC under noisy conditions (10% NSR) in addition to the disturbances of 20N horizontal force applied in the same way as the previous experiment. Again, the one-leg mechanism has initial conditions at distance R ¼ 0.21 m and angle θ3 ¼ 90 degrees. Fig. 10 shows a comparison between the output response of the P-AOC and the other controllers under noisy conditions, and Table 4 summarizes the performance indices for this experiment. In terms of the P-AOC, it can be observed that performs slightly better for controlling the angle θ3 than the others. But, notice that all the controllers can deal with the noisy conditions. Moreover, Fig. 11 depicts the applied torques when using in noisy

Reference Conventional-P Fuzzy-P P-AOC

0.22

Design and operation of human locomotion systems

Distance (m)

70

0.225

0.215 0.21 0.205

0

1

2

3

4

5

6

7

8

9

10

Time (s)

Angle (º)

96

Reference Conventional-P Fuzzy-P P-AOC

94 92 90 88

0

1

2

3

4

5

6

7

8

10

9

Time (s)

Fig. 8 Comparative output response for the balancing task under disturbances.

Table 3 Performance indices of the controllers under disturbances

Conventional-P Fuzzy-P P-AOC

Angle θ3

Distance R

Controller type ISE

IAE

ITSE

ITAE

ISE

IAE

ITSE

ITAE

0.0002 0.0009 0.0008

0.0216 0.0278 0.0182

0.1104 0.0032 0.0025

0.0008 0.0900 0.0536

0.0053 0.0678 0.0306

0.1141 0.2828 0.1322

0.5767 0.2248 0.1060

0.0266 0.9598 0.4395

Torque 1 (Nm)

–0.65

–0.75 –0.8 –0.85

0.5 0.4 0.3 0.2 0.1 0 –0.1

0

1

2

3

4

5 Time (s)

6

7

8

9

10

Conventional-P Fuzzy-P P-AOC

0

1

2

3

4

5 Time (s)

Fig. 9 Applied torques to the one-leg mechanism using the controllers.

6

7

8

9

10

Multibody dynamics for human-like locomotion

Torque 2 (Nm)

Conventional-P Fuzzy-P P-AOC

–0.7

71

Reference Conventional-P Fuzzy-P P-AOC

0.22

Design and operation of human locomotion systems

Distance (m)

72

0.225

0.215 0.21 0.205

1

0

3

2

4

5 Time (s)

6

8

7

9

Angle (º)

96

10

Reference Conventional-P Fuzzy-P P-AOC

94 92 90 88

0

1

2

3

4

5 Time (s)

6

7

8

9

10

Fig. 10 Comparative output response for the balancing task under noisy conditions.

Table 4 Performance indices of the controllers under noisy conditions

Conventional-P Fuzzy-P P-AOC

Angle θ3

Distance R

Controller type ISE

IAE

ITSE

ITAE

ISE

IAE

ITSE

ITAE

0.0002 0.0002 0.0002

0.0232 0.0230 0.0252

0.1169 0.1149 0.1270

0.0008 0.00080 0.0009

0.0055 0.0055 0.0054

0.1279 0.1277 0.1291

0.6396 0.6322 0.6487

0.0275 0.0275 0.0271

Torque 1 (Nm)

–0.65

Conventional-P Fuzzy-P P-AOC

–0.7 –0.75 –0.8 –0.85 –0.9

0

1

2

3

4

6

7

8

Torque 2 (Nm)

0.6

9

10

Conventional-P Fuzzy-P P-AOC

0.4 0.2 0 –0.2 –0.4

0

1

2

3

4

5 Time (s)

Fig. 11 Applied torques to the one-leg mechanism using the controllers.

6

7

8

9

10

Multibody dynamics for human-like locomotion

5 Time (s)

73

74

Design and operation of human locomotion systems

conditions. Again, the P-AOC performs better in terms of τ1 and reacts faster in τ2. It is remarkable to say that the actions performed assured that the one-leg mechanism did not slip.

5.3 Disturbance rejection Another concern about balancing the one-leg mechanism is to determine the maximum disturbance load rejection when applying a controller. In this regard, this experiment considers to measure the maximum horizontal force that can be applied to the system before sliding and falling when using the P-AOC. Thus, the one-leg mechanism is subjected to instant disturbances of horizontal forces starting at time t ¼ 0 s with F ¼ 0N and incrementing the force by +20N at a rate of 500 ms through a maximum period of 10 s. Again, the one-leg mechanism has initial conditions at distance R ¼ 0.21 m and angle θ3 ¼ 90 degrees. Assuming that slipping occurs when the magnitude of the resultant horizontal force Rx in the contact point is greater than the friction force f, the condition expressed in Eq. (32) is true, where μ is the static constant friction and Ry is the normal force. In that sense, Fig. 12 shows the resultant horizontal force Rx in comparison with the friction force f aiming to determine when the mechanism is slipping or not, considering μ ¼ 1 for rubber on dryconcrete contact. As notice, all the controllers can reject disturbances above 200N, and particularly P-AOC can reduce significantly the resultant horizontal force Rx that is applied to the contact point for disturbances in the range from 0 to 160N, in contrast with the other controllers. jRx j > jf ¼ μRy j

(32)

To this end, it is remarkable to say that P-AOC can deal with disturbances and noisy conditions, specifically for the balancing task in the oneleg mechanism. It can also observe that P-AOC rejects larger disturbances than the other control strategies. Furthermore, the mechanical balance of the one-leg mechanism provides an easy way to control it, even with the conventional-P control strategy.

5.4 Learning of balancing task Alternatively, we ran an experiment for learning the balancing task in the one-leg mechanism from scratch and without any prior knowledge. This experiment aims to determine the performance of the output response of the policy π obtained through the AHN-RL approach.

Reaction forces (N)

Horizontal-reaction Friction

1000 500 0

0

50

100

150 200 Disturbance force (N)

250

300

350

400

100

150

250

300

350

400

250

300

350

400

2500 2000

Horizontal-reaction Friction

1500 1000 500 0

0

50

200

Reaction forces (N)

Disturbance force (N) 2000 Horizontal-reaction Friction

1500 1000 500 0

50

100

150

200 Disturbance force (N)

Fig. 12 Graph showing the slipping condition when applying the controllers: (top) conventional-P, (middle) fuzzy-P, and (bottom) P-AOC.

Multibody dynamics for human-like locomotion

Reaction forces (N)

1500

75

76

Design and operation of human locomotion systems

Thus, the balancing task was set up as follows. A state of the mechanism considers the distance R and the angle θ3 such that st ¼ ðRt , θ3t Þ. The goal state is sgoal ¼ (R, θ3) and the cost function c(st, at) is expressed as Eq. (33) where penalty is a positive value greater than zero that punishes the slipping behavior in the mechanism.

cðst ,at Þ ¼

8 rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ X < ðsgoal, i st, i Þ2 :

i

penalty

jRx j jf ¼ μRy j

(33)

jRx j > jf ¼ μRy j

Initial dataset D collected 10 roll-outs with random initial states sinitial ¼ (0.21, 90 15) for coverage. Five iterations were fixed for training AHN. We set the AHN with 10 molecules and learning rate η ¼ 0.01. We preprocessed the training data by normalizing it to be mean 0 and standard deviation 1 ensuring that the loss function weights all parts of the state equally. For the policy search, we set K ¼ 100 candidate sequences and H ¼ 3 horizon steps. After training, we performed an online policy search π via MPC with K ¼ 3 and H ¼ 3 and the usage of the trained AHN-RL. For the training step, Fig. 13 shows the estimation of the AHN-model in contrast with the target values from D. As depicted, it can be observed that estimates highly approximate to the target values, obtaining an accuracy of 99.42% which can be considered well trained. After that, the policy π was computed. Fig. 14 shows the output response of the performed π found by AHN-RL over the two control experiments explained earlier: output response under disturbances and under noisy conditions. Table 5 summarizes the performance indices for the entire experiment. As shown, the output response of the AHN-RL controller under disturbances and noisy conditions is almost the same. In terms of the distance value, it is not reached but the indices confirm a slight variation less than 0.0008 ISE and 0.0039 ITAE, considered good metrics. In terms of the angle, it performs well in both disturbances and noisy conditions proving that the AHN-RL controller can learn to balance the one-leg mechanism from scratch. Moreover, Fig. 15 shows the torques applied during the two experiments. As shown, the torques are similar in both conditions, but actually the applied torques are the smallest ones over the entire experimentation (i.e., τ1 2 [0.1, 0.1] and τ2 2 [0.05, 0.05]). This particular behavior gives insights about the powerful of learning a task using the RL approach.

Multibody dynamics for human-like locomotion

77

Estimated change in distance

3 2 1 0 -1 -2 -3 -3

-2

-1

0

1

2

3

Target change in distance

Estimated change in angle

3 2 1 0 -1 -2 -3 -3

-2

-1

0

1

2

3

Target change in angle Fig. 13 Comparison between estimates and targets of the AHN-based dynamics model ^f θ : (top) changes in distance R, and (bottom) changes in angle θ3.

6 Conclusions This chapter presented a force-balanced one-leg mechanism for easy controlling. The proposed design was modeled and simulated. In addition, we proposed to use AHN for controlling and learning the balancing task in the mechanism. Experimental results provided information to conclude that the proposed mechanism is easily controlled by simple P-control

Reference AHN-RL (disturbance) AHN-RL (noise)

0.215

Design and operation of human locomotion systems

Distance (m)

78

0.22

0.21 0.205 0.2

0

1

2

3

4

5 Time (s)

6

7

8

95

10

Reference AHN-RL (disturbance) AHN-RL (noise)

94 Angle (º)

9

93 92 91 90 89

0

1

2

3

4

5 Time (s)

6

7

8

9

10

Fig. 14 Comparative output response for the balancing task using AHN-RL. Table 5 Performance indices of the AHN-RL controller

Disturbance Noisy conditions

Angle θ3

Distance R

AHN-RL controller ISE

IAE

ITSE

ITAE

ISE

IAE

ITSE

ITAE

0.0008 0.0008

0.0847 0.0865

0.4255 0.4304

0.0039 0.0039

0.0052 0.0053

0.0788 0.1026

0.3944 0.5133

0.0258 0.0264

Multibody dynamics for human-like locomotion

79

Torque 1 (Nm)

0.1 AHN-RL (disturbance) AHN-RL (noise)

0.05

0

-0.05

-0.1

0

1

2

3

4

5

6

7

8

9

10

Time (s) 0.05

Torque 2 (Nm)

AHN-RL (disturbance) AHN-RL (noise)

0

-0.05

0

1

2

3

4

5

6

7

8

9

10

Time (s)

Fig. 15 Applied torques to the one-leg mechanism using AHN-RL.

strategies. In addition, both P-AOC control and AHN-RL learning approaches performed well when the mechanism was subjected to disturbances and noisy conditions. Also, a disturbance rejection analysis was done to validate the performance of the controllers. For future work, we will investigate controllability and learnability of more complex robotic tasks such as walking, jumping and climbing stairs using this mechanism in legged robots. Also, we will characterize the dynamics of the one-leg mechanism in detail.

Acknowledgments This research has been funded by the Universidad Panamericana through the grant “Fomento a la Investigacio´n UP 2017”, under project code UP-CI-2017-ING-MX-03, and it has been partially supported by the Google Research Awards for Latin America 2017.

References [1] M.H. Raibert, Legged robots, Commun. ACM 29 (6) (1986) 499–514. [2] P.B. Weiber, R. Tredake, S. Kuindersma, Modeling and control of legged robots, in: B. Siciliano, O. Khatib (Eds.), Springer Handbook of Robotics, Springer International Publishing, second ed, Switzerland, 2015 (Chapter 48). [3] J. Cuadrado, U. Lugris, F. Michaud, F. Mouzo, Role of multibody dynamics based simulation in human, robotic and hybrid locomotion benchmarking, in: Workshop on Benchmarking Bipedal Locomotion, 2014 IEEE-RAS Int. Conference on Humanoid Robots, Poster, Madrid, Spain, 2014.

80

Design and operation of human locomotion systems

[4] M.A. Sherman, A. Seth, S.L. Delp, Simbody: multibody dynamics for biomedical research, Procedia Iutam 2 (2011) 241–261. [5] I. Dı´az, J.J. Gil, E.E. Sa´nchez, Lower-limb robotic rehabilitation: literature review and challenges, J. Robot. 2011 (1) (2011) 1–11. [6] D. Gong, J. Shao, Y. Li, G. Zuo, Study of human-like locomotion for humanoid robot based on human motion capture data, in: 2016 IEEE International Conference on Robotics and Biomimetics (ROBIO), 2016, pp. 933–938. [7] D. Torricelli, J. Gonzalez, W. Maarten, R. Jimenez-Fabia´n, B. Vanderborght, M. Sartori, S. Dosen, D. Farina, D. Lefeber, J.L. Pons, Human-like compliant locomotion: state of the art of robotic implementations, Bioinspiration Biomim. 11 (5) (2016) 051002. [8] R. Khusainov, I. Shimchik, I. Afanasyev, E. Magid, Toward a human-like locomotion: modelling dynamically stable locomotion of an anthropomorphic robot in Simulink environment, in: 2015 12th International Conference on Informatics in Control, Automation and Robotics (ICINCO), Colmar, France, 2015. [9] J. Jackson, C. Hass, B. Fregly, Development of a subject-specific foot-ground contact model for walking, J. Biomech. Eng. 138 (9) (2016) 091002 (12 pages). [10] N. Takayuki, Y. Araki, H. Omohi, A new gravity compensation mechanism for lower limb rehabilitation, in: 2009 International Conference on Mechatronics and Automation, 2009, pp. 943–948. [11] M. Stojicevic, M. Stoimenov, Z. Jeli, A bipedal mechanical walker with balancing mechanism, Tech. Gazette 25 (1) (2018) 118–124. [12] J.R. Rebula, P.D. Neuhaus, B.V. Bonnlander, M.J. Johnson, J.E. Pratt, A controller for the LittleDog quadruped walking on rough terrain, in: Proceedings—IEEE International Conference on Robotics and Automation, 2007, pp. 1467–1473. [13] E. Yoshida, O. Kanoun, C. Esteves, J.P. Laumond, Task-driven support polygon reshaping for humanoids, in: Proceedings of the 2006 6th IEEE-RAS International Conference on Humanoid Robots, HUMANOIDS, 2006, pp. 208–213. [14] A. Del Prete, S. Tonneau, N. Mansard, Fast algorithms to test static equilibrium for legged robots, in: IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 2016, pp. 1601–1607. [15] E. Najafi, G.A.D. Lopes, R. Babuska, Balancing a legged robot using state-dependent Riccati equation control, IFAC Proc. 47 (3) (2014) 2177–2182. [16] F. Grasser, A. D’Arrigo, S. Colombi, A.C. Rufer, JOE: a mobile, inverted pendulum, IEEE Trans. Ind. Electron. 49 (1) (2002) 107–114. [17] M. Kashki, J. Zoghzoghy, Y. Hurmuzlu, Adaptive control of inertially actuated bouncing robot, IEEE Trans. Robot. 33 (3) (2017) 509–522. [18] P. Wensing, A. Wang, S. Seok, D. Otten, J. Lang, S. Kim, Proprioceptive actuator design in the MIT Cheetah: impact mitigation and high-bandwidth physical interaction for dynamic legged robots, IEEE/ASME Trans. Mechatron. 22 (5) (2017) 2196–2207. [19] K. Komoda, H. Wagatsuma, Energy-efficacy comparisons and multibody dynamics analyses of legged robots with different closed-loop mechanisms, Multibody Syst. Dyn. 40 (2017) 123–153. [20] V. van der Wijk, J. Herder, Synthesis of dynamically balanced mechanisms by using counter-rotary countermass balanced double pendula, ASME J. Mech. Des. 131 (11) (2009) 111003 (8 pages). [21] J. Garcı´a de Jalo´n, E. Bayo, Kinematic and Dynamic Simulation of Multibody Systems: The Real-Time Challenge, Springer-Verlag, New York, NY, 1994. [22] S. Briot, V. Arakelian, J.-P. Le Baron, Shaking force minimization of high-speed robots via centre of mass acceleration control, Mech. Mach. Theory 57 (2012) 1–12. [23] M. Acevedo, An efficient method to find the dynamic balancing conditions of mechanisms: planar systems, in: ASME International Design Engineering Technical

Multibody dynamics for human-like locomotion

[24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39]

81

Conferences and Computers and Information in Engineering Conference, vol. 5B: 39th Mechanisms and Robotics Conference, 2015. H. Ponce, P. Ponce, Artificial organic networks, in: 2011 IEEE Conference on Electronics, Robotics and Automotive Mechanics, IEEE, Cuernavaca, Morelos, Mexico, 2011, pp. 29–34. H. Ponce, P. Ponce, A. Molina, Artificial organic networks: artificial intelligence based on carbon networks, in: Studies in Computational Intelligence, vol. 521, Springer, New York, NY, 2014. H. Ponce, P. Ponce, A. Molina, Adaptive noise filtering based on artificial hydrocarbon networks: an application to audio signals, Expert Syst. Appl. 41 (14) (2014) 6512–6523. H. Ponce, P. Ponce, A. Molina, The development of an artificial organic networks toolkit for LabVIEW, J. Comput. Chem. 36 (7) (2015) 478–492. H. Ponce, A novel artificial hydrocarbon networks based value function approximation in hierarchical reinforcement learning, in: 15th Mexican International Conference on Artificial Intelligence, Springer, 2016, pp. 211–225. H. Ponce, P. Ponce, A. Molina, A novel robust liquid level controller for coupled-tanks systems using artificial hydrocarbon networks, Expert Syst. Appl. 42 (22) (2015) 8858–8867. D. Qian, S. Tong, H. Lium, X. Liu, Load frequency control by neural-network-based integral sliding mode for nonlinear power systems with wind turbines, Neurocomputing 173 (2016) 875–885. A. Roose, S. Yahyam, H. Al-Rizzo, Fuzzy-logic control of an inverted pendulum on a cart, Comput. Electr. Eng. 61 (2017) 31–47. M. Stogiannos, A. Alexandridis, H. Sarimveis, Model predictive control for systems with fast dynamics using inverse neural models, ISA Trans. 72 (2018) 161–177. P. Ponce, H. Ponce, A. Molina, Doubly fed induction generator (DFIG) wind turbine controlled by artificial organic networks, Soft Comput. 22 (9) (2018) 2867–2879. H. Ponce, P. Ponce, A. Molina, Artificial hydrocarbon networks fuzzy inference system, Math. Probl. Eng. 2013 (2013) 1–13. A. Molina, H. Ponce, P. Ponce, G. Tello, M. Ramirez, Artificial hydrocarbon networks fuzzy inference systems for CNC machines position controller, Int. J. Adv. Manuf. Technol. 72 (9–12) (2014) 1465–1479. A. Nagabandi, G. Yang, T. Asmar, G. Kahn, S. Levine, R.S. Fearing, Neural network dynamics models for control of under-actuated legged millirobots (2017) arXiv:1711.05253. E.M. de Cote, E.O. Garcia, E.F. Morales, Transfer learning by prototype generation in continuous spaces, Adapt. Behav. 24 (6) (2016) 464–478. A. Rao, A survey of numerical method for optimal control, Adv. Astronaut. Sci. 135 (2009) 497–528. S.-K. Oh, W. Pedrycz, S.-B. Rho, T.-C. Ahn, Parameter estimation of fuzzy controller and its application to inverted pendulum, Eng. Appl. Artif. Intell. 17 (1) (2004) 37–60.