1. Introduction
The development of anthropomorphic robotic arms has received more and more attention in the robot industry in recent years [
1]. By analyzing the human arm skeleton, joint movements and muscle forces in depth, the anthropomorphic robotic arm aims to develop a human-like mechanical structure, motion planning and control theory [
2,
3]. The practice has proved that anthropomorphic robotic arms can assist or even replace humans to complete auxiliary and heavy tasks in multiple fields, such as industry and biomedical and social services [
4,
5]. In particular, the anthropomorphic robotic arms with humanized control, such as Yumi robotic arm, Robonaut2 robotic arm and Justin robotic arm, can be more flexible and coordinated.
Traditional control of anthropomorphic robotic arms suffers from various limitations. Direct use of control instructions to control robot action is a complex and mechanized operation, easily causing operator fatigue and low efficiency [
6], and requires prior training of the operator. Although the human–computer interaction mode of the graphical interface improves the quality of interaction, it still cannot remove the command input of the mouse and keyboard [
7,
8], and the efficiency improvement is not obvious. Additionally, the manual controller provides more limited space for the operator.
Teleoperation control based on body sensors has the characteristics of stability and flexibility, which can better simulate human movement and meet the needs of contactless service under epidemic conditions [
9,
10,
11,
12]. Among the many signal sources available, the motion of the arm is an accurate and stable signal source for hand tracking and arm movement [
13]. Currently, arm-based teleoperated devices can be divided into wearable and non-wearable types [
14]. Wearable sensors, for example, Myo arm rings and body gloves [
15], take the sensor (inertial sensor, bending sensor) signal as the input signal to detect the EMG signal of the external surface of the hand [
16,
17], which can quickly and accurately receive the arm movement information and is not affected by light, occlusion and other factors. However, wearing the sensor for a long time can cause discomfort and cumbersome operations that require calibration before use [
18]. Non-wearable sensors are mainly depth cameras, such as Kinect, Leap Motion and other binocular cameras, which use arm images as input signals to identify the joints of the hand and thus calculate the posture information of the human arm without the need for wearable devices [
19], which is more flexible and convenient, but susceptible to the effects of occlusion and lighting conditions, and requires computationally heavy and complex algorithms [
20].
Because of the limited single-mode equipment, the arm tracking is not complete [
21], making the complex control application impossible. Only within 60° of palm rotation can the leap motion sensor (horizontal field of view angle 140°, vertical field of view angle 120°, interaction depth 10–60 cm) fully recognize the hand. When the palm rotation is at 90 degrees or even at a rotation of 180 degrees, there is finger occlusion, and it is unable to identify the correct movement of the fingers [
22,
23]. Vision-based robot control, especially visual servo-control algorithm research results [
24] are numerous [
25]. Andrzej Milecki calibrated the vision system, and then recognized the position of three markers assembled on the operator’s hand and arm, and used it for the control system of an electro-hydraulic servo-driven two-axis manipulator to track the human hand [
26]. Y. Benezeth proposed a vision-based static camera indoor human detection and tracking system, which improves the performance of the entire system and reduces the complexity of the task [
27].
Some researchers combine different teleoperation device [
28] modes to improve arm tracking accuracy and stability. The experimental results of multimodal fusion, for example, Myo Armband and Leap Motion camera fusion [
23,
29], Kinect and EMG Fusion [
30], Leap motion and data glove fusion [
31,
32], Vision and EEG [
33], etc., have effectively verified the complementary advantages of multi-modal information.
Conventional robots are mainly controlled based on single-modal signals, and their control robustness is poor and vulnerable to the external environment. However, dual-arm robot control based on multi-modal signal sensing has not only greatly improved the flexibility and accuracy of the dual-arm robot control under unknown environmental conditions, but also ensured that the dual-arm robots work more reliably in the process of cooperative operation. In this paper, we designed a dual-arm tracking control system based on multimodal signal perception, fusing the signals from the Kinect vision sensor, data glove and force sensor by using the algorithm, used it to control the anthropomorphic robotic arm and dexterous hand in real-time, and then verified the effectiveness and stability of multimodal signal perception and control of data:
Multimodal sensor fusion control overcomes the inherent defect of single mode and is more flexible and applicable to the operator.
Multimodal sensor fusion control conforms to the characteristics of bionics in humanoid motion to improve the coordination and naturalness of dual-arm robot control.
Multimodal sensor fusion teleoperation control has higher efficiency and can be carried out much more smoothly.
In general, one arm of the human body has a scapula, a clavicle, a humerus, a radius, an ulna, 8 wrist bones, 5 metacarpals and 14 phalanges. The shoulder joint corresponds to a spherical sub (rotation about X, Y and Z axes) with three degrees of freedom; the elbow joint corresponds to a rotational sub (rotation about X axis) with one degree of freedom; the wrist joint has two directions of rotation (rotation about X and Z axes) with two degrees of freedom; the ulna and flexor bones, which connect the wrist joint to the elbow joint, have a cross-rotational motion between them (rotation about Y axis) with one degree of freedom.
2. Methodology
2.1. Mechanical Structure
As new materials, new structures and advanced control methods are rapidly developing, bionic robots are gradually moving from the stage of imitating the shape and behavioral gait of creatures to a biomimetic system based on electromechanical systems and biological characteristics. Bionic arm robots are mainly designed to imitate the human arm system and realize the characteristics and functions of human assignments. When compared with traditional mechanical arms, bionic arm robots are able to perform their assigned tasks efficiently under various complex task conditions in terms of high efficiency, low energy consumption and high environmental adaptability.
Bionic arm robots are mainly divided into bionic single-arm robots and bionic double-arm robots. It is not only possible for bionic two-armed robots to work together with both arms compared to anthropomorphic single-armed robots, but also are highly fault tolerant. This greatly improves the operational performance of bionic two-armed robots and their application ranges when they are confronted with complex tasks. Therefore, bionic dual-armed robots are urgently needed for the industrial, the medical, the educational and other fields [
34].
The structural design of the anthropomorphic dual-arm robot is shown in
Figure 1.
The anthropomorphic robotic arm is mainly divided into the performing parts and the functional parts, in which the performing part is equipped with five degrees of freedom. It is mainly realized with the functional part to move and rotate in a certain space. It is composed of two rotation degrees of freedom in the shoulder joint, two rotation degrees of freedom in the elbow joint and one rotation degree of freedom in the wrist. The functional part is mainly to realize the imitation of human hand movements. It is designed to grasp and release the objects. In this way, the anthropomorphic robotic arm will be able to accomplish grasping and releasing objects in a certain space through the cooperative operation between the executive part and the functional part. Therefore, anthropomorphic robotic arms are widely accepted in industrial and medical fields [
35]. Other detailed parameters are shown in
Table 1.
2.2. Acquisition System Components
The bionic dual-arm robots are mainly composed of multiple sensors to collect signals, such as Kinect and data gloves, and these signals are fused and processed for accurate control of the robots, in which the Kinect is used to collect the posture data of human arms and the data gloves are used to receive the signals of human hand movements. Their detailed functions and parameters are as follows.
The Kinect’s core technology is skeleton tracking technology, which is not only able to accurately identify the relative position relationships between human skeletal points in space, but also is capable of obtaining the 3D coordinate information about each joint of the human body through the depth image data processing algorithm. The Kinect-based skeleton tracking technology is able to convert the identified human skeletal data into human joint angle data, and intelligent algorithms can be used to map the collected joint data to each servo of the bi-arm robots, through the mutual cooperation of the servos, so as to achieve the precision control and fast response of the bi-arm robots.
The Kinect sensor is used to capture the operator’s arm joint position and angle information. The effective range of Kinect detection is 0.5 m–4.5 m, with a visual range of 70 degrees horizontally and 60 degrees vertically. It can track up to 6 users’ position information and support up to 25 skeletal nodes. As shown in
Figure 2, the Kinect has two cameras: an RGB camera on the left to acquire 1920 × 1080 color images at up to 30 frames per second, a depth sensor in the middle to detect the relative position of the operator and an infrared emitter on the right to actively project modulated near-infrared light.
The data glove collects information about the operator’s finger joint angle and palm rotation.
Figure 3 shows the structure of the data glove. It has embedded Bluetooth HC-08 4.0BLE, and the master mode automatically connects to the BLE slave. It contains five potentiometers to detect the bending angle of each of the five fingers. It is equipped with a gyroscopic acceleration sensor to detect the angular velocity, acceleration and tilt information of the palm.
The control board type is Arduino UNO and the processor core is ATmega328. The control board has 14 digital inputs/outputs (6 of which can be used as PWM outputs), 6 analog inputs, a 16 MHz crystal oscillator, a USB port, a power socket, an ICSP header and a reset button in
Figure 3.
The operator needs to initialize before using the glove: extend the arm and clench the fist, move the fist vertically down and turn on the power. The middle LED will flash rapidly several times, then the user extends the hand as far as possible. The LED flashes again, indicating that initialization is complete.
2.3. Data Preprocessing
Kalman filtering is a mathematical technique used for estimating the state of a system, given noisy and uncertain measurements. It is often used in robotics and control systems to improve the estimation and prediction of system states. The algorithm provides a way to optimally combine prior knowledge, current measurements and a model of the system’s dynamics to estimate the current state. In the context of a robotic arm following an experimenter’s arm motion, the Kalman filter can be used to estimate the position, velocity and acceleration of the experimenter’s arm using sensor measurements, such as data from cameras, inertial measurement units (IMUs) or other tracking devices. The measurements may be noisy, and there could be latency or delays in the processing pipeline. The Kalman filter helps in dealing with these uncertainties, providing a more accurate and timely estimate of the experimenter’s arm motion. By improving the estimation of the experimenter’s arm motion, the Kalman filter can help the robotic arm to react more quickly and accurately, reducing the delay time difference between them.
Based on the discrete-time linear stochastic difference equation, the state space model of the arm is
where
is the arm state at the previous moment,
is the estimated value of the arm state at the current moment and
is the measured value of the arm state at the current moment.
A,
B and
H are the state transition model, the control input model and the observation model, respectively.
and
are the process noise at the previous moment and the measurement noise, respectively, assuming that they are independent and obey a normal distribution
N.
where
Q is the covariance of the process noise and
R is the covariance of the measurement noise. For the case of hand motion analysis, the control input
emitted in the muscle system is considered to be unknown. The motion model of the simplified arm is
The state of the arm position including the shoulder, elbow and wrist is
In Equation (
6),
is the position
. The state transfer model and observation model of the simplified arm are, respectively,
The Kalman filter eliminates the regularity error due to jitter and abrupt changes to estimate the arm (shoulder, elbow and wrist) position. The Kalman filter algorithm, a recursive predictive filtering algorithm, is used to estimate the state of the process and to minimize the estimation mean square error. The covariance matrix of the prior state estimate and the prior error in each cycle are
where
is the covariance matrix of the prior state estimate. The prior estimate of
x is obtained from the posterior estimate of the previous moment and the input information, and
is the prior mean squared error. Kalman gain is updated in each loop to correct the state estimate of the posterior.
where
is Kalman gain and
is the posterior state estimate. The covariance matrix of the errors is updated.
where
is the posterior mean squared error and Q is the system noise covariance matrix.
2.4. Implementation
The proposed sensor fusion teleoperation control system is described as shown in
Figure 4, which consists of three parts: the signal acquisition terminal, the control terminal and the communication protocol.
The acquisition terminal consists of an operator, Kinect devices and data gloves. The Kinect camera is used to collect the operator’s posture data of bilateral arms and data gloves are used to collect the operator’s posture data of bilateral hands.
The signal control terminal is a miniature control processor, which can communicate wirelessly through Bluetooth and TCP/IP. The control terminal is mainly composed of two parts: anthropomorphic robotic arm and bionic dexterous hand. The anthropomorphic robotic arm is used primarily as a mobile carrier. Each arm is servo-driven by 5 servos. There are 2 servos of 60 kg torque in the shoulder joint, 2 servos of 20 kg torque in the elbow joint and 1 servo of 20 kg torque in the wrist joint. Five servos are cooperated with each other to achieve the function of humanoid arm movement. Therefore, the anthropomorphic robotic arm is primarily used to reach the designated space location by carrying a dexterous hand.
The dexterous hand is consisted mainly of the hand mechanical structure, 6 driving servos and 5 finger force feedback sensors, as well as soft and hard connection interfaces, etc.
Figure 5 is the display and installation of the dexterous hand. We can realize the pressure signal collection at the finger tips of the dexterous hand by setting the force feedback sensors, and further process the signal through our intelligent algorithm, so that the dexterous hand will be more accurate in perceiving the pressure changes of the external environment. Therefore, we are controlling the grip strength of the dexterous hand by using force feedback, which makes the dexterous hand have improved human–computer interaction performance. At the same time, the dexterous hand is not only able to achieve adaptive grasping according to the shape of the object, but also to implement some basic functions that imitate human hands, such as opening doors and carrying boxes.
In the teleoperation communication protocol, the acquisition terminal collects the operator’s bi-arm and bi-hand posture information. Through the MATLAB filtering processing and data fusion, the control algorithm is transmitted to the micro control processor of the dual-arm robots by means of Wi-Fi connection, so as to be driven by each servo for effective and precise control. Similarly, it is possible to give feedback on the positions to MATLAB through the microcontroller. Therefore, the calibration of the actual trajectory data of the multi-arm robots with the theoretical control data will be completed, and then the intelligent algorithm will be used to realize the control stability and reliability from the multi-arm robots. The communication connection between the MATLAB PC and the micro control processor is established in the same Wi-Fi environment, and the data are transmitted through the TCP/IP protocol.
3. Kinematic Modeling
Although bionic single-arm robots are widely used in industrial, medical and service fields, there are many limitations in the practical operation and functional realization of bionic single-arm robots, which greatly limit the collaborative operation of bionic single-arm robots in conditions such as equipment assembly or heavy lifting. In contrast, bionic two-armed robots are capable of accomplishing complex tasks that bionic single-armed robots are not able to achieve through the collaborative cooperation of both arms. Therefore, bionic two-armed robots have become the focus of research in the field of intelligent robotics.
3.1. D(Denavit)–H(Hartenberg) Modeling
Firstly, we are required to number the connecting rods of the arm, we usually are defining the fixed base as connecting rod 0, the first movable connecting rod as connecting rod 1, and so on, and the end connecting rod of the arm is defined as connecting rod i. The modified DH method is to establish the transformation coordinate system i on connecting rod i, and it is a common rule to define the coordinate system of the connecting rod. The coordinate system of the 10-degree-of-freedom dual-arm system is established according to the modified DH method in
Table 2, as shown in
Figure 6.
The modified DH method is the XZ transformation process. The joint angle
represents the rotation angle between coordinates with
axis as the rotation axis; the linkage offset d represents the distance along the
axis direction; the linkage twist angle
represents the rotation angle with
axis as the rotation axis; the linkage length a represents the distance along the
axis direction. The modified DH kinematic modeling is used and is formulated as follows.
In this formula,
i is the link, when
i = 0 it indicates the base and
is the first link.
is the homogeneous transfer matrix of the
i link relative to the
link. The equation is obtained after superposition as follows:
3.2. Based on MATLAB Modeling
The mathematical model of the five-degrees-of-freedom humanoid robot arm is obtained by using the MATLAB toolbox. As shown in
Figure 7, the maximum working range of the anthropomorphic double-arm robots was reached in the three-dimensional space. Firstly, the kinematic model of the anthropomorphic double-armed robots was established in the MATLAB toolbox through the modified DH parameters, while the point cloud map of the maximum range reached by the anthropomorphic two-armed robot in the three-dimensional working space was established by using the Monte Carlo method. The Monte Carlo method, also known as the statistical simulation method, is a numerical computation method guided by probabilistic statistical theory. The advantage of using the Monte Carlo method to calculate the workspace is the short time taken compared to numerical.
The steps of Monte Carlo method workspace solving are as follows:
(1) Random variables are generated for each joint, and a set of joint space vectors for the robotic arm are generated at random.
(2) The calculation of the kinematic positive solution, and the mapping from the joint space to the end workspace (Cartesian coordinate system) is performed.
(3) The workspace distribution is plotted.
3.3. Control Model
The recognized human joint angles are converted into servo control commands according to the algorithm for bi-arm robot motion control. The shoulder of the bimanual robots was controlled individually by two servos to drive two degrees of rotation freedom in the shoulder. Similarly, the elbow joint was equipped with two degrees of rotational freedom, and the wrist joint was provided with only one degree of rotation freedom along the axis direction. Therefore, the Kinect is acquiring the angle information of the human back arm movement in 3D space, and the intelligent algorithm is completing the conversion process between the angle information and the servo commands. Its principle is as follows:
(1) Shoulder joint: The number 1 degree of freedom of the robot arm is controlled in real-time by the value of the angle variable between the projection line of the upper arm in the side plane from the human body and the front plane, and the number 2 degree of freedom of the robot arm is coordinated in real-time by the angle variable between the projection line of the upper arm in the front plane from the human body and the lateral plane. Then, the two-angle information is mapped to the two servos in the shoulder joint of the bimanual robot, and therefore the rotation angle that the two servos of the shoulder joint need to achieve is determined. Meanwhile, the angle-command conversion algorithm is used to obtain the precise command of the servos. Thus the accurate control of the shoulder joint is completed.
As shown below, Equation (
21) is expressed as the projection of the line on the plane, and Equation (
22) is expressed as the angle between the line and the plane, so that the angle of the spatial position of the 2 degrees of freedom of the shoulder can be accurately calculated to accurately control the two-armed robot.
(2) Elbow joint: As shown in
Figure 8, the number 3 degree of freedom of the elbow joint is controlled in real time by the dynamic change of the angle between the plane formed by the three points of shoulder–elbow–wrist and the ZOX plane; the number 4 degree of freedom of the elbow joint is controlled in real time by the dynamic change of the angle between the two vectors of elbow–wrist and elbow–shoulder.
(3) Wrist joint: the number 5 degree of freedom of the wrist joint is precisely controlled by the inertial sensor of the glove.
In this way, it is used by image acquisition technology, finger motion sensing technology and the intelligent control algorithm. Thus it accomplishes real-time control of the bionic dual-arm robots and achieves complex collaborative operations.
There is an absolute coordinate system established on the Kinect and a relative coordinate system was determined on the operator’s shoulder. The absolute coordinate system was defined as a spatial coordinate system with the Kinect’s depth camera as the origin. The human skeletal points were recognized by Kinect, as shown in
Figure 9.
5. Conclusions
The humanoid dual-arm robots are designed and controlled from the perspective of bionics in this paper, and the Kinect depth camera and data glove are used to collect the operator’s dual-arm posture data. This solves the shortcomings of single modal signal control robots. Meanwhile, the fused data are analyzed and processed by fusing the body-sensing glove data, and the intelligent control algorithm is added to achieve a more stable and accurate anthropomorphic dual-arm robot tracking and control. This not only implements remote operation control of the humanoid robot but also effectively strengthens the naturalness and flexibility of the control. Therefore, anthropomorphic dual-arm robots are capable of collaborative operation and environmental adaptation under complex environmental conditions and will be widely used in industrial, medical and educational fields because of their high operational efficiency and low energy consumption, and high flexibility.
Although we achieved promising control performance using the sensor fusion technology, the time delay is still not fully addressed and discussed in this study, which is of vital importance for teleoperation control. It is believed that in future work, we will fuse multiple vision sensors and myoelectric sensors and we intend to use multi-sensor fusion algorithms and intelligent control algorithms, which further improve the control accuracy and coordination. At the same time, the control performance of time delay will be discussed and addressed to improve transparency.