1. Introduction
The ability of humans to grasp and manipulate objects with their hands and arms is one of their greatest strengths in navigating their environment. Transferring this extraordinary ability to robotics is a complicated and worthwhile challenge, as it could lend robots the adaptability and generalizability of humans and human hands. In recent years, soft materials have played an essential role in advances in robotics [
1,
2]. Taking advantage of softness greatly simplifies the grasping process, as it allows more room for error in the placement of the fingers and offers other modes of grasping, such as suction or adhesion gripping [
3,
4]. The compliance of these mechanisms also increases the system’s safety when compared to rigid bodies [
5]. Soft materials allow robots to approximate the softness of human hands, leading to robust and adaptable robotic systems.
However, using soft components as an alternative to rigid bodies presents challenges in both control and sensing. The main difficulty in developing soft robotic systems is the implementation of sensors without interfering with the material’s softness and controlling the inherent nonlinearities of compliant bodies [
6]. Due to the flexible nature of compliant structures, these mechanisms are often designed to be underactuated and have high-order degrees of freedom [
7]. Underactuated robotic hands effectively decrease the complexity of traditional robotic hands, reducing costs and simplifying control architectures [
8,
9]. Soft materials have shown promise for solving many problems associated with robotic gripping. Using flexible components to grasp objects can create a grasp similar to how humans manipulate objects.
Strategies have been developed for the force control of rigid robots [
10], and many techniques have been investigated for the implementation of soft grippers, ranging from sensing and control to the design of compliant mechanisms, with much of this work focusing on motion and shape control [
11,
12]. Regulating the force applied to the target object is crucial to ensure safe and efficient manipulation during the gripping process. Force sensing is a fundamental and difficult challenge in the force control of a robotic gripper because sensors can add unwanted stiffness and change the friction of the gripper. Thus, the main objective of this study was to investigate visual force sensing to control the force applied by a soft robotic gripper. Many studies have been conducted on the design of different methods of soft grasping, such as granular jamming [
13,
14], surface adhesion [
15], and the use of deformable structures to bend around the applied force [
16]. The final method was the focus of this study. Fin ray effect grippers, such as the one used in this study, are popular for adaptive gripping solutions. These biomimetic passive compliment mechanisms are designed to mimic the function of fish fins and can perform soft gripping without additional actuators [
17]. Designed by placing connecting bars between two beams attached at the tip, entirely made of soft materials, fin ray grippers can deform around externally applied forces, allowing the gripper to adapt to the object in contact with the finger.
Visual force sensing is the focus of significant research in soft robotics. Many of these solutions use complex models to approximate the shape and estimate the torques on the joints, while others relate the images to data gathered offline [
18,
19,
20,
21,
22]. The methods of sensing for fin ray effect grippers can be broken into two main streams: embedding or attaching sensors into the body or onto the surface of the finger to measure the gripping force directly, or by using the shape of the gripper and either FEA or mathematical modeling to estimate the force visually. Estimating the shape of the gripper is often done visually, with some work attaching a flex sensor to measure deflections [
19]. When using the shape of the finger to estimate the force, the most common method is to formulate a model-based approach and use analytical techniques to estimate the force. Some research approximates the finger as a chained-beam constraint model (CBCM) [
20]. CBCMs are used to model and discretize large deflections in compliant mechanisms [
23]. Modeling soft grippers in this way is effective at low numbers of chained beams but loses accuracy as the chain grows. Other work, such as [
19], uses an array of deflection sensors to measure the deformation of the gripper finger and map the values into an estimated force. Others use pseudo-rigid body models to approximate the finger as a group of connected four-bar mechanisms and estimate the stiffness of the joints using finite element modeling, combining the two solutions mentioned above [
24].
Despite research on soft robotics, only a few studies have implemented closed-loop force control of a compliant gripping mechanism, with many applications using an on–off approach to grasping, which fails for fragile objects [
25]. Existing solutions to the closed-loop control problem include robust control of continuum and soft robots to handle uncertainties in modeled dynamics [
26]. As a result of the difficulty of control for many types of soft robotics, a standard solution is to operate the robot in open-loop mode, using finite element methods to pre-compute the desired position [
27]. While soft robots are a quickly emerging field, most open-loop and closed-loop control methods are often designed for and implemented to solve position and shape control problems [
28]. When force control is implemented, it is frequently done by indirectly controlling a separate feature, such as pressure control. Control methods for these implementations are commonly done using strategies such as PID or model predictive control [
27,
29]. As many robot manipulation processes require efficient and reliable force control, filling this research gap is vital.
In summary, this study aims to investigate a method for noninvasive force sensing using visual methods and to close the loop in the force control problem of a flexible robotic gripper. Accurate force feedback from visual sensing of soft gripper fingers can be used in many manipulation applications, from handling food products to standard pick-and-place operations. Integrating visual force sensing and adaptive control into a soft robotic system will contribute to the knowledge of force control for soft robotics. It will aid in developing versatile and efficient control strategies for soft grippers, capable of performing reliable grasps on a wide range of target objects.
2. Materials and Methods
This research investigated methods for estimating and controlling the force applied by a flexible robotic gripper, specifically, a Festo fin ray effect gripper. The proposed solution uses finite element analysis (FEA) to generate an extensive dataset to train an autoencoder for feature extraction. The autoencoder compressed images into a latent set of features that describe the shape of the gripper fingers. These features were then used to construct a regression model that estimates the normal force applied by the gripper based on the deformed shape. Using feedback generated by the visual force sensor, an adaptive control algorithm, model reference adaptive control (MRAC), was designed to control the uncertain and nonlinear dynamics of the gripping process. Incorporating visual force feedback into an MRAC control scheme was called virtual reference adaptive control (VRAC) for this work. One of the goals of this work was to design the VRAC to be generally applicable to a wide range of objects and operating conditions, which also lends itself to being able to transfer to other soft grippers, creating general-purpose control strategies for force control of compliant mechanisms. To further define the scope of this work, all objects were rigid, and, as such, the methods presented here were not tested on objects that could be deformed significantly by the gripper. It is likely that the methods proposed would still be effective to some extent, with a reduction in accuracy to some degree and the controller would likely need to be re-tuned to better match the dynamics of a flexible structure.
2.1. Adaptive Gripping
Adaptive gripping using compliant or soft materials has become valuable in many robotics applications. Soft materials able to deform around objects allow for a more generalized mechanism design, moving away from the application-specific design of rigid robots [
30]. In this work, the sensor and controller were designed for a Festo fin ray effect adaptive gripper. Inspired by the fin ray effect, as the name suggests, they were intended to mimic fish fins [
31]. The finger mechanism bends around the force applied to the gripper, causing the fin to adapt to the shape of the object [
32]. The fin ray finger used in this work can be seen in
Figure 1 [
33], and the labeled components are described in
Table 1.
Adaptive grippers can be actuated in multiple ways. Tendon-driven actuation drives the finger’s deformation by shortening a tendon wire attached to the finger. This effective technique allows the tendon to help control the deformation, which gives the finger more control and increases the complexity by adding a tendon [
34]. Other actuation methods involve using suction cups to attach to the object’s surface or granular jamming, which uses beads to adapt around an object before vacuuming out the air to lock the grains into place. Gripping by actuation is another common adaptive gripping strategy; a compliant structure is passively deformed by the object, resulting in the gripper partially matching the shape of the object. Gripping by actuation is the strategy that was used to test the methods proposed here. The gripper chosen for this work is driven by a parallel jaw servo-electric gripper, which attaches to the finger’s base. Attaching the gripper in this manner allows the gripper to be deformed almost entirely by the object’s shape.
2.2. Force Sensing
Sensing for soft robots is a common area of interest among researchers, with many teams focusing on using computer vision instead of physical sensors as a means of measurement. Using cameras to observe the mechanisms prevents issues that arise from the implementation of sensors, which often have undesirable physical properties that interfere with the operation of soft robots. Flexible gripping mechanisms often have complex structures that make them difficult to model analytically. Studies have been conducted to solve the sensing problem using FEA and U-Net-structured neural networks [
18]. The proposed workflow for force sensing is shown in
Figure 2.
The above workflow can be split into two main sub-processes: image normalization and force estimation. Image normalization converts the image of the finger taken at an angle into a format consistent across all viewing angles and extracts the finger’s shape from the image. As the name suggests, force estimation senses the force that the gripper finger applies. This is done using feature extraction algorithms combined with regression models to allow the system to sense the applied force based on training data.
The full gripper assembly, which included the Weiss WSG 110 (Weiss, Lauffen am Neckar, Germany) parallel jaw servo-electric gripper, Festo fin ray effect adaptive gripper fingers, Luxonis OAK-D-S2 cameras (Luxonis, Boulder, CO, USA), and the camera bracket assembly, can be seen in
Figure 3. The labels shown in the image are described in
Table 2.
2.2.1. Data Collection
A significant issue in training a visual force sensor is the high dimensions of an image captured by the gripper finger. An autoencoder was used to solve this problem. Autoencoders are neural networks trained to recreate their input at the output, with a bottleneck between an encoder and a decoder. This creates a low-dimensional encoding of the original input in the middle of the network [
35]. An autoencoder was used to compress this image and reduce the feature space. The autoencoder took the input image and compressed it into a latent vector of 1024 features compared to the 262,144 features of the original image.
The autoencoder was trained on images of the gripper to recreate the input at the network’s output. This process has several benefits besides compression, especially concerning noise reduction. The network was trained on synthetic data generated by FEA to eliminate the undesirable effects of real-world image processing. During real-time operation, this trains the network to correct abnormal data, such as occlusions and noise.
Ansys Mechanical generated the synthetic data to train the autoencoder. The fin and object were modeled and placed in the Ansys simulation environment, with a displacement loading placed at the flat edge of the target object. The object was displaced 25 mm into the fin, deforming the fin ray finger around it. This process was repeated at five points along the fin’s length. Examples of the FEA simulation at undeformed and fully deformed positions for both a 50 mm and 10 mm round object are shown in
Figure 4 and
Figure 5, respectively.
The images above show a significant difference in the shape of the gripper finger when deformed by the larger and smaller shapes. Thus, the force applied by the gripper depends on its shape. Other work has assumed that the gripper will consistently deform in the same manner, which is not the case in reality [
1]. The models suggested in this work will deal with the effect of different deformation modes on the gripper’s stiffness.
2.2.2. Model Training
An autoencoder is an unsupervised learning method to train encoder and decoder models to recreate the encoder’s input at the output of the decoder. The reconstruction at the decoder’s end is compared with the desired output, often the same image as the input, and is used to calculate the loss and train the network. Binary cross-entropy was used as the loss function to train the network. Binary cross-entropy is shown in Equation (
1) [
36]:
where
y is the ground truth, in this case, the pixel value of the image, and
is the predicted pixel value of the reconstructed image. This network structure is further broken down into two blocks: down-convolution and up-convolution. These blocks contain four layers: a convolution layer, batch normalization layer, rectified linear unit activation function, and max pool layer for the down-convolution block. The up-convolution block is similar, with the convolution replaced with a transpose convolution layer and the max pool replaced with upsampling. The structure of these blocks is illustrated in
Figure 6.
The encoder network was built from four down-convolution blocks in series, outputting a compressed feature matrix of dimensions 256, 2, and 2. This matrix was flattened into a vector of 1024 features before sending it to the regression model. The original compressed features were then passed into the decoder network built from four consecutive up-convolutional blocks, followed by a sigmoid layer. The entire autoencoder network is shown in
Figure 6.
A random forest regression algorithm was trained on pairs of encoded images and force data were collected from a load cell to estimate the force applied by the gripper from the compressed and flattened feature vectors. The size of the forest and other hyper-parameters were chosen through testing.
The gripping force is in the direction perpendicular to the contact between the fin and the object and normal to the surface of the finger. For this work, the ejection force created by the gripper pushing the object along the finger was not considered but could be added in the future using a 3-axis load cell and multi-output regression models.
2.3. Gripper Control
To solve the force control problem, a cascaded three-stage controller was proposed. The controller consists of a force control stage to regulate the force and generate a position setpoint, which is fed into the position controller, which then creates the setpoint for the velocity controller. A high-level block diagram of the proposed controller is shown in
Figure 7.
As can be seen in the block diagram, visual force sensing provides feedback to the force controller. This sensor is developed using the synthetic data generated using FEA, creating the virtual component of the VRAC control scheme.
Force Control
During the initial testing of the normal force applied by the gripping, it was found that the gripper acted as a nonlinear spring with an approximately exponential stiffness within the gripper operating region. As the gripper deforms significantly, the layers of the fin ray begin to mesh together, creating an effect called layer jamming, which causes the stiffness of the mechanism to increase substantially. This increased stiffness results in nonlinear dynamics, particularly in the system’s gain, and requires more advanced control schemes to regulate the force. The proposed controller is illustrated in the block diagram in
Figure 8 [
37].
As the object’s shape and location along the finger were unknown before the gripping process, many system parameters were unknown, necessitating an online adaptive controller. The usual strategy for controlling uncertain systems is to use either robust control or adaptive methods. For this application, it was desirable that the controller performance would improve over time with more gripping cycles, making model reference adaptive control (MRAC) a good choice. Combining the MRAC with the virtual force sensing above creates the VRAC controller.
The measurement of the system was performed using a camera and autoencoder system, which provided a fairly noisy measurement. To estimate the states of the system and deal with noise, an Extended Kalman Filter (EKF) was used [
38]. The control output of the MRAC is given by Equation (
2):
where
is the control action,
is the feed-forward gain,
is the reference signal,
is the feedback gain vector,
is the estimated system states, and
is the adaptive control action, derived from the disturbance model, show in Equation (
3):
The MRAC controller learns the controller gains to drive the closed-loop performance of the nonlinear system of the fin ray gripper to match the reference model. The parameter update equations are as follows:
where
,
, and
are the parameter learning rates;
is a feature vector of known Gaussian functions, all having the same width, but with differing means;
is the gripping force setpoint;
is the estimated state vector for the gripper dynamics;
is the error between the estimated plant states and the reference model states. The reference model is the state–space model shown in Equation (
7):
The model error is the difference between the observed and reference model states, as defined in Equation (
8):
The gripper was programmed to operate in grip and release movements defined by a force setpoint and holding time. The gripping process is expected to perform poorly at the start of its life as the parameters are initialized with a guess. As more gripping cycles are performed, the performance should converge to the reference model dynamics. To improve the controller’s performance at the start of the operation, the controller gains and the disturbance model can be initialized to values close to the converged solution. If a separate object needs to be gripped, the controller will adapt and find the new gains.
In summary, a method for controlling the force applied by a soft robotic gripper was proposed, designing the complete workflow for the virtual reference adaptive controller, from building a force sensor to outlining the architecture of a control algorithm. A force sensor was proposed to extract the shape of a fin ray effect gripper and encode the image using an autoencoder into a latent representation of the gripper’s features. These encoded features were then used to train a regression algorithm to estimate the force applied by the gripper finger, completing the design of the force sensor. Due to the nonlinear and uncertain dynamics inherent in many soft robots, including the gripper used in this work, an adaptive force control method was designed to drive the gripper to match a defined reference model, with the goal of reliably and consistently regulating the force in the same manner for all shapes.
3. Results and Discussion
This section will present the results of implementing the VRAC control and feedback schemes on a real fin ray gripper attached to a six-degree-of-freedom Kuka robotic arm. The control scheme was designed to be adaptable to many different objects and displayed a significant degree of generality in the types of objects this system could effectively grasp.
3.1. Visual Force Sensor
3.1.1. Force Sensor Training
A finite element analysis model was generated to train the autoencoder using objects of different shapes to deform the gripper. Five shapes were used: three circles of diameter 10 mm, 25 mm, and 50 mm, two rectangles of 25 mm, and one much longer than the gripper. Data were generated for each object at five points along the gripper, starting from the tip of the finger, moving 80 mm into the fin, with 500 images collected for each test. A design objective of this work was to address the issue of the changing stiffness along the length of the fin; as such, the point on the finger does not affect the estimation accuracy, as long the object is within 80 mm of the tip. The images were then filled in so the gripper was viewed as a solid white shape over a black background in a binary image. Overall, 11,100 images were generated using FEA. In the real operation of the robot, there will be times when the gripper is not completely visible to the camera. Before training, the input images were stochastically obscured by circles of random size and location on the fin to attempt to deal with this. The output images were left unobscured during training. The obscuring process was conducted on the original data, creating a total dataset of 33,300 samples. The training configuration is shown in
Table 3.
The data were split into 521 batches of 64 images during training. The network with the structure outlined in
Figure 6 was trained for approximately nine minutes over five epochs, using binary cross-entropy as a loss function. Less data could be used to achieve similar results, but, for this case, the data were available, and the training was not extensive, meaning that large quantities of data could be used with minimal effort.
Figure 9 plots loss across the epochs during the training process.
As seen in
Figure 9, the loss converges reasonably quickly in about three epochs. Extra training time allowed the model to reduce the loss slightly, but it appears to have had a minimal effect. The autoencoder was evaluated using five-fold cross-validation, holding 6660 images as the validation dataset. The results of the autoencoder training process are summarized in
Table 4.
Figure 10 shows a sample set of the test results. The figure shows that the model can effectively recreate the shape of the gripper, even in the presence of large obstructions. The mean squared error of the test set was calculated to be 0.172 pixels.
In
Figure 10, the first row shows the original image, the second row shows the occlusions, and the bottom row shows that the model could reconstruct the fin’s original shape in this test set while ignoring the image’s occlusion. Some failure cases were found, especially when a larger occlusion covered most of the tip of the gripper.
3.1.2. Force Sensing
The generalization of the visual force sensor was tested by training a random forest regression on two separate round shapes, with radii of 10 mm and 50 mm. The regression model took an input of the encoded image, with an output of the predicted force applied by the gripper.
Table 5 summarizes the random forest regression model. This model was then tested on a third shape between the two, a 25 mm circle. To ensure that the model was trained on a dataset of high-resolution force data, the data were collected by pushing the finger into the object at steps of one millimeter. The finger was pushed over a range of 32 mm. This indentation depth was chosen as the gripper stalls at 80 N, and 32 mm is the depth that can be reached without stalling for all shapes tested. The training objects and results are shown in
Figure 11 and
Figure 12, and the results of the novel test object are shown in
Figure 13. The results below compare the true value recorded by a load cell next to the force measured by the trained camera sensor as the gripper performs a slow gripping motion across the gripper’s trained range of motion.
The labeled components describing the gripping process are summarized in
Table 6.
Table 7 shows the camera sensor’s performance on three objects of different radii.
As can be seen in the table, the sensor performs well on the training and test shapes. The novel shape performs the worst while still achieving a mean squared error of 1.029 , which is acceptable. Due to the process of perspective warping the input image to view the gripper from a normalized angle, the camera sensor is fairly robust to changes in camera angle if it is calibrated correctly after the change in angle. This, however, is a limitation of this design, as the calibration needs to be performed every time the camera is moved.
A comparison of the methods developed in this paper to other work can be seen in
Table 8. The force estimation was compared to four methods, each using different strategies. De Barrie et al.’s work is the most similar to the methods presented, where a U-Net neural network was used to predict the stress in the fin ray structure, and a dense neural network was used to estimate the gripping force, using FEA to generate data [
18]. In another work using a more analytical approach, Yao et al. developed a solution using a chained-beam-constraint model, a method used to model flexible cantilever beams [
20], and converting the deflections into a force. Other works used more hardware-driven approaches. Using a series of deformation sensors along the back of the finger, Chen et al. measured the deflection and mapped the measurements to the estimated force [
19]. The final method, developed by Xu et al., compared to this work used visual markers placed at nodes on the finger to measure the deflections and converted these measurements into forces, once again using FEA tools [
22].
As seen in the table, the proposed method performs as well as the other existing methods. It is difficult to compare the methods used directly, as the methods of evaluating the performance of the force sensor vary based on the researchers. Another issue in the comparison is the differing stiffnesses of the grippers used in the experiments. A normalized absolute error was found to make the comparison easier by dividing the absolute error by the maximum force value. While still imperfect, it gives a better idea for comparing the metrics. However, the method proposed in this paper does have benefits that are not quantifiable using the metrics above. The most significant difference between the proposed method and the others outlined above is the ability of our method to handle large forces, which was not tested in other research. Other works use fin ray effect grippers with much lower stiffness than the Festo gripper. The Festo gripper deforms significantly less than the custom grippers made for other works, requiring a more sensitive model to predict the force accurately. The Festo finger can also provide significant force for real-world gripping applications. However, an issue affecting the sensor in this work is that it is difficult to measure forces below five Newtons, as the gripper can remain deformed after contact, resulting in an unreliable reading at low forces. This could be fixed by using a gripper with a more elastic material, which can snap back into shape more quickly. Other solutions could be to resolve this in the software.
3.2. Force Control
The force controller was tested by grasping a cylindrical object of approximately 10 mm in diameter. The controller gains and features weights were initially at zero, showing the ability of the system to converge to an appropriate set of gains and weights. It is important to note that the controller will converge faster if the gains start closer to the desired value. To reduce the control problem’s complexity, the system’s dynamics were simplified to a second-order approximation of the system. The reference model was designed to be defined by a damping ratio (
) and settling time (
), with the system tested here using
and
. The system was tested with different reference models to observe the performance, and these values were chosen due to the reliability of the parameter convergence. However, the controller could converge at settling times up to 0.5 seconds and follow alternate damping ratios. The chosen reference model used in the experiments in this paper is shown in Equation (
9):
Finding a set of learning rates to allow the gains to converge was a challenging problem. The chosen learning rates, denoted as
in Equations (
4)–(
6) are shown in
Table 9 for each parameter.
The experimental results of the gripping are shown in
Figure 14. The test was conducted using a repeating grip and release pattern, with the first gripping setpoints being 30 N and 15 N, with a release between grasps. During the release, the controller parameters remained frozen and only updated when the gripper was in the gripping portion of the cycle. For the test, the object was gripped twenty times, ten at each setpoint, over three minutes. The convergence of the controller gains can be seen in
Figure 15.
As shown in the figure, by examining the plot for , the controller starts with abysmal performance, unable to reach the setpoint. After a few cycles, the gripping approaches the desired setpoint while still having a steady-state error, eventually disappearing with more gripping cycles. A similar trend appears for , where the real system approaches the dynamics of the reference model.
A practical issue in the MRAC controller is parameter drift, where noise in the measured signal causes the controller parameters to continue growing despite good performance, which leads to instability. This was resolved by placing a lower limit on the error signal, where the controller gains remain frozen if the error is below a set threshold. Saturating the error signal in this manner was an effective method of limiting parameter drift. After tuning the saturation limit for each parameter, the controller maintained good performance.
The controller gains and weights were then saved to be loaded for a new gripping process, which can be seen in
Figure 16, which shows a zoomed view of the gripping process. As can be seen in the zoomed plot, the gripping force can closely match the reference model for both of the system states. The controller struggles to match
exactly, as the derivatives are often noisy, resulting in the controller not exactly matching the reference, although it still has acceptable performance. An intentionally designed limitation of the controller can be seen in
Figure 16 as well. The controller does not match the reference on the release cycle. This is desired and intentional, as the release portion of a grasp does not require precise force control, resulting in the adaptive mechanisms being turned off for this motion.
To summarize the results of this work, the virtual reference adaptive control method using visual sensing, an autoencoder, and random forest regression effectively estimated the gripping force of compliant gripper fingers. Utilizing visual force estimation, an adaptive control scheme was implemented to regulate the applied force of the gripper fingers. The VRAC proposed in this work generally applies to many objects of various shapes and sizes. The methods outlined in this work could also be easily transferred to other designs of soft grippers deformed by either actuation or tendons. The method of estimating the force that the gripper applies is easily transferable, as the estimation only requires the FEA model to encode the shape and a load cell for the force measurement.
4. Conclusions
The methods employed in this work showed that a visual force sensor could provide good force feedback when applied to the physical operation of a soft robotic gripper. The force sensor achieved a mean squared error of approximately one Newton squared for a novel object. These results are comparable to those found in [
18], where a more computationally intensive network was used. It is difficult to directly compare results, as the max force value for this work was approximately thirty Newtons, compared to four Newtons in [
18] and much of the other work. The major limitation of this force sensor, as presented in this report, is its inability to be completely robust to occlusions of the fin. The simulated occlusions during training helped significantly in this regard but could only do so much. When severe occlusions occurred in the gripping process, the gripping force appeared to saturate early, potentially resulting in unreliable reading. Due to the need for perspective warping to standardize the captured images’ viewing angle, it was crucial to calibrate the camera position and warping source points to create the desired final image. This moderately tedious process had to be re-calibrated with every camera bump. A potential solution to this limitation is mentioned in the conclusion section of the future work.
An adaptive force controller was designed to implement visual force feedback into a closed-loop control scheme to further investigate the effectiveness of a visual force sensor. The results show the algorithm learning controller gains and the system nonlinearity over time, resulting in a control of the system that closely matches a designed reference model. Qualitatively examining the controller results, the system can follow the reference model, giving the results desired by a closed-loop controller. One drawback and limitation of this method was the inability of the model to accurately match the derivative state of the reference model due to noisy measurements, resulting in the need to limit the error signal at lower values. Limiting the controller in this way prevents parameter drift but creates a permanent model-tracking error in the system. A further limitation of the control scheme was the time required to achieve acceptable control performance. The first few grip cycles were prolonged, resulting in poor control performance. The learning rates were set to small values, as increases often led to instability in the system and the gains not converging. A further improvement to the gripping architecture would be the implementation of a more complex control algorithm for the position control of the gripper. The current structure uses a PI controller for position control, which, in this system, is the controlled variable of the force controller.
In conclusion, a model reference adaptive controller was designed and implemented using visual force feedback to control the applied force by a soft robotics gripper. The solution used an autoencoder and a regression algorithm to estimate the force optically, and an MRAC controller effectively regulated the desired force after a few grip and release cycles of the gripper. Using model reference adaptive control and visual force sensing has several vital implications in soft robotics. First, a visual force sensor can remove the need for embedding sensors into the gripper’s soft material, reducing the manufacturing time and implementation complexity. Cameras are simple to set up and allow for an intuitive approach to sensing. An effective closed-loop control strategy allows for precise and delicate gripping of potentially fragile parts and can create a more efficient gripping process.
Future Work
While this work presents a good framework for controlling this class of adaptive grippers, more work remains. A more sophisticated method was used to extract the fin ray finger from the image. An autoencoder with a more complex structure could be trained by varying the observation of the input angle while leaving the output image untouched. This would allow the autoencoder to deal with the input variations, removing the need for the perspective warping step in the force sensor. Similar methods could be used to handle variations in lighting and background, which would improve the segmentation of the fin from the background.
It would also be interesting to investigate other control methods, such as nonlinear model predictive control, dual iterative control, or even robust control strategies, and compare the results. This could potentially lead to discovering a method that can converge to a solution more rapidly or reduce the complexity of the control algorithm. Further development of the control architecture of this system could lead to a model with memory of the shape and dynamics of the gripper interacting with objects of differing shapes, which could allow the system to learn ideal methods of interacting with the target object.
The overall future direction of this project will be to embed more intelligence into the gripping operation, focusing on generating setpoints for the gripper before interacting with the object. The most promising technology in any robotics sub-discipline is using large language models to add reasoning to robotics. There has been much research on applications of transformer-based models in recent years, and it is no surprise that this has translated to robotics. These models can ingest large web-scale datasets and have shown impressive emergent reasoning ability [
39]. Some applications of large language models in the robotic gripping process could be to estimate the materials and density of the target objects and use these estimates to approximate the mass and force required for effective manipulation. A slip detection algorithm could be implemented into the system to further the impact of an object estimation method capable of inferring the weight and other properties of the objects. Given the system slip feedback from the environment, the models will be able to learn the best approach to grip objects, allowing the algorithms to improve the more they interact with the environment.