1. Introduction
According to a 2016/17 season’s UEFA Elite Club Injury Study Report, most injuries in professional football occur in the thigh and knee area [
1]. The report further shows that the most common type of injury is muscle rupture, strain, or cramps, representing 45% of all injuries in professional football [
1]. Muscle strains in hamstrings accounts for 12–16% of all injuries among football players and have high recurrence rates (14 to 62%) [
2,
3,
4]. Among professional football clubs, a typical hamstring injury will result in an average of 17 days of absence from training and matches with an average cost of EUR 16.666 per day that the player is unavailable to the team [
5].
Several modifiable risk factors for muscle injuries have been identified, including fatigue, high-speed running loads, overall muscle strength, and inter- and intralimb asymmetry [
6].
Asymmetry in both the inter- and intralimb has been proposed to increase the risk of sustaining a hamstring strain injury [
6]. Interlimb asymmetry has typically been assessed using the bilateral Nordic Hamstring Exercise, but according to Cuthbert et al., this exercise is not necessarily the best way to evaluate hamstring strength and interlimb imbalances [
7]. The authors stated that in bilateral exercises, a compensatory strategy is typically adopted and can shift between test sessions. One way of minimizing the risk of compensatory strategies is using a unilateral exercise as an alternative. Furthermore, regarding the intralimb asymmetry, the difference between the quadriceps and hamstrings strength (H/Q ratio) has been identified as particularly important [
8,
9]. Since the assessment of muscle strength is seen as a key risk factor for preventing injuries in the thigh area, objective measurements with high reliability and validity are essential [
10]. Currently, stationary isokinetic dynamometers (ID) have proven to be both reliable and valid and are considered the “golden standard” [
11]. However, IDs are expensive, time- and space-consuming, and often difficult to operate, warranting further development and investigation of more practically applicable measurement methods.
Previous investigations have evaluated the reliability and validity to measure peak force in different joints less expensive and with the use of portable measuring equipment, such as field-based tests [
12] and handheld dynamometers [
10,
13,
14,
15]. Handheld dynamometers are generally known to be a reliable measure of muscle strength; however, a systematic review highlights that concurrent validity for handheld dynamometers varies in terms of which joint is measured [
13]. Correlations between handheld dynamometers and ID have been found to be very high (ICC from 0.73 to 0.98) for hip abductors/adductors and hip flexors/extensors [
16,
17] and moderate (ICC from 0.37 to 0.91) for ankle plantar/dorsiflexion and knee flexors/extensors [
16,
17,
18,
19,
20]. Even though handheld dynamometers are less expensive and portable, they still require a strong and experienced practitioner to obtain reliable and valid measures [
21].
Recently, a novel device called an H-station (FysioMeter, Aalborg, Denmark) has been developed with the purpose of monitoring athletes’ quadriceps and hamstrings strength, using two Nintendo Wii balance boards (WBB). The equipment is cheaper, easier to use, and portable compared to the ID and may therefore be more useful for in-season monitoring of players. However, no studies have investigated the between-session reliability and concurrent validity of the H-station compared to the ID and its potential to quantify the H/Q ratio.
Therefore, the purposes of this study were: (a) to determine between-session reliability of the H-station when measuring isometric quadriceps and hamstrings strength and H/Q ratio, and (b) to determine concurrent validity of these measures when comparing to a golden standard ID (Humac NORM, CSMi, Stoughton, MA, USA).
2. Materials and Methods
2.1. Experimental Approach to the Problem
Measurements were performed on two separate occasions with exactly seven days apart at Aalborg University, Aalborg, Denmark between 1/10-22 and 1/12-22. All tests were conducted by the same rater (F.H.M.). The study followed a counterbalanced testing pattern so half of the participants started in the H-station and the other half started in the ID (
Figure 1). The same order of measurements was followed on both test occasions. Measurements were performed at the same time of the day to avoid the circadian rhythm influencing the measurements [
22]. All participants were tested on their dominant leg, defined as the leg they prefer kicking a ball with. They were instructed to wear the same shoes at both testing days. At the initial testing session, descriptive characteristics of the participants were collected, including height, body mass (kg), body fat percentage (%), muscle mass (kg), and length of lower leg, (from lateral femoral epicondyle, estimated knee axis of rotation, to the posterior part of the sole of the foot). Body composition was measured with a bioimpedance apparatus (InBody 270, Biospace, San Francisco, CA, USA).
The study followed the guidelines for reporting reliability and agreement studies (GRRAS) [
23]. The study was conducted in accordance with Danish Legislation and the Helsinki Declaration. The North Denmark Region Committee on Health Research Ethics (LBK nr. 1083) was contacted, and the study was granted exemption from requiring ethics approval. All participants provided written informed consent before any study activities were initiated.
2.2. Participants
A convenience sample of 20 asymptomatic and physically active (>3.5 h a week) participants (17 males and 3 females) were recruited from Aalborg University (Aalborg, Denmark). Participants were excluded if they reported one or more of the following criteria: (1) a history of traumatic spine or lower extremity injury within the past three months, (2) pain in spine or lower extremity, (3) strenuous exercise within the last 24 h before testing, and (4) caffeine or nicotine intake within the last 8 h. A short questionnaire was completed to ensure none of the exclusion criteria was violated. An a priori sample size calculation was not conducted, as there were no established data to do so.
2.3. Procedures
2.3.1. Procedure in the H-Station
Participants were tested in an H-station (FysioMeter, Aalborg, Denmark) using a standard Nintendo WBB (Nintendo, Kyoto, Japan) connected to a standard PC. Initially, participants performed a short quadriceps-specific warm-up followed by the isometric quadriceps strength test (Quad-H). Then, they undertook a short hamstring-specific warm-up before the isometric hamstring strength test (Ham-H). The warm-up consisted of four isometric sub-maximal followed by one maximal contraction, in accordance to procedures in previous studies [
24,
25,
26]. Besides improving performance and preventing injuries, the warm-up also served the purpose of familiarizing the participants to the tests. Following warm-up, the participants had two minutes of rest before the test to minimize the potential effect of fatigue.
For the Quad-H test, participants were seated on a standard treatment table with their hips flexed approximately at 85 degrees. The thigh was secured to the treatment table with two Velcro
® straps placed at both proximal and distal part of the thigh (
Figure 2a). The H-station was then placed as close to the participant as possible, and participants were instructed to place the tip of their shoe touching the marked center of the WBB. Then, the tester adjusted the height of the treatment table to align the sole of the shoe parallel to the floor. The height of the treatment table and knee angle, measured with a goniometer, were noted and reproduced for the second testing occasion. To secure that the H-station did not move during testing, a weight plate of 25 kg was placed in front of the H-station with the tester standing on the weight plate and securing both the top and bottom of the H-station (
Figure 2a). The Quad-H test consisted of three sets of one isometric maximal knee extension, with one minute of rest between each set. Participants were instructed to keep their arms crossed over their shoulders and to push “as fast and hard as possible” with their toes and to hold the contraction for 3 s. Before the Ham-H test, a two-minute rest period was given to the participants.
For the Ham-H test, the H-station was rotated 180 degrees horizontal to the treatment table (
Figure 2b). Thereby, the WBB was positioned posterior to the participant’s feet and the dominant knee of the participant was touching the knee padding of the H-station. The participant’s thigh was secured using two straps, similar to the Quad-H test. The Ham-H test followed the same procedure as the Quad-H test, except participants were instructed to perform a knee flexion.
For both the Quad-H- and the Ham-H test, verbal encouragement, and real-time visual biofeedback of the force trace were provided to ensure maximum effort during each repetition [
27]. The trial was repeated if a counter movement was detected to ensure a contraction starting from rest, or if one of the three repetitions varied with more than 20% from maximum peak value.
Force data were collected via one Nintendo WBB with four strain gauge transducers positioned at each of the four corners. Data from the H-station were sampled at 100 Hz from each transducer and were transferred via Bluetooth to FysioMeter’s software version 5.0.1 on a tablet. Subsequently, FysioMeter software filtered data by using a 4th-order Butterworth lowpass filter with a cutoff frequency of 20 Hz. The FysioMeter software calculates one variable: absolute maximum force measured in kg.
2.3.2. Procedure in the ID
Participants were tested in a Humac NORM ID (CSMi, Stoughton, MA, USA) connected to a Windows laptop with Humac NORM software installed (HUMAC 2009, v.9.7.1). Participants performed a short warm-up followed by the isometric strength test, initially for the quadriceps (Quad-I) and then, the hamstring (Ham-I). The warm-up consisted of four isometric sub-maximal contractions followed by one maximal knee contraction. Following warm-up, the participants had two minutes of rest before the test.
For both the Quad-I and the Ham-I test, participants were seated with their hips flexed at 85 degrees and their thighs and torso were secured using a strap and a safety belt. An ankle strap was placed five centimeters proximal to the distal aspect of the lateral malleolus. The mechanical lever arm of the ID was aligned with the lateral epicondyle of the knee. The knee angle was adjusted to the same angle as in the Quad-H and Ham-H tests, respectively.
After the warm-up was performed, participants completed the Quad-I test, which consisted of three sets of one maximal isometric knee extension lasting three seconds, with one minute of rest in between each set. After the Quad-I test, a two-minute rest period was given to the participants before the Ham-I test. After the rest period, the participants completed the Ham-I test. The procedure for the Ham-I test was the same as Quad-I except participants were instructed to perform a knee flexion. For both the Quad-I and Ham-I tests, verbal encouragement and real-time visual biofeedback of the torque signal were provided.
2.4. Data Processing
To compare the two measuring devices, the outcome values from the H-station, peak force in kg, were converted to maximum quadriceps and hamstring torque (Nm), respectively. For the Quad-H test, torque was calculated by first converting kg into Newton (N) by multiplying kg with gravitational acceleration (9.81 m/s2). Secondly, the lever arm perpendicular to the force vector applied on the WBB was calculated and finally the quadriceps torque was calculated by multiplying the applied force with the lever arm.
Maximum hamstring torque for the Ham-H test was calculated by first converting kg values to N. Since knee angle for the Ham-H test was 90 degrees, torque was calculated by multiplying the force applied on the WBB with the lever arm being the length of the lower leg.
Data processing for the ID was performed by using a custom-made MATLAB version R2021a (MathWorks, Natick, MA, USA) script. Maximum torque values were extracted for each maximal isometric knee extension and flexion.
The mean for the maximum quadriceps and hamstring torque was calculated, respectively, for all four tests on both session 1 and 2 and was used for further statistical analysis. H/Q ratio was calculated by dividing maximum mean hamstring torque by the maximum mean quadriceps torque.
2.5. Statistical Analysis
Statistical analysis was completed using SPSS version 27 (IBM, Chicago, IL, USA). p-values ≤ 0.05 were interpreted as significant. All torque values are presented in Nm. Histograms and Shapiro–Wilks test of normality (p > 0.05) were used to test for normal distribution for the maximum mean quadriceps and hamstring torque and for the H/Q ratio measures. All outcomes met the assumption of normal distribution.
As recommended, both relative and absolute reliability were reported [
28]. The relative reliability was quantified using an ICC
2.1 two-way mixed model using absolute agreement and the corresponding 95% confidence intervals (CI 95%) were also calculated for the ICC values. The ICCs were interpreted using the following criteria: poor (<0.5), moderate (0.5–0.75), good (0.75–0.90), and excellent (>0.90) [
3].
Absolute reliability was investigated with Bland–Altman plots with 95% limits of agreement (LOA) represented in absolute values and in percentage, standard error of measurement (SEM), coefficient of variance (CV), and minimal detectable change (MDC). Lastly, a paired sample t-test was used to test if a systematic bias was present between session 1 and session 2 for the Quad-H, Ham-H, Quad-I, and Ham-I tests.
The concurrent validity of the H-station compared to the ID was investigated using the Pearson product moment correlations (Pearson r), and Bland–Altman plots with 95% LOA. The Pearson r correlation values were interpreted using the following criteria: ±0.0–0.09 indicates no correlation, ±0.1–0.3 indicates a small correlation, ±0.3–0.5 indicates a moderate correlation, and ±0.5–1.0 indicates a large correlation [
14].
3. Results
Of the 20 individuals screened, 19 participants were included in the analysis (16 males, 3 females).
Table 1 displays all demographic data.
3.1. Between-Session Reliability
Table 2 reports the descriptive statistics for the H-station and ID between-session reliability data for quadriceps and hamstrings strength tests and H/Q ratio. The Quad-H and Ham-H tests showed excellent reliability (ICC: 0.91 for both tests) while the H/Q-H had a good reliability (ICC: 0.89). The relative reliability for the Quad-I- and Ham-I tests was good (ICC of 0.80 to 0.89) while it was moderate for the H/Q-I (ICC: 0.65).
For the H-station recordings, SEM and CV between sessions were 22.5 Nm and 8.8% for Quad-H, 10.4 Nm and 7.2% for Ham-H, and 0.05 and 9.3% for H/Q-H. For the ID recordings, SEM and CV between sessions were 18.6 Nm and 6.5 for Quad-I, 17.0 Nm and 11.7% for Ham-I, and 0.06 and 12.4% for H/Q-I.
Figure 3 displays the Bland–Altman plots with LOA, investigating absolute reliability of the H-station and ID. For the Quad-H test (plot A), Bland–Altman displayed a mean difference of 14.5 Nm (
p = 0.063) with LOA% of 24.4%. For the Ham-H test (plot B), a significant mean difference of 11.4 Nm (
p = 0.049) with LOA% of 20.0% was found. For H/Q-H (plot C), a mean difference of 0.01 (
p = 0.768) with LOA% of 22.6% was found. A mean difference of 10.2 Nm (
p = 0.108) and 5.4 Nm (
p = 0.344) and a LOA% of 17.9 and 32.4% was found for the Quad-I and Ham-I tests, respectively. For the H/Q-I, a mean difference of 0.04 (
p = 0.082) and LOA% of 29.4% was found.
MDC was 62.5 Nm for Quad-H, 28.8 Nm for Ham-H, 0.15 for H/Q-H, 51.5 Nm for Quad-I, 47.1 Nm for Ham-I, and 0.17 for H/Q-I.
3.2. Concurrent Validity
Table 3 reports the correlations between the H-station and the ID for each condition and
Figure 4 displays the Bland–Altman plots with LOA. Maximum hamstring torque in the H-station showed a large correlation with the ID with a Pearson r value of 0.79. A moderate correlation was found for the maximum quadriceps torque with Pearson r value of 0.69. The correlation of H/Q ratio between the H-station and the ID showed a moderate correlation with Pearson r value of 0.37.
Bland–Altman plots displayed a significant systematic bias of 32.4 Nm (p = 0.027) with LOA and LOA% of 115.1 Nm and 43.3% for the maximum quadriceps torque (plot A). A mean difference of 9.9 Nm (p = 0.138) with LOA and LOA% of 54.3 Nm and 37.9% were found for the maximum hamstring torque (plot B). A mean difference of 0.05 (p = 0.228) with LOA and LOA% of 0.33 and 59.5% were found for the H/Q ratio measures (plot C).
4. Discussion
This study evaluated (i) the between-session reliability and (ii) the concurrent validity of quadriceps and hamstring maximum strength and the associated ratio (H/Q-ratio) on a novel, portable, and easy to use alternative to the golden standard IDs. The FysioMeter H-station exhibited good to excellent relative between-session reliability for the quadriceps, hamstring, and H/Q-ratio measures. Furthermore, we found an acceptable absolute reliability for all the above measures. Lastly, compared to the ID, the quadriceps test showed a high correlation, while the hamstring and the H/Q showed a moderate correlation.
4.1. Between-Session Reliability
The absolute maximum torque values obtained by the H-station in the present study were very similar to findings of studies investigating a similar population [
24,
26,
29]. Several studies have investigated the reliability of a handheld dynamometer (HHD) in relation to quadriceps and hamstring strength, but only a few have expressed the absolute values in Nm. Instead, most of the studies have used kg or N to express their absolute values obtained by HHD, which makes it difficult to compare the results obtained by the current study.
In the current study, the H-station was found to be highly reliable for measuring both maximum quadriceps and hamstring torque (ICC = 0.91 for both). Furthermore, the reliability of the H-station measuring H/Q ratio was good (ICC = 0.89). One systematic bias was found between the mean of session 1 (138.2 Nm) and session 2 (149.6 Nm) for the Ham-H test, which could indicate a learning effect of the test [
30]. Only one study has previously evaluated the reliability of the Nintendo Wii balance boards (WBB) for measuring lower limb strength in older adults [
25]. This previous study found very similar reliability values compared to the current study, with ICC values for WBB ranging from 0.91 to 0.97, SEM from 9.7 to 13.9% and LOA ranging from 20.3 to 28.7%. Other portable strength measuring devices, such as HHD’s, have shown ICC values between 0.86 and 0.96, SEM/CV between 4.2 and 14.7% and LOA (%)/MDC between 11.6 and 24.9% [
17,
18,
29,
31]. For the isometric hamstring strength, previous studies found ICC’s between 0.89 and 0.96, SEM/CV between 4.8 and 8.6% and LOA/MDC between 13.3 and 23.8 [
17,
29]. Our results for both relative and absolute reliability were within the same ranges One study has evaluated the HHD’s ability to measure H/Q ratio [
29]. This latter study found ICC values ranging from 0.87 to 0.90, which corresponds to our study’s ICC (ICC = 0.89) for H/Q ratio of the H-station.
The SEM values reported here represent variation in torque due to three overall factors: instrumental-, biological- and/or experimental protocol variations [
30]. Instrumental variation refers to the error or noise in the measuring equipment used. In relation to measuring error, one study has found excellent correlations (ICC = 0.83–0.99) of WBB compared to a laboratory-based force plate measuring force during a dynamic task [
32]. This finding implies that the WBB is a reliable tool for measuring force in dynamic tasks, but it does not eliminate the possibility of variation due to signal noise or calibration errors. When testing human performance, random variations are usually caused by biological- and experimental protocol differences [
25,
30]. Among biological factors, changes in mental and physical states between session 1 and 2 can account for some of the random variation [
13]. The mental state was attempted controlled by providing verbal encouragement and visual bio feedback, which are suggested to have a positive influence on participants’ physical performance [
27]. It was not possible to control completely the participants’ physical state, thus it is possible that the biological variations could account for the relatively high between subject variance in the data.
4.2. Concurrent Validity
In the present study, the maximum quadriceps torque measured with the ID was significantly higher than the maximum quadriceps torque measured with the H-station. These results point towards a similar tendency as previous studies investigating concurrent validity of HHD compared to an ID reported ICC values ranging from 0.42 to 0.85 for isometric quadriceps strength [
17,
18,
19,
20] and 0.66 to 0.91 for isometric hamstring strength [
17,
19].
The paired sample t-test showed that the H-station significantly underestimated the maximum quadriceps torque values with a mean difference of 32.4 Nm (SD = 58.7 Nm). This underestimation can be explained due to the manual calculation of torque values for the H-station. This calculation is only valid on the premise of the foot contact with WBB being perpendicular. In case of dorsiflexion of the ankle joint during the isometric knee extension, the force applied on the WBB would not have acted perpendicularly resulting in an erroneous lever arm. This could introduce a measurement error that may have contributed to the underestimation of the maximum quadriceps torque measured in the H-station.
Aside from the previous discussed instrumental and/or biological variations causing random error, the current experimental protocol had limitations, which could account for some of the variation within and between participants and the low correlation between devices. The Quad-H test depended on the practitioner’s ability to maintain the H-station in place during the tests. Chamorro et al. highlight that a higher reliability of HHD can be achieved when the practitioner is stronger than the participant [
13]. In addition, this aspect is especially important when testing large muscle groups like quadriceps. Movement of the H-station during the tests would result in lower maximum quadriceps or hamstrings torque. The experimenter noticed that the H-station did move during a few tests. This was compensated by simply repeating the specific test. For the Quad-H test, some participants reported discomfort (tip of the toes) during the isometric knee extension. This may have affected the motivation of the participant to produce a maximal and consistent knee extension. Furthermore, all participants wore different types of shoes during testing. It is uncertain if the shoe type influences the torque generation, but it is recommended that further studies should try to control this parameter, by providing shoes for the participants or by modifying the H-station (e.g., adding a foot pad).
4.3. Practical Applications
Our findings indicate that the H-station is a reliable tool for measuring relative strength changes in quadriceps, hamstring, and H/Q-ratio. Moreover, the H-station is relatively cheap, enabling the device to be placed in public and private clinics to, e.g., monitor patients after an anterior cruciate ligament injury, as well as in professional or semi-professional sport clubs. This has important implications for practitioners, coaches, and clinicians, as the device will enable monitoring of the strength of the lower limb of athletes over a season or after an injury for handball or football players. In that respect, SEM and MDC values are very useful to ensure that gains in quadriceps and hamstring strength are above measurement errors and clinically relevant.