2.1. Smartwatches
Ten different smartwatches currently available on the market (as of May 2024) from nine manufacturers were included in this study (
Table 1). The smartwatches were selected to cover a wide range of manufacturers with adequate market shares, ranging from entry-level products at a recommended retail price of 120 €, such as the Mi Watch (XIA) (Xiaomi, Inc., Beijing, China), to an advanced triathlon-specific model, the Garmin Forerunner
® 955 Solar (GAF) (Garmin AG, Schaffhausen, Switzerland), at 650 €. In an overview of global sales figures for wearables from 2014 to 2022, in total, eight leading manufacturers were listed, seven of which are included in this study: Apple, Samsung, Xiaomi, Huawei, Fitbit, Garmin, and Fossil (order according to rank) [
19]. Looking at the most popular smartwatch brands in Germany in 2023, the top seven are represented in this study, supplemented by Fossil in rank 10. Moreover, Polar was included in this study, owing to the popularity of their smartwatches in Germany [
19] and because Polar watches were analysed in previous studies [
15,
16,
18].
As regards movement tracking, a multi-GNSS analysis is more accurate than a single GNSS [
20]. For this reason, the GNSS function of the watches was set to utilise both the American GPS and the Russian GLONASS systems in parallel to achieve better comparability where applicable. If this was not available, the watch’s default settings (commonly GPS only) were used. The possible GNSS options are shown in
Table 1, based on the information from the respective manufacturers. Further hardware specifications can be found in
Table A5. Notably, Fossil does not specify the supported GNSSs in the Fossil Gen 6 Smartwatch (FOS) (Fossil, Inc., Richardson TX, USA). However, as its Snapdragon Wear 4100/4100+ (Qualcomm Technologies, Inc., San Diego CA, USA) processor supports GPS, GLONASS, Galileo, and BeiDou (BDS), that smartwatch is expected to perform accordingly [
21]. To compensate for any variability in production, GNSS data were simultaneously collected using two separate watches of the same model, worn unilaterally by the same individual when running or mounted to the handlebar when cycling. During running and swimming, it was double-checked for each smartwatch that it was firmly attached and tightly worn on the athlete’s wrist, ensuring direct skin contact and minimising any relative motion between watch and wrist (
Figure 1).
Table 1.
Overview of the ten investigated smartwatches, their manufacturers, and available GNSS options.
Table 1.
Overview of the ten investigated smartwatches, their manufacturers, and available GNSS options.
Smartwatch | Manufacturer | Abbreviation | GNSS Options | |
---|
GTS3 | Amazfit (Zepp North America, Inc. Irvine, CA, USA) | AMA | GPS, GLONASS, Galileo, BDS, QZSS | [22,23] |
Watch SE | Apple, Inc. (Cupertion, CA, USA) | APP | GPS, GLONASS, Galileo, QZSS | [24] |
Versa 4 | Fitbit, Inc. (San Francisco, CA, USA) | FIT | GPS, GLONASS | [25] |
Gen 6 Smartwatch | Fossil, Inc. (Richardson TX, USA) | FOS | GPS, GLONASS, Galileo, BDS | [21,26,27] |
Forerunner® 955 Solar | Garmin AG (Schaffhausen, Switzerland) | GAF | GPS, GLONASS, Galileo, BDS, QZSS, IRNSS | [28,29,30] |
Venu® 2 | Garmin AG (Schaffhausen, Switzerland) | GAV | GPS, GLONASS, Galileo | [31,32] |
Watch GT 3 | Huawei (Shenzhen, China) | HUA | GPS, GLONASS, Galileo, BDS, QZSS | [33,34] |
Ignite 2 | Polar, Inc. (Kempele, Finnland) | POL | GPS, GLONASS, Galileo, QZSS | [35,36] |
Galaxy Watch 4 | Samsung Electronics Co., Ltd. (Seoul, South Korea) | SAM | GPS, GLONASS, Galileo, BDS | [37,38] |
Mi Watch | Xiaomi, Inc. (Beijing, China) | XIA | GPS, GLONASS, Galileo, BDS | [39] |
For all measurements and watches, the most suitable sport mode for the planned measurement was selected; for running, this was either outdoor running or track running. For watches that did not provide a specific track running mode, the regular outdoor running mode was selected. For road cycling, the outdoor cycling mode was used. If the watch did not offer this mode, the outdoor running mode was chosen instead, which was necessary with the FOS watch. Because of the higher average movement speed in cycling, it can be assumed that this differing choice of mode did not significantly influence the cycling distance measurements. For swimming, the indoor swimming mode was selected for any watch, and the lap length was set to 50 m. If a software update for a device or its corresponding application became available during the measurement period, it was installed to provide the most current device software status available.
2.3. Heart Rate Measurements
The arterial blood pressure in rest was measured manually and bilaterally with a sphygmomanometer (boso med 1, Bosch + Sohn GmbH & Co. KG, Jungingen, Germany) by a physician. The resting heart rate was determined by electrocardiography (ECG), which was also used to rule out any deviations in the cardiac currents that could have led to an incorrect measurement. Except for two extrasystoles for one subject, which did not lead to exclusion due to their statistical insignificance, no abnormalities were detected that could have contributed false measurements. In particular, heart rate and pulse rate could be treated as equivalents for all subjects. In addition, the resting ECG and blood pressure measurements were examined to detect any pathologies that would have contraindicated a stress test. If the ECG results were normal and the systolic blood pressure was ≤160 mmHg, the treadmill test was started.
For this purpose, the participants completed 5 × 3 min intervals at a self-selected running speed (min.: 8 km/h) on a motorised treadmill (h/p/cosmos saturn
®, 250/100, h/p/cosmos sports and medical GmbH, 83365 Nußdorf–Traunstein, Germany or Star Trac 10 FreeRunner™, MERCOR Fitnesskonzepte GmbH Leipzig, Germany, respectively). An 1.5% incline was set to compensate for the absence of air resistance. While running, optical heart rate data (i.e., pulse rate data) were collected continuously by two different smartwatch models in parallel using their photoplethysmography sensors, with one watch being worn on each wrist. Therefore, the total number of intervals to be run for 10 smartwatches could be reduced from 10 to 5. As reference, true heart rate was measured electronically using a chest strap heart rate monitor (Garmin HRM-PRO, Garmin AG, Schaffhausen, Switzerland). The validity of the chest strap measurement has been confirmed in previous studies [
40,
41,
42,
43]. The measurements using the smartwatches and the chest strap started simultaneously and ended after 3 min of running at the target speed. All heart rate measurements were conducted with a data rate of at least 1 Hz.
2.4. Tracking of Running and Cycling Distances
To evaluate the accuracy of the distance tracking measurements, the smartwatches were tested on reference routes for running and cycling. To avoid environmental influences [
44], the same location and the same date and time were used under a clear sky. Before the GNSS measurements were performed, a bike computer (Sigma BC 8.12, SIGMA-ELEKTRO GmbH, Neustadt, Germany) was calibrated with the tyre size. The rolling length of one tyre’s circumference was measured by marks on the ground while the rider was sitting on the bike (tyre size: 25 × 700c at 7.5 bar/109 PSI). This calibration was crosschecked and confirmed by measuring the length of a five-laps course on a 400 m stadium track on lane 1 as described below. With this calibration, all running and cycling distances were measured as reference values by riding the same course at least twice on the bike (with the tyre pressure kept at its calibration time value given above). Additionally, the true lengths of the running and cycling routes were crosschecked using OpenStreetMap and Google Maps. These three distance results per course, i.e., by bike ride, OpenStreetMap and Google Maps, differed only marginally in the second decimal place (3.41 km vs. 3.4(0) km and 36.84 km vs. 36.8(0) km). Notably, this small difference was inevitable as the bike computer’s display provided two decimal places, whereas the map material yielded only one.
For each run, 4 smartwatches were worn simultaneously to reduce the number of runs: two on the left forearm and two on the right forearm. The positions were numbered—(T1) proximal forearm, left; (T2) distal forearm, left; (T3) distal forearm, right; and (T4) proximal forearm, right—and noted for each watch and trial. Five runners (3 males, 2 females) ran 4000 m on a 400 m standard tartan stadium track. The runners were instructed to stay in lane one, which is exactly 400 m in length, at a distance of 30 cm from the inner line [
45]; running was maintained in this lane (with the exception of, at maximum, 6 quick overtaking manoeuvres per 4000 m). Additionally, an outdoor run of 3.41 km on an asphalt road in profiled terrain was performed. The tests on the stadium track and the asphalt road were performed in triplicate by five runners, each with four smartwatches in positions T1–T4, resulting in 5 × 3 × 4/10 = 6 separate measurements per watch. The watches were rotated among the runners, as well as the forearm position, after each run. For the cycling measurements, four cyclists (three males, one female) completed a fixed road bike course of 36.84 km on asphalt roads in both directions with five smartwatches attached to the handlebars of each bike or to the cyclist’s forearms (the latter if a non-zero pulse rate was required for correct operation), resulting in 4 × 5 × 2/10 = 4 separate measurements per smartwatch model.
2.6. Data Analysis and Statistics
The data collected from all smartwatches was transferred to the mobile phone apps of the respective manufacturers via Bluetooth. The following apps were used: Fitbit (Fitbit, Inc., San Francisco, CA, USA), Fossil Smartwatch (Fossil, Inc., Richardson TX, USA), Garmin Connect (Garmin AG, Schaffhausen, Switzerland), Apple Health (Apple, Inc., Cupertion, CA, USA), Huawei Health (Huawei, Shenzhen, Guangdong, China), Mi Fitness (Xiaomi, Inc., Beijing, China), Polar Flow (Polar Electro, Inc., Kempele, Finnland), Samsung Health (Samsung Electronics Co., Ltd., Seoul, South Korea), and Zepp (Zepp North America, Inc., Irvine, CA, USA). Afterwards, the data were transferred manually to Microsoft® Excel® for Microsoft 365 (Microsoft Corporation, Redmond, WA, USA). All mathematical analyses and statistical tests were performed with Microsoft® Excel® for Microsoft 365 (Microsoft Corporation, Redmond, USA), MATLAB R2023a (MathWorks Inc., Natick, MA, USA), and JASP (JASP Team (2023), Version 0.17.1, Amsterdam University, Amsterdam, Netherlands).
To evaluate the accuracy of the heart rate measurement, the mean heart rate and the peak heart rate were each determined for 3 min intervals and then compared to the heart rate values from the chest strap reference measurements. Average and peak heart rate values for the 3 min intervals were directly reported by the smartwatch under study and the chest strap reference, respectively, so that no further averaging or peak detection had to be conducted. The 3 min interval per stage was chosen because of its commonality and significance in performance diagnostics. Derived descriptive statistical parameters comprised minimum deviation (from reference), maximum deviation, mean absolute error, mean absolute percentage error, median, and interquartile range. Pearson and Spearman correlation coefficients were calculated between measured and reference heart rates, along with their coefficients of determination (R2) and levels of significance.
To verify the accuracy of the GNSS-based distance measurements, the distances measured by the smartwatches were compared to the true reference distance. In particular, descriptive statistics included the arithmetic mean, minimum, maximum, mean absolute error, mean absolute percentage error, standard deviation, and interquartile range. To test for statistical significance of possible deviations between measured distances and the reference value, t-tests were carried out with effect sizes characterised by Cohen’s d. To compare the accuracy of the watches among each other, a one-way repeated-measures ANOVA was conducted with the smartwatch model being the independent variable. In addition, a t-test was conducted for the 4000 m stadium runs to investigate if wearing the watch on the left or right forearm (i.e., inside or outside the lane) had an impact on the measured distance.
For the swimming tests, the number of metres swum by the participants, the number of strokes used, and the SWOLF index were evaluated. The SWOLF value is the time in seconds plus the number of strokes required to swim a given distance, i.e., SWOLF = time in seconds/lap + strokes/lap. Descriptive statistics comprised the arithmetic mean, minimum, maximum, mean absolute error, mean absolute percentage error, standard deviation, and interquartile range.
For all statistical tests, the level of significance was set to p < 0.05. If not stated otherwise, results are given in terms of mean ± standard deviation (SD).