Using Machine Learning to Understand Injuries in Female Agricultural Operators in the Central United States
Abstract
:1. Introduction
2. Materials and Methods
2.1. Sample
2.2. Measures
2.3. Statistical Analysis
2.3.1. Sample
2.3.2. XGBoost
2.3.3. Logistic Regression
3. Results
3.1. Sample
3.2. XGBoost
3.3. Logistic Regression
3.4. Comparing XGBoost to Logistic Regression
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- USDA. National Agricultural Statistics Service. Female Producers. 2019. Available online: https://www.nass.usda.gov/Publications/Highlights/2019/2017Census_Female_Producers.pdf (accessed on 10 November 2024).
- USDA. National Agricultural Statistics Service. 2022 Census of Agriculture. Available online: https://www.nass.usda.gov/AgCensus/ (accessed on 10 November 2024).
- Pilgeram, R.; Dentzman, K.; Lewin, P.; Conley, K. How the USDA changed the way women farmers are counted in the Census of Agriculture. Choices 2020, 35, 1–10. [Google Scholar]
- Ball, J.A. She works hard for the money: Women in Kansas agriculture. Agric. Hum. Values 2014, 31, 595–605. [Google Scholar] [CrossRef]
- Kubik, W.; Moore, R.J. Health and well-being of farm women: Contradictory roles in the contemporary economy. J. Agric. Saf. Health 2005, 11, 249–256. [Google Scholar] [CrossRef] [PubMed]
- Pilgeram, R.; Dentzman, K.; Lewin, P. Women, race and place in US Agriculture. Agric. Hum. Values 2022, 39, 1341–1355. [Google Scholar] [CrossRef] [PubMed]
- Trauger, A. ‘‘Because they can do the work’’: Women farmers in sustainable agriculture in Pennsylvania, USA. Gend. Place Cult. 2004, 11, 289–307. [Google Scholar] [CrossRef]
- Trauger, A.; Sachs, C.; Barbercheck, M.; Brasier, K.; Kiernan, N.E. “Our market is our community”: Women farmers and civic agriculture in Pennsylvania, USA. Agric. Hum. Values 2010, 27, 43–55. [Google Scholar] [CrossRef]
- Schmidt, C.; Goetz, S.J.; Tian, Z. Female farmers in the United States: Research needs and policy questions. Food Policy 2021, 101, 102039. [Google Scholar] [CrossRef]
- Fremstad, A.; Paul, M. Opening the farm gage to women? The gender gap in US agriculture. J. Econ. Issues 2020, 54, 124–141. [Google Scholar] [CrossRef]
- Schmidt, C.; Deller, S.C.; Goetz, S.J. Women farmers and community well-being under modeling uncertainty. Appl. Econ. Perspect. Policy 2024, 46, 275–299. [Google Scholar] [CrossRef]
- Prater, L.F. Health Needs of Women in Ag Overlooked. Successful Farming, 18 May 2022. Available online: https://www.agriculture.com/family/health-safety/health-needs-of-women-in-ag-overlooked (accessed on 24 November 2024).
- Dimich-Ward, H.; Guernsey, J.R.; Pickett, W.; Renie, D.; Hartling, L.; Brison, R.J. Gender differences in the occurrence of farm related injuries. Occup. Environ. Med. 2004, 61, 52–56. [Google Scholar]
- Karttunen, J.P.; Rautiainen, R.H.; Quendler, E. Gender division of farm work and occupational injuries. J. Agric. Saf. Health 2019, 25, 117–127. [Google Scholar] [CrossRef] [PubMed]
- Kossen, J.; Band, N.; Lyle, C.; Gomez, A.N.; Rainforth, T.; Gal, Y. Self-attention between datapoints: Going beyond individual input-output pairs in deep learning. arXiv 2021, arXiv:2106.02584. [Google Scholar]
- Grinsztajn, L.; Oyallon, E.; Varoquaux, G. Why do tree-based methods still outperform deep learning on tabular data? arXiv 2022, arXiv:2207.08815v1. [Google Scholar]
- Montomoli, J.; Romeo, L.; Moccia, S.; Bernardini, M.; Migiorelli, L.; Berardin, D.; Donati, A.; Carsetti, A.; Bocci, G.; Garcia, P.D.W.; et al. RISC-19-ICU Investigators. Machine learning using the extreme gradient boosting (XGBoost) algorithm predicts 5-day delta of SOFA Score at ICU admission in COVID-19 patients. J. Intensive Med. 2021, 1, 110–116. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
- Beseler, C.L.; Rautiainen, R.H. Injury, musculoskeletal symptoms, and stress as a function of aging in agricultural operators. Workplace Health Saf. 2023, 17, 597–605. [Google Scholar] [CrossRef]
- Wang, R.; Wang, L.; Zhang, J.; He, M.; Xu, J. XGBoost machine learning algorism performed better than regression models in predicting mortality of moderate-to-severe traumatic brain injury. World Neurosurg. 2022, 163, e617–e622. [Google Scholar] [CrossRef]
- Luu, B.C.; Wright, A.L.; Haeberle, H.S.; Karnuta, J.M.; Schickendantz, M.S.; Makhni, E.C.; Nwachukwu, B.U.; Williams, R.J.; Ramkumar, P.N. Machine learning outperforms logistic regression analysis to predict next-season NHL player injury. An analysis of 2322 players from 2007 to 2017. Orthop. J. Sports Med. 2020, 8, 2325967120953404. [Google Scholar] [CrossRef]
- Xi, Y.; Wang, H.Y.; Sun, N.L. Machine learning outperforms traditional logistic regression and offers new possibilities for cardiovascular risk prediction: A study involving 143,043 Chinese patients with hypertension. Front. Cardiovasc. Med. 2022, 9, 1025705. [Google Scholar] [CrossRef]
- Liew, B.X.W.; Kovacs, F.M.; Rügamer, D.; Royuela, A. Machine learning versus logistic regression for prognostic modelling in individuals with non-specific neck pain. Eur. Spine J. 2022, 31, 2082–2091. [Google Scholar] [CrossRef]
- Karnuta, J.M.; Luu, B.C.; Haeberle, H.S.; Saluan, P.M.; Frangiamore, S.J.; Stearns, K.L.; Farrow, L.D.; Nwachukwu, B.U.; Verma, N.N.; Makhni, E.C.; et al. Machine learning outperforms regression analysis to predict next-season major league baseball player injuries. Orthop. J. Sports Med. 2020, 8, 2325967120963046. [Google Scholar] [CrossRef]
- Song, X.; Liu, X.; Liu, F.; Wang, C. Comparison of machine learning and logistic regression models in predicting acute kidney injury: A systematic review and meta-analysis. Int. J. Med. Inform. 2021, 151, 104484. [Google Scholar] [CrossRef] [PubMed]
- Christodoulou, E.; Ma, J.; Collins, G.S.; Steyerberg, E.W.; Verbakel, J.Y.; Calster, B.V. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J. Clin. Epidemiol. 2019, 110, 12–22. [Google Scholar] [CrossRef] [PubMed]
- Chengane, S.; Beseler, C.L.; Duysen, E.G.; Rautiainen, R.H. Occupational stress among farm and ranch operators from a seven state surveillance system in the midwestern United States. BMC Public Health 2021, 21, 2076. [Google Scholar] [CrossRef] [PubMed]
- Gorucu, S.; Murphy, D.J.; Kassab, C. A multi-year analysis of fatal farm and agricultural injuries in Pennsylvania. J. Agric. Saf. Health 2015, 21, 281–298. [Google Scholar]
- Jadhav, R.; Lander, L.; Achutan, C.; Haynatzki, G.; Rajaram, S.; Patel, K.; Rautiainen, R. Review and meta-analysis of emerging risk factors for agricultural injury. J. Agromed. 2016, 21, 1–14. [Google Scholar] [CrossRef]
- Reiner, A.M.; Gerberich, S.G.; Ryan, A.D.; Mandel, J. Large machinery-related agricultural injuries across a five-state region in the Midwest. J. Occup. Environ. Med. 2016, 58, 154–161. [Google Scholar] [CrossRef]
- Stueland, D.T.; Lee, B.C.; Nordstrum, D.L.; Layde, P.M.; Wittman, L.M.; Gunderson, P.D. Case-control study of agricultural injuries to women in Central Wisconsin. Women Health 1997, 25, 91–103. [Google Scholar] [CrossRef]
Variable | Total Sample n = 1529 n (%) | Training Sample n = 1070 n (%) | Test Sample n = 459 n (%) |
---|---|---|---|
Injury (outcome) | |||
Yes | 156 (10.2) | 107 (10.0) | 49 (10.7) |
No | 1373 (89.8) | 963 (90.0) | 410 (89.3) |
Operator | |||
Primary | 202 (13.2) | 147 (13.7) | 55 (12.0) |
Secondary | 1218 (79.7) | 849 (79.4) | 369 (80.4) |
Tertiary | 109 (7.13) | 74 (6.92) | 35 (7.63) |
Primary occupation | |||
Farm/ranch work | 847 (56.1) | 589 (55.8) | 258 (57.0) |
Other | 662 (43.9) | 467 (44.2) | 195 (43.0) |
Percentage time working on operation | |||
100% | 385 (25.9) | 269 (25.9) | 116 (26.1) |
75–99% | 227 (15.3) | 158 (15.2) | 69 (15.5) |
50–74% | 223 (15.0) | 150 (14.4) | 73 (16.4) |
25–49% | 332 (22.4) | 234 (22.5) | 98 (22.0) |
0–24% | 318 (21.4) | 229 (22.0) | 89 (20.0) |
Farm or ranch | |||
Farm | 926 (60.6) | 654 (61.1) | 272 (59.3) |
Ranch | 231 (15.1) | 158 (14.8) | 73 (15.9) |
Both | 372 (24.3) | 258 (24.1) | 114 (24.8) |
Estimated revenue | |||
<50,000 | 245 (16.1) | 179 (16.8) | 66 (14.5) |
50,000–100,000 | 143 (9.41) | 105 (9.87) | 38 (8.33) |
100,000–200,000 | 233 (15.3) | 155 (14.6) | 78 (17.1) |
200,000–300,000 | 172 (11.3) | 127 (11.9) | 45 (9.87) |
300,000–400,000 | 150 (9.87) | 99 (9.30) | 51 (11.2) |
400,000–500,000 | 118 (7.76) | 82 (7.71) | 36 (7.89) |
500,000–1,000,000 | 287 (18.9) | 193 (18.1) | 94 (20.6) |
1,000,000–2,000,000 | 128 (8.42) | 92 (8.65) | 36 (7.89) |
2,000,000–3,000,000 | 27 (1.78) | 19 (1.79) | 8 (1.75) |
3,000,000–5,000,000 | 12 (0.79) | 10 (0.94) | 2 (0.44) |
>5,000,000 | 5 (0.33) | 3 (0.28) | 2 (0.44) |
Respiratory condition | |||
Yes | 417 (27.3) | 282 (26.4) | 135 (29.4) |
No | 1112 (72.7) | 788 (73.6) | 324 (70.6) |
Skin disorder | |||
Yes | 294 (19.2) | 217 (20.3) | 77 (16.8) |
No | 1235 (80.8) | 853 (79.7) | 382 (83.2) |
High work-related stress | |||
Yes | 338 (22.1) | 226 (21.1) | 112 (24.4) |
No | 1191 (77.9) | 844 (78.9) | 347 (75.6) |
Sleep deprivation | |||
Yes | 293 (19.2) | 192 (17.9) | 101 (22.0) |
No | 1236 (80.8) | 878 (82.1) | 358 (78.0) |
Exhaustion | |||
Yes | 345 (22.6) | 230 (21.5) | 115 (25.1) |
No | 1184 (77.4) | 840 (78.5) | 344 (74.9) |
Musculoskeletal discomfort exposure | |||
Yes | 1070 (70.0) | 747 (69.8) | 323 (70.4) |
No | 459 (30.0) | 323 (30.2) | 136 (29.6) |
Noise exposure * | |||
Yes | 990 (64.7) | 675 (63.1) | 315 (68.6) |
No | 539 (35.3) | 395 (36.9) | 144 (31.4) |
Respiratory exposures | |||
Yes | 786 (51.4) | 548 (51.2) | 238 (51.8) |
No | 743 (48.6) | 522 (48.8) | 221 (48.2) |
Skin exposures | |||
Yes | 1042 (68.2) | 717 (67.0) | 325 (70.8) |
No | 487 (31.8) | 353 (33.0) | 134 (29.2) |
Use MSD prevention techniques | |||
Yes | 1218 (79.7) | 851 (79.5) | 367 (80.0) |
No | 311 (20.3) | 219 (20.5) | 92 (20.0) |
Continuous variables | Mean (SD) | Mean (SD) | Mean (SD) |
Age | 59.7 (12.2) | 60.1 (12.2) | 58.9 (12.0) |
Number of musculoskeletal disorders | 1.26 (1.61) | 1.25 (1.62) | 1.28 (1.59) |
Variable | Univariable OR (95% CI) | Multivariable OR (95% CI) |
---|---|---|
Operator | ||
Primary | Reference | Reference |
Secondary | 0.91 (0.56, 1.47) | 0.78 (0.45, 1.34) |
Tertiary | 1.01 (0.48, 2.13) | 0.72 (0.31, 1.69) |
Primary occupation | ||
Farm/ranch work | Reference | Reference |
Other | 0.55 (0.38, 0.78) | 0.68 (0.38, 1.22) |
Percentage time working on operation | ||
100% | Reference | Reference |
75–99% | 1.05 (0.64, 1.73) | 1.09 (0.64, 1.85) |
50–74% | 1.03 (0.63, 1.70) | 1.39 (0.78, 2.45) |
25–49% | 0.82 (0.51, 1.31) | 1.50 (0.75, 3.00) |
0–24% | 0.33 (0.18, 0.61) | 0.93 (0.40, 2.17) |
Farm or ranch | ||
Both | Reference | Reference |
Farm | 0.70 (0.48, 1.02) | 0.70 (0.47, 1.06) |
Ranch | 0.94 (0.57, 1.56) | 0.86 (0.49, 1.49) |
Estimated revenue | ||
1.11 (1.03, 1.18) | ||
No | Reference | Reference |
Yes | 1.78 (1.26, 2.51) | 1.24 (0.84, 1.84) |
Diagnosed skin disorder | ||
No | Reference | Reference |
Yes | 1.70 (1.17, 2.48) | 1.45 (0.94, 2.22) |
High work-related stress | ||
No | Reference | Reference |
Yes | 2.97 (2.10, 4.19) | 1.61 (1.03, 2.52) |
Sleep deprivation | ||
No | Reference | Reference |
Yes | 2.43 (1.70, 3.48) | 1.06 (0.67, 1.68) |
Exhaustion | ||
No | Reference | Reference |
Yes | 2.54 (1.80, 3.59) | 0.96 (0.60, 1.52) |
Work positions leading to musculoskeletal discomfort | ||
No | Reference | Reference |
Yes | 3.40 (2.08, 5.57) | 1.80 (0.94, 3.42) |
Noise exposure | ||
No | Reference | Reference |
Yes | 1.85 (1.26, 2.72) | 1.03 (0.64, 1.66) |
Respiratory exposures | ||
No | Reference | Reference |
Yes | 2.23 (1.57, 3.18) | 0.97 (0.62, 1.52) |
Skin exposures | ||
No | Reference | Reference |
Yes | 2.19 (1.44, 3.33) | 0.94 (0.56, 1.57) |
Use MSD prevention techniques | ||
No | Reference | Reference |
Yes | 2.80 (1.59, 4.92) | 1.84 (0.93, 3.62) |
Age | 0.97 (0.96, 0.99) | 0.98 (0.96, 0.99) |
Number of musculoskeletal symptoms | 1.39 (1.27, 1.51) | 1.25 (1.13, 1.40) |
Variable | Univariable OR (95% CI) | Multivariable OR (95% CI) |
---|---|---|
Musculoskeletal symptoms | 1.39 (1.27, 1.51) | 1.28 (1.17, 1.41) |
Age | 0.97 (0.96, 0.99) | 0.98 (0.96, 0.99) |
Sleep deprivation | 2.43 (1.70, 3.48) | 1.14 (0.74, 1.75) |
High work-related stress | 2.97 (2.10, 4.19) | 1.57 (1.03, 2.39) |
Respiratory exposures | 2.23 (1.57, 3.18) | 1.42 (0.97, 2.09) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Beseler, C.L.; Rautiainen, R.H. Using Machine Learning to Understand Injuries in Female Agricultural Operators in the Central United States. Safety 2025, 11, 9. https://doi.org/10.3390/safety11010009
Beseler CL, Rautiainen RH. Using Machine Learning to Understand Injuries in Female Agricultural Operators in the Central United States. Safety. 2025; 11(1):9. https://doi.org/10.3390/safety11010009
Chicago/Turabian StyleBeseler, Cheryl L., and Risto H. Rautiainen. 2025. "Using Machine Learning to Understand Injuries in Female Agricultural Operators in the Central United States" Safety 11, no. 1: 9. https://doi.org/10.3390/safety11010009
APA StyleBeseler, C. L., & Rautiainen, R. H. (2025). Using Machine Learning to Understand Injuries in Female Agricultural Operators in the Central United States. Safety, 11(1), 9. https://doi.org/10.3390/safety11010009