Machine Learning Models for Predicting Mental Health Indicators Using Digital Physical Activity Data: A Systematic Literature Review

  • Dirga Febrian Universitas Logistik dan Bisnis Internasional
  • Rolly Maulana Awangga Universitas Logistik dan Bisnis Internasional
Keywords: Machine Learning, Mental Health, Digital Physical Activity, Feature Selection

Abstract

This systematic literature analysis examines 40 studies (2020–2025) on the use of machine learning to predict mental health using digital activity data. Two research questions are presented: algorithm performance comparison and model effectiveness factor. Data surveys (43,9%) are a more widely used data collection method. Because of its interpretability, Logistic Regression is the most popular (29.3%), whereas Random Forest (26.8%) is best for performance-interpretability. With a rata-rata accuracy of 80.1% ± 4.2% and an AUC of 87.1% ± 1.8%, XGBoost provides superior performance. The best study achieves an AUC >0,98 through feature engineering that canggih using SHAP and recursive feature elimination. Critical success factors include cermat fitur selection, temporal dinamika, cross-validation, and clinical interpretability. Although machine learning has significant potential, there are still challenges with standardization, generalizability, and real-world implementation. Research in the long term requires longitudinal studies, external validation, and standard protocols to realize this technology's potential in improving mental health outcomes.

References

T. Tran et al., “Exploring key factors influencing depressive symptoms among middle-aged and elderly adult population: A machine learning-based method,” Arch Gerontol Geriatr, vol. 129, Feb. 2025, doi: 10.1016/j.archger.2024.105647.

M. D. Nemesure, M. V. Heinz, R. Huang, and N. C. Jacobson, “Predictive modeling of depression and anxiety using electronic health records and a novel machine learning approach with artificial intelligence,” Sci Rep, vol. 11, no. 1, Dec. 2021, doi: 10.1038/s41598-021-81368-4.

S. Rahman, M. Irfan, M. Raza, K. M. Ghori, S. Yaqoob, and M. Awais, “Performance analysis of boosting classifiers in recognizing activities of daily living,” Int J Environ Res Public Health, vol. 17, no. 3, Feb. 2020, doi: 10.3390/ijerph17031082.

J. P. Choe, S. Lee, and M. Kang, “Machine learning modeling for predicting adherence to physical activity guideline,” Sci Rep, vol. 15, no. 1, Dec. 2025, doi: 10.1038/s41598-025-90077-1.

T. Zhang, Z. Zhong, W. Mao, Z. Zhang, and Z. Li, “A New Machine-Learning-Driven Grade-Point Average Prediction Approach for College Students Incorporating Psychological Evaluations in the Post-COVID-19 Era,” Electronics (Switzerland), vol. 13, no. 10, May 2024, doi: 10.3390/electronics13101928.

S. S. Nawrin, H. Inada, H. Momma, and R. Nagatomi, “Twenty-four-hour physical activity patterns associated with depressive symptoms: a cross-sectional study using big data-machine learning approach,” BMC Public Health, vol. 24, no. 1, Dec. 2024, doi: 10.1186/s12889-024-18759-5.

L. Zhang et al., “Utilizing machine learning techniques to identify severe sleep disturbances in Chinese adolescents: an analysis of lifestyle, physical activity, and psychological factors,” Front Psychiatry, vol. 15, 2024, doi: 10.3389/fpsyt.2024.1447281.

R. F. Xu, Z. J. Liu, S. Ouyang, Q. Dong, W. J. Yan, and D. W. Xu, “Machine learning-driven development of a stratified CES-D screening system: optimizing depression assessment through adaptive item selection,” BMC Psychiatry, vol. 25, no. 1, Dec. 2025, doi: 10.1186/s12888-025-06693-8.

I. J. Ratul, M. M. Nishat, F. Faisal, S. Sultana, A. Ahmed, and M. A. Al Mamun, “Analyzing Perceived Psychological and Social Stress of University Students: A Machine Learning Approach,” Heliyon, vol. 9, no. 6, Jun. 2023, doi: 10.1016/j.heliyon.2023.e17307.

C. Park et al., “Machine Learning-Based Aggression Detection in Children with ADHD Using Sensor-Based Physical Activity Monitoring,” Sensors, vol. 23, no. 10, May 2023, doi: 10.3390/s23104949.

M. Raihan, A. Debnath, P. Adhikary, M. Masud, H. Meshref, and A. K. Bairagi, “Bridging Technology and Psychology: AI-Driven Analysis of Student’s Class Lecture Activity for Improved Learning Outcomes,” IEEE Access, 2024, doi: 10.1109/ACCESS.2024.3474848.

D. A. Lagunes-Ramirez, G. Gonzalez-Serna, L. Rivera-Rivera, N. Gonzalez-Franco, M. Y. Hernandez-Perez, and J. A. Reyes-Ortiz, “Through the Youth Eyes: Training Depression Detection Algorithms with Eye Tracking Data,” IEEE Latin America Transactions , vol. 23, no. 1, pp. 6–16, 2025, doi: 10.1109/TLA.2025.10810399.

D. Lekkas, J. A. Gyorda, and N. C. Jacobson, “A machine learning investigation into the temporal dynamics of physical activity-mediated emotional regulation in adolescents with anorexia nervosa and healthy controls,” European Eating Disorders Review, vol. 31, no. 1, pp. 147–165, Jan. 2023, doi: 10.1002/erv.2949.

Y. S. Can, “Stressed or just running? Differentiation of mental stress and physical activity by using machine learning,” Turkish Journal of Electrical Engineering and Computer Sciences, vol. 30, no. 1, pp. 312–327, 2022, doi: 10.3906/elk-2102-138.

E. Maekawa et al., “Bayesian Networks for Prescreening in Depression: Algorithm Development and Validation,” JMIR Ment Health, vol. 11, 2024, doi: 10.2196/52045.

A. Bieliński, I. Rojek, and D. Mikołajewski, “Comparison of Selected Machine Learning Algorithms in the Analysis of Mental Health Indicators,” Electronics (Switzerland), vol. 12, no. 21, Nov. 2023, doi: 10.3390/electronics12214407.

Y. Chen et al., “Machine Learning Methods to Identify Predictors of Psychological Distress,” Processes, vol. 10, no. 5, May 2022, doi: 10.3390/pr10051030.

M. Tan et al., “Evaluating machine learning-enabled and multimodal data-driven exercise prescriptions for mental health: a randomized controlled trial protocol,” Front Psychiatry, vol. 15, 2024, doi: 10.3389/fpsyt.2024.1352420.

T. Zhou, T. Feng, and A. Kemperman, “Non-linear associations between the built environment and outdoor activity duration: An application of gradient boosting decision trees,” Cities, vol. 165, Oct. 2025, doi: 10.1016/j.cities.2025.106146.

I. Sander, R. Mazumder, J. Fingerhut, F. J. Parada, A. Koselevs, and K. Gramann, “Beyond built density: From coarse to fine-grained analyses of emotional experiences in urban environments,” J Environ Psychol, vol. 96, Jun. 2024, doi: 10.1016/j.jenvp.2024.102337.

F. Liu, P. Yu, J. Wu, and L. Guo, “The influence of exercise adherence on peace of mind among Chinese college students: a moderated chain mediation model,” Front Public Health, vol. 12, 2024, doi: 10.3389/fpubh.2024.1447429.

S. Vairavasundaram et al., “Dynamic Physical Activity Recommendation Delivered through a Mobile Fitness App: A Deep Learning Approach,” Axioms, vol. 11, no. 7, Jul. 2022, doi: 10.3390/axioms11070346.

J. E. Teixeira et al., “Player Tracking Data and Psychophysiological Features Associated with Mental Fatigue in U15, U17, and U19 Male Football Players: A Machine Learning Approach,” Applied Sciences (Switzerland), vol. 15, no. 7, Apr. 2025, doi: 10.3390/app15073718.

Z. S. Venter, V. Gundersen, S. L. Scott, and D. N. Barton, “Bias and precision of crowdsourced recreational activity data from Strava,” Landsc Urban Plan, vol. 232, Apr. 2023, doi: 10.1016/j.landurbplan.2023.104686.

K. Nowakowska, A. Sakellarios, J. Kaźmierski, D. I. Fotiadis, and V. C. Pezoulas, “AI-Enhanced Predictive Modeling for Identifying Depression and Delirium in Cardiovascular Patients Scheduled for Cardiac Surgery,” Diagnostics, vol. 14, no. 1, Jan. 2024, doi: 10.3390/diagnostics14010067.

J. hee Kim, “Comparison of predictive models for knee pain and analysis of individual and physical activity variables using interpretable machine learning,” Knee, vol. 54, pp. 146–153, Jun. 2025, doi: 10.1016/j.knee.2025.02.006.

C. Park, N. Kim, C. W. Won, and M. Kim, “Predicting cognitive frailty in community-dwelling older adults: a machine learning approach based on multidomain risk factors,” Sci Rep, vol. 15, no. 1, Dec. 2025, doi: 10.1038/s41598-025-00844-3.

A. Shvetcov et al., “Machine learning identifies a COVID-19-specific phenotype in university students using a mental health app,” Internet Interv, vol. 34, Dec. 2023, doi: 10.1016/j.invent.2023.100666.

D. Lekkas, R. J. Klein, and N. C. Jacobson, “Predicting acute suicidal ideation on Instagram using ensemble machine learning models,” Internet Interv, vol. 25, Sep. 2021, doi: 10.1016/j.invent.2021.100424.

Z. Lin and W. (David) Fan, “Modeling bicycle volume using crowdsourced data from Strava smartphone application,” International Journal of Transportation Science and Technology, vol. 9, no. 4, pp. 334–343, Dec. 2020, doi: 10.1016/j.ijtst.2020.03.003.

S. Munira and I. N. Sener, “A geographically weighted regression model to examine the spatial variation of the socioeconomic and land-use factors associated with Strava bike activity in Austin, Texas,” J Transp Geogr, vol. 88, Oct. 2020, doi: 10.1016/j.jtrangeo.2020.102865.

B. Sheng, O. M. Moosman, B. Del Pozo-Cruz, J. Del Pozo-Cruz, R. M. Alfonso-Rosa, and Y. Zhang, “A comparison of different machine learning algorithms, types and placements of activity monitors for physical activity classification,” Measurement (Lond), vol. 154, Mar. 2020, doi: 10.1016/j.measurement.2020.107480.

H. Jiang, L. Dong, and B. Qiu, “How Are Macro-Scale and Micro-Scale Built Environments Associated with Running Activity? The Application of Strava Data and Deep Learning in Inner London,” ISPRS Int J Geoinf, vol. 11, no. 10, Oct. 2022, doi: 10.3390/ijgi11100504.

J. Turner, T. Wagner, and B. Langhals, “Biomechanical and Psychological Predictors of Failure in the Air Force Physical Fitness Test,” Sports, vol. 10, no. 4, Apr. 2022, doi: 10.3390/sports10040054.

Z. Zheng et al., “Diagnosing and tracking depression based on eye movement in response to virtual reality,” Front Psychiatry, vol. 15, 2024, doi: 10.3389/fpsyt.2024.1280935.

S. Yan, H. Hosseinmardi, H. Te Kao, S. Narayanan, K. Lerman, and E. Ferrara, “Affect Estimation with Wearable Sensors,” J Healthc Inform Res, vol. 4, no. 3, pp. 261–294, Sep. 2020, doi: 10.1007/s41666-019-00066-z.

C. Shen and Y. Li, “The association of mental health with physical activity and its dimensions in Chinese adults: A cross-sectional study,” PLoS One, vol. 19, no. 10 October, Oct. 2024, doi: 10.1371/journal.pone.0311535.

A. C. Arevian et al., “Clinical state tracking in serious mental illness through computational analysis of speech,” PLoS One, vol. 15, no. 1, Jan. 2020, doi: 10.1371/journal.pone.0225695.

R. Riad et al., “Automated Speech Analysis for Risk Detection of Depression, Anxiety, Insomnia, and Fatigue: Algorithm Development and Validation Study,” J Med Internet Res, vol. 26, 2024, doi: 10.2196/58572.

H. Wu, C. Pettit, S. Yoo, and J. Lee, “Can infrastructure, built environment, and geographic factor negate weather impact on Strava cyclists?,” J Transp Land Use, vol. 17, no. 1, pp. 1–20, Jan. 2024, doi: 10.5198/jtlu.2024.2318.

Published
2025-07-30
How to Cite
Dirga Febrian, & Rolly Maulana Awangga. (2025). Machine Learning Models for Predicting Mental Health Indicators Using Digital Physical Activity Data: A Systematic Literature Review. Journal Informatic, Education and Management (JIEM), 7(2), 258-275. https://doi.org/10.61992/jiem.v7i2.142