Machine learning vs. regression models to predict the risk of Legionella contamination in a hospital water network

Machine learning vs. regression models to predict the risk of Legionella contamination in a hospital water network

Authors

  • Osvalda De Giglio Interdisciplinary Department of Medicine, Hygiene Section, University of Bari Aldo Moro, Bari, Italy
  • Fabrizio Fasano Interdisciplinary Department of Medicine, Hygiene Section, University of Bari Aldo Moro, Bari, Italy
  • Giusy Diella Interdisciplinary Department of Medicine, Hygiene Section, University of Bari Aldo Moro, Bari, Italy
  • Valentina Spagnuolo nterdisciplinary Department of Medicine, Hygiene Section, University of Bari Aldo Moro, Bari, Italy; Department of Precision and Regenerative Medicine and Ionian Area (DiMePre-J), University of Bari Aldo Moro, Bari, Italy
  • Francesco Triggiano Interdisciplinary Department of Medicine, Hygiene Section, University of Bari Aldo Moro, Bari, Italy
  • Marco Lopuzzo nterdisciplinary Department of Medicine, Hygiene Section, University of Bari Aldo Moro, Bari, Italy; Department of Precision and Regenerative Medicine and Ionian Area (DiMePre-J), University of Bari Aldo Moro, Bari, Italy
  • Francesca Apollonio Interdisciplinary Department of Medicine, Hygiene Section, University of Bari Aldo Moro, Bari, Italy
  • Carla Maria Leone Azienda Ospedaliero Universitaria Policlinico di Bari, Hygiene Section, Bari, Italy
  • Maria Teresa Montagna Interdisciplinary Department of Medicine, Hygiene Section, University of Bari Aldo Moro, Bari, Italy

Keywords:

Machine learning; Water network; Hospital; Artificial Intelligence, Legionella

Abstract

Introduction. The periodic monitoring of Legionella in hospital water networks allows preventive measures to be taken to avoid the risk of legionellosis to patients and healthcare workers. Study design. The aim of the study is to standardize a method for predicting the risk of Legionella contamination in the water supply of a hospital facility, by comparing Machine Learning, conventional and combined models. Methods. During the period July 2021– October 2022, water sampling for Legionella detection was performed in the rooms of an Italian hospital pavilion (89.9% of the total number of rooms). Fifty-eight parameters regarding the structural and environmental characteristics of the water network were collected. Models were built on 70% of the dataset and tested on the remaining 30% to evaluate accuracy, sensitivity, and specificity. Results. A total of 1,053 water samples were analyzed and 57 (5.4%) were positive for Legionella. Of the Machine Learning models tested, the most efficient had an input layer (56 neurons), hidden layer (30 neurons), and output layer (two neurons). Accuracy was 93.4%, sensitivity was 43.8%, and specificity was 96%. The regression model had an accuracy of 82.9%, sensitivity of 20.3%, and specificity of 97.3%. The combination of the models achieved an accuracy of 82.3%, sensitivity of 22.4%, and specificity of 98.4%. The most important parameters that influenced the model results were the type of water network (hot/cold), the replacement of filter valves, and atmospheric temperature. Among the models tested, Machine Learning obtained the best results in terms of accuracy and sensitivity. Conclusions. Future studies are required to improve these predictive models by expanding the dataset using other parameters and other pavilions of the same hospital.

References

1. 2. Fields BS, Benson RF, Besser RE. Legionella and Legion-

naires’ disease: 25 years of investigation. Clin Microbiol

Rev. 2002 Jul;15(3):506-26. doi: 10.1128/CMR.15.3.506-

526.2002. PMID: 12097254.

Iliadi V, Staykova J, Iliadis S, Konstantinidou I, Sivykh P,

Romanidou G, et al. Legionella pneumophila: The Journey

from the Environment to the Blood. J Clin Med. 2022

Oct 18;11(20):6126. doi: 10.3390/jcm11206126. PMID:

36294446.

3. Samuelsson J, Payne Hallström L, Marrone G, Gomes Dias.

J. Legionnaires’ disease in the EU/EEA*: increasing trend

from 2017 to 2019. Euro. Surveill. 2023, 28(11), 2200114.

doi: 10.2807/1560-7917.ES.2023.28.11.2200114. PMID:

36927719.

4. Guidelines for drinking-water quality: Third edition. Gene-

va: World Health Organization; 2004.

5. Guidelines for drinking-water quality: Fourth edition incor-

porating the first and second addenda. Geneva: World Health

Organization; 2022. Available from: https://www.who.int/

publications/i/item/9789240045064 [Last accessed: 2024

May 20].

6. Direttiva (E.U.), 2020/2184 del. Parlamento Europeo e del

Consiglio del 16 dicembre 2020 Concernente la Qualità del-

le Acque Destinate al Consumo Umano. G.U. dell’Unione

Europea L 435/1 del 23 dicembre 2020. Available from:

http://data.europa.eu/eli/dir/2020/2184/oj [Last accessed:

2024 May 20].

7. Legislative Decree 18 February 2023 concerning the

implementation of Directive (EU) 2020/2184 of the Euro-

pean Parliament and of the Council of 16 December 2020

concerning the quality of water intended for human con-

sumption. Available from: https://www.gazzettaufficiale.it/

eli/id/2023/03/06/23G00025/SG [Last accessed: 2024 May

20].

8. European Centre for Disease Prevention and Control

(ECDC). Legionnaires’ Disease: Annual Epidemiological

Report for 2019. Annual Epidemiological Report on Com-

municable Diseases in Europe. Stockholm: ECDC; 2021.

9. Fischer FB, Saucy A, Vienneau D, Hattendorf J, Fanderl

J, de Hoogh K, et al. Impacts of weather and air pollution

on Legionnaires’ disease in Switzerland: A national case-

crossover study. Environ Res. 2023 Sep 15; 233:116327. doi:

10.1016/j.envres.2023.116327. Epub 2023 Jun 22. PMID:

37354934.

10. Graham FF, Harte D, Zhang J, Fyfe C, Baker MG. Increa-

sed Incidence of Legionellosis after Improved Diagnostic

Methods, New Zealand, 2000-2020. Emerg Infect Dis. 2023

Jun;29(6):1173-1182. doi: 10.3201/eid2906.221598. PMID:

37209673.

11. Centers for Disease Control and Prevention Legionnaires’

Disease: Use Water Management Programs in Buildings to

Help Prevent Outbreaks, 2016. Available from: https://www.

cdc.gov/vitalsigns/legionnaires/index.html [Last accessed:

2024 May 20].

12. Kanarek P, Bogiel T, Breza-Boruta B. Legionellosis risk-an

overview of Legionella spp. habitats in Europe. Environ

Sci Pollut Res Int. 2022 Nov;29(51):76532-76542. doi:

10.1007/s11356-022-22950-9. Epub 2022 Sep 26. PMID:

36161570.

13. De Giglio O, Diella G, Lopuzzo M, Triggiano F, Calia C,

Pousis C, et al. Management of Microbiological Conta-

mination of the Water Network of a Newly Built Hospital

Pavilion. Pathogens. 2021 Jan 16;10(1),75. doi: 10.3390/

pathogens10010075.

14. Ghaznavi C, Ishikane M, Yoneoka D, Tanoue Y, Kawashi-

ma T, Eguchi A, et al. Effect of the COVID-19 pandemic

138 15. 16. 17. 18. 19. 20. 21. 22. 23. and state of emergency declarations on the relative inci-

dence of legionellosis and invasive pneumococcal disease

in Japan. J Infect Chemother. 2023 Jan;29(1), 90-4. doi:

10.1016/j.jiac.2022.08.016. Epub 2022 Sep 16. PMID:

36116719.

Borella P, Montagna MT, Stampi S, Stancanelli G, Romano-

Spica V, Triassi M, et al. Legionella contamination in hot

water of Italian hotels. Appl Environ Microbiol. 2005

Oct;71(10):5805-13. doi: 10.1128/AEM.71.10.5805-

5813.2005. PMID: 16204491.

Kyritsi MA, Mouchtouri VA, Katsioulis A, Kostara E,

Nakoulas V, Hatzinikou M, et al. Legionella Colonization

of Hotel Water Systems in Touristic Places of Greece: As-

sociation with System Characteristics and Physicochemical

Parameters. Int J Environ Res Public Health. 2018 Nov 30;

15(12):2707. doi: https://doi.org/10.3390/ijerph15122707.

PMID: 30513698.

D’Alò GL, Messina A, Mozzetti C, Cicciarella Modica D,

De Filippis P. Competitive colonization of Legionella and

Pseudomonas aeruginosa in water systems of residential

facilities hosting closed communities Legionella versus

Pseudomonas aeruginosa in water systems of residential

facilities. Ig Sanita Pubbl. 2022 Mar-Apr; 79(2):92-110.

De Giglio O, Diella G, Lopuzzo M, Triggiano F, Calia C,

Pousis C, et al. Impact of lockdown on the microbiological

status of the hospital water network during COVID-19

pandemic. Environ Res. 2020 Dec;191:110231. doi:

10.1016/j.envres.2020.110231. Epub 2020 Sep 23. PMID:

32976823.

Gamage SD, Jinadatha C, Coppin JD, Kralovic SM, Bender

A, Ambrose M, et al. Factors That Affect Legionella Posi-

tivity in Healthcare Building Water Systems from a Large,

National Environmental Surveillance Initiative. Environ Sci

Technol. 2022 Aug 16;56(16):11363-11373. doi: 10.1021/

acs.est.2c02194. Epub 2022 Aug 5. PMID: 35929739

Federigi I, De Giglio O, Diella G, Triggiano F, Apollonio

F, D’Ambrosio M, et al. Quantitative Microbial Risk As-

sessment Applied to Legionella Contamination on Long-

Distance Public Transport. Int J Environ Res Public Health.

2022 Feb 10;19(4):1960. doi: 10.3390/ijerph19041960.

PMID: 35206148.

De Giglio O, Napoli C, Diella G, Fasano F, Lopuzzo M,

Apollonio F, et al. Integrated approach for legionellosis

risk analysis in touristic-recreational facilities. Environ Res.

2021 Nov; 202:111649. doi: 10.1016/j.envres.2021.111649.

Epub 2021 Jul 9. PMID: 34252427.

Nagy DJ, Dziewulski DM, Codru N, Lauper UL. Under-

standing the distribution of positive Legionella samples in

healthcare-premise water systems: Using statistical analysis

to determine a distribution for Legionella and to support

sample size recommendations. Infect Control Hosp Epi-

demiol. 2021 Jan;42(1):63-68. doi: 10.1017/ice.2020.384.

Epub 2020 Oct 8. PMID: 33028429.

Fasano F, Addante AS, Valenzano B, Scannicchio G. Varia-

bles Influencing per Capita Production, Separate Collection,

and Costs of Municipal Solid Waste in the Apulia Region

(Italy): An Experience of Deep Learning. Int J Environ

O. De Giglio et al.

Res Public Health. 2021 Jan 17;18(2):752. doi: 10.3390/

ijerph18020752. PMID: 33477308.

24. Alom MZ, Taha TM, Yakopcic C, Westberg S, Sidike P,

Nasrin MS, et al. State-of-the-Art Survey on Deep Learning

Theory and Architectures. Electronics. 2019; 8(3):292. doi:

https://doi.org/10.3390/electronics8030292.

25. Brunello A, Civilini M, De Martin S, Saccomanno M, Vi-

tacolonna N. Machine learning-assisted environmental sur-

veillance of Legionella: A retrospective observational study

in Friuli-Venezia Giulia region of Italy in the period 2002–

2019. Informatics in Medicine Unlocked.2022; 28:100803.

doi: https://doi.org/10.1016/j.imu.2021.100803.

26. Tata A, Marzoli F, Cordovana M, Zacometti C, Massaro A,

Barco L, et al. A multi-center validation study on the discri-

mination of Legionella pneumophila sg.1, Legionella pneu-

mophila sg. 2-15 and Legionella non-pneumophila isolates

from water by FT-IR spectroscopy. Front Microbiol. 2023

Apr 13;14:1150942. doi: 10.3389/fmicb.2023.1150942.

PMID: 37125166.

27. Sinčak P, Ondo J, Kaposztasova D, Virčikova M, Vranayova

Z, Sabol J. Artificial intelligence in public health prevention

of legionelosis in drinking water systems. Int J Environ Res

Public Health. 2014 Aug 21;11(8):8597-611. doi: 10.3390/

ijerph110808597. PMID: 25153475.

28. Russell S, Norvig P. Artificial Intelligence: A Modern Ap-

proach. Global Edition; 2021.

29. Soori M, Arezoo B, Dastres R. Artificial intelligence, ma-

chine learning and deep learning in advanced robotics, a

review. Cognitive Robotics. 202; 3:54-70. doi: https://doi.

org/10.1016/j.cogr.2023.04.001.

30. Sharma N, Sharma R, Jindal N. Machine Learning and

Deep Learning Applications-A Vision. Global Transitions

Proceedings. 2021;2(1):24-28. doi:https://doi.org/10.1016/j.

gltp.2021.01.004.

31. Guidelines for the Prevention and Control of Legionellosis,

2015. Available from: http://www.salute.gov.it/imgs/C_17_

pubblicazioni_2362_allegato.pdf. [Last accessed: 2024 May

20].

32. ISO 11731:2017. Water Quality—Enumeration of Legionel-

la; International Organization for Standardization: Geneva,

Switzerland; 2017.

33. Civil Protection Department Apulia Region. Available from:

https://protezionecivile.puglia.it/bollettini-meteorologici-

regionali-mensili. [Last accessed: 2024 May 20].

34. Potdar K, Pardawala TS, Pai CDA. Comparative Study of

Categorical Variable Encoding Techniques for Neural Net-

work Classifiers. Int. J. Comput Appl. 2017;175:7-9. doi:

10.5120/ijca2017915495.

35. Patro S, Sahu KK. Normalization: A Preprocessing

Stage. IARJSET. 2015;2(3):20-22. doi: 10.5120/

ijca2017915495.

36. Xu Y, Goodacre R. On Splitting Training and Validation

Set: A Comparative Study of Cross-Validation, Bootstrap

and Systematic Sampling for Estimating the Generaliza-

tion Performance of Supervised Learning. J Anal Test.

2018;2(3):249-262. doi: 10.1007/s41664-018-0068-2. Epub

2018 Oct 29. PMID: 30842888.

Machine learning model to predict Legionella contamination

37. Dobbin KK, Simon RM. Optimally splitting cases for trai-

ning and testing high dimensional classifiers. BMC Med

Genomics. 2011 Apr 8;4:31. doi: 10.1186/1755-8794-4-31.

PMID: 21477282.

38. Kufel J, Bargieł-Łączek K, Kocot S, Koźlik M, Bartnikowska

W, Janik M, et al. What Is Machine Learning, Artificial

Neural Networks and Deep Learning?-Examples of Practical

Applications in Medicine. Diagnostics (Basel). 2023 Aug

3;13(15):2582. doi: 10.3390/diagnostics13152582. PMID:

37568945.

39. Jürgen Schmidhuber. Deep learning in neural networks: An

overview. Neural Networks. 2015;61:85-117. https://doi.

org/10.1016/j.neunet.2014.09.003.

40. Stegemann J, Buenfeld N. A Glossary of Basic Neural

Network Terminology for Regression Problems. Neural

Comput. & Applic. 1999; 8:290–6. https://doi.org/10.1007/

s005210050034.

41. Xu C, Coen-Pirani P, Jiang X. Empirical Study of Over-

fitting in Deep Learning for Predicting Breast Cancer

Metastasis. Cancers. 2023;15:1969. https://doi.org/10.3390/

cancers15071969.

42. Bengio Y, Courville A, Vincent P. Representation Learning:

A Review and New Perspectives. IEEE Transact Pattern

Anal Machine Intell. 2013;35:1798-1828. doi: 10.1109/

TPAMI.2013.50.

43. Deng L,Yu D. Deep Learning: Methods and Applica-

tions. Found. Trends Signal Process 2014;7:197-387. doi:

10.1561/2000000039.

44. Greenwell BM, Boehmke BC. Variable Importance Plots-An

Introduction to the vip Package. R Journal 2020;12(1):343-

366. https://doi.org/10.32614/RJ-2020-013.

45. Favorskaya MN, Andreev VV. The study of activation

functions in deep learning for pedestrian detection and

tracking. Int Arch Photogramm Remote Sens Spat Inf. Sci

2019; XLII-2/W12:53-9. doi: 10.5194/isprs-archives-XLII-

2-W12-53-2019.

46. Eckle K, Shmidt-Hieber J. A comparison of deep networks

with ReLU activation function and linear spline-type me-

thods. Neural Netw. 2019;110:232–242. doi: 10.1016/j.

neunet.2018.11.005.

47. Huang F, Zhang J, Zhou C, Wang Y, Huang J,Zhu L. A deep

learning algorithm using a fully connected sparse autoenco-

der neural network for landslide susceptibility prediction.

Landslides. 2020;17:217-229. doi: 10.1007/s10346-019-

01274-9.

48. De Giglio O, Fasano F, Diella G, Lopuzzo M, Napoli C,

Apollonio F, et al. Legionella and legionellosis in touristic-

recreational facilities: Influence of climate factors and geo-

statistical analysis in Southern Italy (2001-2017). Environ

Res. 2019;178:108721. doi: 10.1016/j.envres.2019.108721.

Epub 2019 Sep 6. PMID: 31541805.

49. Conza L, Casati Pagani S, Gaia V. Influence of climate and

geography on the occurrence of Legionella and amoebae in

composting facilities. BMC Res Notes. 2014 Nov 24;7:831.

doi: 10.1186/1756-0500-7-831. PMID: 25421541.

50. Cui Y, Kim DY, Zhu J. On the generalized poisson regres-

sion mixture model for mapping quantitative trait loci

139

with count data. Genetics. 2006 Dec;174(4):2159-72. doi:

10.1534/genetics.106.061960. Epub 2006 Oct 8. PMID:

17028335.

51. Nguyen QH, Ly HB, Ho LS, Al-Ansari N, Le HV, Tran

VQ, et al. Influence of Data Splitting on Performance of

Machine Learning Models in Prediction of Shear Strength

of Soil. Mathematical Problems in Engineering. 2021:1-15.

doi: 10.1155/2021/4832864.

52. Singh P, Singh N, Singh KK, Singh A. Chapter 5 - Dia-

gnosing of disease using machine learning. In: Singh KK,

Elhoseny M, Singh A, Elngar AA, Eds. Machine Learning

and the Internet of Medical Things in Healthcare. Academic

Press; 2021:89-111. doi: https://doi.org/10.1016/B978-0-12-

821229-5.00003-3.

53. Wilson AM, Canter K, Abney SE, Gerba CP, Myers ER,

Hanlin J, et al. An application for relating Legionella

shower water monitoring results to estimated health

outcomes. Water Res. 2022 Aug 1;221:118812. doi:

10.1016/j.watres.2022.118812. Epub 2022 Jul 3. PMID:

35816914.

54. Marchesi I, Paduano S, Frezza G, Sircana L, Vecchi E, Zuc-

carello P, et al. Safety and Effectiveness of Monochloramine

Treatment for Disinfecting Hospital Water Networks. Int J

Environ Res Public Health. 2020 Aug 22;17(17):6116. doi:

10.3390/ijerph17176116. PMID: 32842654.

55. Papadakis A, Keramarou M, Chochlakis D, Sandalakis V,

Mouchtouri VA, Psaroulaki A. Legionella spp. Colonization

in Water Systems of Hotels Linked with Travel-Associated

Legionnaires’ Disease. Water. 2021;13(16):2243. https://

doi.org/10.3390/w13162243.

56. Arvand M, Jungkind K, Hack A. Contamination of the cold

water distribution system of health care facilities by Legio-

nella pneumophila: do we know the true dimension? Euro

Surveill. 2011 Apr 21;16(16):19844. PMID: 21527132.

57. Stout JE, Yu VL, Muraca P. Isolation of Legionella

pneumophila from the cold water of hospital ice ma-

chines: implications for origin and transmission of the

organism. Infect Control. 1985;6(4):141-6. doi: 10.1017/

s0195941700062937. PMID: 3886578.

58. Istituto Superiore di Sanità 2020. Rapporto COVID-19, n.

21/2020. Guida per la prevenzione della contaminazione da

Legionella negli impianti idrici di strutture turistico recet-

tive, e altri edifici ad uso civile e industriale non utilizzati

durante la pandemia COVID-19.

59. Sheffer PJ, Stout JE, Wagener MM, Muder RR. Efficacy

of new point-of-use water filter for preventing exposure to

Legionella and waterborne bacteria. Am J Infect Control.

2005;33(5 Suppl 1):S20-5. doi: 10.1016/j.ajic.2005.03.012.

PMID: 15940113.

60. Walker JT. The influence of climate change on waterborne di-

sease and Legionella: a review. Perspect Public Health. 2018

Sep;138(5):282-286. doi: 10.1177/1757913918791198.

PMID: 30156484.

61. Fragou K, Kokkinos P, Gogos C, Alamanos Y, Vantarakis A.

Prevalence of Legionella spp. in water systems of hospitals

and hotels in South Western Greece. Int J Environ Health Res.

2012;22(4):340-54. doi: 10.1080/09603123.2011.643229.

140 O. De Giglio et al.

62. 63. 64. 65. 66. Epub 2011 Dec 12. PMID: 22149148.

Montagna MT, Brigida S, Fasano F, Leone CM, D’Ambro-

sio M, Spagnuolo V, et al. The role of air temperature in

Legionella water contamination and legionellosis incidence

rates in southern Italy (2018-2023). Ann Ig. 2023 Nov-

Dec;35(6):631-640. doi: 10.7416/ai.2023.2578. Epub 2023

Sep 20. PMID: 37724578.

Dupke S, Buchholz U, Fastner J, Förster C, Frank C, Lewin

A, et al. Impact of climate change on waterborne infections

and intoxications. J Health Monit. 2023 Jun 1;8(Suppl

3):62-77. doi: 10.25646/11402. PMID: 37342430; PMCID:

PMC10278370.

Pavissich JP, Aybar M, Martin KJ, Nerenberg R. A methodo-

logy to assess the effects of biofilm roughness on substrate

fluxes using image analysis, substrate profiling, and mathe-

matical modelling. Water Sci Technol. 2014;69(9):1932-41.

doi: 10.2166/wst.2014.103. PMID: 24804670.

Tierra G, Pavissich JP, Nerenberg R, Xu Z, Alber MS.

Multicomponent model of deformation and detachment

of a biofilm under fluid flow. J R Soc Interface. 2015 May

6;12(106):20150045. doi: 10.1098/rsif.2015.0045. PMID:

25808342.

Liu J, Chen H, Yao L, Wei Z, Lou L, Shan Y, et al.The

spatial distribution of pollutants in pipe-scale of large-

67. 68. 69. 70. 71. diameter pipelines in a drinking water distribution system.

J Hazard Mater. 2016 Nov 5; 317:27-35. doi: 10.1016/j.

jhazmat.2016.05.048. Epub 2016 May 17. PMID:

27244696.

Shen Y, Monroy GL, Derlon N, Janjaroen D, Huang C,

Morgenroth E, et al. Role of biofilm roughness and hydro-

dynamic conditions in Legionella pneumophila adhesion

to and detachment from simulated drinking water biofilms.

Environ Sci Technol. 2015;49(7):4274-82. doi: 10.1021/

es505842v. Epub 2015 Mar 11. PMID: 25699403.

Lin H, Zhu X, Wang Y, Yu X. Effect of sodium hypochlorite

on typical biofilms formed in drinking water distribution

systems. J Water Health. 2017;15(2):218-227. doi: 10.2166/

wh.2017.141. PMID: 28362303.

Hordri NF, Samar A, Yuhaniz SS, Shamsuddin SM. A syste-

matic literature review on features of deep learning in big

data analytics. Int J Adv Soft Comput Appl. 2017;9(1):32-

49.

Vadera S, Ameen S. Methods for Pruning Deep Neural

Networks. IEEE Access. 2022;10:63280-63300.

Ma YD, Zhao ZC, Liu D, He Z, Zhou W. OCAP: On-device

Class-Aware Pruning for personalized edge DNN models.

J Syst Architect. 2023;142:102956.

Downloads

Published

2025-04-06

Issue

Section

Original research

How to Cite

1.
De Giglio O, Fasano F, Diella G, et al. Machine learning vs. regression models to predict the risk of Legionella contamination in a hospital water network. Ann Ig. 2025;37(1):128-140. doi:10.7416/ai.2024.2644