Natural language processing and String Metric-assisted Assessment of Semantic Heterogeneity method for capturing and standardizing unstructured nursing activities in a hospital setting: a retrospective study
Keywords:
: Cross-mapping, natural language processing, standardized nursing terminology, professional assessment instrument, nursing activities, clinical nursing information systemAbstract
Background. Nurses record data in electronic health records (EHRs) using different terminologies and coding systems. The purpose of this study was to identify unstructured free-text nursing activities recorded by nurses in EHRs with natural language processing (NLP) techniques and to map these nursing activities into standard nursing activities using the SMASH method.
Study design. A retrospective study using NLP techniques with a unidirectional mapping strategy called
SMASH.
Methods. The unstructured free-text nursing activities recorded in the Medicine, Neurology and Gastroenterology inpatient units of the Agostino Gemelli IRCCS University Hospital Foundation, Rome, Italy were collected for 6 months in 2018. Data were analyzed by three phases: a) text summarization component with NLP techniques, b) a consensus analysis by four experts to detect the category of word stems, and c) cross-mapping with SMASH. The SMASH method calculated the string comparison, similarity and distance of words through the Levenshtein distance (LD), Jaro-Winker distance and the cross-mapping’s cut-offs: map [0.80-1.00] with < 13 LD, partial-map [0.50-0.79] with <13 LD and no map [0.0-0.49] with >13 LD.
Results. During the study period, 491 patient records were assessed. 548 different unstructured free-text nursing activities were recorded by nurses. 451 unstructured free-text nursing activities (82.3%) were mapped to standard PAI nursing activities, 47 (8.7%) were partial mapped, while 50 (9.0%) were not mapped. This automated mapping yielded recall of 0.95%, precision of 0.94%, accuracy of 0.91%, F-measure of 0.96. The F-measure indicates good reliability of this automated procedure in cross-mapping.
Conclusions. Lexical similarities between unstructured free-text nursing activities and standard nursing activities were found, NLP with the SMASH method is a feasible approach to extract data related to nursing concepts that are not recorded through structured data entry.
References
1. Maas ML, Delaney C. Nursing process outcome linkage research: issues, current status, and health policy implications. Med. 2004; 42(2 Suppl): 40-8.
doi: 10.1097/01.mlr.0000109291.44014.cb.
2. D’Agostino F, Sanson G, Cocchieri A, et al. Prevalence of nursing diagnoses as a measure of nursing complexity in a hospital setting. J Adv Nurs. 2017; 73(9): 2129-42. doi: 10.1111/ jan.13285.
3. Galatzan BJ, Carrington JM. Examining the meaning of the language used to communicate the nursing hand-off. Res Nurs Health. 2021; 44(5): 833-43. doi: 10.1002/nur.22175.
4. Tastan S, Linch GCF, Keenan GM, et al. Evidence for the existing American Nurses Association-recognized standardized nursing terminologies: a systematic review. Int J Nurs Stud. 2014; 51(8): 1160-70. doi: 10.1016/j. ijnurstu.2013.12.004.
5. Häyrinen K, Saranto K. The use of nursing terminology in electronic documentation. Stud Health Technol Inform. 2009; 146: 342-6. doi:
10.3233/978-1-60750-024-7-342.
6. D’Agostino F, Zeffiro V, Vellone E, et al. CrossMapping of Nursing Care Terms Recorded in Italian Hospitals into the Standardized NNN Terminology. Int J Nurs Knowl. 2020; 31(1): 4-13. doi: 10.1111/2047-3095.12200.
7. De Groot K, De Veer AJE, Paans W, Francke AL. Use of electronic health records and standardized terminologies: A nationwide survey of nursing staff experiences. Int J Nurs Stud. 2020; 104: 103523. doi: 10.1016/j.ijnurstu.2020.103523.
8. Sanson G, Vellone E, Kangasniemi M, Alvaro R, D’Agostino F. Impact of nursing diagnoses on patient and organisational outcomes: a systematic literature review. J Clin Nurs. 2017; 26(2324): 3764-3783. doi: 10.1111/jocn.13717.
9. Rabelo-Silva ER, Dantas Cavalcanti AC, Ramos Goulart Caldas MC, Lucena AF, Almeida MA, Linch GF, da Silva MB, Müller-Staub M. Advanced Nursing Process quality: Comparing the International Classification for Nursing Practice
(ICNP) with the NANDA-International (NANDA-I) and Nursing Interventions Classification (NIC). J Clin Nurs. 2017; 26(3-4): 379-387. doi:
10.1111/jocn.13387.
10. Ali S, Sieloff CL. Nurse’s use of power to standardise nursing terminology in electronic health records. J Nurs Manag. 2017; 25(5): 346-353. doi: 10.1111/jonm.12471.
11. Chae S, Oh H, Moorhead S. Effectiveness of Nursing Interventions using Standardized Nursing Terminologies: An Integrative Review. West J Nurs Res. 2020; 42(11): 963-973. doi:
10.1177/0193945919900488.
12. Saba VK, Arnold JM. Clinical care costing method for the Clinical Care Classification System. Int J Nurs Terminol Classif. 2004; 15(3): 69-77.
doi: 10.1111/j.1744-618x.2004.tb00002.x.
13. Tubaishat A. The effect of electronic health records on patient safety: A qualitative exploratory study. Inform Health Soc Care. 2019; 44(1): 79-91. doi: 10.1080/17538157.2017.1398753.
14. Tsai CH, Eghdam A, Davoody N, et al. Effects of Electronic Health Record Implementation and Barriers to Adoption and Use: A Scoping Review and Qualitative Analysis of the Content. Life. 2020; 10(12): 327. doi: 10.3390/life10120327.
15. Yang X, Bian J, Fang R, et al. Identifying relations of medications with adverse drug events using recurrent convolutional neural networks and gradient boosting. J Am Med Inform Assoc. 2020; 27(1): 65-72. doi: 10.1093/jamia/ ocz144.
16. Bowles KH, Potashnik S, Ratcliffe SJ, et al. Conducting research using the electronic health record across multi-hospital systems: semantic harmonization implications for administrators. J Nurs Adm. 2013; 43(6): 355-60. doi: 10.1097/ NNA.0b013e3182942c3c.
17. Szostak J, Ansari S, Madan S, et al. Construction of biological networks from unstructured information based on a semiautomated curation workflow. Database (Oxford) 2015; 1-14. doi: 10.1093/database/bav057.
18. Kreimeyer K, Foster M, Pandey A, et al. Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review. J Biomed Inform. 2017; 73: 14-29. doi: 10.1016/j.jbi.2017.07.012.
19. Livingston KM, Bada M, Baumgartner WA, Hunter LE. KaBOB: ontology-based semantic integration of biomedical databases. BMC Bioinformatics. 2015; 16: 126. doi: 10.1186/ s12859-015-0559-3.
20. Otokiti A. Using informatics to improve healthcare quality. Int J Health Care Qual Assur. 2019; 32(2): 425-430. doi: 10.1108/IJHCQA-03-20180062.
21. Kruse CS, Stein A, Thomas H, Kaur H. The use of Electronic Health Records to Support Population Health: A Systematic Review of the Literature. J Med Syst. 2018; 42(11): 214. doi: 10.1007/s10916-018-1075-6.
22. Pine KH. The qualculative dimension of healthcare data interoperability. Health Informatics J. 2019; 25(3): 536-548. doi: 10.1177/1460458219833095.
23. Urquhart C, Currell R, Grant MJ, Hardiker NR. Nursing record systems: effects on nursing practice and healthcare outcomes. Cochrane Database Syst Rev. 2009; 1: CD002099. doi: 10.1002/14651858.CD002099.pub2. Update in: Cochrane Database Syst Rev. 2018; 5: CD002099.
24. D’Agostino F, Zega M, Rocco G, Luzzi L, Vellone E, Alvaro R. Impact of a nursing information system in clinical practice: a longitudinal study project. Ann Ig. 2013; 25(4): 329-41. doi: 10.7416/ai.2013.1935.
25. Wulff A, Mast M, Hassler M, Montag S, Marschollek M. Designing an openEHR-Based Pipeline for Extracting and Standardizing Unstructured Clinical Data Using Natural Language Processing. Methods Inf Med. 2020; 59(2): 6478. doi: 10.1055/s-0040-1716403.
26. Long WJ. Parsing Free Text Nursing Notes. AMIA Annu Symp Proc. 2003; 917.
27. Elfrink V, Bakken S, Coenen A, McNeil B, Bickford C. Standardized nursing vocabularies: a foundation for quality care. Semin Oncol Nurs. 2001; 17(1): 18-23. doi: 10.1053/ sonu.2001.20415.
28. Goossen W. Cross-mapping between three terminologies with the international standard nursing reference terminology model. Int J Nurs Terminol Classif. 2006; 17(4): 153-64. doi: 10.1111/j.1744-618X.2006.00034.x.
29. Hyun S, Johnson SB, Bakken S. Exploring the ability of natural language processing to extract data from nursing narratives. Comput Inform Nurs. 2009; 27(4): 215-23. doi: 10.1097/ NCN.0b013e3181a91b58.
30. Hyun S, Bakken S, Friedman C, Johnson SB.
Natural language processing challenges in HIV/ AIDS clinic notes. AMIA Annu Symp Proc. 2003; 872.
31. Mehta N, Pandit A. Concurrence of big data analytics and healthcare: a systematic review. Int J Med Inform. 2018; 114: 57-65. doi: 10.1016/j. ijmedinf.2018.03.013.
32. Dreisbach C, Koleck TA, Bourne PE, Bakken S. A systematic review of natural language processing and text mining of symptoms from electronic patient-authored text data. Int J Med Inform. 2019; 125: 37-46. doi: 10.1016/j. ijmedinf.2019.02.008.
33. Fleuren WWM, Alkema W. Application of text mining in the biomedical domain.
Methods. 2015; 74: 97-106. doi: 10.1016/j. ymeth.2015.01.015.
34. Juhn Y, Liu H. Artificial intelligence approaches using natural language processing to advance EHR-based clinical research. J Allergy Clin Immunol. 2020; 145(2): 463-9. doi: 10.1016/j. jaci.2019.12.897.
35. Torres FBG, Gomes DC, Hino AAF, Moro C, Cubas MR. Comparison of the Results of
Manual and Automated Processes of CrossMapping Between Nursing Terms: Quantitative Study. JMIR Nurs. 2020; 9; 3(1):e18501. doi: 10.2196/18501.
36. Lu F, Park HT, Ucharattana P, Konicek D, Delaney C. Nursing outcomes classification in the systematized nomenclature of medicine clinical terms: a cross-mapping validation. Comput Inform Nurs. 2007; 25(3): 159-70. doi: 10.1097/01.NCN.0000270042.22164.21.
37. Sun JY, Sun Y. A system for automated lexical mapping. J Am Med Inform Assoc. 2006; 13(3):
334-43. doi: 10.1197/jamia.M1823.
38. Forsvik H, Voipio V, Lamminen J, Doupi P, Hypponen H, Vuokko R. Literature review of patient record structures from the physician’s perspective. J Med Syst. 2017; 41(2): 29. doi: 10.1007/s10916-016-0677-0.
39. Kieft RAMM, Vreeke EM, de Groot EM, et al. Mapping the Dutch SNOMED CT subset to Omaha System, NANDA International and
International Classification of Functioning, Disability and Health. Int J Med Inform. 2018; 111: 77-82. doi: 10.1016/j.ijmedinf.2017.12.025.
40. Junglyun K, Yingwei Y, Tamara Goncalves Rezende M, Gail K. An examination of the coverage of the SNOMED CT coded nursing problem list subset. JAMIA Open. 2019; 2(3): 386-91. doi: 10.1093/jamiaopen/ooz023.
41. Vis L, Koole S, Goossen A, Huisman H, Goossen W. Semantic Cross-Mapping Execution of Data in the Perinatal Registry of the Netherlands. Stud Health Technol Inform. 2020; 273: 117-22. doi:
10.3233/SHTI200625.
42. Kim TY, Hardiker N, Coenen A. Inter-terminology mapping of nursing problems. J Biomed Inform. 2014; 49: 213-20. doi: 10.1016/j. jbi.2014.03.001.
43. Ke G, Meng Q, Finley T, et al. LightGBM: A highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst. 2017; 3147-55.
44. Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ACL. 2019; 19(1): 4171-86.
45. Brown W, Weng C, Vawdrey DK, CarballoDiéguez A, Bakken S. SMASH: A Data-driven Informatics Method to Assist Experts in Characterizing Semantic Heterogeneity among Data Elements. AMIA Annu Symp Proc. 2017; 10: 1717-26.
46. Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform. 2018; 19(6): 1236-46. doi: 10.1093/bib/bbx044.
47. Simões MF, Silva G, Pinto AC, Fonseca M, Silva NE, Pinto RMA, Simões S. Artificial neural networks applied to quality-by-design: From formulation development to clinical outcome. Eur J Pharm Biopharm. 2020; 152: 282-95. doi: 10.1016/j.ejpb.2020.05.012.
48. Liu LG, Grossman RH, Mitchell EG, et al. A deep database of medical abbreviations and acronyms for natural language processing. Sci Data. 2021; 8(1): 149. doi: 10.1038/s41597-021 -00929-4.
49. Cocchieri A, Di Sarra L, D’Agostino F, et al. Sviluppo e implementazione di un sistema informativo infermieristico pediatrico in ambito ospedaliero: il PAI pediatrico [Development and implementation of pediatric and neonatal nursing information system in an hospital setting: the pediatric PAI]. Ig Sanita Pubbl. 2018; 74(4): 315-28.
50. D’Agostino F, Zega M, Rocco G, et al. Impact of a nursing information system in clinical practice: a longitudinal study project. Ann Ig. 2013; 25(4): 329-41. doi: 10.7416/ai.2013.1935.
51. Zega M, D’Agostino F, Bowles KH, et al. Development and validation of a computerized assessment form to support nursing diagnosis. Int J Nurs Knowl. 2014; 25(1): 22-9. doi: 10.1111/2047-3095.12008.
52. D’Agostino F, Vellone E, Tontini F, Zega M, Alvaro R. Sviluppo di un sistema informativo utilizzando un linguaggio infermieristico standard per la realizzazione di un Nursing Minimum Data Set [Development of a computerized system using standard nursing language for creation of a nursing minimum data set]. Prof Inferm. 2012; 65(2): 103-9.
53. Sanson G, Alvaro R, Cocchieri A, et al. Nursing Diagnoses, Interventions, and Activities as Described by a Nursing Minimum Data Set: A Prospective Study in an Oncology Hospital
Setting. Cancer Nurs. 2019; 42(2): 39-47. doi: 10.1097/NCC.0000000000000581.
54. Koleck TA, Dreisbach C, Bourne PE, Bakken S. Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review. J Am Med Inform Assoc. 2019; 26(4): 364-79. doi: 10.1093/jamia/ ocy173.
55. Bacchi S, Gluck S, Tan Y, et al. Prediction of general medical admission length of stay with natural language processing and deep learning: a pilot study. Intern Emerg Med. 2019; 15(6): 989-95. doi: 10.1007/s11739-019-02265-3.
56. Cook MJ, Yao L, Wang X. Facilitating Accurate Health Provider Directories Using Natural Language Processing. BMC Med Inform Decis Mak. 2019; 19(3): 80. doi: 10.1186/s12911-019 -0788-x.
57. Toplak M, Birarda G, Read S, et al. Infrared Orange: Connecting Hyperspectral Data with Machine Learning. Technical Reports. 2017; 30(4): 40-5. Available on: https://en.wikipedia.org/wiki/
Orange_(software) [Last accessed: 2021, May 05]. doi: 10.1080/08940886.2017.1338424.
58. Hong QN, Pluye P, Fàbregues S, et al. Improving the content validity of the mixed methods appraisal tool: a modified e-Delphi study. J Clin Epidemiol. 2019; 111: 49-59.e1. doi: 10.1016/j. jclinepi.2019.03.008.
59. Available on: https://it.functions-online.com/ levenshtein.html [Last accessed: 2021, June 04].
60. Kim TY. Automating lexical cross-mapping of ICNP to SNOMED CT. Inform Health Soc Care.
2016; 41(1): 64-77. doi: 10.3109/17538157. 2014.948173.
61. Regolamento generale per la protezione dei dati personali del 24 maggio 2016, n. 679. General Data Protection Regulation o GDPR, normativa europea in materia di protezione dei dati.
62. Bjarnadottir RI, Lucero RJ. What Can We Learn about Fall Risk Factors from EHR Nursing Notes? A Text Mining Study. EGEMS. 2018; 6(1): 1-8. doi: 10.5334/egems.237.
63. Sterling NW, Patzer RE, Di M, Schrager JD. Prediction of emergency department patient disposition based on natural language processing of triage notes. Int J Med Inform. 2019; 129: 184-8. doi: 10.1016/j.ijmedinf.2019.06.008.
64. Le QV, Mikolov T. Distributed representations of sentences and documents. Int Conf Mach Learn. 2014; 1188-96.
65. Zeffiro V, Sanson G, Vanalli M, et al. Translation and cross-cultural adaptation of the Clinical Care Classification system. Int J Med Inform. 2021; 153: 104534. doi: 10.1016/j. ijmedinf.2021.104534.
66. Kang MJ, Dykes PC, Korach TZ, et al. Identifying nurses’ concern concepts about patient deterioration using a standard nursing terminology. Int J Med Inform. 2020; 133: 104016. doi:
10.1016/j.ijmedinf.2019.104016.
67. Lavin MA, Harper E, Barr N. Health Information Technology, Patient Safety, and Professional Nursing Care Documentation in Acute Care Settings. Online J Issues Nurs. 2015; 20(2): 6. doi: 10.3912/OJIN.Vol20No02PPT04.
68. Zhang X, Zhao J, LeCun Y. Character-level convolutional networks for text classification. In: Adv Neural Inf Process Syst. 2015; 64957.
69. Bravetti C, Cocchieri A, D’Agostino F, Alvaro R, Zega M. The assessment of the complexity of care through the clinical nursing information system in clinical practice: a study protocol. Ann Ig. 2017; 29(4): 273-80. doi: 10.7416/ ai.2017.2155.
70. Bravetti C, Cocchieri A, D’Agostino F, et al. A nursing clinical information system for the assessment of the complexity of care. Ann Ig. 2018; 30(5): 410-20. doi: 10.7416/ai.2018.2241.
71. Moen H, Hakala K, Peltonen LM, et al. Supporting the use of standardized nursing terminologies with automatic subject heading prediction: a comparison of sentence-level text classification methods. J Am Med Inform Assoc. 2020; 27(1): 81-88. doi: 10.1093/jamia/ocz150.
72. Heidarizadeh K, Rassouli M, Manoochehri H, Tafreshi MZ, Ghorbanpour RK. Effect of electronic report writing on the quality of nursing report recording. Electron Physician. 2017; 9(10): 5439-45. doi: 10.19082/5439.
73. Häyrinen K, Lammintakanen J, Saranto K. Evaluation of electronic nursing documentation-nursing process model and standardized terminologies as keys to visible and transparent nursing. Int JMed Inform. 2010; 79(8): 554-64. doi: 10.1016/j.ijmedinf.2010.05.002.
74. Kavuluru R, Rios A, Lu Y. An empirical evaluation of supervised learning approaches in assigning diagnosis codes to electronic medical records. Artif Intell Med. 2015; 65(2): 155-66. doi: 10.1016/j.artmed.2015.04.007.
75. Mitchell B, Petrovskaya O, McIntyre M, Frisch N. Where is nursing in the electronic health care record? Stud Health Technol Inform. 2009; 143: 202-06. doi: 10.3233/978-1-58603-979-0202.
76. Westra BL, Latimer GE, Matney SA, et al. A national action plan for sharable and comparable nursing data to support practice and translational research for transforming health care. J Am Med Inform Assoc. 2015; 22(3): 600-07. doi: 10.1093/jamia/ocu011.
77. Johnson SG, Pruinelli L, Westra BL. Machine Learned Mapping of Local EHR Flowsheet Data to Standard Information Models using Topic Model Filtering. AMIA Annu Symp Proc. 2020; 4: 504-13.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Transfer of Copyright and Permission to Reproduce Parts of Published Papers.
Authors retain the copyright for their published work. No formal permission will be required to reproduce parts (tables or illustrations) of published papers, provided the source is quoted appropriately and reproduction has no commercial intent. Reproductions with commercial intent will require written permission and payment of royalties.